Skip to content

Mesh refinement: fix MR on GPUs#564

Merged
SeverinDiederichs merged 12 commits into
Hi-PACE:developmentfrom
SeverinDiederichs:mr_fix_gpus
Jul 21, 2021
Merged

Mesh refinement: fix MR on GPUs#564
SeverinDiederichs merged 12 commits into
Hi-PACE:developmentfrom
SeverinDiederichs:mr_fix_gpus

Conversation

@SeverinDiederichs
Copy link
Copy Markdown
Member

@SeverinDiederichs SeverinDiederichs commented Jul 14, 2021

Currently, using level 1 does not work on GPUs. This PR is resolving the issue.

The problem lies within the field solver: the staging area, which is used to conduct the DST, was based on the real box array. However, on level 1, this boxes have an offset because not all cells are valid on level 1. This causes out of memory accesses in the field solver.
To resolve the issue, the staging area is created like the complex array, without an offset. All functions interacting with the staging area needed to be adapted to take into account the offset.

Using a version which includes #561, I ran the full MR test:

amr.n_cell = 128 128 300
hipace.patch_lo = -1 -1 -3.5
hipace.patch_hi =  1  1 -1
amr.ref_ratio_vect =  8 8 1

hipace.normalized_units=1
hipace.predcorr_max_iterations = 30
hipace.predcorr_B_mixing_factor = 0.05
hipace.predcorr_B_error_tolerance = 4e-2

amr.blocking_factor = 4
amr.max_level = 1

max_step = 0
hipace.output_period = 1

hipace.numprocs_x = 1
hipace.numprocs_y = 1

hipace.depos_order_xy = 2

geometry.coord_sys   = 0                  # 0: Cartesian
geometry.is_periodic =  1     1     0      # Is periodic?
geometry.prob_lo     = -8.   -8.   -6    # physical domain
geometry.prob_hi     =  8.    8.    6

beams.names = beam beam2
beam.injection_type = fixed_weight
beam.num_particles = 1000000
beam.profile = gaussian
beam.zmin = -5.9
beam.zmax = 5.9
beam.radius = 1.2
beam.density = 20.
beam.u_mean = 0. 0. 2000
beam.u_std = 0. 0. 0.
beam.position_mean = 0. 0. 2
beam.position_std = 0.3 0.3 0.5
beam.ppc = 1 1 1
beam.finest_level = 0

beam2.injection_type = fixed_weight
beam2.num_particles = 1000000
beam2.profile = can
beam2.zmin = -1.5
beam2.zmax = -3.0
beam2.radius = 1.2
beam2.density = 20000.
beam2.u_mean = 0. 0. 2000
beam2.u_std = 0. 0. 0.
beam2.position_mean = 0. 0. 0
beam2.position_std = 0.1 0.1 0.2
beam2.ppc = 1 1 1
beam2.finest_level = 1

plasmas.names = plasma ions
plasma.density = 1.
plasma.ppc = 1 1
plasma.u_mean = 0.0 0.0 0.
plasma.element = electron
plasma.level = 0

ions.density = 1.
ions.ppc = 1 1
ions.u_mean = 0.0 0.0 0.
ions.element = proton
ions.level = 1
ions.neutralize_background = 0

diagnostic.diag_type = xyz

Previously, I found some differences between CPU and GPU, however, they were caused by different beam initialization. Reading a beam from file, both give the same result:
image
This was tested both in Debug and in normal mode.

Using just this PR (so without #561, therefore no plasma is allowed on level 1) and a grid current example, both CPU and GPU give the same result:
image

  • Small enough (< few 100s of lines), otherwise it should probably be split into smaller PRs
  • Tested (describe the tests in the PR description)
  • Runs on GPU (basic: the code compiles and run well with the new module)
  • Contains an automated test (checksum and/or comparison with theory)
  • Documented: all elements (classes and their members, functions, namespaces, etc.) are documented
  • Constified (All that can be const is const)
  • Code is clean (no unwanted comments, )
  • Style and code conventions are respected at the bottom of https://github.com/Hi-PACE/hipace
  • Proper label and GitHub project, if applicable

@MaxThevenet
Copy link
Copy Markdown
Member

Awesome, thanks for this PR!

@SeverinDiederichs SeverinDiederichs merged commit 664d769 into Hi-PACE:development Jul 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants