Enable OpenMP in particle push and coordinate transformation routines. #241

atmyers · 2022-09-09T20:24:31Z

Close #195

To Do

turn tiling on
set a reasonable default tile size when OMP is selected
rebase and also OpenMP-accelerate new kernels from Space Charge Solver #162

ax3l · 2022-09-09T23:30:23Z

Thanks! Do we need to change some defaults, like the dynamic scheduling default we do in WarpX?

atmyers · 2022-09-15T17:31:44Z

Yes, at minimum we should set turn tiling on and set a reasonable default tile size when OMP is selected - I will update.

ax3l · 2022-09-20T16:43:14Z

@atmyers thanks! We also merged #162 now. Can you please rebase and OpenMP-enable those routines as well? :)

src/particles/ImpactXParticleContainer.cpp

atmyers · 2022-09-21T17:34:55Z

Here are the current performance results on the expanding beam test:

dynamic_1.out:TinyProfiler total time across processes [min...avg...max]: 1.35 ... 1.35 ... 1.35
dynamic_2.out:TinyProfiler total time across processes [min...avg...max]: 0.8144 ... 0.8144 ... 0.8144
dynamic_4.out:TinyProfiler total time across processes [min...avg...max]: 0.5541 ... 0.5541 ... 0.5541
static_1.out:TinyProfiler total time across processes [min...avg...max]: 1.353 ... 1.353 ... 1.353
static_2.out:TinyProfiler total time across processes [min...avg...max]: 0.8157 ... 0.8157 ... 0.8157
static_4.out:TinyProfiler total time across processes [min...avg...max]: 0.5614 ... 0.5614 ... 0.5614

ax3l · 2022-09-22T00:05:08Z

Thank you! Can you perform a little test on a single CPU package comparing MPI w/ 1 OMP thread vs. all OMP threads?

ax3l · 2022-09-22T00:06:26Z

Can you please also update MFIter loops to be OpenMP accelerated?

atmyers · 2022-09-22T19:02:03Z

Here is a comparison between pure MPI and pure OMP on the expanding beam test, with diags disabled:

MPI:
1 ranks, 1 threads: 1.357
2 ranks, 1 threads: 0.9569
4 ranks, 1 threads: 0.6122

OMP:
1 ranks, 1 threads: 1.353
1 ranks, 2 threads: 0.8035
1 ranks, 4 threads: 0.6009

ax3l · 2022-09-27T18:16:00Z

src/particles/spacecharge/ForceFromSelfFields.cpp

@@ -40,6 +40,9 @@ namespace impactx::spacecharge
            space_charge_field.at(lev).at("y").setVal(0.);
            space_charge_field.at(lev).at("z").setVal(0.);

+#ifdef AMREX_USE_OMP
+#pragma omp parallel if (amrex::Gpu::notInLaunchRegion())
+#endif
            for (amrex::MFIter mfi(phi.at(lev)); mfi.isValid(); ++mfi) {


@WeiqunZhang commented:
we can also do tiling on CPU for MFIter loops (not yet done here).

Let's investigate if this helps us here and potentially make it a user option.

Example:
https://github.com/ECP-WarpX/WarpX/blob/3fe406c9701f61e07b23f7123cf0a7bad492c6dc/Source/Diagnostics/ComputeDiagFunctors/BackTransformFunctor.cpp#L110

atmyers requested a review from ax3l September 9, 2022 20:24

ax3l self-assigned this Sep 9, 2022

ax3l added the backend: openmp Specific to OpenMP execution (CPUs) label Sep 9, 2022

ax3l reviewed Sep 21, 2022

View reviewed changes

src/particles/ImpactXParticleContainer.cpp Show resolved Hide resolved

atmyers added 6 commits September 21, 2022 10:10

Enable OpenMP in particle push and coordinate transformation routines.

96c86b6

enable tiling by default if not running on GPU

ecccc14

use dynamic scheduling by default

479c1e3

do in nice way with named function

eea6241

add some docstrings

6e0c594

also enable omp in the new spacecharge particle gather and push

91419a6

turn on dynamic scheduling by default

49d3f6f

atmyers force-pushed the enable_omp_particles branch from fb30f24 to 49d3f6f Compare September 21, 2022 21:55

also accelerate mfiter loop with omp

10d5e87

This comment was marked as outdated.

Sign in to view

ax3l merged commit 54741c5 into ECP-WarpX:development Sep 22, 2022

This comment was marked as outdated.

Sign in to view

ax3l mentioned this pull request Sep 22, 2022

Examples: Set OMP_NUM_THREADS=2 #263

Merged

ax3l reviewed Sep 27, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable OpenMP in particle push and coordinate transformation routines. #241

Enable OpenMP in particle push and coordinate transformation routines. #241

atmyers commented Sep 9, 2022 •

edited

ax3l commented Sep 9, 2022

atmyers commented Sep 15, 2022

ax3l commented Sep 20, 2022 •

edited

atmyers commented Sep 21, 2022

ax3l commented Sep 22, 2022

ax3l commented Sep 22, 2022 •

edited

atmyers commented Sep 22, 2022 •

edited

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

ax3l Sep 27, 2022 •

edited

Enable OpenMP in particle push and coordinate transformation routines. #241

Enable OpenMP in particle push and coordinate transformation routines. #241

Conversation

atmyers commented Sep 9, 2022 • edited

To Do

ax3l commented Sep 9, 2022

atmyers commented Sep 15, 2022

ax3l commented Sep 20, 2022 • edited

atmyers commented Sep 21, 2022

ax3l commented Sep 22, 2022

ax3l commented Sep 22, 2022 • edited

atmyers commented Sep 22, 2022 • edited

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

ax3l Sep 27, 2022 • edited

Choose a reason for hiding this comment

atmyers commented Sep 9, 2022 •

edited

ax3l commented Sep 20, 2022 •

edited

ax3l commented Sep 22, 2022 •

edited

atmyers commented Sep 22, 2022 •

edited

ax3l Sep 27, 2022 •

edited