-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor and benchmark Particle #2296
Conversation
Apply the C++11 Rule of Five: default constructor, copy constructor, move constructor, copy assignment operator, move assignment operator, destructor.
Codecov Report
@@ Coverage Diff @@
## python #2296 +/- ##
=======================================
+ Coverage 71% 71% +<1%
=======================================
Files 380 380
Lines 18938 19039 +101
=======================================
+ Hits 13588 13661 +73
- Misses 5350 5378 +28
Continue to review full report at Codecov.
|
e8e2a9b
to
e74391e
Compare
Resources in struct Particle are now owned by pointers, excepted struct ParticlePosition which has to be contiguous in memory.
Resources in struct Particle are now owned by unique_ptr's, excepted struct ParticlePosition which has to be contiguous in memory.
Benchmarking suite to measure the integrator execution time under various conditions: build features, struct Particle implementation, number of cores, number of particles, interaction types.
e74391e
to
a6a9193
Compare
Revert resource management edits for the ParticleForce member of struct Particle.
As suggested by @RudolfWeeber and @fweik, the ParticleForce member is now contiguous: struct Particle {
ParticlePosition r;
ParticleForce f;
std::unique_ptr<ParticleProperties> p;
std::unique_ptr<ParticleLocal> l;
std::unique_ptr<ParticleMomentum> m;
/* followed by feature-enabled members */
}; Using a subset of the tests (LJ gas and P3M saline solution), the slow down is now 16% instead of 23%. Adding more features still affects performance:
|
Create a struct ParticleExtended to group together all unique_ptr defined in struct Particle.
The short range loop has to access charges and types frequently.
Moved charge and type to the main Particle struct to make the short range loop evaluate faster. LJ liquid is still 10% slower, LJ gas is 20% slower. Electrostatics is now slightly faster. The benchmarks run without
|
@jngrad could you please put the benchmarks into a separate pull request? I think we can merge those quickly and there is already interest to extend them. |
@jngrad, could you please run the corrected benchmarks on the various particle implementations again? |
Using commits |
3061: Avoid unneeded reallocations of ghost data r=jngrad a=fweik So I looked at profiles of the particle exchange code, and noticed some things... In some situations the can improve the performance considerably, so I think this should go into the release. This has some overlap with #2296, but the approach is similar, and this is slightly cleaner and can be merged. It's slightly hackish, but I think this is the best we can do with the current `Cell` data structure. Description of changes: - Realloc ghost particles only if needed - Fixed memleak in resort code - Fixed corner-case of Utils::List::resize - Removed exclusions from ghosts Co-authored-by: Florian Weik <fweik@icp.uni-stuttgart.de>
3061: Avoid unneeded reallocations of ghost data r=jngrad a=fweik So I looked at profiles of the particle exchange code, and noticed some things... In some situations the can improve the performance considerably, so I think this should go into the release. This has some overlap with #2296, but the approach is similar, and this is slightly cleaner and can be merged. It's slightly hackish, but I think this is the best we can do with the current `Cell` data structure. Description of changes: - Realloc ghost particles only if needed - Fixed memleak in resort code - Fixed corner-case of Utils::List::resize - Removed exclusions from ghosts 3067: Document bonded potentials r=KaiSzuttor a=jngrad Fixes #3049 Description of changes: - bonded IAs listed in #3049 are now documented in both Sphinx and Doxygen - cleaned up outdated docstrings - re-shuffled paragraphs in [7. Bonded interactions](http://espressomd.org/html/doc/inter_bonded.html) Co-authored-by: Florian Weik <fweik@icp.uni-stuttgart.de> Co-authored-by: Jean-Noël Grad <jgrad@icp.uni-stuttgart.de>
@jngrad can this be closed for now? |
Yes, this has derived too much. It will be easier to restart directly from |
This PR is not meant to be merged (yet).
Following our exchange on #2239, I've developed a benchmarking suite to measure the integration time of pre-equilibrated particle simulations. The suite currently provides scripts for a LJ gas, a LJ liquid, a P3M saline solution and a P3M ionic gas, and runs them using 1/2/4/8/16 cores, 1k/10k particles per core, and 3 versions of
myconfig.hpp
.The benchmarks are activated in CMake with
-DWITH_BENCHMARKS=ON
. To add your own benchmarks, use/src/maintainer/benchmarks/lj.py
as a template and add a new line in/src/maintainer/benchmarks/CMakeLists.txt
. Pass option--visualize
tolj.py
to start OpenGL and visualize the simulation.Execute
/maintainer/benchmarks/suite.sh
to run the benchmarks. The script loops over variousmyconfig.hpp
files and measures the impact of including more features in thestruct Particle
. The script also loops over commits in the git history to measure the performance of various implementations ofstruct Particle
. The results are stored inbenchmarks.csv
, where the relevant fields are:commit
: git commit (currently testing the originalstruct Particle
implementation, a raw pointer implementation and a unique_ptr implementation)config
: version ofmyconfig.hpp
(currently testing minimal, default and maxset)script
: benchmark file (currently runninglj.py
andp3m.py
)arguments
: space-separated list of arguments passed to the benchmarkcores
: MPI cores (1 ifmpiexec
was not used)MPI
:True
ifmpiexec
was used,False
otherwisemean
: mean execution time of a single integration step (seconds)ci
: 95% confidence interval for the meanDebug fields:
steps_per_tick
: sample size of the meanduration
: duration of the simulation (seconds), should be 1-2 min longE1
,E2
,E3
: energy values from the final state of the simulations (to check the benchmark is reproducible)The benchmark takes 20 hours on my workstation, you can get the raw results here (I'll run it on bee next week). The pointer implementation of
struct Particle
runs 23% slower than the original implementation, for both raw and smart pointers. Using the original implementation and default features as a baseline, the increase in integration time can be broken down as follows:The pointer implementation reduces the struct size from 928 to 208 bytes. Adding more features increases the execution time in the same way no matter the implementation. We need to change that by re-shuffling the members of Particle substructs to minimize the slow-down.