Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enable OpenMP for prod_force and prod_virial #1360

Merged
merged 3 commits into from
Dec 19, 2021

Conversation

njzjz
Copy link
Member

@njzjz njzjz commented Dec 16, 2021

About 1 ms can be saved in each training step.

About 1 ms can be saved in each training step.
@njzjz njzjz closed this Dec 16, 2021
@njzjz njzjz reopened this Dec 16, 2021
@njzjz njzjz marked this pull request as draft December 16, 2021 03:11
@codecov-commenter
Copy link

codecov-commenter commented Dec 16, 2021

Codecov Report

Merging #1360 (3b19ddf) into devel (2223ff3) will decrease coverage by 1.21%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##            devel    #1360      +/-   ##
==========================================
- Coverage   75.53%   74.32%   -1.22%     
==========================================
  Files          91       91              
  Lines        7506     7482      -24     
==========================================
- Hits         5670     5561     -109     
- Misses       1836     1921      +85     
Impacted Files Coverage Δ
deepmd/infer/deep_tensor.py 56.09% <0.00%> (-39.03%) ⬇️
deepmd/train/run_options.py 71.42% <0.00%> (-15.37%) ⬇️
deepmd/loggers/loggers.py 41.90% <0.00%> (-9.53%) ⬇️
source/op/_gelu.py 71.42% <0.00%> (-6.35%) ⬇️
deepmd/train/trainer.py 69.73% <0.00%> (-2.74%) ⬇️
deepmd/loss/ener.py 47.34% <0.00%> (-0.24%) ⬇️
deepmd/utils/argcheck.py 90.03% <0.00%> (-0.20%) ⬇️
source/op/_tabulate_grad.py 100.00% <0.00%> (ø)
source/op/_prod_force_grad.py 100.00% <0.00%> (ø)
source/op/_prod_virial_grad.py 100.00% <0.00%> (ø)
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2223ff3...3b19ddf. Read the comment docs.

@njzjz njzjz marked this pull request as ready for review December 16, 2021 03:19
@@ -36,14 +36,17 @@ prod_force_a_cpu(

memset(force, 0.0, sizeof(FPTYPE) * nall * 3);
// compute force of a frame
#pragma omp parallel
Copy link
Collaborator

@wanghan-iapcm wanghan-iapcm Dec 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two threads with different i_idx may write on the same force force[j_idx * 3 + xxx], which gives unpredictable result.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized that, so that pragma omp for is inside this loop and before another loop. The current version can pass the tests.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I understand you compute the neighbors of the same atom using multi-threading.

What I do not understand is why you need omp parallel to generate threads here, but not at L49 where you really need multi-threading

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

@wanghan-iapcm wanghan-iapcm merged commit 7f3c218 into deepmodeling:devel Dec 19, 2021
@njzjz njzjz deleted the omp_force_virial branch December 21, 2021 23:39
njzjz added a commit to njzjz/deepmd-kit that referenced this pull request Aug 17, 2022
Sometimes when box is quite small (i.e. box size < 2 * rcut), the same atom may repeat to appear in the neighbor list. This cause inaccurate results when using OMP.
wanghan-iapcm pushed a commit that referenced this pull request Aug 19, 2022
* revert prod_force OMP in #1360

Sometimes when box is quite small (i.e. box size < 2 * rcut), the same atom may repeat to appear in the neighbor list. This cause inaccurate results when using OMP.

* do not update pip

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>

* revert pining pip; setting env for setuptools>=64

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
mingzhong15 pushed a commit to mingzhong15/deepmd-kit that referenced this pull request Jan 15, 2023
* revert prod_force OMP in deepmodeling#1360

Sometimes when box is quite small (i.e. box size < 2 * rcut), the same atom may repeat to appear in the neighbor list. This cause inaccurate results when using OMP.

* do not update pip

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>

* revert pining pip; setting env for setuptools>=64

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants