Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes for srp example #339

Merged
merged 4 commits into from Jan 17, 2017
Merged

Conversation

akohlmey
Copy link
Member

@akohlmey akohlmey commented Jan 14, 2017

This adds several changes fixing problems exposed by running the example for fix srp and a bug report for compute rdf reported on lammps-users

  • out-of-bound accesses when clearing the result array
  • the neighbor list build was operating on outdated pointers
  • we cannot copy a neighbor list with different newton settings, since a request with newton == 2 can request a list with newton_pair off when newton_pair is on.

The second fix seems to affect other problem reports as well, e.g. it fixes #316
As indicated in the comments, there may be a more general way to address the issue in Npair, however, this PR may serve as a temporary workaround.

…ing bins and stencil point, too.

in addition, relevant pointers were not properly initialized to NULL
@stanmoore1
Copy link
Contributor

stanmoore1 commented Jan 14, 2017

@akohlmey This change to npair.cpp is the same or similar change needed for #316. @sjplimp was looking into how he wanted to fix it.

@akohlmey
Copy link
Member Author

Yup, i can now successfully run both the gcmc test from anders and the basal input, too.

@stanmoore1
Copy link
Contributor

@sjplimp was considering a slightly different fix, we should coordinate

Copy link
Collaborator

@andeplane andeplane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me and solves the problem in #316. I'm curious about @sjplimp's other solution though.

…ghbor list, if newton settings are not compatible

an alternate route to address this issue would be to allow an "ANY" setting for neighbor list requests and then query the neighbor list for newton setting instead of the force class.
Copy link
Collaborator

@zqex zqex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fix resolves the segmentation fault of the SRP example. However, comparing thermodynamic output to the last stable release 17/11 2016, shows all out put are completely off. Temperature ~100 not 1 and so on.

@akohlmey akohlmey requested a review from timsirk January 16, 2017 14:53
@akohlmey
Copy link
Member Author

@zqex, thanks for the feedback. it looks like there may be something else that needs to be looked at. perhaps there is a combination of issues and one of them is uncovered by one of the changes.
I've added @timsirk to the list of reviewers, perhaps he'll have an idea what else to look at.

@zqex
Copy link
Collaborator

zqex commented Jan 16, 2017

Having looked more closely into the output, and comparing the branch to the 17/11 2016 stable version, then most of the neighbor builds for this branch are dangeous and they contain much fewer ghosts and neighbors compared to the stable simulation. The fix srp modifies the cutoff after the neighbor lists have been initialized, perhaps they are not correctly reinitialized?

This branch:
Neighbor list info ...
update every 10 steps, delay 10 steps, check yes
max neighbors/atom: 2000, page size: 100000
master list distance cutoff = 3
ghost atom cutoff = 3
binsize = 1.5, bins = 7 7 7
3 neighbor lists, perpetual/occasional/extra = 3 0 0
(1) pair dpd, perpetual, skip from (3)
pair build: skip
stencil: none
bin: none
(2) pair srp, perpetual, skip from (3)
pair build: skip
stencil: none
bin: none
(3) pair hybrid, perpetual
pair build: half/bin/newton
stencil: half/bin/3d/newton
bin: standard
Setting up Verlet run ...
Unit style : lj
Current step : 0
Time step : 0.01
Removed/inserted 0/2700 bond particles. (../fix_srp.cpp:249)
Extending ghost comm cutoff. New 3.523044, old 3.000000. (../fix_srp.cpp:271)
Memory usage per processor = 19.9732 Mbytes
Step Temp PotEng v_nPotEng Press Atoms v_natoms Lx Ly Lz
0 0.98262092 17216.995 5.7389985 29.837398 5700 3000 10 10 10
:
:
:
1000 100.46527 217090.45 72.363485 112.09572 5700 3000 10 10 10
Deleted 2700 atoms, new total = 3000
Loop time of 9.08705 on 1 procs for 1000 steps with 3000 atoms

Performance: 95080.412 tau/day, 110.047 timesteps/s
100.0% CPU use with 1 MPI tasks x no OpenMP threads

MPI task timing breakdown:
Section | min time | avg time | max time |%varavg| %total

Pair | 1.4452 | 1.4452 | 1.4452 | 0.0 | 15.90
Bond | 0.063011 | 0.063011 | 0.063011 | 0.0 | 0.69
Neigh | 7.0144 | 7.0144 | 7.0144 | 0.0 | 77.19
Comm | 0.42856 | 0.42856 | 0.42856 | 0.0 | 4.72
Output | 0.0031333 | 0.0031333 | 0.0031333 | 0.0 | 0.03
Modify | 0.10126 | 0.10126 | 0.10126 | 0.0 | 1.11
Other | | 0.03146 | | | 0.35

Nlocal: 3000 ave 3000 max 3000 min
Histogram: 1 0 0 0 0 0 0 0 0 0
Nghost: 12014 ave 12014 max 12014 min
Histogram: 1 0 0 0 0 0 0 0 0 0
Neighs: 91217 ave 91217 max 91217 min
Histogram: 1 0 0 0 0 0 0 0 0 0

Total # of neighbors = 91217
Ave neighs/atom = 30.4057
Ave special neighs/atom = 1.8
Neighbor list builds = 98
Dangerous builds = 94

Stable:
Neighbor list info ...
3 neighbor list requests
update every 10 steps, delay 10 steps, check yes
max neighbors/atom: 2000, page size: 100000
master list distance cutoff = 3
ghost atom cutoff = 3
binsize = 1.5, bins = 7 7 7
Setting up Verlet run ...
Unit style : lj
Current step : 0
Time step : 0.01
Removed/inserted 0/2700 bond particles. (../fix_srp.cpp:249)
Extending ghost comm cutoff. New 3.523044, old 3.000000. (../fix_srp.cpp:271)
Memory usage per processor = 13.5492 Mbytes
Step Temp PotEng v_nPotEng Press Atoms v_natoms Lx Ly Lz
0 0.98262092 31772.336 10.590779 60.690437 5700 3000 10 10 10
:
:
:
1000 1.032276 31816.562 10.605521 61.068891 5700 3000 10 10 10
Deleted 2700 atoms, new total = 3000
Loop time of 9.66309 on 1 procs for 1000 steps with 3000 atoms

Performance: 89412.347 tau/day, 103.487 timesteps/s
100.0% CPU use with 1 MPI tasks x no OpenMP threads

MPI task timing breakdown:
Section | min time | avg time | max time |%varavg| %total

Pair | 6.6786 | 6.6786 | 6.6786 | 0.0 | 69.11
Bond | 0.062303 | 0.062303 | 0.062303 | 0.0 | 0.64
Neigh | 2.4634 | 2.4634 | 2.4634 | 0.0 | 25.49
Comm | 0.36922 | 0.36922 | 0.36922 | 0.0 | 3.82
Output | 0.0034096 | 0.0034096 | 0.0034096 | 0.0 | 0.04
Modify | 0.057141 | 0.057141 | 0.057141 | 0.0 | 0.59
Other | | 0.02904 | | | 0.30

Nlocal: 3000 ave 3000 max 3000 min
Histogram: 1 0 0 0 0 0 0 0 0 0
Nghost: 11830 ave 11830 max 11830 min
Histogram: 1 0 0 0 0 0 0 0 0 0
Neighs: 841452 ave 841452 max 841452 min
Histogram: 1 0 0 0 0 0 0 0 0 0

Total # of neighbors = 841452
Ave neighs/atom = 280.484
Ave special neighs/atom = 1.8
Neighbor list builds = 22
Dangerous builds = 0

@akohlmey
Copy link
Member Author

yup, that seems to be the case. when i manually reset the communication cutoff to, say, 4.0, the results are back to what they previously were.
so fix srp has to invalidate its neigbor list request, when changing the communication cutoff?

… minimum value needed and error out

i suspect, this communication cutoff adjustment was included into the code before it was possible to separately set it via comm_modify. stopping with an error message printing the needed/current value is cleaner, in keeping with other modules in LAMMPS and much less problematic.
@akohlmey
Copy link
Member Author

after some thinking about it, i have changed fix srp to not even try to adjust the communication cutoff and then having to deal with all kinds of re-initialization issues, but rather report the cutoff needed and then exist in case it is insufficient. it'll be easy to add a comm_modify cutoff command into the LAMMPS input with a suitable cutoff after this error.

@sjplimp sjplimp merged commit c52a263 into lammps:master Jan 17, 2017
@akohlmey akohlmey deleted the fixes-for-srp-example branch January 17, 2017 19:49
Copy link
Collaborator

@timsirk timsirk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks. nice catch on the out-of-bounds read. setting the comm cutoff in the script should be just as 'usable' as the auto-set.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Memory issue with neighbor list and gcmc
6 participants