-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimized Dynamical Matrix command #1314
Optimized Dynamical Matrix command #1314
Conversation
I guess I am too stubborn. Is there a flag to tell if an atom is a ghost or local? I am double counting atoms when I take into consideration the ghosts for local_idx. |
An atom is local if it's index is smaller than atom->nlocal.
…--
Dr. Axel Kohlmeyer akohlmey@gmail.com http://goo.gl/1wk0
College of Science & Technology, Temple University, Philadelphia PA, USA
International Centre for Theoretical Physics, Trieste. Italy.
On Sat, Feb 2, 2019, 06:37 charlessievers ***@***.*** wrote:
I guess I am too stubborn. Is there a flag to tell if an atom is a ghost
or local?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#1314 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AARp4wKSWTSq7vTEJdE4ofZkYsNyCnIVks5vJSQBgaJpZM4afZUz>
.
|
Great, that is what I did on my outdated code. I am glad to know that it is always true. |
Alright, I made it work. I will admit that it is better in every possible way. I still need to implement the group functionality, rewrite the third order code, and update the examples. Thank you for pushing me and improving my product. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've looked over this version of the code and made some comments and suggestions for changes.
src/USER-PHONON/dynamical_matrix.cpp
Outdated
double **f = atom->f; | ||
|
||
//initialize dynmat to all zeros | ||
for (int i=0; i < dynlen; i++) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dynlen
is of type bigint
, so should be i
and j
here.
src/USER-PHONON/dynamical_matrix.cpp
Outdated
|
||
if (comm->me == 0 && screen) fprintf(screen,"Calculating Dynamical Matrix...\n"); | ||
|
||
for (int i=1; i<=natoms; i++){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
natoms
is of type bigint
, so should be i
here.
src/USER-PHONON/dynamical_matrix.cpp
Outdated
for (int j=0; j < dynlen; j++) | ||
dynmat[i][j] = 0.; | ||
|
||
energy_force(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why call energy_force()
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great question. I believe that it was for testing purposes initially. Any case, it is outdated.
|
||
// compute all forces | ||
force_clear(); | ||
external_force_clear = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The steps below are essentially what is in energy_force()
, is it not?
I would measure the duration of the force computation, as that can be used as an estimate for how long it will take to complete the command (wall_time of one force computation * 6 * dynlen * dynlen, if i am not mistaken).
src/USER-PHONON/dynamical_matrix.cpp
Outdated
} | ||
|
||
|
||
memory->create(final_dynmat,int(dynlen),int(dynlen),"dynamic_matrix_buffer:buf"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This limits dynlen
to be a 32-bit only integer. Now, it is going to be quite slow to compute the dynamical matrix for a very large system, but then there should be a test at the beginning checking that dynlen
isn't larger than MAXSMALLINT
. Then it could also be made an int
and loop indices correspondingly.
src/USER-PHONON/dynamical_matrix.h
Outdated
int scaleflag; | ||
int me; | ||
bigint dynlen; | ||
double **dynmat; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dynmat
does not have to be a class member, and it does not have to be a full matrix but a 3 x dynlen size matrix, since the matrix is constructed 3 rows at a time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. I will make it in the local scope.
neighbor->delay = 1; | ||
neighbor->ago = 0; | ||
neighbor->ndanger = 0; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is an idea for how you can add group support:
- count the number of atoms in the group and store it as a class member. that number replaces
atom->natoms
to determine the per-atom loop count in the computation (nelem) - allocate a vector
idxmap[nelem]
and a vectorlocalmap[atom->nmax]
on every MPI rank and zero them - on each MPI rank do a loop from 0 to
atoms->nlocal
(exclusive) and if the loop index i is member of the group addatom->tag[i]
to localmap[] - on MPI rank zero, copy the non-zero elements of localmap[] to idxmap[]
- now loop over all MPI ranks and have them send their localmap to the one on rank 0, and append the non-zero elements to idxmap[]
- now sort idxmap[] (if desired) and broadcast it to all MPI ranks. use the content of idxmap instead of the loop from 1 to
atom->natoms
(inclusive) for the loops in the dynamical matrix calculation. - when outputting the dynamical matrix, also provide this list, so that the matrix elements can be identified by atom id and direction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe I already did this exact solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nope. my suggestion will reduce the amount of memory used, yours will always require storage of the dimension natoms.
src/USER-PHONON/dynamical_matrix.cpp
Outdated
return negative gradient for nextra_global dof in fextra | ||
------------------------------------------------------------------------- */ | ||
|
||
void DynamicalMatrix::energy_force(int resetflag) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
calling this function energy_force()
is a misnomer, since you are setting eflag
and vflag
to zero, and thus do not tally energy or virial contributions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just wanted to make it look similar to other class methods. Would you have me change other classes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is an "internal" function, its name only has to make sense within this class.
src/USER-PHONON/dynamical_matrix.cpp
Outdated
if (local_idx >= 0){ | ||
for (int alpha=0; alpha<3; alpha++){ | ||
displace_atom(local_idx, alpha, 1); | ||
energy_force(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
energy_force()
must be called by all MPI ranks, not only those owning the displaced atom.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have fallen for this trap enough times. I have fixed this.
dynmat[(i-1)*3+alpha][(j-1)*3+beta] *= conversion; | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if dynmat changed to be a (local) 3 x dynlen matrix, then it must be reduced here, and then added to the global dynamical matrix. This would reduce the memory consumption for large systems significantly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Memory consumption > MPI overhead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. you can reduce memory use even more by outputting the dynamical matrix here directly. in LAMMPS we try to distribute data as much as possible, and thus we try to avoid allocations of size atom->natoms
(but that is not always possible). atom->natoms
with a constant factor is similar, but atom->natoms * atom->natoms
is a much bigger problems, as memory use now grows quadratic with the number of atoms and cannot be reduced by parallelization.
...and if quadratic size cannot be avoided, then at least it should be only on one MPI rank (i.e. use MPI_Reduce()
instead of MPI_Allreduce()
), i.e. the one writing the file.
The whole matrix would really only be needed, e.g. if you include some postprocessing, and even then with additional effort, that could be distributed.
But then again, any size of a system, where this would matter, is probably taking such a long time to compute, that nobody would want to do it in the first place. Hence my suggestion, to run the force computation once to record the time it take for one force compute and then write out an estimate for the total time it would take to compute the matrix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should definitely do MPI_Reduce instead of all reduce. I wrote that back when I was learning MPI...
But then again, any size of a system, where this would matter, is probably taking such a long time to compute, that nobody would want to do it in the first place. Hence my suggestion, to run the force computation once to record the time it take for one force compute and then write out an estimate for the total time it would take to compute the matrix.
Should I make a check_comp_time method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems a bit overkill to me. i would just leave the one call to the force computation in the setup()
method (that seems the best place for it) and measure its time.
if you then use the following code (adapted from src/run.cpp
:
timer->init();
timer->barrier_start();
calculateMatrix();
timer->barrier_stop();
If you keep the timer calls in the force routine, as they were in verlet.cpp or similar, then you have a nice breakdown of how the time is spent on the different force computations, in the summary output from the Finish class.
The call in the setup function could in the future also be used to get per MPI rank timer information to be used for load balancing, although that is something that can be done outside the command using a run 0 and a balance command. There are lots of additional 'gimmicks' that can be added later, a timeout (for when running with batch systems), a restart feature (for very large systems) etc.
I understand how you feel. I have been in the same situation many times myself and it took quite a while to manage the resulting frustration. Later I have learned to appreciate, that somebody saying "no" to my contributions doesn't mean "no" to myself, but that there is value in the rejection by leading to an overall better effort. In an open source effort like LAMMPS, it is particularly difficult to say "no" to a submission (unless it is complete and obvious garbage), since the well-being of the project is dependent on contributions, and there should be a reward for people giving their changes/additions back to the project (instead of keeping them to themselves). |
src/USER-PHONON/dynamical_matrix.cpp
Outdated
//find number of local atoms in the group (final_gid) | ||
for (int i=1; i<=natoms; i++){ | ||
local_idx = atom->map(i); | ||
if (mask[local_idx] & groupbit && local_idx < nlocal) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not account for local_idx < 0
for atoms that are not owned and will thus have illegal memory access. Try:
if ((local_idx >= 0) && (local_idx < nlocal) && mask[local_idx] & groupbit)
src/USER-PHONON/dynamical_matrix.cpp
Outdated
|
||
//combine subgroup maps into total temporary groupmap | ||
MPI_Allgatherv(sub_groupmap,gid,MPI_INT,temp_groupmap,recv,displs,MPI_INT,world); | ||
std::sort(temp_groupmap,temp_groupmap+group->count(igroup)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have to add #include <algorithm>
if you want to use std::sort()
. As an alternative, you could also use our internal c-style sort from the mergesort.h
header.
double conv_distance; | ||
double conv_mass; | ||
double del; | ||
int igroup,groupbit; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should also store the result of group->count(igroup)
here, since the count function incurs an MPI collective operation.
@charlessievers please put a note here on github, when would like me to do the next pass of reviewing your contribution. This would then also include compiling the code and running the provided examples with an instrumented binary and/or using valgrind's memcheck tool. |
@akohlmey that would be great. I am currently making three more examples. I should be done with them by tomorrow. |
…ssing uninitialized data, too.
@charlessievers i looked over your latest changes and made some corrections, and now it appears that the Please let me know ASAP, how you want to proceed. I am planning to make a patch release later today, and this could be included or would have to wait (possibly 3-4 more weeks) until the next patch and it is completely functional. |
I would like to merge the dynamical matrix command. I will keep working on the third order. How would you like me to proceed. |
i'll just delete the third order code from this branch and ask other LAMMPS developers to have a look and approve the merge. I will also delete the examples using ASE (for now). They should be accompanied with instructions about how to get and set up ASE and how to run those inputs. I know a little bit about this so I know how to deal with it, but other LAMMPS users will be confused. I am not opposing it (in fact, we would very uch welcome a small tutorial addition to the LAMMPS manual describing how to use LAMMPS through ASE), but we want to avoid to include anything that is not properly documented. after the merge (and patch release) you can then create a new branch re-add the content, that i removed, and submit a new pull request and we keep working on that until it is ready to be merged. |
Perfect, will do! |
great. please make sure you pull the branch now, so you have my changes and bugfixes also for |
Just pulled. |
one more thing, can you give me a "permanent" e-mail address, that i can add to the readme in case people (or the LAMMPS developers) need to contact you about your contribution? thanks, |
Best, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this has already been merged by @akohlmey but I have a couple of comments anyway.
- Well done @charlessievers ! This is a great addition, something that has been talked about for years
- What about providing an option to report instead the Hessian d2U/dR_j^2 in units of energy/distance^2?
- It would be better if you could avoid hard-coding your own unit conversion factors and instead use built in lammps constants e.g. force->mvv2e.
- Why is dynamical matrix always in units of 10 J/mol/A^2/g/mol. It would be more LAMMPSific if the units changed with unit_style?
@athomps if you would have checked the discussion at the very end of the PR
(waaaay down), you should have seen, that this is only part1. there is
going to be a second PR with a third_order command and the opportunity to
improve that and the dynamical matrix command. there should also be some
python code included to diagonalize the matrix and documented examples
setting things up in a very elegant way through ASE.
@charlessievers, when you submit the new pull request, try to assign it to
@athomps directly. so he can help you first to improve it and then will
assign it to me for final checks, docs cleanup, and to merge like on this
one.
…On Thu, Feb 28, 2019 at 7:05 PM athomps ***@***.***> wrote:
***@***.**** commented on this pull request.
It looks like this has already been merged by @akohlmey
<https://github.com/akohlmey> but I have a couple of comments anyway.
1. Well done @charlessievers <https://github.com/charlessievers> !
This is a great addition, something that has been talked about for years
2. What about providing an option to report instead the Hessian
d2U/dR_j^2 in units of energy/distance^2?
3. It would be better if you could avoid hard-coding your own unit
conversion factors and instead use built in lammps constants e.g.
force->mvv2e.
4. Why is dynamical matrix always in units of 10 J/mol/A^2/g/mol. It
would be more LAMMPSific if the units changed with unit_style?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1314 (review)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AARp43dDQkLrOr2nKcugJUjHTrE0GMb9ks5vSG65gaJpZM4afZUz>
.
--
Dr. Axel Kohlmeyer akohlmey@gmail.com http://goo.gl/1wk0
College of Science & Technology, Temple University, Philadelphia PA, USA
International Centre for Theoretical Physics, Trieste. Italy.
|
I just looked at this again. I see that it uses O(N) memory by writing one row at a time. This is very good. Also, because it does not use a neighbor list, it should work for any potential, including things like EAM, kspace, maybe even QEQ, all of which are non-trivial. Well done @charlessievers and @akohlmey |
For systems that are bigger than the interaction cutoff, the number of non-zero entries in each row is roughly constant i.e. the matrix sparsity ~ 1/L^3. This could be handled optionally by only writing non-zero entries in each row, preceded by a list of column indices for that row. Sparseness could also be exploited in the computation, by applying a distance check, or neighbor check, but it is not easy to ensure correctness for all potentials. |
Very good addition, and just in time for when I needed it 👍 The command does have a few issues, some of which are easy to fix. Should I open a PR, or would you prefer doing it all in one batch with the third_order part? @charlessievers @akohlmey |
@martok i generally prefer doing things in an incremental fashion. Besides, if you post a pull request we can either merge it to the master branch directly or @charlessievers can merge it first into his development branch depending on how much time he has available to test and integrate changes and what seems easier to do. My time is going to be somewhat limited for the next couple of weeks due to other projects. |
Makes sense! |
Hello everyone, Thank you for your comments and support. Sorry I have not been responding, I gave a departmental talk on thermal devices this week.
A hessian option is a great idea, I will go ahead and implement that when I get some time!
Hard-coded unit conversions are only set in the ESKM style. I too would prefer to avoid them.
The dynamical matrix only uses these units in the ESKM style. The square root of the eigenvalues divided by 2pi give you THz with said units. Also, ESKM -- an in house lattice dynamics code -- uses dynamical matrices with said units.
Great idea! Some brainstorming will definitely be required.
I just learned how to program this year, so there are bound to be some issues. I appreciate your time and patience to improve the command. Best, |
Purpose
Important Note by @akohlmey: The
third_order
command has been removed from this PR and will be added in a later PR, since it still needs work, whiledynamical_matrix
seems to work as expected, is valgrind clean and ready to be merged (and a nice complement to fix phonon).I added two commands to lammps. One that calculates the dynamical matrix and one that calculates the third order equivalent of the dynamical matrix (excluding division by mass). These commands will be useful to anyone who would like to calculate these matrices using the numerous force fields that are implemented in lammps.
closes #1304
Author(s)
Charlie Sievers
UC Davis PhD Candidate
Donadio Lab
Backward Compatibility
Backwards compatible up to two years.
Implementation Notes
Implemented for make mpi.
The matrices are calculated using a finite difference method and have been MPI paralellized.
Post Submission Checklist
Further Information, Files, and Links
Within the directory lammps/examples/USER/phonon/ there are examples and manuals which describe how to use the implemented commands.
Information on dynamical matrices: http://faculty.virginia.edu/esfarjani/UVA/Teaching_files/phonons.pdf