Optimized Dynamical Matrix command #1314

charlessievers · 2019-02-02T03:34:17Z

Purpose

Important Note by @akohlmey: The third_order command has been removed from this PR and will be added in a later PR, since it still needs work, while dynamical_matrix seems to work as expected, is valgrind clean and ready to be merged (and a nice complement to fix phonon).

I added two commands to lammps. One that calculates the dynamical matrix and one that calculates the third order equivalent of the dynamical matrix (excluding division by mass). These commands will be useful to anyone who would like to calculate these matrices using the numerous force fields that are implemented in lammps.

closes #1304

Author(s)

Charlie Sievers
UC Davis PhD Candidate
Donadio Lab

Backward Compatibility

Backwards compatible up to two years.

Implementation Notes

Implemented for make mpi.

The matrices are calculated using a finite difference method and have been MPI paralellized.

Post Submission Checklist

The feature or features in this pull request is complete
Suitable new documentation files and/or updates to the existing docs are included
One or more example input decks are included
The source code follows the LAMMPS formatting guidelines

Further Information, Files, and Links

Within the directory lammps/examples/USER/phonon/ there are examples and manuals which describe how to use the implemented commands.

Information on dynamical matrices: http://faculty.virginia.edu/esfarjani/UVA/Teaching_files/phonons.pdf

charlessievers · 2019-02-02T05:37:05Z

I guess I am too stubborn. Is there a flag to tell if an atom is a ghost or local? I am double counting atoms when I take into consideration the ghosts for local_idx.

akohlmey · 2019-02-02T05:50:15Z

An atom is local if it's index is smaller than atom->nlocal.

…

-- Dr. Axel Kohlmeyer akohlmey@gmail.com http://goo.gl/1wk0 College of Science & Technology, Temple University, Philadelphia PA, USA International Centre for Theoretical Physics, Trieste. Italy.

On Sat, Feb 2, 2019, 06:37 charlessievers ***@***.*** wrote: I guess I am too stubborn. Is there a flag to tell if an atom is a ghost or local? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#1314 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AARp4wKSWTSq7vTEJdE4ofZkYsNyCnIVks5vJSQBgaJpZM4afZUz> .

charlessievers · 2019-02-02T05:51:45Z

An atom is local if it's index is smaller than atom->nlocal.
…

Great, that is what I did on my outdated code. I am glad to know that it is always true.

charlessievers · 2019-02-02T05:59:15Z

Alright, I made it work. I will admit that it is better in every possible way. I still need to implement the group functionality, rewrite the third order code, and update the examples. Thank you for pushing me and improving my product.

akohlmey

I've looked over this version of the code and made some comments and suggestions for changes.

akohlmey · 2019-02-03T10:44:56Z

src/USER-PHONON/dynamical_matrix.cpp

+    double **f = atom->f;
+
+    //initialize dynmat to all zeros
+    for (int i=0; i < dynlen; i++)


dynlen is of type bigint, so should be i and j here.

akohlmey · 2019-02-03T10:45:48Z

src/USER-PHONON/dynamical_matrix.cpp

+
+    if (comm->me == 0 && screen) fprintf(screen,"Calculating Dynamical Matrix...\n");
+
+    for (int i=1; i<=natoms; i++){


natoms is of type bigint, so should be i here.

akohlmey · 2019-02-03T10:47:16Z

src/USER-PHONON/dynamical_matrix.cpp

+        for (int j=0; j < dynlen; j++)
+            dynmat[i][j] = 0.;
+
+    energy_force(0);


why call energy_force() here?

Great question. I believe that it was for testing purposes initially. Any case, it is outdated.

akohlmey · 2019-02-03T10:51:14Z

src/USER-PHONON/dynamical_matrix.cpp

+
+    // compute all forces
+    force_clear();
+    external_force_clear = 0;


The steps below are essentially what is in energy_force(), is it not?
I would measure the duration of the force computation, as that can be used as an estimate for how long it will take to complete the command (wall_time of one force computation * 6 * dynlen * dynlen, if i am not mistaken).

akohlmey · 2019-02-03T11:03:03Z

src/USER-PHONON/dynamical_matrix.cpp

+    }
+
+
+    memory->create(final_dynmat,int(dynlen),int(dynlen),"dynamic_matrix_buffer:buf");


This limits dynlen to be a 32-bit only integer. Now, it is going to be quite slow to compute the dynamical matrix for a very large system, but then there should be a test at the beginning checking that dynlen isn't larger than MAXSMALLINT. Then it could also be made an int and loop indices correspondingly.

akohlmey · 2019-02-03T11:07:30Z

src/USER-PHONON/dynamical_matrix.h

+        int scaleflag;
+        int me;
+        bigint dynlen;
+        double **dynmat;


dynmat does not have to be a class member, and it does not have to be a full matrix but a 3 x dynlen size matrix, since the matrix is constructed 3 rows at a time.

Right. I will make it in the local scope.

akohlmey · 2019-02-03T11:22:51Z

src/USER-PHONON/dynamical_matrix.cpp

+    neighbor->delay = 1;
+    neighbor->ago = 0;
+    neighbor->ndanger = 0;
+


Here is an idea for how you can add group support:

count the number of atoms in the group and store it as a class member. that number replaces atom->natoms to determine the per-atom loop count in the computation (nelem)

allocate a vector idxmap[nelem] and a vector localmap[atom->nmax] on every MPI rank and zero them

on each MPI rank do a loop from 0 to atoms->nlocal(exclusive) and if the loop index i is member of the group add atom->tag[i] to localmap[]

on MPI rank zero, copy the non-zero elements of localmap[] to idxmap[]

now loop over all MPI ranks and have them send their localmap to the one on rank 0, and append the non-zero elements to idxmap[]

now sort idxmap[] (if desired) and broadcast it to all MPI ranks. use the content of idxmap instead of the loop from 1 to atom->natoms(inclusive) for the loops in the dynamical matrix calculation.

when outputting the dynamical matrix, also provide this list, so that the matrix elements can be identified by atom id and direction.

I believe I already did this exact solution.

nope. my suggestion will reduce the amount of memory used, yours will always require storage of the dimension natoms.

akohlmey · 2019-02-03T11:25:23Z

src/USER-PHONON/dynamical_matrix.cpp

+   return negative gradient for nextra_global dof in fextra
+------------------------------------------------------------------------- */
+
+void DynamicalMatrix::energy_force(int resetflag)


calling this function energy_force() is a misnomer, since you are setting eflag and vflag to zero, and thus do not tally energy or virial contributions.

I just wanted to make it look similar to other class methods. Would you have me change other classes?

Since this is an "internal" function, its name only has to make sense within this class.

akohlmey · 2019-02-03T11:29:40Z

src/USER-PHONON/dynamical_matrix.cpp

+        if (local_idx >= 0){
+            for (int alpha=0; alpha<3; alpha++){
+                displace_atom(local_idx, alpha, 1);
+                energy_force(0);


energy_force() must be called by all MPI ranks, not only those owning the displaced atom.

I have fallen for this trap enough times. I have fixed this.

akohlmey · 2019-02-03T11:36:50Z

src/USER-PHONON/dynamical_matrix.cpp

+                            dynmat[(i-1)*3+alpha][(j-1)*3+beta] *= conversion;
+                        }
+                    }
+                }


if dynmat changed to be a (local) 3 x dynlen matrix, then it must be reduced here, and then added to the global dynamical matrix. This would reduce the memory consumption for large systems significantly.

I see. Memory consumption > MPI overhead.

yes. you can reduce memory use even more by outputting the dynamical matrix here directly. in LAMMPS we try to distribute data as much as possible, and thus we try to avoid allocations of size atom->natoms (but that is not always possible). atom->natoms with a constant factor is similar, but atom->natoms * atom->natoms is a much bigger problems, as memory use now grows quadratic with the number of atoms and cannot be reduced by parallelization.
...and if quadratic size cannot be avoided, then at least it should be only on one MPI rank (i.e. use MPI_Reduce() instead of MPI_Allreduce()), i.e. the one writing the file.

The whole matrix would really only be needed, e.g. if you include some postprocessing, and even then with additional effort, that could be distributed.

But then again, any size of a system, where this would matter, is probably taking such a long time to compute, that nobody would want to do it in the first place. Hence my suggestion, to run the force computation once to record the time it take for one force compute and then write out an estimate for the total time it would take to compute the matrix.

I should definitely do MPI_Reduce instead of all reduce. I wrote that back when I was learning MPI...

But then again, any size of a system, where this would matter, is probably taking such a long time to compute, that nobody would want to do it in the first place. Hence my suggestion, to run the force computation once to record the time it take for one force compute and then write out an estimate for the total time it would take to compute the matrix.

Should I make a check_comp_time method?

That seems a bit overkill to me. i would just leave the one call to the force computation in the setup() method (that seems the best place for it) and measure its time.

if you then use the following code (adapted from src/run.cpp:

timer->init(); timer->barrier_start(); calculateMatrix(); timer->barrier_stop();

If you keep the timer calls in the force routine, as they were in verlet.cpp or similar, then you have a nice breakdown of how the time is spent on the different force computations, in the summary output from the Finish class.

The call in the setup function could in the future also be used to get per MPI rank timer information to be used for load balancing, although that is something that can be done outside the command using a run 0 and a balance command. There are lots of additional 'gimmicks' that can be added later, a timeout (for when running with batch systems), a restart feature (for very large systems) etc.

akohlmey · 2019-02-03T11:48:44Z

Alright, I made it work. I will admit that it is better in every possible way. I still need to implement the group functionality, rewrite the third order code, and update the examples. Thank you for pushing me and improving my product.

I understand how you feel. I have been in the same situation many times myself and it took quite a while to manage the resulting frustration. Later I have learned to appreciate, that somebody saying "no" to my contributions doesn't mean "no" to myself, but that there is value in the rejection by leading to an overall better effort. In an open source effort like LAMMPS, it is particularly difficult to say "no" to a submission (unless it is complete and obvious garbage), since the well-being of the project is dependent on contributions, and there should be a reward for people giving their changes/additions back to the project (instead of keeping them to themselves).

akohlmey · 2019-02-03T18:33:49Z

src/USER-PHONON/dynamical_matrix.cpp

+    //find number of local atoms in the group (final_gid)
+    for (int i=1; i<=natoms; i++){
+        local_idx = atom->map(i);
+        if (mask[local_idx] & groupbit && local_idx < nlocal)


This does not account for local_idx < 0 for atoms that are not owned and will thus have illegal memory access. Try:

if ((local_idx >= 0) && (local_idx < nlocal) && mask[local_idx] & groupbit)

akohlmey · 2019-02-03T18:43:03Z

src/USER-PHONON/dynamical_matrix.cpp

+
+    //combine subgroup maps into total temporary groupmap
+    MPI_Allgatherv(sub_groupmap,gid,MPI_INT,temp_groupmap,recv,displs,MPI_INT,world);
+    std::sort(temp_groupmap,temp_groupmap+group->count(igroup));


You have to add #include <algorithm> if you want to use std::sort(). As an alternative, you could also use our internal c-style sort from the mergesort.h header.

akohlmey · 2019-02-03T18:45:06Z

src/USER-PHONON/dynamical_matrix.h

+        double conv_distance;
+        double conv_mass;
+        double del;
+        int igroup,groupbit;


you should also store the result of group->count(igroup) here, since the count function incurs an MPI collective operation.

akohlmey · 2019-02-07T13:14:49Z

@charlessievers please put a note here on github, when would like me to do the next pass of reviewing your contribution. This would then also include compiling the code and running the provided examples with an instrumented binary and/or using valgrind's memcheck tool.

charlessievers · 2019-02-08T06:14:41Z

@akohlmey that would be great. I am currently making three more examples. I should be done with them by tomorrow.

…ssing uninitialized data, too.

requested changes are included

akohlmey · 2019-02-28T19:53:04Z

@charlessievers i looked over your latest changes and made some corrections, and now it appears that the dynamical_matrix command is ready to be merged, but the third_order command is not. I converted the docs for the former and integrated them properly into the manual, but didn't do it for third_order (those should be merged into the file added for dynamical_matrix. I fixed a major issue with third_order causing segmentation faults, but the output between serial and parallel runs differs a lot and i cannot tell currently, if this is to be expected or not. For dynamical_matrix, one can see, that those a minor numerical differences that are within the scale of what is expected from during finite differences and summing up forces differently.

Please let me know ASAP, how you want to proceed. I am planning to make a patch release later today, and this could be included or would have to wait (possibly 3-4 more weeks) until the next patch and it is completely functional.

charlessievers · 2019-02-28T19:57:06Z

I would like to merge the dynamical matrix command. I will keep working on the third order. How would you like me to proceed.

akohlmey · 2019-02-28T20:02:52Z

I would like to merge the dynamical matrix command. I will keep working on the third order. How would you like me to proceed.

i'll just delete the third order code from this branch and ask other LAMMPS developers to have a look and approve the merge. I will also delete the examples using ASE (for now). They should be accompanied with instructions about how to get and set up ASE and how to run those inputs. I know a little bit about this so I know how to deal with it, but other LAMMPS users will be confused. I am not opposing it (in fact, we would very uch welcome a small tutorial addition to the LAMMPS manual describing how to use LAMMPS through ASE), but we want to avoid to include anything that is not properly documented.

after the merge (and patch release) you can then create a new branch re-add the content, that i removed, and submit a new pull request and we keep working on that until it is ready to be merged.

charlessievers · 2019-02-28T20:09:16Z

I would like to merge the dynamical matrix command. I will keep working on the third order. How would you like me to proceed.

i'll just delete the third order code from this branch and ask other LAMMPS developers to have a look and approve the merge. I will also delete the examples using ASE (for now). They should be accompanied with instructions about how to get and set up ASE and how to run those inputs. I know a little bit about this so I know how to deal with it, but other LAMMPS users will be confused. I am not opposing it (in fact, we would very uch welcome a small tutorial addition to the LAMMPS manual describing how to use LAMMPS through ASE), but we want to avoid to include anything that is not properly documented.

after the merge (and patch release) you can then create a new branch re-add the content, that i removed, and submit a new pull request and we keep working on that until it is ready to be merged.

Perfect, will do!

akohlmey · 2019-02-28T20:14:11Z

Perfect, will do!

great. please make sure you pull the branch now, so you have my changes and bugfixes also for third_order and the documentation integration.

charlessievers · 2019-02-28T20:16:57Z

great. please make sure you pull the branch now, so you have my changes and bugfixes also for third_order and the documentation integration.

Just pulled.

akohlmey · 2019-02-28T20:19:50Z

great. please make sure you pull the branch now, so you have my changes and bugfixes also for third_order and the documentation integration.

Just pulled.

one more thing, can you give me a "permanent" e-mail address, that i can add to the readme in case people (or the LAMMPS developers) need to contact you about your contribution?
institutional e-mails often expire after people graduate or find a new job, and then it can be difficult to get hold of people.

thanks,
axel.

…w pull request

charlessievers · 2019-02-28T20:21:53Z

one more thing, can you give me a "permanent" e-mail address, that i can add to the readme in case people (or the LAMMPS developers) need to contact you about your contribution?
institutional e-mails often expire after people graduate or find a new job, and then it can be difficult to get hold of people.

thanks,
axel.

charliesievers@cox.net

Best,
Charlie

athomps

It looks like this has already been merged by @akohlmey but I have a couple of comments anyway.

Well done @charlessievers ! This is a great addition, something that has been talked about for years
What about providing an option to report instead the Hessian d2U/dR_j^2 in units of energy/distance^2?
It would be better if you could avoid hard-coding your own unit conversion factors and instead use built in lammps constants e.g. force->mvv2e.
Why is dynamical matrix always in units of 10 J/mol/A^2/g/mol. It would be more LAMMPSific if the units changed with unit_style?

akohlmey · 2019-03-01T00:14:19Z

@athomps if you would have checked the discussion at the very end of the PR (waaaay down), you should have seen, that this is only part1. there is going to be a second PR with a third_order command and the opportunity to improve that and the dynamical matrix command. there should also be some python code included to diagonalize the matrix and documented examples setting things up in a very elegant way through ASE. @charlessievers, when you submit the new pull request, try to assign it to @athomps directly. so he can help you first to improve it and then will assign it to me for final checks, docs cleanup, and to merge like on this one.

…

On Thu, Feb 28, 2019 at 7:05 PM athomps ***@***.***> wrote: ***@***.**** commented on this pull request. It looks like this has already been merged by @akohlmey <https://github.com/akohlmey> but I have a couple of comments anyway. 1. Well done @charlessievers <https://github.com/charlessievers> ! This is a great addition, something that has been talked about for years 2. What about providing an option to report instead the Hessian d2U/dR_j^2 in units of energy/distance^2? 3. It would be better if you could avoid hard-coding your own unit conversion factors and instead use built in lammps constants e.g. force->mvv2e. 4. Why is dynamical matrix always in units of 10 J/mol/A^2/g/mol. It would be more LAMMPSific if the units changed with unit_style? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1314 (review)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AARp43dDQkLrOr2nKcugJUjHTrE0GMb9ks5vSG65gaJpZM4afZUz> .

-- Dr. Axel Kohlmeyer akohlmey@gmail.com http://goo.gl/1wk0 College of Science & Technology, Temple University, Philadelphia PA, USA International Centre for Theoretical Physics, Trieste. Italy.

athomps · 2019-03-01T18:59:36Z

I just looked at this again. I see that it uses O(N) memory by writing one row at a time. This is very good. Also, because it does not use a neighbor list, it should work for any potential, including things like EAM, kspace, maybe even QEQ, all of which are non-trivial. Well done @charlessievers and @akohlmey

athomps · 2019-03-01T19:12:01Z

For systems that are bigger than the interaction cutoff, the number of non-zero entries in each row is roughly constant i.e. the matrix sparsity ~ 1/L^3. This could be handled optionally by only writing non-zero entries in each row, preceded by a list of column indices for that row. Sparseness could also be exploited in the computation, by applying a distance check, or neighbor check, but it is not easy to ensure correctness for all potentials.

martok · 2019-03-06T10:20:12Z

Very good addition, and just in time for when I needed it 👍

The command does have a few issues, some of which are easy to fix. Should I open a PR, or would you prefer doing it all in one batch with the third_order part? @charlessievers @akohlmey

akohlmey · 2019-03-06T11:20:35Z

@martok i generally prefer doing things in an incremental fashion. Besides, if you post a pull request we can either merge it to the master branch directly or @charlessievers can merge it first into his development branch depending on how much time he has available to test and integrate changes and what seems easier to do. My time is going to be somewhat limited for the next couple of weeks due to other projects.

martok · 2019-03-06T12:01:14Z

Makes sense!
Don't worry about merge time, I just want to put this somewhere public to avoid creating wildly different branches.

charlessievers · 2019-03-06T18:35:41Z

Hello everyone,

Thank you for your comments and support. Sorry I have not been responding, I gave a departmental talk on thermal devices this week.

@athomps

What about providing an option to report instead the Hessian d2U/dR_j^2 in units of energy/distance^2?

A hessian option is a great idea, I will go ahead and implement that when I get some time!

It would be better if you could avoid hard-coding your own unit conversion factors and instead use built in lammps constants e.g. force->mvv2e.

Hard-coded unit conversions are only set in the ESKM style. I too would prefer to avoid them.

Why is dynamical matrix always in units of 10 J/mol/A^2/g/mol. It would be more LAMMPSific if the units changed with unit_style?

The dynamical matrix only uses these units in the ESKM style. The square root of the eigenvalues divided by 2pi give you THz with said units. Also, ESKM -- an in house lattice dynamics code -- uses dynamical matrices with said units.

Sparseness could also be exploited in the computation, by applying a distance check, or neighbor check, but it is not easy to ensure correctness for all potentials.

Great idea! Some brainstorming will definitely be required.

@martok

The command does have a few issues, some of which are easy to fix.

I just learned how to program this year, so there are bound to be some issues. I appreciate your time and patience to improve the command.

Best,
Charlie

Optimized (but not working) Dynamical Matrix command)

682b456

charlessievers changed the title ~~Optimized (but not working) Dynamical Matrix command)~~ Optimized Dynamical Matrix command Feb 2, 2019

Optimized Dynamical Matrix

adebe90

akohlmey self-assigned this Feb 2, 2019

akohlmey added enhancement work_in_progress labels Feb 2, 2019

akohlmey added this to the Stable Release Spring 2019 milestone Feb 2, 2019

akohlmey previously requested changes Feb 3, 2019

View reviewed changes

added a groupmap

5c3e3f3

akohlmey reviewed Feb 3, 2019

View reviewed changes

casievers added 3 commits February 5, 2019 11:49

Memory Use Reduction

8ec9b6f

minor dynmat changes and start of third order changes

4226522

third order tensor calculator

490f67d

casievers and others added 4 commits February 11, 2019 19:46

Added additional tutorials for the dynamical matrix calculator

4b8621e

count each force computation as one simulation step. this avoids acce…

286112f

…ssing uninitialized data, too.

integrate dynamical_matrix command into LAMMPS manual

16946d8

fix segfault and copy-n-modify issues with third order command

9298fe7

charlessievers requested a review from rbberger as a code owner February 28, 2019 19:45

remove third_order command and ASE based examples to be added in a ne…

7062bc8

…w pull request

update attribution information in Package details documentation

1d2eab5

akohlmey removed the work_in_progress label Feb 28, 2019

akohlmey requested review from sjplimp, stanmoore1 and athomps February 28, 2019 20:29

akohlmey added 2 commits February 28, 2019 15:49

add one more false positive required by recent changes

5fd033c

add x-ref

50fef54

rbberger approved these changes Feb 28, 2019

View reviewed changes

akohlmey merged commit 14e6c12 into lammps:master Feb 28, 2019

athomps reviewed Mar 1, 2019

View reviewed changes

martok mentioned this pull request Mar 6, 2019

Updates to dynamical_matrix #1359

Merged

13 tasks


		if (comm->me == 0 && screen) fprintf(screen,"Calculating Dynamical Matrix...\n");

		for (int i=1; i<=natoms; i++){

		}


		memory->create(final_dynmat,int(dynlen),int(dynlen),"dynamic_matrix_buffer:buf");

Optimized Dynamical Matrix command #1314

Optimized Dynamical Matrix command #1314

Conversation

charlessievers commented Feb 2, 2019 • edited by akohlmey Loading

Purpose

Author(s)

Backward Compatibility

Implementation Notes

Post Submission Checklist

Further Information, Files, and Links

charlessievers commented Feb 2, 2019 • edited Loading

akohlmey commented Feb 2, 2019 via email

charlessievers commented Feb 2, 2019

charlessievers commented Feb 2, 2019

akohlmey left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

charlessievers Feb 3, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

akohlmey commented Feb 3, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

akohlmey commented Feb 7, 2019

charlessievers commented Feb 8, 2019

akohlmey commented Feb 28, 2019

charlessievers commented Feb 28, 2019

akohlmey commented Feb 28, 2019

charlessievers commented Feb 28, 2019

akohlmey commented Feb 28, 2019

charlessievers commented Feb 28, 2019

akohlmey commented Feb 28, 2019

charlessievers commented Feb 28, 2019

athomps left a comment

Choose a reason for hiding this comment

akohlmey commented Mar 1, 2019 via email

athomps commented Mar 1, 2019

athomps commented Mar 1, 2019

martok commented Mar 6, 2019

akohlmey commented Mar 6, 2019

martok commented Mar 6, 2019

charlessievers commented Mar 6, 2019 • edited Loading

charlessievers commented Feb 2, 2019 •

edited by akohlmey

Loading

charlessievers commented Feb 2, 2019 •

edited

Loading

charlessievers Feb 3, 2019 •

edited

Loading

charlessievers commented Mar 6, 2019 •

edited

Loading