Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lammps writer #28

Merged
merged 37 commits into from
Mar 5, 2019
Merged

Lammps writer #28

merged 37 commits into from
Mar 5, 2019

Conversation

rmatsum836
Copy link
Collaborator

@rmatsum836 rmatsum836 commented Feb 11, 2019

Getting a Lammps writer going, mostly adapted from mBuild. Definitely a work in progress. As the Site class begins to take shape, I can start to expand on this.

@ahy3nz
Copy link
Collaborator

ahy3nz commented Feb 12, 2019

I'm guessing there's probably room to mess with smoothly handling units as things get written out. Something like

>>> box = [1,2,3] * u.nanometer
>>> box
unyt_array([1., 2., 3.], 'nm')
>>> box.convert_to_units(u.angstrom)
>>> box
unyt_array([10., 20., 30.], 'angstrom')

Or maybe instead of specifying u.angstrom we can set + define lmp_distance_unit, lmp_energy_unit, lmp_time_unit, etc. which could leave room for flexibility when working with anything not units real
This is probably simple/redundant/overkill for boxes and distances, but then writing out specific parameters might be more interesting, as things are converted between nm/angstroms and kj/kcal (for units real at least) for force constants or specifying energy terms

@mattwthompson
Copy link
Member

I think for now we should stick to adding units as we go, and not making new variables. But that approach is probably how we would implement unit systems, except probably with dictionaries instead of engine_dimension_unit variables, i.e. lammps_unit_system = {'length' : u.angstrom, 'time' : u.ps, 'energy' : u.kcal}

@mattwthompson mattwthompson added enhancement New feature or request WIP work in progress - do not merge labels Feb 12, 2019
@rmatsum836
Copy link
Collaborator Author

In mBuild's Lammps writer, the use of Parmed's atom indexing was heavily used to keep track of atom indices for bonds, angles, and dihedrals.

Currently, Sites don't have any indexing which make it difficult to keep track of the bonds, angles and dihedrals to write out to a Lammps data file.

Either we come up with a temporary workaround or effort should rather focused on handling indexing better.

@mattwthompson
Copy link
Member

I'm not sure which option is best, but the use of an index-site map or enumerate and list indexing could work for looking up the indices for each site in a bond/angle/dihedral

@rmatsum836 rmatsum836 mentioned this pull request Feb 19, 2019
@ahy3nz
Copy link
Collaborator

ahy3nz commented Feb 19, 2019

I vote an index-site map - if we added site.index as a property, it's an extra bookkeeping operation, especially if we want to add topologies.

We'd have to flesh out the lammps-writing more, but doing lots of list indexing seems somewhat inefficient ("linear complexity in list length") - to that end, dictionaries are faster as just straightforward look-up tables

@mattwthompson mattwthompson mentioned this pull request Feb 20, 2019
topology/formats/lammpsdata.py Outdated Show resolved Hide resolved
topology/formats/lammpsdata.py Outdated Show resolved Hide resolved
@rmatsum836
Copy link
Collaborator Author

This is pretty close to being able to write out a simple system of LJ particles. Once #47 gets merged, Site or AtomType will have masses so I will be able to write masses out to the Lammps data file.

The code commented out was replaced with TODO statements to clean things up

Copy link
Collaborator

@justinGilmer justinGilmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

beginning review with some changes

topology/formats/lammpsdata.py Outdated Show resolved Hide resolved
topology/formats/lammpsdata.py Outdated Show resolved Hide resolved
topology/formats/lammpsdata.py Outdated Show resolved Hide resolved
@rmatsum836
Copy link
Collaborator Author

I want to review this myself and then should be read for review.

@rmatsum836
Copy link
Collaborator Author

Last change I made was unyt_array.v to unyt_array.value. One thing I'm unsure of is setting variable vector = box.get_vectors().get_value. Since there is some math involving numpy to get the bounds, I think this is the cleanest way. But other than that, should be ready for review.

Copy link
Member

@mattwthompson mattwthompson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, just few small comments and nits. This will be a big step forward for this package.

topology/formats/lammpsdata.py Outdated Show resolved Hide resolved
data.write('{0:.6f} {1:.6f} zlo zhi\n'.format(
zlo_bound, zhi_bound))
data.write('{0:.6f} {1:.6f} {2:.6f} xy xz yz\n'.format(
xy, xz, yz))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... in this block of code (including the above few lines),

Suggested change
xy, xz, yz))
xy.in_units_of(u.angstrom), xz.in_units_of(u.angstrom), yz.in_units_of(u.angstrom)))

Maybe this is a little verbose but this is my preference (which I can be talked out of).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was contemplating this Friday. I originally stripped the units earlier because adding a unyt_array with a numpy.array is messy, as is being done to calculate some of the bounds. However I think this is okay as long as I set the numpy arrays as unyt arrays.

I'm still not entirely sure how to best handle converting the units of box angles and lengths.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think getting x/y/lo/hi should only be math on unyt_array objects. xlo/xhi/ylo/yhi should all come from vectors with units, and even though it may not be pretty I think your changes are good.

I'm still not entirely sure how to best handle converting the units of box angles and lengths.

I think your changes are good; keeping things as unyt arrays until they're being written out. The only thing that would be worth considering down the line is not converting them to u.angstrom at the beginning and writing them out as xlo_bound.in_units_of(u.angstrom).value later on, but I think what you have now is what we want to do. What are you concerned about?

topology/formats/lammpsdata.py Outdated Show resolved Hide resolved
if atom_style not in ['atomic', 'charge', 'molecular', 'full']:
raise ValueError('Atom style "{}" is invalid or is not currently supported'.format(atom_style))

xyz = list()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a a detail outside the scope of this PR (don't know how to easily add a non-review comments during a review) but we can probably initialize it to be an empty numpy array instead of appending to lists

topology/formats/lammpsdata.py Outdated Show resolved Hide resolved
topology/formats/lammpsdata.py Outdated Show resolved Hide resolved
top.update_connection_list()
write_lammpsdata(top, filename='triclinic.data')

#def test_num_atoms():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good idea (and it works for me locally) 👍

topology/tests/test_lammps.py Outdated Show resolved Hide resolved
@mattwthompson
Copy link
Member

re: cd2290a this should be done in the unit test so we can be sure the masses are written out. So just adding mass argument to L32-33 in test_lammps.py should store them internally, and then we can (manually) check that they're written out to the file

>>> from topology.core.site import Site
>>> Site(name='foo', mass=1)
/Users/mwt/software/topology/topology/core/site.py:118: UserWarning: Masses are assumed to be g/mol
  warnings.warn("Masses are assumed to be g/mol")
<topology.core.site.Site object at 0x106eae390>
>>> site = Site(name='foo', mass=1)
>>> site.mass
unyt_quantity(1., 'g/mol')

@rmatsum836
Copy link
Collaborator Author

Maybe, I'm misinterpreting. But I think I'm doing this currently in test_lammps.py in lines 19-20 and lines 32-33.

@mattwthompson
Copy link
Member

If what I'm seeing is accurate:

image

you're doing everything but setting the mass (the default Site() and AtomType() have masses of None and 0 by default, respectively, so that's what's getting written out). It just needs mass=1 * u.g/u.mol added, I think.

>>> from topology.core.atom_type import AtomType
>>> from topology.core.site import Site
>>> Site(name='foo').mass is None
True
>>> AtomType().mass
unyt_quantity(0., 'g/mol')

Sitenote: it's pretty silly that LAMMPS doesn't play well with zero mass

@rmatsum836
Copy link
Collaborator Author

Oh I see the confusion now, I messed up that commit for some reason.

@mattwthompson
Copy link
Member

Sweet, I can read these in now, and it looks good to me.

(mosdef) [Clausius] ~
$ lmp_mpi 
LAMMPS (11 Aug 2017)
units real
atom_style full
pair_style lj/cut 2
read_data data.lammps
Reading data file ...
  orthogonal box = (0 0 0) to (10 10 10)
  1 by 1 by 1 MPI processor grid
  reading atoms ...
  2 atoms
Finding 1-2 1-3 1-4 neighbors ...
  special bond factors lj:   0          0          0         
  special bond factors coul: 0          0          0         
  0 = max # of 1-2 neighbors
  0 = max # of 1-3 neighbors
  0 = max # of 1-4 neighbors
  1 = max # of special neighbors
^C
(mosdef) [Clausius] ~
$ lmp_mpi 
LAMMPS (11 Aug 2017)
units real
atom_style full
pair_style lj/cut 2
read_data data.triclinic
Reading data file ...
  triclinic box = (-5 0 0) to (10 14.4338 8.16497) with tilt (-5 0 5.7735)
  1 by 1 by 1 MPI processor grid
  reading atoms ...
  2 atoms
Finding 1-2 1-3 1-4 neighbors ...
  special bond factors lj:   0          0          0         
  special bond factors coul: 0          0          0         
  0 = max # of 1-2 neighbors
  0 = max # of 1-3 neighbors
  0 = max # of 1-4 neighbors
  1 = max # of special neighbors
^C

Copy link
Collaborator

@ahy3nz ahy3nz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Do we need to convert xyz to the correct atom style before doing anything? Or at least ensure xyz is in Angstrom?
  • I think we might need to enforce some wrapping inside the box (however the box gets defined)?

@mattwthompson
Copy link
Member

Do we need to convert xyz to the correct atom style before doing anything? Or at least ensure xyz is in Angstrom?

Good catch. It looks to me like they're stored as units so we shouldn't need to ensure they're a specific unit beforehand. But we'd want to make sure they're printed out in angstroms, i.e. changing this line https://github.com/mattwthompson/topology/blob/f513814ad125fed2bd88c5ca2e72f69676547b23/topology/formats/lammpsdata.py#L207 to ... .in_units_of(u.angstrom).value)

I think we might need to enforce some wrapping inside the box (however the box gets defined)?

I don't think LAMMPS needs everything to be wrapped - I could be wrong - but I'm ok with punting on this specific issue either way.

if forcefield:
sigmas = [site.atom_type.parameters['sigma'] for site in topology.site_list]
epsilons = [site.atom_type.parameters['epsilon'] for site in topology.site_list]
sigma_dict = dict([(unique_types.index(atom_type)+1,sigma) for atom_type,sigma in zip(types,sigmas)])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right, units-system issues here. For atomstyle real, I think it's angstrom and kcal/mol?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahy3nz
Copy link
Collaborator

ahy3nz commented Mar 4, 2019

I don't think LAMMPS needs everything to be wrapped - I could be wrong - but I'm ok with punting on this specific issue either way.

Okay yeah that is true, from lammps
"Atom lines specify the (x,y,z) coordinates of atoms. These can be inside or outside the simulation box. When the data file is read, LAMMPS wraps coordinates outside the box back into the box for dimensions that are periodic. As discussed above, if an atom is outside the box in a non-periodic dimension, it will be lost."

@rmatsum836
Copy link
Collaborator Author

So is this good to merge? I understand this is very limited in the scope of what the writer will eventually be, but I think this is a good first go.

data.write(atom_line.format(
index=i+1,type_index=unique_types.index(types[i])+1,
zero=0,charge=0,
x=coords[0].value,y=coords[1].value,z=coords[2].value))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
x=coords[0].value,y=coords[1].value,z=coords[2].value))
x=coords[0].in_units_of(u.angstrom).value,
y=coords[1].in_units_of(u.angstrom).value,
z=coords[2].in_units_of(u.angstrom).value,

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the only thing left before merging

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, just committed that change.

@mattwthompson
Copy link
Member

Thanks for the contribution!

@mattwthompson mattwthompson merged commit 3aac2a1 into mosdef-hub:master Mar 5, 2019
@mattwthompson mattwthompson mentioned this pull request May 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request WIP work in progress - do not merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants