Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 363 - New Topology System #815

Merged
merged 457 commits into from Nov 7, 2016
Merged

Issue 363 - New Topology System #815

merged 457 commits into from Nov 7, 2016

Conversation

@dotsdl
Copy link
Member

dotsdl commented Apr 4, 2016

Fixes #363

This is a WIP PR to serve as a focal point for discussion on the new topology system before it is merged into develop. The merge will happen soon after release of 0.15.0.

Changes made in this Pull Request:

  • WIP

TODO List

  • Get tests passing!
  • Implement guessers
  • Make flags work
  • Make moving residues work (in progress @richardjgowers)
  • For each Parser, check all fields are read and correctly named/assigned
  • For each Writer, check that no assumptions are made on the contents of Timesteps/AtomGroups
@richardjgowers

This comment has been minimized.

Copy link
Member

richardjgowers commented on 456b00d Feb 6, 2016

Looks good

dotsdl and others added 22 commits Feb 6, 2016
Also, some other subtle bugfixes.
Somehow missed this one during the migration of old AtomGroup functions.
**NOTE:** `atoms.bonds` is broken for atom groups without explicit bond typing
in this commit.

In order to write a LAMMPS.DATAWriter, I first need explicit typing of bonds,
angles, dihedrals, and impropers beyond the tupling of atoms types done now.
To hack bond types into the topology, I branched issue-363 and tried the
following:

1. Add optional `types` keyword to `topologyattrs.Bonds`. This is stored
in the class instance a `self._bondtypes`. A topology parser can pass `types`
to the Bonds topologyattr.
2. When `topologyattrs.Bonds` is called upon to generate a TopologyGroup,
these values are passed on via a new keyword in `TopologyGroup.__init__()`.
They are stored in the TopologyGroup instance as `self._bondtypes`.
3. I modified `TopologyGroup.__getitem__` to yield a Bond with attribute
`_bondtype`. If `_bondtype is not None` then it overrides the default behavior
of `bond.type`.
4. I modified `TopologyGroup()` calls wherever I could find them to pass on
`_bondtype` in the instantiation.

I found that the DATAParser in branch #363 is slightly broken, because
the atom types are saved as integers which breaks `select_atoms` (I think it
expects strings, and typing the types as strings on parsing fixes this issue).
It seems there is currently no test for atom selection from LAMMPS DATA files,
so this might have gone undetected.

An error is raised for `ag.bonds` for an atom group that doesn't have bonds,
but this seems to also be the case for branch issue-363 right now, so I didn't
try to fix this.

Two places are especially ugly. One is where I tried to hack `_bondtypes`
into the existing code to remove duplicate bonds. This is also where
the code seems to be broken for non-explicit bond types. In this case,
I set `_bondtypes` to be a numpy array full of `None` with `dtype=object`.
It seems that `unique_rows` fails for this case, although it works fine
when `_bondtypes.dtype == "|S1"`.
The function `unique_row` was causing problems because bond type could
be None, and the containing numpy array had dtype==object. I replaced
`unique_rows` with `np.unique` by first converting the bond indices and
bond types into a numpy record, and then breaking this record back into
the bond indices and bond types.

I also simplified some of what I'd written in `topologyattrs.Bonds`.
If parser told universe to create empty TopologyGroup, the resulting
group was an attribute of atoms but gave IndexError on trying to access
it. Seems to work as expected now... creating TopologyGroup with 0
TopologyObjects
Fix DATA type parsing and add explicit bond types
Writes selection at current trajectory frame to file, including sections
Atoms, Masses, Velocities, Bonds, Angles, Dihedrals, and Impropers (if
these are defined). Atoms section is written in the "full" sub-style if
charges are available or "molecular" sub-style if they are not.
Molecule id in atoms section is set to to 0.

No other sections are written to the DATA file.
As of this writing, other sections are not parsed into the topology
by the `DATAReader`.

If the selection includes a partial fragment, then the outputted DATA
file will be invalid, because it will describe bonds between atoms
which do not exist in the Atoms section.

By default the writer assumes "conventional" or "real" LAMMPS units
where length is measured in Angstroms and velocity is measured in
Angstroms per femtosecond. If other units are desired, they must be
specified.

Raises ValueError if atom types are not convertable to integers or if
atoms of the same type don't all have the same mass.

An AttributeError will be raised if the atom group doesn't have masses.

I added tests to make sure topology attributes of the written file
(after being read again) match the topology attributes read from
the original file. Checks types, bonds, positions, velocities, etc.

There was a problem with test data mini.data because it had atom type
3 which had undefined mass, so I changed the atom type to 1.

The DATAParser before was not converting velocity units, so I added
default units Angstroms/fs which are the "real" units in LAMMPS.
I added an associated test and fixed part of a test that was passing in
an invalid way.
I also removed the part of the docstring describing the
not-yet-implemented feature because it would probably cause more
false hope than good.
This in accordance with the consensus on #599
(#599 (comment)).
We want ResidueGroups and SegmentGroups to behave roughly in the same
way as before, that is having methods they had when they were subclasses
of AtomGroup. Not everything makes sense, but most things do.

This was consensus from discussion in #703
(#703 (comment))
I consider these very fragile since they depend on particular atom names
(however customary), but included them where they should go since they
depend on atom names. They will need to be refactored, though, since
they depend on getitem working for atom names, which is not something
we've included (so far) for any groups.
Probably useful for PDB files or any other topology file that gives an
element name.
Conflicts:
	package/MDAnalysis/coordinates/MOL2.py
	package/MDAnalysis/coordinates/PDB.py
	package/MDAnalysis/core/AtomGroup.py
	package/MDAnalysis/core/Selection.py
	package/MDAnalysis/core/Timeseries.py
	package/MDAnalysis/core/__init__.py
	package/MDAnalysis/core/topologyobjects.py
	package/MDAnalysis/lib/util.py
	package/MDAnalysis/topology/CRDParser.py
	package/MDAnalysis/topology/DLPolyParser.py
	package/MDAnalysis/topology/GMSParser.py
	package/MDAnalysis/topology/GROParser.py
	package/MDAnalysis/topology/HoomdXMLParser.py
	package/MDAnalysis/topology/PDBParser.py
	package/MDAnalysis/topology/PDBQTParser.py
	package/MDAnalysis/topology/PSFParser.py
	package/MDAnalysis/topology/PrimitivePDBParser.py
	package/MDAnalysis/topology/TOPParser.py
	package/MDAnalysis/topology/XYZParser.py
	package/MDAnalysis/topology/base.py
	package/MDAnalysis/topology/core.py
	package/MDAnalysis/topology/tpr/utils.py
	testsuite/MDAnalysisTests/test_atomselections.py
	testsuite/MDAnalysisTests/test_topology.py
	testsuite/MDAnalysisTests/test_util.py
	testsuite/MDAnalysisTests/topology/test_gro.py
	testsuite/MDAnalysisTests/topology/test_tprparser.py
This is to match decided behavior for accessing attributes at various
levels.
@dotsdl dotsdl added this to the Topology refactor milestone Apr 4, 2016
@orbeckst
Copy link
Member

orbeckst commented Apr 5, 2016

OMG, this is really happening?

richardjgowers and others added 6 commits Oct 25, 2016
Readded tests for phi_selection and friends
* fix bug and docs in rotaxis

If a == b this used to return [nan, nan, nan] due to the division by 0.

* pep8 changes test_transformations.py

* add rotaxis test

* add align_principal_axis test

* update changelog
@orbeckst
Copy link
Member

orbeckst commented Oct 25, 2016

On 20 Oct, 2016, at 13:14, Richard Gowers notifications@github.com wrote:

Still to do is to remove all references to core.AtomGroup.AtomGroup and core.AtomGroup.Universe in docs (and any example code). These have become core.groups.AtomGroup and core.universe.Universe. For backwards compatibility we can add a stub module core.AtomGroup which makes the namespace still work but raises a warning on import.

Good idea, because it's been a common idiom to use AtomGroup.{AtomGroup,Universe}

Oliver Beckstein * orbeckst@gmx.net
skype: orbeckst * orbeckst@gmail.com

abiedermann pushed a commit to abiedermann/mdanalysis that referenced this pull request Oct 26, 2016
 NIMUstA8lOQd9w5kL+PcYC3o+J7vh1/gjIyJsZ5iE9y9zQ3aYtzFQShPnCfrKYyl
 Bco/xr+U3bN9piXTZqhYKIRzVUs6SSd2uy7q63de8QDNsNWkKouKZ1+PhvZPkMNN
 i+PhBW/gp+PXeta+4Y5REnBrUpX4bW3DCHuKTJ+nM80PXmsMG0i64ShRX/umXnoG
 qpBfOGfnr40SD/ZFmmYc8qqZvllzPjew8GMGXXdjidetOZHaAkFRDMzOr3FNYkWD
 2zSKN95CJGDynSwdsDTo9rrUN5lcJr58JG+JeDBLoK7XxN5VVhPge+cV0gHF2SLk
 TQcgRXXY/AJY0ZpME0PRk7YKpUPqW/UjKPHENRFqNPlskF+8dzvFu2KNk928A+Ws
 4otsRWCzn76tzXsGAK66kXsX9l2mc3IxhDy9TH69GXhog7ASHlglJZ1kZTcPPR8T
 h7SfjW1Pfi7C/HL7RJ+shmmIQd4qvfn828JK/SoHud6dG7/wmfqYIiva7RB0usdt
 CK8y7an2XUGXjnvIv2C2CDUHKy0KgtdRw6wpwChkeVl9ci59cT0vHkOZqXgGSaBS
 L168OTkm0djbpjrMGj8vWe3lna1/Fxf+ELYOC/rUTPOYt7T0SuRoUHhcHhURCD/w
 vViD4Ic6nvs7OR+ODkyZ
 =eUcw
 -----END PGP SIGNATURE-----

adjusted HOLE open file descriptor test (MDAnalysis#129) for Darwin

In order to provoke the failure described by issue MDAnalysis#129 we artificially lower the open file descriptors.
On Mac OS X on travis this always leads to failing tests, as discussed under
MDAnalysis#901 (comment) ; this hack increases the allowed
open fds when we are running on Darwin. According to @jbarnoud, this problem is solved by MDAnalysis#363 so once that
corresponding PR MDAnalysis#815 is merged we should be able to revert this commit.
richardjgowers and others added 4 commits Oct 27, 2016
* Add pylintrc

* activate pylint during CI
kain88-de and others added 4 commits Nov 1, 2016
* Add tests

* TST: Added tests for get_named_residue

* fix imports and pylint warning
* Finished checking PQRWriter

* MOL2Writer now cleanly doesn't write non mol2 sources

* Finished checking PDBQT
Conflicts:
	package/AUTHORS
	package/CHANGELOG
	testsuite/MDAnalysisTests/analysis/test_psa.py
@richardjgowers
Copy link
Member

richardjgowers commented Nov 7, 2016

@kain88-de sorry I missed yesterday, you can merge this when ready

@kain88-de kain88-de merged commit 94f1d75 into develop Nov 7, 2016
2 of 3 checks passed
2 of 3 checks passed
continuous-integration/travis-ci/push The Travis CI build is in progress
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls Coverage increased (+0.7%) to 86.483%
Details
@kain88-de
Copy link
Member

kain88-de commented Nov 7, 2016

OK the new topology system is now in develop. Everyone has done a great job. We even managed to increase coverage with this massive PR.

@jbarnoud
Copy link
Contributor

jbarnoud commented Nov 7, 2016

Wow! Congrats!

On 07-11-16 12:09, Max Linke wrote:

OK the new topology system is now in develop. Everyone has done a
great job. We even managed to increase coverage with this massive PR.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#815 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABUWukKjAMo2nzBt57gBsgP8cphUjKKCks5q7wbRgaJpZM4H_mdq.

@orbeckst orbeckst changed the title WIP: Issue 363 - New Topology System Issue 363 - New Topology System Nov 7, 2016
@orbeckst
Copy link
Member

orbeckst commented Nov 7, 2016

Epic merge!

Special kudos to @richardjgowers, @dotsdl and @kain88-de. It's been almost 1 year since @richardjgowers visited ASU and started the topology overhaul together with @dotsdl. They got the basics done in about 1 month

Richard and I spent a good deal of time during his visit to Arizona working away at issue #363, and we are almost finished with it.

but it took another ~45 weeks to make it the new foundation for MDAnalysis. The 10/90 rule at work...

363_blackboard_draft_asu

@jdetle
Copy link
Contributor

jdetle commented Nov 7, 2016

Incredible!

@dotsdl
Copy link
Member Author

dotsdl commented Nov 7, 2016

Wooo! This is a big deal! Thanks everyone for the hard work on all this. It really was a great community effort, and really reflects well on the mettle of our dev team. I mean, look at that participant list on the right.

Looking forward to what we can make happen with the new topology system. This enables tons of new possibilities. :D

abiedermann pushed a commit to abiedermann/mdanalysis that referenced this pull request Jan 5, 2017
WIP: Issue 363 - New Topology System
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Linked issues

Successfully merging this pull request may close these issues.

None yet

You can’t perform that action at this time.