Skip to content

Commit

Permalink
Mol2 residue and molecule info for sites. (#671)
Browse files Browse the repository at this point in the history
* modify mbuild converter to flatten compound hierarchy

* Add change method to determine molecule_group

* Add new label ("molecule" and "group") to help with the conversion

* fix typo and paritally update tests

* update from_mbuild tests

* Add molecule_number update docs
Also start translating residue info

* Make site.group optional string

* add missing var

* WIP: Remodel the labeling system, add parse_label
Combine residue name and number to be residue,
combine molecule name and number to be molecule.
Add parse label for from_mbuild method, but as of
right now, has really really bad performance (need
to rethink the logic here)

* remove the cloning step, improve performance

* add unit test

* update residue handling in convert parmed

* change method to parse label when convert from mbuild

* include missing import

* modify iter_site_by_residue, add iter_site_by_molecule

* update to_mbuild to match with new construct

* fix edge case when looking up None molecule and residue

* fix mol2 reader for new residue and molecule setup

* fix unit tests which used old syntax

* fix remaining unit tests

* replace __getattribute__ with getatt

* Address Cal's comment
Adjust docstring for the from_mbuild method.
Change docstring for the site.group.
Change MoleculeType and ResidueType to be NamedTuple.

* add options to infer (or not infer) the hierarchy structure when going from gmso to mbuild

* add infer_hierarchy for to_mbuild method

* parse group info when converting from mbuild

* WIP - removing all subtopology class and its reference

* remove remaining subtops from gmso objects and tests

* fix various errors/bugs, only 6 fails left

* fix parameterization bugs

* revert one step

* add patch for edge case, where molecule_tag is None

* fix case when site has no molecule or site

* trim misc codes

* make top.connections to be determined on the flight, remove self._connections

* Remove unnecessary function, relocate boundary bond assertion

* use n_direct_bonds inplace of is_independent when parsing residue

* add use_molecule_info option for apply

* add isomorphic check

* Modify Atomtyping parameterization to use flat molecule IDs for applying forcefields

* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci

* revert some changes

* Add match_ff_by, option to match by molecule name or group

* fix minor typo

* Fix bug, add remove_untyped
Fix bugs related to the new ff_match_by options.
Add new option to remove untyped connections.

* add missing flag

* more typos fixes

* add more unit tests for new features

* fix typo and add a comment

* parsing all lj and electrostatics scaling availabel, add unit tests

* change the error when molecule_id not in molecule_scaling_factors dict

* Add atom.clone and topology.create_subtop

* populate group and molecule info for cg atom

* fix typo

* turn error into warning when dict of ff is given on empty top

* remove return statement

* fix bug when apply ff with scaling factor of 0

* Mol2 format molecule information from RTI
This PR address concerns from @bc118 to automatically grab molecule
information from a mol2 file and attach it to the site.molecule for each
site in top.sites. This will go hand in hand with a future PR to address
the `gmso.formats.convert_mbuild.to_mbuild` utility in GMSO to properly
build an mBuild.Compound with the hierarchy that will results in
lossless mbuild to GMSO conversions.

* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci

* fix bug

* remove unused imports

Co-authored-by: Co Quach <daico007@gmail.com>
Co-authored-by: Co Quach <43968221+daico007@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
  • Loading branch information
4 people committed Jul 25, 2022
1 parent 2970412 commit c336588
Show file tree
Hide file tree
Showing 2 changed files with 35 additions and 3 deletions.
13 changes: 10 additions & 3 deletions gmso/formats/mol2.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
import unyt as u

from gmso import Atom, Bond, Box, Topology
from gmso.abc.abstract_site import MoleculeType, ResidueType
from gmso.core.element import element_by_name, element_by_symbol
from gmso.formats.formats_registry import loads_as

Expand Down Expand Up @@ -69,6 +70,7 @@ def from_mol2(filename, site_type="atom"):
"@<TRIPOS>BOND": _parse_bond,
"@<TRIPOS>CRYSIN": _parse_box,
"@<TRIPOS>FF_PBC": _parse_box,
"@<TRIPOS>MOLECULE": _parse_molecule,
}
for section in sections:
if section not in supported_rti:
Expand All @@ -79,7 +81,6 @@ def from_mol2(filename, site_type="atom"):
else:
supported_rti[section](topology, sections[section])

topology.update_topology()
# TODO: read in parameters to correct attribute as well. This can be saved in various rti sections.
return topology

Expand Down Expand Up @@ -139,13 +140,14 @@ def parse_ele(*symbols):
f"No charge was detected for site {content[1]} with index {content[0]}"
)
charge = None

molecule = top.label if top.__dict__.get("label") else top.name
atom = Atom(
name=content[1],
position=position.to("nm"),
element=element,
charge=charge,
residue=(content[7], int(content[6])),
residue=ResidueType(content[7], int(content[6])),
molecule=MoleculeType(molecule, 1),
)
top.add_site(atom)

Expand Down Expand Up @@ -178,3 +180,8 @@ def _parse_box(top, section):
lengths=[float(x) for x in content[0:3]] * u.Å,
angles=[float(x) for x in content[3:6]] * u.degree,
)


def _parse_molecule(top, section):
"""Parse molecule information from the mol2 file."""
top.label = str(section[0].strip())
25 changes: 25 additions & 0 deletions gmso/tests/test_mol2.py
Original file line number Diff line number Diff line change
Expand Up @@ -133,3 +133,28 @@ def test_neopentane_mol2_elements(self):
r"consider manually adding the element to the topology$",
):
top = Topology.load(get_fn("neopentane.mol2"))

def test_mol2_residues(self):
top = Topology.load(get_fn("parmed.mol2"))
assert np.all(
np.array([site.residue.name for site in top.sites]) == "RES"
)
assert np.all(
np.array([site.residue.number for site in top.sites]) == 1
)

def test_mol2_molecules(self):
top = Topology.load(get_fn("methane.mol2"))
assert np.all(
np.array([site.molecule.name for site in top.sites]) == "MET"
)
assert np.all(
np.array([site.molecule.number for site in top.sites]) == 1
)

def test_mol2_group(self):
# Is there a place to read from mol2 file?
top = Topology.load(get_fn("ethane.mol2"))
for site in top.sites:
site.group = "ethane"
assert np.all(np.array([site.group for site in top.sites]) == "ethane")

0 comments on commit c336588

Please sign in to comment.