getitem_ Implementation for ForceField class #505

umesh-timalsina · 2021-02-10T15:43:23Z

Related to #238 and #188 and #192. Currently, the ForceField class in GMSO is just a container with a bunch of dictionaries for each potential terms. You can search an AtomType, BondType, AngleType etc... by their class/type names with . However, the following things are missing from the current implementation:

No support for BondTypes/AngleTypes/DihedralTypes with wildcards. Full or Partial wildcards support
No utility to support wildcard patterns for an AtomType class name.

Resolution

After #501, we can support arbitrary tags for any Potential type in gmso. Which means if we want to add wildcard patterns that a particular atom type is supposed to match, these patterns (list, set) can be added and saved as a tag like atom_class_patterns for an AtomType. A tokenizer for wildcard tokens for an AtomType might look like( Assuming that there's no branching in partial wild cards for a particular type/class (which I am not convinced is the case and warrants further discussion)):

class WildCardTokenizer:
    def __init__(self, token: str) -> None:
        self.token = token
        self.tokens_chain = []
        self._initialize()

    def _initialize(self) -> None:
        self.tokens_chain.append(self.token)
        max_len = len(self.token)
        self.tokens_chain = list(f'{self.token[0:max_len-j]}'
                                 if j == 0 else f'{self.token[0:max_len-j]}*'
                                 for j in range(0, max_len+1))

>>> print(WildCardTokenizer('CHH').tokens_chain)
['CHH', 'CH*', 'C*', '*']

And the AtomType can be extended to have a method like:

    def initialize_match_tokens(self, overwrite=True) -> None:
        """Add wildcard tokens for this atomtype's potential matches"""
        if self.atomclass:
            self.add_tag(
                'class_tokens',
                WildCardTokenizer(self.atomclass),
                overwrite=overwrite
            )
        if self.name:
            self.add_tag(
                'type_tokens',
                WildCardTokenizer(self.name),
                overwrite=overwrite
            )

Now comes the search problem in the ForceFiled. I think a st. forward way to do it would be to override object.__getitem__ for the ForceField class to search for wildcard matches in multiple passes. For example if a ForceField has a dihedral type like:

'*~Ar~Ar~*': <DihedralType DihedralType1, id 139964614361616>,

A Dihedral in a Topology with four Atoms with their atomclasses ['Ar', 'Ar', 'Ar', 'Ar'], should match the above dihedral type while parametrizing the Topology.

Unanswered Questions

What rules if any should be followed while matching partial/full wildcards?
Are there any restrictions in the presence of wildcards?
What is the precedence order in case a multiple match is found?
Any alternative ideas for the resolution of the problem?

The text was updated successfully, but these errors were encountered:

umesh-timalsina · 2021-02-11T18:44:52Z

In #506, I have implemented a tokenizer class. I am going for an exact match rule there. But, I think there can be a regex based solution as well. Lets look at an example:

>>> from gmso import ForceField
>>> from gmso.tests.utils import get_path
>>> ff = ForceField(get_path('ff-example0.xml'))
>>> ff.atom_types
{'Ar': <AtomType Ar, id 139711668333968>, 'Xe': <AtomType Xe, id 139711668558096>, 'Li': <AtomType Li, id 139711668346128>}
>>> ff.atom_types['Ar'].tags
{'element': 'Ar', 'class_tokens': <gmso.utils.wildcard.WildCardTokenizer object at 0x7f112dd93090>, 'type_tokens': <gmso.utils.wildcard.WildCardTokenizer object at 0x7f112861df90>}
>>> ff.atom_types['Ar'].get_tag('class_tokens').tokens_chain
['Ar', 'A*', '*']
>>> ff.atom_types['Xe'].get_tag('class_tokens').tokens_chain
['Xe', 'X*', '*']

Now, if these to AtomClasses i.e. Xe and Ar are associated with two atoms that have a Bond in a Topology, while parameterizing the sytem using this ForceField, The following precedence order should be followed:

If there is a BondType, where member types is [ Xe, Ar ] in the ForceField the3. If there as BondType where member types is [ *, Ar ] in the ForceField the Bond should be assigned that BondType.
If there as BondType where member types is [ *, Ar ] in the ForceField the Bond should be assigned that BondType.
Bond should be assigned that BondType.
If there as BondType where member types is [ X*, Ar ] in the ForceField the Bond should be assigned that BondType.
If there as BondType where member types is [ *, Ar ] in the ForceField the Bond should be assigned that BondType.
If there as BondType where member types is [ Xe, * ] in the ForceField the Bond should be assigned that BondType. etc...

However, there are multiple cases for these rules and I think we need some domain expertise to generalize these rules @mosdef-hub/mosdef-contributors.

umesh-timalsina · 2021-06-08T16:04:50Z

Closed by #519

This was referenced Feb 10, 2021

Uniqueness of CoreType Names in ForceField #188

Closed

Sepearate Utilty Class for WildCards?? #238

Closed

umesh-timalsina added best-practice core feature enhancement New feature or request labels Feb 10, 2021

umesh-timalsina self-assigned this Feb 10, 2021

umesh-timalsina mentioned this issue Feb 11, 2021

Forcefield __getitem__ for tokens matching #506

Closed

umesh-timalsina mentioned this issue Feb 12, 2021

Add Missing Atoms in OPLSAA ForceField mosdef-hub/foyer#378

Open

umesh-timalsina closed this as completed Jun 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

getitem_ Implementation for ForceField class #505

getitem_ Implementation for ForceField class #505

umesh-timalsina commented Feb 10, 2021

umesh-timalsina commented Feb 11, 2021

umesh-timalsina commented Jun 8, 2021

__getitem___ Implementation for ForceField class #505

__getitem___ Implementation for ForceField class #505

Comments

umesh-timalsina commented Feb 10, 2021

Resolution

Unanswered Questions

umesh-timalsina commented Feb 11, 2021

umesh-timalsina commented Jun 8, 2021

getitem_ Implementation for ForceField class #505

getitem_ Implementation for ForceField class #505