Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

atomtypes module: keeping track of new atom types #283

Closed
nickvandewiele opened this issue Sep 4, 2014 · 5 comments
Closed

atomtypes module: keeping track of new atom types #283

nickvandewiele opened this issue Sep 4, 2014 · 5 comments
Labels
stale stale issue/PR as determined by actions bot Topic: Aromatics Type: Feature Type: Risk of Error

Comments

@nickvandewiele
Copy link
Contributor

With the support of the new adjacency list format, new elements and a plethora of new atom types emerged as well.

the modulermgpy.molecule.atomtype does a number of things:

  • it defines the supported atom types through the constructor of AtomType(...), by providing 2 things:
    • the element symbol
    • relation to one another atom types: a list of all predecessors: generic, a list of all successors: specific
  • it defines the conversion of one atom type to a sister atom type when an Action (incrementBond, decrementBond, formBond, breakBond, incrementRadical, decrementRadical, incrementLonePair, decrementLonePair ) is called upon the atom of this atom type.

E.g.:

atomTypes['Cd'  ].setActions(
    incrementBond=['Cdd','Ct'],
    decrementBond=['Cs'],
    formBond=['Cd'],
    breakBond=['Cd'],
    incrementRadical=['Cd'],
    decrementRadical=['Cd'],
    incrementLonePair=[],
    decrementLonePair=[]
)
  • it provides a method getAtomType(atom, bonds) that perceives the correct atom type for the given parameter atom surrounded by the parameter bonds.
...
    elif atom.symbol == 'C':
        if   double == 0 and doubleO == 0 and triple == 0 and benzene == 0: atomType = 'Cs'
        elif double == 1 and doubleO == 0 and triple == 0 and benzene == 0: atomType = 'Cd'
        elif double + doubleO == 2        and triple == 0 and benzene == 0: atomType = 'Cdd'
        elif double == 0 and doubleO == 0 and triple == 1 and benzene == 0: atomType = 'Ct'
        elif double == 0 and doubleO == 1 and triple == 0 and benzene == 0: atomType = 'CO'
        elif double == 0 and doubleO == 0 and triple == 0 and benzene == 2: atomType = 'Cb'
        elif double == 0 and doubleO == 0 and triple == 0 and benzene == 3: atomType = 'Cbf'
   elif atom.symbol == 'N':
...

Although this module seems to work fine and streamlined for fast comparison, the code itself seems:

  • difficult to maintain: e.g. a variable doubleO counts the number of double bonds to oxygen, but what about CS (double bond to S), CN (triple bond to N), ... The above code will explode in terms of lines of code because of the combinatorial nature of the attributes to be checked.
  • contains a lot of duplication: e.g. ancestors, children are defined per atom type although keeping the atomtypes in a directed tree structure would have done the same job much more concisely and flexibly.
  • and may lead to errors and confusion: e.g. a quick glance showed that although the atom type CS is defined, it does not appear anywhere in the atom type perception code. So do we want the atom type CS or not? It's nowhere clear. And this is just one example.

One way to refactor all of this, is using a dictionary of atom types, in which each atom type has a number of carefully chosen attributes. E.g.:

  • A max_order attribute could be used to deduce the conversion of atomtypes under Actions like incrementBond. Cs would have max_order set to 1, incrementBond would search for sister atom types with max_order+1.
  • other attributes may include the things that were checked in the atom type perception code.

In addition, keeping a tree structure of atom types, like we do for other databases will avoid the need to keep track of attributes like generic or specific anymore. A tree-traversal algorithm will give you the same info.

Finally, a template atom type in which we define these attributes may also serve as a way to perceive atom types.

Although we currently don't seem to have trouble with atom typing, the unit tests we have to check atom typing are very limited, and maybe it silently results in errors without us noticing.

@bslakman
Copy link
Member

Commenting to agree, and to add that I am unclear about the "benzene" atom types that aren't carbon: N3b, Sib, and Sibf. Are these intended to be any N or Si in an aromatic ring or fused ring? Or only ones that are bonded to carbons?

@rwest
Copy link
Member

rwest commented Sep 15, 2014

I would be happy to see someone try to refactor this, but would suggest/request that they instrument its speed before and after their changes (or else convince me that this isn't important). My intuition says this gets called a lot, and is one of the places where speed matters. That intuition is not based on recently acquired evidence, however, so I am willing to be shown wrong! And yes: ability to extend, maintain, debug, and check are also important!

@github-actions
Copy link

This issue is being automatically marked as stale because it has not received any interaction in the last 90 days. Please leave a comment if this is still a relevant issue, otherwise it will automatically be closed in 30 days.

@github-actions github-actions bot added the stale stale issue/PR as determined by actions bot label Jun 22, 2023
@JacksonBurns
Copy link
Contributor

@rwest is this related to #2440?

@github-actions github-actions bot removed the stale stale issue/PR as determined by actions bot label Jun 23, 2023
@github-actions
Copy link

This issue is being automatically marked as stale because it has not received any interaction in the last 90 days. Please leave a comment if this is still a relevant issue, otherwise it will automatically be closed in 30 days.

@github-actions github-actions bot added the stale stale issue/PR as determined by actions bot label Sep 22, 2023
@JacksonBurns JacksonBurns closed this as not planned Won't fix, can't repro, duplicate, stale Sep 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale stale issue/PR as determined by actions bot Topic: Aromatics Type: Feature Type: Risk of Error
Projects
None yet
Development

No branches or pull requests

5 participants