-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
atomtypes module: keeping track of new atom types #283
Comments
Commenting to agree, and to add that I am unclear about the "benzene" atom types that aren't carbon: N3b, Sib, and Sibf. Are these intended to be any N or Si in an aromatic ring or fused ring? Or only ones that are bonded to carbons? |
I would be happy to see someone try to refactor this, but would suggest/request that they instrument its speed before and after their changes (or else convince me that this isn't important). My intuition says this gets called a lot, and is one of the places where speed matters. That intuition is not based on recently acquired evidence, however, so I am willing to be shown wrong! And yes: ability to extend, maintain, debug, and check are also important! |
This issue is being automatically marked as stale because it has not received any interaction in the last 90 days. Please leave a comment if this is still a relevant issue, otherwise it will automatically be closed in 30 days. |
This issue is being automatically marked as stale because it has not received any interaction in the last 90 days. Please leave a comment if this is still a relevant issue, otherwise it will automatically be closed in 30 days. |
With the support of the new adjacency list format, new elements and a plethora of new atom types emerged as well.
the module
rmgpy.molecule.atomtype
does a number of things:AtomType(...)
, by providing 2 things:generic
, a list of all successors:specific
incrementBond, decrementBond, formBond, breakBond, incrementRadical, decrementRadical, incrementLonePair, decrementLonePair
) is called upon the atom of this atom type.E.g.:
getAtomType(atom, bonds)
that perceives the correct atom type for the given parameter atom surrounded by the parameter bonds.Although this module seems to work fine and streamlined for fast comparison, the code itself seems:
doubleO
counts the number of double bonds to oxygen, but what about CS (double bond to S), CN (triple bond to N), ... The above code will explode in terms of lines of code because of the combinatorial nature of the attributes to be checked.CS
is defined, it does not appear anywhere in the atom type perception code. So do we want the atom typeCS
or not? It's nowhere clear. And this is just one example.One way to refactor all of this, is using a dictionary of atom types, in which each atom type has a number of carefully chosen attributes. E.g.:
max_order
attribute could be used to deduce the conversion of atomtypes under Actions likeincrementBond
.Cs
would havemax_order
set to 1,incrementBond
would search for sister atom types withmax_order
+1.In addition, keeping a tree structure of atom types, like we do for other databases will avoid the need to keep track of attributes like
generic
orspecific
anymore. A tree-traversal algorithm will give you the same info.Finally, a template atom type in which we define these attributes may also serve as a way to perceive atom types.
Although we currently don't seem to have trouble with atom typing, the unit tests we have to check atom typing are very limited, and maybe it silently results in errors without us noticing.
The text was updated successfully, but these errors were encountered: