Join GitHub today
Change the OBAromTyper from using SMARTS patterns to a switch statement (rebased) #1545
The original OBAromTyper applies atom types (corresponding to the number of pi electrons donated by particular atom environments) by matching SMARTS patterns against a molecule. Which somewhat convenient, this is a very inefficient way of applying the types, as it is all-against-all involving redundant matches, and also has subsequent pruning (based on hybridisation) in the code. The use of the 'D' in the SMARTS patterns almost meant that different results were possible depending on whether implicit or explicit hydrogens were present.
This has been replaced with a more efficient, and fairly readable (IMO) switch statement on element, then on charge, and then heavy atom degree (in general). Pruning is now done before the switch statement.
Reading SMILES is about 3.5% faster (based on ChEMBL "-ismi -onul"). The SMILES output for ChEMBL, eMolecules and PubChem Compound is unchanged except for a single eMolecules entry where the output was previously in error.