Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Change the OBAromTyper from using SMARTS patterns to a switch statement (rebased) #1545
The original OBAromTyper applies atom types (corresponding to the number of pi electrons donated by particular atom environments) by matching SMARTS patterns against a molecule. Which somewhat convenient, this is a very inefficient way of applying the types, as it is all-against-all involving redundant matches, and also has subsequent pruning (based on hybridisation) in the code. The use of the 'D' in the SMARTS patterns almost meant that different results were possible depending on whether implicit or explicit hydrogens were present.
This has been replaced with a more efficient, and fairly readable (IMO) switch statement on element, then on charge, and then heavy atom degree (in general). Pruning is now done before the switch statement.
Reading SMILES is about 3.5% faster (based on ChEMBL "-ismi -onul"). The SMILES output for ChEMBL, eMolecules and PubChem Compound is unchanged except for a single eMolecules entry where the output was previously in error.