Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a problem with aromatic heteroatom tautomer enumeration #2952

Conversation

greglandrum
Copy link
Member

There was a problem with moving Hs onto/off of charged aromatic N atoms.
This updates the transform rules so that that no longer happens.

Fixes one of the problems raised in #2908

@greglandrum greglandrum added this to the 2019_09_4 milestone Feb 11, 2020
@greglandrum
Copy link
Member Author

@mcs07 : if you have time to take a look a this, I'd love your comments.
The transforms we are using came from molvs originally, so you might find the changes useful there too.

@@ -12,6 +12,7 @@
#include <GraphMol/MolStandardize/FragmentCatalog/FragmentCatalogUtils.h>
#include <GraphMol/SmilesParse/SmilesParse.h>
#include <GraphMol/SmilesParse/SmilesWrite.h>
#include <GraphMol/SmilesParse/SmartsWrite.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this isn’t really used ( moltosmarts is commented out )

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch. fixed

@mcs07
Copy link
Contributor

mcs07 commented Feb 12, 2020

Hi Greg, this seems reasonable to me, although it has been a long while since I thought deeply about this. At least it doesn't break anything in the small collection of tests in MolVS.

To be honest, I've always felt that this method of enumerating via transforms and scoring to find a canonical tautomer is a bit flawed, and it will always be a struggle to make it robust and efficient. For ages I've been meaning to try do an implementation that is closer to the Sayle/Delany method but never got around to it...

@greglandrum greglandrum merged commit 915471a into rdkit:master Feb 13, 2020
@greglandrum greglandrum deleted the fix/aromatic_heteroatom_tautomer_enumeration branch February 13, 2020 05:35
@greglandrum
Copy link
Member Author

Thanks @mcs07!
I agree that the Sayle/Delany approach is also interesting (and something I may also look at as I continue to fall into this rabbit hole), but the transform-based approach is still certainly useful and has the advantage of being pretty easy to explain and expand.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants