Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KekulizationException in tautomer canonicalization #5784

Closed
d-b-w opened this issue Nov 22, 2022 · 3 comments
Closed

KekulizationException in tautomer canonicalization #5784

d-b-w opened this issue Nov 22, 2022 · 3 comments
Assignees
Labels
Milestone

Comments

@d-b-w
Copy link
Contributor

d-b-w commented Nov 22, 2022

Describe the bug
There is a KekulizationException when canonicalizing tautomers for some types of molecules. This seems to be new since the sprint release. I've seen it in a couple of places, but the commonality seems to be with non-aromatic rings with two nitrogens and one external methyl group.

Here are a couple of examples:

examples

To Reproduce

from rdkit import Chem 
from rdkit.Chem.MolStandardize import rdMolStandardize  

for smi in ['NC1=NC=NC(C)=C1',
            'CC1N=CN(C)C(=O)C=1',
            'CC1=CC=CC(=O)N1C']:
    mol = Chem.MolFromSmiles(smi)
    rdMolStandardize.TautomerEnumerator().Canonicalize(mol)

Expected behavior

some sort of answer :)

Screenshots
If applicable, add screenshots to help explain your problem.

Configuration (please complete the following information):

  • RDKit version: 2022.09 release. Also tested with a build at adfdeca (head of origin/master on November 20)
  • OS: Darwin, Linux and Windows
  • Using an internal Schrödinger build, also built locally
@d-b-w d-b-w added the bug label Nov 22, 2022
@d-b-w
Copy link
Contributor Author

d-b-w commented Nov 22, 2022

Looks caused by #5402 - if I revert that commit these structures give answers

@greglandrum
Copy link
Member

Yeah, there's a problem with that change. I'm looking into it, but it's non-trivial (as one would expect with anything connected to tautomerism)

@d-b-w
Copy link
Contributor Author

d-b-w commented Nov 29, 2022

Would it make sense to catch kekulization errors and continue at this point? I think it would be best if all transformations were guaranteed to always produce valid structures, but that seems like a hard guarantee to make!

@greglandrum greglandrum self-assigned this Dec 1, 2022
@greglandrum greglandrum added this to the 2022_09_3 milestone Dec 1, 2022
greglandrum added a commit to greglandrum/rdkit that referenced this issue Dec 1, 2022
catch kekulization errors during the tautomer enumeration
I have tested this on ~100K ChEMBL molecules and encountered
no further problems.
greglandrum added a commit that referenced this issue Dec 8, 2022
catch kekulization errors during the tautomer enumeration
I have tested this on ~100K ChEMBL molecules and encountered
no further problems.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants