Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cores with query atoms may fail to R-group-decompose molecules #4505

Closed
ptosco opened this issue Sep 13, 2021 · 0 comments
Closed

Cores with query atoms may fail to R-group-decompose molecules #4505

ptosco opened this issue Sep 13, 2021 · 0 comments
Labels
Milestone

Comments

@ptosco
Copy link
Contributor

ptosco commented Sep 13, 2021

Describe the bug
Attempting to carry out R-group decomposition with cores containing query atoms may result in unexpected failures to decompose molecules that do match the core.

To Reproduce

import rdkit
from rdkit import Chem
from rdkit.Chem import rdRGroupDecomposition

# 1st example
core = Chem.MolFromSmarts("[n]1([*:1])ccccc1")
core

image

params = rdRGroupDecomposition.RGroupDecompositionParameters()
params.onlyMatchAtRGroups = True
rgd = rdRGroupDecomposition.RGroupDecomposition(core, params)
pyridine = Chem.MolFromSmiles("n1ccccc1")
rgd.Add(pyridine)
-1
N_methyl_pyridinium = Chem.MolFromSmiles("[n+]1(C)ccccc1")
rgd.Add(N_methyl_pyridinium)
0

# 2nd example
core = Chem.MolFromSmarts("[n]1([*:1])ccccc1")
core

image

params = rdRGroupDecomposition.RGroupDecompositionParameters()
params.onlyMatchAtRGroups = True
rgd = rdRGroupDecomposition.RGroupDecomposition(core, params)
methylcyclohexene = Chem.MolFromSmiles("C=1(C)CCCCC1")
rgd.Add(methylcyclohexene)
-1
methylcyclohexane = Chem.MolFromSmiles("C1(C)CCCCC1")
rgd.Add(methylcyclohexane)
0

Expected behavior
In the 1st example, pyridine should be decomposed, with the R-group being null.
In the 2nd example, methylcyclohexene should be decomposed, with one of the R-groups being null and the other being [1:*]C.

Additional context
Both problems have a common cause.
The RGD algorithm initially adds hydrogens to molecules in order to give them a chance to match core R-groups in case there is no heavy atom available to match them.
However, pyridine and methylcyclohexene will not add any hydrogens at the sp2 nitrogen or carbon, respectively, as they their nitrogens are already all filled.
Addressing this bug requires a fundamental change into how the matching of molecules and cores is done, as adding explicit Hs to the molecules may not work when query atoms are present in the core.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants