Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RDKitToolkitWrapper double bond stereochemistry checker too strict #785

Closed
j-wags opened this issue Dec 10, 2020 · 0 comments · Fixed by #786
Closed

RDKitToolkitWrapper double bond stereochemistry checker too strict #785

j-wags opened this issue Dec 10, 2020 · 0 comments · Fixed by #786

Comments

@j-wags
Copy link
Member

j-wags commented Dec 10, 2020

Describe the bug
C=C and C=N bonds with equivalent substituents on one atom are being flagged as having missing stereochemical information when loading from 3D SDF using RDKitToolkitWrapper.

To Reproduce

from openforcefield.topology import Molecule
# This molecule has no stereocenters
mol = Molecule.from_smiles('C=C')
mol.generate_conformers()
mol.to_file('xxx.sdf', file_format='sdf')
mol2 = Molecule.from_file('xxx.sdf')

UndefinedStereochemistryError Traceback (most recent call last)
in ()
3 mol.generate_conformers()
4 mol.to_file('xxx.sdf', file_format='sdf')
----> 5 mol2 = Molecule.from_file('xxx.sdf')

/Users/jeffreywagner/projects/OpenForceField/openforcefield/openforcefield/topology/molecule.py in from_file(cls, file_path, file_format, toolkit_registry, allow_undefined_stereo)
4070 file_format=file_format,
4071 allow_undefined_stereo=allow_undefined_stereo,
-> 4072 _cls=cls,
4073 )
4074 elif hasattr(file_path, "read"):

/Users/jeffreywagner/projects/OpenForceField/openforcefield/openforcefield/utils/toolkits.py in from_file(self, file_path, file_format, allow_undefined_stereo, _cls)
2830 Chem.SetAromaticity(rdmol, Chem.AromaticityModel.AROMATICITY_MDL)
2831 mol = self.from_rdkit(
-> 2832 rdmol, allow_undefined_stereo=allow_undefined_stereo, _cls=_cls
2833 )
2834 mols.append(mol)

/Users/jeffreywagner/projects/OpenForceField/openforcefield/openforcefield/utils/toolkits.py in from_rdkit(self, rdmol, allow_undefined_stereo, _cls)
3439 rdmol,
3440 raise_warning=allow_undefined_stereo,
-> 3441 err_msg_prefix="Unable to make OFFMol from RDMol: ",
3442 )
3443

/Users/jeffreywagner/projects/OpenForceField/openforcefield/openforcefield/utils/toolkits.py in _detect_undefined_stereo(cls, rdmol, err_msg_prefix, raise_warning)
4182 else:
4183 msg = "Unable to make OFFMol from RDMol: " + msg
-> 4184 raise UndefinedStereochemistryError(msg)
4185
4186 @staticmethod

UndefinedStereochemistryError: Unable to make OFFMol from RDMol: Unable to make OFFMol from RDMol: RDMol has unspecified stereochemistry. RDMol name: Bonds with undefined stereochemistry are:

  • Bond 0 (atoms 0-1 of element (C-C)

Proposed solution

The code below is my best attempt at detecting which bonds are stereogenic, starting from an SDF. We can use other code to ensure that all stereogenic bonds have defined stereochemistry.

(I'm posting a lengthy writeup since this may be of interest to others)

Using molecule C=CC/C=C(/Cl). Note that the double bond on the right IS stereogenic, and the one on the left is NOT stereogenic

image

# Here we have RDKit make a molecule with two double bonds, one of which is stereogenic, and write it to SDF

from rdkit import Chem
from rdkit.Chem import AllChem
rdmol = Chem.MolFromSmiles('C=CC/C=C(/Cl)')
#rdmol = Chem.MolFromSmiles('C=C')
#rdmol = Chem.MolFromSmiles('FC=CCl')

Chem.SanitizeMol(rdmol,
                 Chem.SANITIZE_ALL ^ Chem.SANITIZE_ADJUSTHS ^ Chem.SANITIZE_SETAROMATICITY,
                )
Chem.AssignStereochemistry(rdmol)

Chem.FindPotentialStereoBonds(rdmol)

rdmol = Chem.AddHs(rdmol)

AllChem.EmbedMultipleConfs(
    rdmol,
    numConfs=1,
    pruneRmsThresh=1,
    randomSeed=1,
)
for bond in rdmol.GetBonds():
    print('before file write', bond.GetStereo())
    
writer = Chem.SDWriter('xxx_rd.sdf')
writer.write(rdmol)
writer.close()

before file write STEREONONE <-- Here's the non-stereogenic bond
before file write STEREONONE
before file write STEREONONE
before file write STEREOE <-- Here's the actual stereogenic bond
before file write STEREONONE
before file write STEREONONE
before file write STEREONONE
before file write STEREONONE
before file write STEREONONE
before file write STEREONONE
before file write STEREONONE
before file write STEREONONE

# Next we load it from SDF, using the same code as OFFTK currently does
rdmol2 = None
for mol in Chem.ForwardSDMolSupplier(open('xxx_rd.sdf','rb'), sanitize=False):
    rdmol2 = mol
    
# Here is the similar sanitization code to what OFFTK runs
Chem.AssignAtomChiralTagsFromStructure(rdmol2)

Chem.SanitizeMol(
    rdmol2,
    (
        Chem.SANITIZE_ALL
        ^ Chem.SANITIZE_SETAROMATICITY
        ^ Chem.SANITIZE_ADJUSTHS
        ^ Chem.SANITIZE_CLEANUPCHIRALITY
        ^ Chem.SANITIZE_KEKULIZE
    ),
)
    
Chem.FindPotentialStereoBonds(rdmol2)
Chem.AssignCIPLabels(rdmol2)


for bond in rdmol2.GetBonds():
    print('After finding potential stereo bonds', bond.GetStereo())

After finding potential stereo bonds STEREOANY <-- Here's the non-stereogenic bond
After finding potential stereo bonds STEREONONE
After finding potential stereo bonds STEREONONE
After finding potential stereo bonds STEREOANY <-- Here's the actual stereogenic bond
After finding potential stereo bonds STEREONONE
After finding potential stereo bonds STEREONONE
After finding potential stereo bonds STEREONONE
After finding potential stereo bonds STEREONONE
After finding potential stereo bonds STEREONONE
After finding potential stereo bonds STEREONONE
After finding potential stereo bonds STEREONONE
After finding potential stereo bonds STEREONONE

# Trying again with cleanIt=True
Chem.FindPotentialStereoBonds(rdmol2, cleanIt=True)
Chem.AssignCIPLabels(rdmol2)

for bond in rdmol2.GetBonds():
    print('After finding potential stereo bonds with cleanIt=True', bond.GetStereo())

After finding potential stereo bonds with cleanIt=True STEREOANY <-- Here's the non-stereogenic bond
After finding potential stereo bonds with cleanIt=True STEREONONE
After finding potential stereo bonds with cleanIt=True STEREONONE
After finding potential stereo bonds with cleanIt=True STEREOANY <-- Here's the actual stereogenic bond
After finding potential stereo bonds with cleanIt=True STEREONONE
After finding potential stereo bonds with cleanIt=True STEREONONE
After finding potential stereo bonds with cleanIt=True STEREONONE
After finding potential stereo bonds with cleanIt=True STEREONONE
After finding potential stereo bonds with cleanIt=True STEREONONE
After finding potential stereo bonds with cleanIt=True STEREONONE
After finding potential stereo bonds with cleanIt=True STEREONONE
After finding potential stereo bonds with cleanIt=True STEREONONE

# Manually wiping existing stereo and trying again with cleanIt=True
for bond in rdmol2.GetBonds():
    bond.SetStereo(Chem.BondStereo.STEREONONE)

Chem.FindPotentialStereoBonds(rdmol2, cleanIt=True)
Chem.AssignCIPLabels(rdmol2)

    
for bond in rdmol2.GetBonds():
    print('After unset chirality and re-percieve', bond.GetStereo())

After unset chirality and re-percieve STEREONONE <-- Here's the non-stereogenic bond
After unset chirality and re-percieve STEREONONE
After unset chirality and re-percieve STEREONONE
After unset chirality and re-percieve STEREOANY <-- Here's the actual stereogenic bond
After unset chirality and re-percieve STEREONONE
After unset chirality and re-percieve STEREONONE
After unset chirality and re-percieve STEREONONE
After unset chirality and re-percieve STEREONONE
After unset chirality and re-percieve STEREONONE
After unset chirality and re-percieve STEREONONE
After unset chirality and re-percieve STEREONONE
After unset chirality and re-percieve STEREONONE

System details

Mac OSX 10.14

rdkit                     2020.09.1        py37h3dce739_0    conda-forge
openforcefield 0.8.1

(behavior reproduces with several earlier versions of both toolkits)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant