Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What to do when rdkit.Chem.AssignStereochemistry does not add chiral code in Molecule.to_rdkit()? #196

Closed
andrrizzi opened this issue Mar 11, 2019 · 10 comments

Comments

@andrrizzi
Copy link
Collaborator

andrrizzi commented Mar 11, 2019

Likely related to #146 and #156 .

With certain molecules, (e.g. DrugBank_6355, which is molecule 156 in the MiniDrugBank test set) this call of rdkit.Chem.Chem.AssignStereochemistry results in

>>> rdatom.GetChiralTag()
rdkit.Chem.rdchem.ChiralType.CHI_UNSPECIFIED

If the code continues, this line results in KeyError: '_CIPCode'.

Differently from from_rdkit(), we already know what stereochemistry that atom should have so it doesn't make much sense to raise an exception.

Is there a different way of setting the stereochemistry using S and R convention in RDKit? Alternatively, is there a way we can save the (read-only) chirality information in our Atom class using the CW/CCW notation alongside the S/R convention?

@j-wags
Copy link
Member

j-wags commented Mar 11, 2019

Some potential fixes would be:

  • If R or S can't be achieved by trying CW/CCW, force-add them with rdatom.SetProp('_CIPCode', 'R'). Note, however, that this won't affect the rdmol's actual topology, and would be overwritten if Chem.AssignStereochemistry is run again on the rdmol.
  • As an alternative to the above, set CW/CCW in these cases, where CW means 'R', and CCW means S. Then, implement checks for these cases in OFFMol.from_rdkit. Note that there would be undefined behavior if the user ran Chem.AssignStereochemistry on this.
  • Hope that it's not critically important that this stereochemistry be defined, and just lose the information while converting the OFFMol to an RDmol.

@proteneer
Copy link
Collaborator

S and R are perceived in the sense that they're "global". CW/CCW are local in that they only look at the adjacent 4 neighbors. What exactly are you trying to do?

@j-wags
Copy link
Member

j-wags commented Mar 11, 2019

So, this is a surprisingly deep issue. For for background: #146

Short version: The essential problem arises from the idea that an OFFMol has completely defined stereochemistry. After all, a person could make a stero-specific SMARTS, and we'd be expected to match it. However, while OpenEye records stereochemistry around groups like R=NH (primary imine) and N(R1)(R2)(R3) (pyramidal nitrogen), RDKit doesn't. So, when we take molecules with completely-defined stereo and try to turn them into rdmols, our conversion function gets confused and raises this exception.

@proteneer
Copy link
Collaborator

I see - am I correct in understanding that RDKit doesn't allow for over-specification of Stereochemistry, where as OEChem does?

@proteneer
Copy link
Collaborator

proteneer commented Mar 11, 2019

PS - CIP is poorly defined to start with. The cahn-ingold-prelog rules are pretty bullshit to start with anyways (so I'd avoid doing stuff that depends on R/S to start with).

@j-wags
Copy link
Member

j-wags commented Mar 11, 2019

"over-specification" is a tricky thing to define. The two major toolkit differences are

  1. stereo around primary imines and
  2. stereo around pyramidal nitrogens.

So, one question is: Should these be "defining" features of a molecular topology in the context of an MD simulation? Inversion rates for pyramidal nitrogens are on the order of 1/microsecond, and can vary a lot depending on substituents. So, maybe. And if in doubt, yes.

Another questions is: Is there physical justification for assigning different parameters due to atom (not bond) stereochemistry? Yes, if there are two or more chiral atoms in a molecule, it can have different properties.

@j-wags
Copy link
Member

j-wags commented Mar 11, 2019

Yeah, I also don't like CIP. But it would take a long time to implement local stereochemistry, and we're already behind schedule. To stay on track we're going to stick with R/S.

#139

@andrrizzi
Copy link
Collaborator Author

I'm implementing solution 1:

If R or S can't be achieved by trying CW/CCW, force-add them with rdatom.SetProp('_CIPCode', 'R'). Note, however, that this won't affect the rdmol's actual topology, and would be overwritten if Chem.AssignStereochemistry is run again on the rdmol.

@j-wags
Copy link
Member

j-wags commented Mar 15, 2019

@andrrizzi Sounds good.

You probably already caught this, but, when we have to do that override, make sure to do it after our last AssignStereochemistry call, otherwise I think all the _CIPCode fields may be overwritten.

andrrizzi added a commit that referenced this issue Mar 16, 2019
… RDKit cannot figure out the atom chirality.
@andrrizzi
Copy link
Collaborator Author

Implemented in #209 .

SMIRNOFF 0.2.0 Implementation automation moved this from In progress to Done Mar 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

No branches or pull requests

3 participants