Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix distance geometry stereo issues #2158

Merged
merged 10 commits into from Apr 10, 2020

Conversation

timvdm
Copy link
Member

@timvdm timvdm commented Apr 5, 2020

This branch contains the commits to fix the distance geometry stereo bugs.

@ghutchis: Could you verify the usage of OBUnitCell in CisTransFrom3D (src/stereo/perception.cpp)? I have based the code on OBMol::GetTorsion but I do not much experience with the class.

3. This was already the default behaviour for cis/trans stereochemistry.
* Fix handling of implicit stereo references
* Fix Calculate13Angle to work for triple bonds
* Use new Gen3DStereoHelper class
* Fix issue with implicit stereo refs (L2236+L2260)
* Use torsion angles (fixes issue with non-planar geometry)
(Some disabled to ensure the test runs fast enough)
@ghutchis
Copy link
Member

ghutchis commented Apr 5, 2020

I'll have to think carefully about the unit cell cases - and probably derive some test cases for them.

@ghutchis
Copy link
Member

ghutchis commented Apr 6, 2020

@timvdm - if we un-comment the smiles at the end of test/testdistgeom.py do they work as well?

@timvdm
Copy link
Member Author

timvdm commented Apr 6, 2020

@timvdm - if we un-comment the smiles at the end of test/testdistgeom.py do they work as well?

Yes, but it will take longer for the test to run (more than 1 minute on my machine). I'm not sure if this is an issue? The other tests run quickly so I didn't want to add a test which takes much longer.

@timvdm
Copy link
Member Author

timvdm commented Apr 6, 2020

@ghutchis Here are some more details regarding the testing I have been doing.

I'm running the same smi-sdf-can roundtrip test from test/testdistgeom.py on a large dataset (eMolecules SMILES database, only considering chiral molecules). So far, 17670 molecules have been processed without any segfaults or wrong stereochemistry.

There is only a minor cosmetic issue which should be fixed I think. For some structures, the number of iterations in OBDistanceGeometry is not enough to generate a valid structure. This results in a warning message:

==============================
*** Open Babel Warning  in AddConformer
  Distance Geometry failed.

The full roundtrip still produces the correct geometry since we now have an additional loop in Gen3D. However, seeing this message is confusing if the whole process actually worked.

Ideally, AddConformer() would return a boolean to indicate success. This would change the API/ABI. Some possible workarounds:

  1. Increase the number of iterations (currently equal to the number of atoms).
  2. Add function and member variable (in OBDistanceGeometryPrivate) to silence this message.
  3. Remove message and add function to check succes/failure.

For the 17670 molecules, there were only 7 such cases. These are listed below together with the number of times OBDistanceGeometry failed.

O\C=C1\CCCCC1=O    1 iteration failed
C[C@H]1CCC\C=C\[C@@H]2C[C@H](O)C[C@H]2[C@H](O)\C=C\C(=O)O1    5 iterations failed
Cl\C=C\Cl    1 iteration failed
CO[C@H]1[C@H](O)CC(=O)O[C@H](C)C\C=C\C=C\[C@H](OC2CCC(C(C)O2)N(C)C)[C@H](C)C[C@H](CC=O)[C@@H]1OC1OC(C)C(OC2CC(C)(O)C(O)C(C)O2)C(C1O)N(C)C    3 iterations failed
C\C1=C/CC(C)(C)\C=C\C\C(C)=C\CC1    1 iteration failed
CCO\C=C\CC    1 iteration failed
CO[C@H]1[C@@H](CC(=O)O[C@H](C)C\C=C\C=C\[C@H](O)[C@H](C)C[C@H](CC=O)[C@@H]1O[C@@H]1O[C@H](C)[C@@H](O[C@H]2C[C@@](C)(O)[C@@H](OC(=O)CC(C)C)[C@H](C)O2)[C@@H]([C@H]1O)N(C)C)OC(C)=O    5 iterations failed

In any case, it might be a good idea to have a minimum number of iterations to handle these small molecules.

@ghutchis
Copy link
Member

ghutchis commented Apr 8, 2020

I would probably suggest 3 for now. Clearly I should have added a bool return value for the method - and I'm happy to add to the ABI for the v3.1 release.

Otherwise, I think the patch looks great - and clearly handling ~17k molecules with limited issues is a profound step forward.

- Increase number of iterations
- Improve error handling
@timvdm
Copy link
Member Author

timvdm commented Apr 9, 2020

I continued testing to almost 31k molecules. There were several molecules for which the number of iterations was not enough. I have increased this and improved the error handling as discussed above. I'm now running the test again with these changes and will report the results later.

@ghutchis
Copy link
Member

ghutchis commented Apr 9, 2020

Let me know when you've completed - I think we're good to merge.

@ghutchis ghutchis merged commit 9d67893 into openbabel:master Apr 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants