Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in depicting CXSMILES with Stereochemistry #66

Closed
Kohulan opened this issue May 26, 2023 · 8 comments
Closed

Bug in depicting CXSMILES with Stereochemistry #66

Kohulan opened this issue May 26, 2023 · 8 comments

Comments

@Kohulan
Copy link

Kohulan commented May 26, 2023

When using CXSMILES with coordinates to depict an image the stereochemistry is ignored. But if I try to generate an absolute SMILES back from the parsed SMILES I could see that the stereochemistry is still preserved.

C[C@@H](C1=CC=CC=C1)NC(=O)[C@H](C2=CC=CC=C2)NC(=S)NC3=CC=CC=C3 |(-0.5,2.0,;-2.0,1.9,;-2.7,0.6,;-4.2,0.6,;-4.9,-0.8,;-4.1,-2.0,;-2.6,-2.0,;-1.9,-0.7,;-2.8,3.2,;-2.1,4.5,;-0.6,4.6,;-2.9,5.8,;-4.4,5.8,;-5.2,7.0,;-6.7,7.0,;-7.4,5.7,;-6.6,4.4,;-5.1,4.4,;-2.2,7.1,;-3.0,8.4,;-4.5,8.4,;-2.3,9.7,;-3.1,11.0,;-4.6,10.9,;-5.4,12.2,;-4.7,13.5,;-3.2,13.6,;-2.4,12.3,)|
image
@nbehrnd
Copy link
Contributor

nbehrnd commented May 26, 2023

Suggestion: change the level of the second pull-down menu; instead No Annotation, use CIP Stereo Label. This annotates two centres as S configurated:

with_labels

(Because an export to .svg is one option, the string below the structure formula can be removed later.)

@Kohulan
Copy link
Author

Kohulan commented May 26, 2023

Thank you but there does not appear to be a problem with annotation, and CIP annotation does not seem to be relevant in this case. The CXSMILES depiction should get displayed like this:

image

@johnmay
Copy link
Member

johnmay commented May 26, 2023

The issue is if you give it coordinates, it expects to have the non-planar bonds (up/down wedges) annotated as well. See here: https://docs.chemaxon.com/display/docs/chemaxon-extended-smiles-and-smarts-cxsmiles-and-cxsmarts.md#src-1806633-safe-id-q2hlbuf4b25fehrlbmrlzfnnsuxfu2fuzfnnqvjuuy1dwfnnsuxfu2fuzenyu01bulrtlvnpbmdszsjvcg9yrg93biiov2lnz2x5ksxvugfuzerpv05ib25kcw

@johnmay
Copy link
Member

johnmay commented May 26, 2023

I'm not sure if we currently pass/set them from the CXSMILES but it's certainly possible. The other option is we assign the up/down bonds if they are missing but I'm less keen on that.

@johnmay
Copy link
Member

johnmay commented May 26, 2023

in short you would ideally need: wU: wD: etc in the CXSMILES layers, this doesn't currently work though.

@nbehrnd
Copy link
Contributor

nbehrnd commented May 26, 2023

The wedges, ... I see. No problem resolution, but a couple of observations (Python 3.11.2, RDKit 202209.3-1 as provided by Debian 12/bookworm):

  • the example about ferrocene c12c3c4c5c1[Fe]23451234c5c1c2c3c45 |C:4.5,0.6,1.7,2.8,3.9,7.12,6.10,9.16,10.18,8.14| works, but the bridged cycles [H][C@]12CCCC@(CC(C)C1)C2(S)Cl |r,TLB:13:11:2.4.3:7.10.8,THB:12:11:2.4.3:7.10.8,9:8:11:2.4.3| and [H][C@]12COCC@(CC@@HC1)[C@@]2(Cl)C1CCCC1 |r,TLB:15:14:2.4.3:7.13.8,THB:16:14:2.4.3:7.13.8,9:8:14:2.4.3| by ChemAxxon do not pass. I assumed perhaps work to cover cxsmiles by cdk/depict is ongoing.

  • Based on a blog post here I run a Jupyter notebook. With one of isomer of mandelic acid, O=C(O)[C@H](O)c1ccccc1 all works well, the output by RDKit is accepted here. With copy-paste of the SMILES of your molecule of interest, it does not.

  • Because SMILES can differ by program / algorithm engaged, I tested the canonical SMILES OpenBabel assigns to your structure (S=C(N[C@H](C(=O)N[C@H](c1ccccc1)C)c1ccccc1)Nc1ccccc1). Though the string passes the notebook, but the result doesn't work for cdk/depict either.

Below a screen photo about cdkdepict (ferrocene, one of ChemAxxon examples for bridged structures; mandelic acid, and with your structure expressed by OpenBabel's canonical SMILES -- the later two cxsmiles generated with the Jupyter notebook attached in the .zip):

testing

test_cxsmiles.ipynb.zip

@Kohulan
Copy link
Author

Kohulan commented May 26, 2023

@johnmay Thanks a lot.

I'm not sure if we currently pass/set them from the CXSMILES but it's certainly possible.
This could be the best option I agree. Since we tend to use the CDK depiction quite a lot it would be great if we fix this internally.

@nbehrnd Thank you for the detailed information.
I was able to parse the SMILES string I provided with RDKIT without any issues, and the molecule is displayed in the exact coordinates provided.

image

@johnmay
Copy link
Member

johnmay commented May 26, 2023

Closing as won't fix, perhaps in future but not now. Please feel free to send patch, i've pointed to what needs to change. All the best.

@johnmay johnmay closed this as completed May 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants