Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Monomers downloaded from monomer.org are not in the correct format #30

Open
ClairePA opened this issue Nov 30, 2020 · 4 comments
Open
Assignees
Labels
Milestone

Comments

@ClairePA
Copy link

Here is an example of the agreed JSON format for monomers.

{ "monomerType": "Backbone", "symbol": "12ddR", "rgroups": [ { "alternateId": "R1-H", "id": 0, "label": "R1", "capGroupSMILES": "[*:1][H]", "capGroupName": "H" }, { "alternateId": "R2-H", "id": 0, "label": "R2", "capGroupSMILES": "[*:2][H]", "capGroupName": "H" } ], "molfile": "\n Marvin 09110915502D \n\n 10 10 0 0 0 0 999 V2000\n -1.4258 10.5012 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0\n -2.1396 10.0877 0.0000 C 0 0 1 0 0 0 0 0 0 0 0 0\n -0.7107 10.0897 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0\n -1.9250 9.2912 0.0000 C 0 0 2 0 0 0 0 0 0 0 0 0\n -0.9231 9.2926 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0\n -1.9238 8.4662 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0\n -2.2881 10.9422 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0\n -2.7596 11.2958 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0\n -3.5846 11.4503 0.0000 R# 0 0 0 0 0 0 0 0 0 0 0 0\n -1.2225 7.9123 0.0000 R# 0 0 0 0 0 0 0 0 0 0 0 0\n 1 2 1 0 0 0 0\n 1 3 1 0 0 0 0\n 2 4 1 0 0 0 0\n 2 7 1 1 0 0 0\n 3 5 1 0 0 0 0\n 4 5 1 0 0 0 0\n 4 6 1 6 0 0 0\n 6 10 1 0 0 0 0\n 7 8 1 0 0 0 0\n 8 9 1 0 0 0 0\nM RGP 2 9 1 10 2\nM END\n\n$$$$\n", "smiles": "[H:1]OC[C@H]1OCC[C@@H]1O[H:2]", "author": "Pistoia Alliance", "name": "1',2'-Di-Deoxy-Ribose", "naturalAnalog": "R", "polymerType": "RNA", "id": 131, "createDate": "Tue Sep 05 17:43:09 CEST 2017" }

The download from monomer.org does not include the agreed array for R groups but flattens it and misses some of the information such as the capgroup SMILEs etc...

{ "monomerversionid": 518, "libraryid": 4, "librarykey": "Nucleotides", "libraryname": "Core Nucleotides", "molfile": "/n ChemDraw11272016482D/n/n 9 9 0 0 0 0 0 0 0 0999 V2000/n -0.9959 0.0314 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0/n -0.9959 -0.7936 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0/n -0.4125 -1.3770 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0/n 0.4125 -1.3770 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0/n 0.9959 -0.7936 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0/n 0.9959 0.0314 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0/n 0.4125 0.6148 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0/n -0.4125 0.6148 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0/n 0.7282 1.3770 0.0000 R1 0 0 0 0 0 0 0 0 0 0 0 0/n 1 2 1 0 /n 2 3 1 0 /n 3 4 1 0 /n 4 5 1 0 /n 5 6 1 0 /n 6 7 1 0 /n 7 8 1 0 /n 8 1 1 0 /n 7 9 1 0 /nM END/n", "smiles": null, "symbol": "ac4C", "naturalanalog": "C", "name": "4-Acetylcytosine", "polymertype": "RNA", "monomertype": "Branch", "status": "draft", "r1": "H", "r2": null, "r3": null, "r4": null, "r5": null, "author": "Bellamy, Claire", "userid": 4, "createddate": 1606251619480, "modifieddate": 1606251619480 }

Please correct the monomer JSON download.

@ClairePA ClairePA assigned elrashid24 and unassigned scilligence Jan 20, 2021
@ClairePA ClairePA added the bug Something isn't working label Jan 20, 2021
@elrashid24 elrashid24 added this to the Release milestone Feb 8, 2021
@elrashid24 elrashid24 modified the milestones: Release , Release Mar 29, 2021
@ClairePA
Copy link
Author

Now nothing is downloaded at all!

@ClairePA
Copy link
Author

Apologies, it is downloaded, but still in the old format.
[{"monomerversionid":7270,"libraryid":23,"libraryname":"Test3","molfile":"Unnamed\nMolEngine04272115472D\n\n 12 12 0 0 1 0 0 0 0 0999 V2000\n 2.3610 2.4680 0.0000 C 0 0 1\n 3.8450 1.9860 0.0000 C 0 0 2\n 4.7620 3.2480 0.0000 C 0 0 1\n 3.8450 4.5100 0.0000 O \n 2.3610 4.0280 0.0000 C 0 0 2\n 1.0990 4.9450 0.0000 C \n 6.3220 3.2480 0.0000 R \n 1.0990 1.5510 0.0000 O \n 4.3270 0.5020 0.0000 O \n 1.2620 6.4960 0.0000 O \n 0.0000 7.4130 0.0000 R \n 1.2620 0.0000 0.0000 C \n 1 2 1\n 2 3 1\n 3 4 1\n 4 5 1\n 5 1 1\n 5 6 1 1\n 3 7 1 1\n 1 8 1 6\n 2 9 1 6\n 6 10 1\n 10 11 1\n 8 12 1\nA 7\nR3\nA 11\nR1\nM END\n","smiles":"[H]OC[C@@H]1[C@@H](OC)[C@@H](O)[C@H]([2OH])O1","symbol":"35mo3r","naturalanalog":"r","name":"3-O-Methylribose (3,5 connectivity)","polymertype":"RNA","monomertype":"Backbone","status":"Active","r1":"H","r2":null,"r3":"OH","r4":null,"r5":null,"author":"Bellamy, Claire","userid":2,"createddate":1619481600000,"modifieddate":1619481600000}]

@ClairePA
Copy link
Author

You might want to look at the schema definition on GitHub https://github.com/PistoiaHELM/HELMMonomerSets/blob/master/HELMmonomerSchema.json
The issue is the R group information which should be nested like this.
"rgroups": [
{ "alternateId": "R1-H", "id": 0, "label": "R1", "capGroupSMILES": "[*:1][H]", "capGroupName": "H" },
{ "alternateId": "R2-H", "id": 0, "label": "R2", "capGroupSMILES": "[*:2][H]", "capGroupName": "H" }
]

and not listed like you have.
"r1":"H","r2":null,"r3":"OH","r4":null,"r5":null,
I can talk you through it if you let me know your availability.

@ClairePA ClairePA added Medium and removed bug Something isn't working labels Jun 29, 2021
@ClairePA
Copy link
Author

Would highly recommend that this is addressed as the limitations of the current approach will be apparent should other polymer types be implemented in HELM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants