Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nucleotide splitting for V3000 molfile SCSR #1412

Closed
even1024 opened this issue Nov 20, 2023 · 0 comments · Fixed by #1420
Closed

Nucleotide splitting for V3000 molfile SCSR #1412

even1024 opened this issue Nov 20, 2023 · 0 comments · Fixed by #1420

Comments

@even1024
Copy link
Collaborator

even1024 commented Nov 20, 2023

Split standard nucleotides set A, C, G, T, U. into components: phosphate, sugar, nucleobase.
Nucleotide splitting will require attachment points rework and monomer coordinates' updates.:

Before the splitting:
image

After:
image

Fully expanded form:
image

For DNA nucleotides deoxyribose (d) should be used instead of ribose (r).

deoxyribose:

image

Please note that In molV3000 nucleotides' phosphate appears from the left side of ribose/deoxiribose.

By default biovia draw saves sequences into template atoms and adds corresponding nucleotides template:

CTAB:
M  V30 BEGIN CTAB
M  V30 COUNTS 2 1 1 0 1
M  V30 BEGIN ATOM
M  V30 1 A 1.9906 -2.7031 0 0 CLASS=RNA ATTCHORD=(2 2 Br) SEQID=1
M  V30 2 G 2.9355 -2.7031 0 0 CLASS=RNA ATTCHORD=(2 1 Al) SEQID=2
M  V30 END ATOM
M  V30 BEGIN BOND
M  V30 1 1 2 1
M  V30 END BOND
M  V30 BEGIN SGROUP
M  V30 1 DAT 1 ATOMS=(1 1) FIELDNAME="SMMX:sequence position data" -
M  V30 FIELDDISP="    1.9906   -2.7031    DA    ALL  1     "
M  V30 END SGROUP
M  V30 END CTAB

Templates:

M  V30 BEGIN TEMPLATE
M  V30 TEMPLATE 1 RNA/AM  V30 BEGIN CTAB
///skipped.
M  V30 END CTABM  V30 TEMPLATE 2 RNA/G
M  V30 BEGIN CTAB
///skipped.
M  V30 END CTAB
M  V30 END TEMPLATE

Molfile loader should transform this into granular form:
CTAB:

M  V30 BEGIN CTAB
M  V30 COUNTS 3 2 0 0 0
M  V30 BEGIN ATOM
M  V30 1 P 0 0 0 0 CLASS=PHOSPHATE SEQID=1 ATTCHORD=(2 2 Br)
M  V30 2 R 5 0 0 0 CLASS=SUGAR SEQID=2 SEQNAME=A ATTCHORD=(4 1 Al 3 Cx)
M  V30 3 A 5 -4 0 0 CLASS=BASE SEQID=2 SEQNAME=A ATTCHORD=(2 2 Al)M  V30 END ATOM
M  V30 BEGIN BOND
M  V30 1 1 1 2
M  V30 2 1 2 3
M  V30 END BOND
M  V30 END CTAB

Templates:

M  V30 BEGIN TEMPLATE
M  V30 TEMPLATE 23 PHOSPHATE/P/P NATREPLACE=P
//skipped.
M  V30 TEMPLATE 25 SUGAR/R/R NATREPLACE=R
//skipped
M  V30 TEMPLATE 16 BASE/A/A NATREPLACE=A
//skipped
M  V30 END TEMPLATE

Full CTAB V3000 form with support of the additional data-sgroups:

SMMX:class
SMMX:NUCLEOTIDE_P5
SMMX:NUCLEOTIDE_BASE
SMMX:NUCLEOTIDE_SUGAR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Indigo
  
Done
Development

Successfully merging a pull request may close this issue.

1 participant