Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mk_prepare_ligand.py can change the total charge of the processed molecule. #63

Closed
xavgit opened this issue Aug 13, 2023 · 6 comments
Closed

Comments

@xavgit
Copy link

xavgit commented Aug 13, 2023

Hi,
I have downloaded some molecules from ZINC20 in mol2 format.
I then have used mk_prepare_ligand.py to convert the molecules to the pdbqt format.
For an high percentage of them mk_prepare_ligand.py takes in input a molecule with total charge
of -2 ( I have checked with openbabel GetTotalCharge() ) and return a pdbqt converted ligand with
a TOT CHARGE of +/.-0.00 as reported using the -v option.

Why these different values of total charge of the same molecule with different formats?

This different values of total charge of the ligands can have effects on the docking results?

For example for ZINC000001644610.mol2 openbabel evaluate a total charge equal -2 whereas
mk_prepare_ligand.py returns the following output when the previous molecule is processed:

Processing ./ZINC000001644610.mol2 file
Molecule setup

==============[ ATOMS ]===================================================
idx | coords | charge |ign| atype | connections
-----+----------------------------+--------+---+----------+--------------- . . .
0 | 0.002 -0.004 0.002 | -0.258 | 0 | OA | [1]
1 | -0.014 1.214 0.009 | 0.062 | 0 | N | [0, 2, 3]
2 | 1.032 1.837 0.002 | -0.258 | 0 | OA | [1]
3 | -1.306 1.936 0.019 | 0.271 | 0 | A | [1, 29, 4]
4 | -1.322 3.320 0.021 | 0.020 | 0 | A | [3, 5, 30]
5 | -2.522 4.006 0.030 | -0.006 | 0 | A | [4, 6, 31]
6 | -3.719 3.311 0.038 | 0.204 | 0 | A | [5, 7, 8]
7 | -4.899 3.984 0.048 | -0.288 | 0 | OA | [6]
8 | -3.709 1.909 0.036 | 0.117 | 0 | A | [6, 9, 29]
9 | -4.885 1.212 0.043 | -0.252 | 0 | NA | [8, 10]
10 | -5.000 0.135 0.768 | 0.035 | 0 | C | [9, 11, 32]
11 | -6.266 -0.614 0.776 | -0.007 | 0 | A | [10, 28, 12]
12 | -7.345 -0.173 0.001 | -0.053 | 0 | A | [11, 13, 33]
13 | -8.520 -0.869 0.008 | -0.053 | 0 | A | [12, 14, 34]
14 | -8.645 -2.023 0.791 | -0.007 | 0 | A | [13, 15, 27]
15 | -9.911 -2.772 0.799 | 0.035 | 0 | C | [14, 16, 35]
16 | -10.028 -3.847 1.527 | -0.252 | 0 | NA | [15, 17]
17 | -11.204 -4.543 1.535 | 0.117 | 0 | A | [16, 25, 18]
18 | -11.919 -4.727 0.349 | 0.046 | 0 | A | [17, 19, 36]
19 | -13.105 -5.431 0.364 | 0.271 | 0 | A | [18, 20, 23]
20 | -13.862 -5.626 -0.893 | 0.062 | 0 | N | [19, 21, 22]
21 | -13.436 -5.167 -1.938 | -0.258 | 0 | OA | [20]
22 | -14.910 -6.247 -0.883 | -0.258 | 0 | OA | [20]
23 | -13.591 -5.955 1.550 | 0.020 | 0 | A | [19, 24, 37]
24 | -12.893 -5.778 2.730 | -0.006 | 0 | A | [23, 25, 38]
25 | -11.698 -5.080 2.732 | 0.204 | 0 | A | [17, 24, 26]
26 | -11.011 -4.907 3.892 | -0.288 | 0 | OA | [25]
27 | -7.566 -2.465 1.565 | -0.053 | 0 | A | [14, 28, 39]
28 | -6.389 -1.771 1.554 | -0.053 | 0 | A | [11, 27, 40]
29 | -2.490 1.228 0.021 | 0.046 | 0 | A | [3, 8, 41]
30 | -0.391 3.868 0.015 | 0.070 | 0 | H | [4]
31 | -2.526 5.086 0.032 | 0.067 | 0 | H | [5]
32 | -4.166 -0.207 1.363 | 0.085 | 0 | H | [10]
33 | -7.248 0.717 -0.603 | 0.063 | 0 | H | [12]
34 | -9.353 -0.529 -0.589 | 0.063 | 0 | H | [13]
35 | -10.744 -2.432 0.201 | 0.085 | 0 | H | [15]
36 | -11.543 -4.320 -0.577 | 0.072 | 0 | H | [18]
37 | -14.521 -6.504 1.552 | 0.070 | 0 | H | [23]
38 | -13.279 -6.190 3.651 | 0.067 | 0 | H | [24]
39 | -7.662 -3.355 2.168 | 0.063 | 0 | H | [27]
40 | -5.557 -2.112 2.152 | 0.063 | 0 | H | [28]
41 | -2.474 0.148 0.018 | 0.072 | 0 | H | [29]
-----+----------------------------+--------+---+----------+--------------- . . .
TOT CHARGE: 0.000

Thanks.

Saverio

PS: I have posted this problem on autodock@scripps.edu but I guess that this place is more appropriated an the question
is made more clear.
Sorry if this way is an error.

ZINC000001644610.pdbqt.txt
ZINC000001644610.mol2.txt

@rwxayheee
Copy link

Hi,

I found the atom type assignments are incorrect for some atoms in the MOL2 file:

The atom types for most carbon atoms in ZINC000001644610 should be C.ar, representing aromatic carbons. The compound is correctly aromatized in the generated pdbqt file, probably because the bond section specifies the aromaticity of the bonds.

The atom types for the phenolate oxygens should be O.3. Typing them "O.2" is likely the cause of the total charge = 0 in this case. I suspect similar issues with the carboxylate containing compounds you posted on the autodock mailing list.

Attached is a MOL2 file for this compound directly downloaded from:
https://zinc20.docking.org/substances/ZINC000001644610/

527368906.mol2.txt

For this MOL2 file, meeko (RDKit) will report a total charge = -2.

Hope this is mildly helpful..!

@diogomart
Copy link
Contributor

Here are the relevant bits of my comments from the mailing list:

For this MOL2
@<TRIPOS>MOLECULE
ZINC000034235761
 47 48 0 0 0
SMALL
USER_CHARGES

@<TRIPOS>ATOM
      1 C1         -0.2907    1.4244    0.5537 C.2     1  ZINC0000342357611   -0.2000
      2 C2         -0.1576    0.1283    0.4179 C.2     1  ZINC0000342357611   -0.1100
      3 C3          1.2174   -0.4846    0.3469 C.3     1  ZINC0000342357611   -0.0800
      4 H4          1.9727    0.2974    0.4245 H       1  ZINC0000342357611    0.1200
      5 C5          1.3822   -1.2330   -0.9828 C.3     1  ZINC0000342357611    0.2500
      6 H6          1.3035   -0.5302   -1.8123 H       1  ZINC0000342357611    0.1100
      7 O7          2.6567   -1.8784   -1.0176 O.3     1  ZINC0000342357611   -0.3400
      8 C8          3.0025   -2.4135   -2.2967 C.3     1  ZINC0000342357611    0.2200
      9 H9          2.1763   -3.0160   -2.6741 H       1  ZINC0000342357611    0.0600
     10 O10         3.2657   -1.3429   -3.2058 O.3     1  ZINC0000342357611   -0.3700
     11 C11         3.5980   -1.7691   -4.5287 C.3     1  ZINC0000342357611    0.1100
     12 H12         2.7797   -2.3633   -4.9354 H       1  ZINC0000342357611    0.0800
     13 C13         3.8296   -0.5446   -5.4163 C.3     1  ZINC0000342357611    0.0900
     14 O14         2.6081    0.1848   -5.5499 O.3     1  ZINC0000342357611   -0.5600
     15 C15         4.8721   -2.6168   -4.4853 C.3     1  ZINC0000342357611    0.1000
     16 H16         5.1026   -2.9795   -5.4870 H       1  ZINC0000342357611    0.0700
     17 O17         5.9579   -1.8226   -4.0034 O.3     1  ZINC0000342357611   -0.5300
     18 C18         4.6518   -3.8079   -3.5477 C.3     1  ZINC0000342357611    0.0800
     19 H19         3.8568   -4.4406   -3.9425 H       1  ZINC0000342357611    0.0800
     20 O20         5.8592   -4.5654   -3.4454 O.3     1  ZINC0000342357611   -0.5500
     21 C21         4.2525   -3.2873   -2.1635 C.3     1  ZINC0000342357611    0.0700
     22 H22         4.0400   -4.1295   -1.5050 H       1  ZINC0000342357611    0.0700
     23 O23         5.3220   -2.5124   -1.6179 O.3     1  ZINC0000342357611   -0.5300
     24 O24         0.3260   -2.2268   -1.0809 O.2     1  ZINC0000342357611   -0.3500
     25 C25         0.1105   -3.0092   -0.0123 C.2     1  ZINC0000342357611    0.0300
     26 C26         0.5736   -2.7193    1.2097 C.2     1  ZINC0000342357611   -0.2400
     27 C27         0.2814   -3.5985    2.2741 C.3     1  ZINC0000342357611    0.5200
     28 O28        -0.4838   -4.5394    2.0991 O.2     1  ZINC0000342357611   -0.7200
     29 O29         0.7928   -3.4210    3.3734 O.2     1  ZINC0000342357611   -0.6800
     30 C30         1.3947   -1.4893    1.4899 C.3     1  ZINC0000342357611    0.0200
     31 H31         2.4461   -1.7653    1.5712 H       1  ZINC0000342357611    0.0700
     32 C32         0.9312   -0.8552    2.8029 C.3     1  ZINC0000342357611   -0.1900
     33 C33         1.8338    0.3010    3.1485 C.3     1  ZINC0000342357611    0.5000
     34 O34         2.7237    0.6299    2.3826 O.2     1  ZINC0000342357611   -0.7100
     35 O35         1.6743    0.9076    4.1940 O.2     1  ZINC0000342357611   -0.7200
     36 H36         0.5848    2.0537    0.6148 H       1  ZINC0000342357611    0.1000
     37 H37        -1.2761    1.8641    0.6001 H       1  ZINC0000342357611    0.0900
     38 H38        -1.0331   -0.5011    0.3569 H       1  ZINC0000342357611    0.1000
     39 H39         4.5869    0.0949   -4.9627 H       1  ZINC0000342357611    0.0700
     40 H40         4.1691   -0.8680   -6.4003 H       1  ZINC0000342357611    0.0600
     41 H41         2.6807    0.9761   -6.1010 H       1  ZINC0000342357611    0.3800
     42 H42         6.7988   -2.2970   -3.9498 H       1  ZINC0000342357611    0.3800
     43 H43         5.7918   -5.3366   -2.8659 H       1  ZINC0000342357611    0.3900
     44 H44         5.1376   -2.1505   -0.7404 H       1  ZINC0000342357611    0.3900
     45 H45        -0.4635   -3.9145   -0.1442 H       1  ZINC0000342357611    0.1600
     46 H46        -0.0921   -0.4967    2.6919 H       1  ZINC0000342357611    0.0400
     47 H47         0.9715   -1.5983    3.5993 H       1  ZINC0000342357611    0.1100
@<TRIPOS>UNITY_ATOM_ATTR
28 1
charge -1
34 1
charge -1
@<TRIPOS>BOND
     1     1     2    2
     2     1    36    1
     3     1    37    1
     4     2     3    1
     5     2    38    1
     6     3     4    1
     7     3    30    1
     8     3     5    1
     9     5     6    1
    10     5     7    1
    11     5    24    1
    12     7     8    1
    13     8     9    1
    14     8    21    1
    15     8    10    1
    16    10    11    1
    17    11    12    1
    18    11    13    1
    19    11    15    1
    20    13    14    1
    21    13    39    1
    22    13    40    1
    23    14    41    1
    24    15    16    1
    25    15    17    1
    26    15    18    1
    27    17    42    1
    28    18    19    1
    29    18    20    1
    30    18    21    1
    31    20    43    1
    32    21    22    1
    33    21    23    1
    34    23    44    1
    35    24    25    1
    36    25    26    2
    37    25    45    1
    38    26    27    1
    39    26    30    1
    40    27    28    1
    41    27    29    2
    42    30    31    1
    43    30    32    1
    44    32    33    1
    45    32    46    1
    46    32    47    1
    47    33    34    1
    48    33    35    2

Reading with RDKit: charge = 0 (as you reported).
Reading with OpenBabel: charge = -2
Converting to MOL with OpenBabel and then reading with RDKit: charge = -2
Converting to MOL2 with OpenBabel (i.e. re-writing it) and reading with RDKit: error because non-ring atom is marked aromatic.

I don't really know what to do from here. An option that comes to mind is to download 2D SDFs from ZINC and use external software to calculate the 3D coordinates and protonate, such as molconvert from ChemAxon.

@xavgit
Copy link
Author

xavgit commented Aug 15, 2023

Hi,
thanks for the suggestions.
I cannot find the 2D SDFs in the download section of the tranches.
Can you kindly indicate where they are?

Thanks.

Saverio

@rwxayheee
Copy link

rwxayheee commented Aug 15, 2023

Hi @xavgit,

2D SDFs can be downloaded individually from Zinc. According to the head of the files, seems like those were made in RDKit. Using RDKit functions in Python, you can create molecule from smi (which you can get from tranches) and then compute 2D coordinates and write an SDF file as output. But depending on what external software you wish to use in the next steps, it might be ok to just use the Smiles strings as the inputs.

@diogomart
Copy link
Contributor

Here's a few interesting discussions regarding the ambiguity of the MOL2 format and the fact that RDKit expects atom types as written by corina.

https://sourceforge.net/p/rdkit/mailman/message/37668451/
rdkit/rdkit#4061
https://sourceforge.net/p/rdkit/mailman/message/37374678/

@xavgit
Copy link
Author

xavgit commented Sep 2, 2023

Thanks for the links.

Saverio

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants