Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mk_prepare_ligand.py Issue #76

Open
heqing-chriscao opened this issue Nov 10, 2023 · 6 comments
Open

mk_prepare_ligand.py Issue #76

heqing-chriscao opened this issue Nov 10, 2023 · 6 comments

Comments

@heqing-chriscao
Copy link

Hello,

When I use "mk_prepare_ligand.py" to prepare a ligand, usually one of the two following problems occur:

"CompletedProcess(args='mk_prepare_ligand.py -i 1exd_aptamer_Model_1H.sdf -o 1exd_aptamer_Model_1H.pdbgt', returncode=0, stdout='', stderr='[20:26:47] Explicit v alence for atom # 549 H, 2, is greater than permitted\n[20:26:47] ERROR: Could not sanitize molecule ending on line 4891\n[20:26:47] ERROR: Explicit valence for atom # 549 H, 2, is greater than permitted\n')

"CompletedProcess(args='mk_prepare_ligand.py -i 1rpu_aptamer_Model_1H.sdf -o 1rpu_aptamer_Model_1H.pdbqt', returncode=0, stdout-'', stderr-'molecule has implicit hydrogens (name=1rpu_aptamer_Model_1H)\n\n')"

Before using mk_prepare_ligand.py, I use Reduce to add hydrogens to the ligand as suggested by AutoDock Vina, and Pymol to convert it to sdf file. I think something off happens during the first step as all errors are due to newly added hydrogens. I wonder if it is the problem of Reduce. If this is the case, are there any better alternatives for adding hydrogen?

The two erroneous files and their hydrogen-added version are attached here. Erroneou Files.zip

Any insight is appreciated. Thank you so much in advance.

@diogomart
Copy link
Contributor

Hi,

The usual ligand for autodock is a drug-like molecule, typically with much less than 50 heavy atoms. These are nucleic acids, so much larger than the typical ligand. We are re-designing the receptor preparation and I think it could be helpful to parameterize these nucleic acids and still try to dock them rigidly. I'll try to remember to get back here once that is released, but I'll probably forget, and then I'll rediscover this when I go over open issues, so feel free to ping back here in about a couple weeks if you are still interested.

As for the errors you are getting, the first is at the RDKit level - there's a hydrogen with two bonds, and the second is meeko rejecting an RDKit molecule with implicit Hs.

@rwxayheee
Copy link

Hi @heqing-chriscao,
Another problem is that 1rpu_aptamer_Model.pdb includes conformations with partial occupancies, which lead to duplicated heavy atoms. When there are two conformations, reduce tries to protonate both and create duplicate hydrogens. I recommend splitting the two conformations (A and B) before running reduce on them. And see if the generated SDF can pass the valence check.

@heqing-chriscao
Copy link
Author

Hi @diogomart,

Thank you for your reply and for your information about the re-design of the receptor preparation. Does it mean, for now, our docking results are not accurate as our ligands are much larger than meeko expected?

For meeko rejecting an RDKit molecule with implicit Hs, are there any ways to fix the implicit hydrogens? I think after adding H's correctly, there shouldn't be any implicit hydrogens. Do you have any suggestions on the methods of adding hydrogens before using meeko?

Thank you so much.

@heqing-chriscao
Copy link
Author

@rwxayheee

Thank you for pointing that out. I have checked some of the files that have the valance issue, and I find most of them indeed have multiple conformations. I will split them before docking them.

For the implicit H's issue, will you get the same error after using reduce?

Thanks.

@rwxayheee
Copy link

rwxayheee commented Nov 12, 2023

Hi @heqing-chriscao,

For the implicit H's issue, will you get the same error after using reduce?

I repeated your way of making the SDF file, and I found the implicit Hs. It is possible to generate the PDBQT file after correcting the charges (or add hydrogens if you prefer to).

Reduce did not protonate the 5' end phosphate groups, so for 1rpu_aptamer_Model, which is a double-strand RNA (21 nucleotides + 21 nucleotides), the total net charge should be -44 ( -1 for each phosphodiester bond and -2 for the terminal phosphate). However, in PyMOL's SDF file, only 42 are noted with CHG=-1, one from each phosphodiester bond or phosphate group.

Meeko will accept a correctly charged SDF file or RDKit molecule. You can do this by manually correct the SDF file or assign formal charge in RDKit:

import meeko, rdkit
from rdkit import Chem
from rdkit.Chem import rdmolops
from meeko import MoleculePreparation
from meeko import PDBQTWriterLegacy

# referece: https://gist.github.com/greglandrum/7f546d1e35c2df537c68a64d887793b8
def add_formal_charges(m):
    m.UpdatePropertyCache(strict=False)
    for at in m.GetAtoms():
        # [O-]
        if at.GetAtomicNum() in [8] and at.GetExplicitValence()==1 and at.GetFormalCharge()==0:
            at.SetFormalCharge(-1)

AH_from_sdf = Chem.SDMolSupplier('1rpu_aptamer_Model_AH.sdf', removeHs=False)[0]
print("Formal Charge from Input:", rdmolops.GetFormalCharge(AH_from_sdf)) # will get -42

# Correct charge and update valence properties
add_formal_charges(AH_from_sdf)
AH_from_sdf.UpdatePropertyCache(strict=False)

print("Formal Charge after Correction:", rdmolops.GetFormalCharge(AH_from_sdf)) # will get -44

# Generate ligand pdbqt string
preparator = MoleculePreparation()
mol_setups = preparator.prepare(AH_from_sdf)
for setup in mol_setups:
    pdbqt_string = PDBQTWriterLegacy.write_string(setup)[0]

# Write pdbqt string to file
with open('1rpu_aptamer_Model_AH.pdbqt','w') as fw:
    for line in pdbqt_string:
        fw.write(line)

But.! I agree with Diogo that it might not be feasible to dock a ligand with this many free torsions. Preparing it as a receptor (and dock rigidily) is also possible if you wish. I attached my files in case you're still interested in making a ligand PDBQT and seeing how many torsions it might have. Hope they are still helpful..

Input
1rpu_aptamer_Model_AH.sdf.txt
Output
1rpu_aptamer_Model_AH.pdbqt.txt

ps. I split A/B before running reduce

@heqing-chriscao
Copy link
Author

Thank you so much @rwxayheee!

The code you given is working well on my end. I will try to learn to how to mannually fix the other files with the same issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants