-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error parsing PDBQT to Mol: Element 'G' not found #20
Comments
Hello! try this:
To parse PDBQT block by RDKit you actually don't need atom_pdbqt_type (symbols which start from 77 position) since we only need up to 66 symbols to read PDBQT as PDB block ( Chem.MolFromPDBBlock('\n'.join([i[:66] for i in pdbqt_fixed.split('MODEL')[1].split('\n')]), removeHs=False, sanitize=False)). So in your case you can easily just ignore the symbols from 77 position and change G to C in the atom type column ([12:16]). I hope it will help you! |
Hello @avnikonenko thanks for your prompt response. I tried what you suggested, and now RDKit can parse the molecule from PDBBlock. Still, it fails when I try to assign the bond orders (see below for the error message). I found out that in the parsed molecule, a 7-membered ring broke incorrectly. This is the parsed molecule before assigning bond orders, note the 7-membered ring on the left has broken: error message:
|
This would be a valuable fix for the script. However, this vina_dock.py script is obsolete and it will be removed in some future revisions of the repository. Now I suggest to use this repo https://github.com/ci-lab-cz/docking-scripts, where we implemented support not only VIna2 but also gnina and smina (smina is invoked from gnina code). We continually support this repo. If you will be able to fix issue there it will be helpful (ci-lab-cz/easydock#8). However, it will require more changes, because meeko changed interface of some of its functions we use. |
hello Dr Pavel,
I chanced across your repo and found it useful to parse docked pdbqt files back to RDKit mol for further analysis. However, sometimes the
pdbqt2mol()
function fails as RDKit is not happy with the "G" atom type.rdkit-scripts/vina_dock.py
Line 200 in 72dbd73
giving:
This "G" atom type is generated by
meeko
at macrocycles during ligand preparation for docking with Autodock Vina, see: forlilab/Meeko#10I saw in your in-line code comments about this issue, but I didn't manage to modify the
fix_pdbqt()
function successfully:rdkit-scripts/vina_dock.py
Line 169 in 72dbd73
Please note that I didn't use your script to run docking; it was prepared slightly differently. I am just using your code to parse back the docked .pdbqt files
I also saw your comment on
build_macrocycle=
:rdkit-scripts/vina_dock.py
Line 151 in 72dbd73
Do you have any idea how to fix this issue?
I've attached a sample offending
.pdbqt
file: https://gist.github.com/linminhtoo/5949437ae066fdd136709971dcc36220#file-bad-pdbqt-L26-L27As you can see, some lines have either "G" (macrocycle), and also "CG0" (not sure if this will also cause problems). I tried brute forcing by replacing "G" with "C" but the template bond order assignment step failed (seems RDKit fails to parse the ring correctly)
Thanks,
Min Htoo
The text was updated successfully, but these errors were encountered: