# Docking a set of smiles with AutoDock Vina

Ryan L. Melvin

23 May 2016

Docking software allows for quick (compared to molecular dynamics docking simulation) estimates of complexes of a receptor and ligand (e.g., protein and drug-link compound). To  identify potential candidates for drug discovery, ligand libraries must be used. One such examples is ZINC[1-2] Once a library or subset thereof has been selected, docking software such AutoDock Vina[3] can provide a quick, efficient calculation of binding sites and affinities.

In [None]:
# Python modules needed:
import subprocess
from shutil import move
import os
import re

First, select the library (or subset thereof) and provide it as a set of smiles strings

In [None]:
# Assuming you have the library to be docked in smiles strings
library = '23_t90.smi'

You'll need a receptor. 

In [None]:
receptor = 'receptor.pdb'

Let's count how many drug-like compounds are in the provided library

In [None]:
with open(zinc_library, 'r+') as smiles_strings:
    num_drugs = sum(1 for line in smiles_strings)

Now, we need some software. Specify where babel is. 

In [4]:
babel = 'path/to/babel'

We also need the MGL tools for vina

In [None]:
MGLToolsFolder = '/home/luy/MGLTools-1.5.4'
ligand_prep_script = os.path.join(MGLToolsFolder,'MGLToolsPckgs','AutoDockTools','Utilities24','prepare_ligand4.py')
receptor_prep_script = os.path.join(MGLToolsFolder,'MGLToolsPckgs','AutoDockTools','Utilities24','prepare_receptor4.py')

Let's prepare the receptor

In [None]:
receptor_resting_place = receptor + 'qt'
receptor_prep_command = (
            receptor_prep_script + ' -r '
            + receprot + ' -o' + receptor_resting_place
            )
    subprocess.call(ligand_prep_command, shell=True)

Now, we're going to have to do a bunch of once. The following script assumes you're using a distributed computer environment with slurm. I use a helper file that takes care of all the slurm details, whose code I provide at the end of this document. I'll also give an example smiles string at the end.

In [None]:
for drug_num in range(1, num_drugs + 1):
    # Convert the smile string for drug i.
    babel_command = (
            babel +  ' -i ' + library + ' -O drug{0}.pdb -f {0} -l {0} --gend3d''.format(drug_num)
            )
    subprocess.call(babel_command, shell=True)
    # We've made a pdb; let's grab its path.
    pdb = 'drug{0}.pdb'.format(drug_num)
    
    # If you're using ZINC, you'll get a compound ID as part of the PDB. Let's get that.
    with open(pdb,'r') as fobj:
        text = fobj.read()
    compound = re.findall('^COMPND\s+(ZINC\d+)', text)[0]
    
    # There are going to be a lot of drugs. Let's make a folder for each
    cwd = os.getcwd()
    compound_dir = os.path.join(cwd,compound)
    if not os.path.exists(compound_dir):
        os.makedirs(compound_dir)
        
    # Move the ith pdb there.
    pdb_path = os.path.join(cwd, pdb)
    pdb_resting_place = os.path.join(compound_dir, pdb)
    move(pdb_path,pdb_resting_place)
    
    # Prepare the ith ligand
    pdbqt_resting_place = pdb_resting_place + 'qt'
    ligand_prep_command = (
            ligand_prep_script + ' -l '
            + pdb_resting_place + ' -o' + pdbqt_resting_place
            )
    subprocess.call(ligand_prep_command, shell=True)

    dock_helper = os.path.join(cwd, 'docking_submit.slurm')
    
    # Run docking
    submit_dock_command = (
            'sbatch --export=x=' + receptor_resting_place + ',y=' + pdbqt_resting_place ' ' + dock_helper
            )
    docking_job = subprocess.Popen(submit_F10site_command, shell=True, cwd=compound_dir, stdout=subprocess.PIPE)
    out, err = docking_job.communicate()
    
    # Let's record what compound goes with what job in case something goes wrong.
    jobid = out.split(' ')[3]
    with open('jobs.txt', 'a+') as log:
        log.write(compound + '\t' + jobid)


## Citations
[1] Irwin JJ, Shoichet BK (2005) ZINC - a free database of commercially available com- pounds for virtual screening. J Chem Inf Model 36:177–182. doi:10.1002/chin. 200516215

[2] Irwin JJ, Sterling T, Mysinger MM et al (2012) ZINC: a free tool to discover chemis- try for biology. J Chem Inf Model 52:1757– 1768. doi:10.1021/ci3001277

[3] Trott O, Olson AJ (2010) Software news and update AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multi- threading. J Comput Chem 31:455–461. doi:10.1002/jcc

## Supplementary information
Docking helper and smiles example

### docking_submit.slurm
```bash
#!/bin/bash -l
#SBATCH --partition=small
#SBATCH --mail-type=END,FAIL
#SBATCH --mail-user=email@wfu.edu
#SBATCH --account=group
#SBATCH --nodes=1
#SBATCH --tasks-per-node=8
#SBATCH --mem=16gb
#SBATCH --time=0-03:00:00
module load vina/1.1.2-intel-2012
vina --receptor {x} --ligand ${y}  --center_x 33.6 --center_y -8.4  --center_z 22.6 --size_x 13 --size_y 10 --size_z 14 --log docking.log
module unload vina/1.1.2-intel-2012
exit
```

### Smiles example
Cc1cc(no1)NC(=O)CCn2cnc3c(c2=O)cnn3C	ZINC54722086