# Docking with AutoDock Vina

<div class="alert alert-block alert-info"> 
<h2>Overview</h2>
    
<strong>Questions</strong>

* How can I dock a ligand using AutoDock Vina?
* How can I visualize docking structures and ligand interactions?

<strong>Learning Objectives</strong>

* Learn how to define a ligand box for docking.
* Use AutoDock Vina to dock ligands.
* Analyze docking results with prolif.
</div>

In this notebook, we are going to dock the three ligands we generated previously notebook with [PDB entry 2zq2](https://www.rcsb.org/structure/2zq2), a trypsin structure from the cow *Bos taurus* that we retrieved and processed in the previous notebook.


Molecular docking is a kind of calculation used to determine binding modes of small molecules to protein targets. It is commonly used in fields such as drug development to design molecules to bind to enzymes or proteins for a therapeutic effect.

When a molecular docking calculation is performed, the docking software samples possible confirmations of a ligand in a target binding pocket. As these configurations are sampled, a score is calculated for each pose. There are many potential ways that poses might be scored, with the development of more accurate scoring functions being an area of active research and development. Types of scoring functions include physics based scoring functions (most often based on molecular forcefields), empirical scoring functions, machine learning scoring functions, etc. For a recent review of docking scoring functions, you can see [this publication](https://link.springer.com/article/10.1007/S12539-019-00327-W)


### Libraries for the IQB workshop

| Library         | abbreviation | Purpose |
|:-------------|:---------:|:------------|
| os           | N/A      | operating system functions - handling file paths and directories. |
| MDAnalysis     | mda | molecular dynamics library - used for reading/writing files and selecting atoms |
| vina | vina | AutoDock Vina software for Python and Jupyter notebooks |
| prolif | plf | ProLIF (Protein-Ligand Interaction Fingerprints) - generates interaction fingerprints for complexes made of ligands, protein, DNA or RNA molecules


## Preparing for Docking: Defining a Ligand Box

When we dock our ligands to our protein, we will want to define the binding pocket and the binding box. Luckily MDAnalysis has tools that we can use to measure our molecule and define a binding box. 
The approach we will take in this notebook is to find the `center_of_geometry` of our ligand to define the center of our binding pocket.

In [None]:
# find the center of the ligand
import MDAnalysis as mda

original_structure = mda.Universe("protein_structures/2zq2.pdb")
ligand_mda = original_structure.select_atoms("resname 13U")

# Get the center of the ligand as the "pocket center"
pocket_center = ligand_mda.center_of_geometry()
print(pocket_center)

After defining the pocket center, we will define our ligand box.
One simple approach to this is to subtract the min and max of the ligand positions in each dimension.
In order to allow for ligand flexibility and potential interactions with nearby residues, we will add an additional five angstroms to each side of our box.

In [None]:
# compute min and max coordinates of the ligand
# take the ligand box to be the difference between the max and min in each direction.
ligand_box = ligand_mda.positions.max(axis=0) - ligand_mda.positions.min(axis=0) + 5
ligand_box

The `pocket_center` and `ligand_box` variables are NumPy arrays.
However, AutoDock Vina expects them to be lists.
We convert them to lists in the cell below.

In [None]:
pocket_center = pocket_center.tolist()
ligand_box = ligand_box.tolist()

## Docking Ligands with AutoDock Vina

Now that we have PDBQT files of our protein and ligand and have defined our docking box, we are ready to perform the actual docking.
Before docking, we will make a directory to store our results.

In [None]:
# make a directory to store our results
import os

pdb_id = "2zq2"

os.makedirs("docking_results", exist_ok=True)

We will dock using the AutoDock Vina Python API.
First, we import `Vina` from `vina`.
We start docking with the line `v = Vina(sf_name="vina")`. 
This creates a docking calculation, `v`, and sets the scoring function to the `vina` scoring function.

<div class="alert alert-block alert-success">
<strong>Scoring Functions in AutoDock Vina</strong>

* Vina (`vina`): `vina` is an empirical scoring function. Binding energy is predicted as the sum of pairwise atomic interactions. It includes terms for hydrogen bonds, hydrophobic interactions, and steric clashes. The parameters for this scoring function were empirically derived from fitting data available in the PDBbind database. You can read more in the [original publication](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3041641/), or in the [Vinardo paper](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4865195/)
* Vinardo (`vinardo`): Vinaro stands for "Vina RaDii Optimized". It was developed to improve the scoring by adjusting atom radii and reparameterizing the empirical terms based on the PDBBIND 2013 database. You can read more in the [Vinardo paper](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4865195/).
* AutoDock4 (`ad4`):  uses a physics-based model and is the most computationally intensive of the available scores. The `ad4` score requires the definition of a flexible receptor, so it won't work with the PDBQT we have prepared. If you are interested in flexible docking, see [the tutorial from AutoDock Vina](https://autodock-vina.readthedocs.io/en/latest/docking_flexible.html). <strong>If you try to use the `ad4` scoring function on a receptor that was not prepared to be a flexible receptor, your notebook kernel will crash.</strong>

</div>

In [None]:
from vina import Vina
ligand = "13U"

v = Vina(sf_name="vina")

Then, we set the files for our ligand and receptor. We will dock just our ideal ligand first. There are two parameters to docking, the `exhaustiveness` and `n_poses`.
The exhaustiveness parameter describes the "exhaustiveness" of the docking - a higher exhaustiveness means taht more ligand conformations are tried. Exhaustiveness also corresponds to the amount of computational effort used during a docking experiment. The default exhaustiveness value is 8; increasing this to 32 will give a more consistent docking result. 

In this notebook, we set the exhaustiveness to 5 to improve speed for the workshop. If you were to do a real docking calculation, you should consider increasing this parameter.

In [None]:
v.set_receptor(f"pdbqt/{pdb_id}.pdbqt")
v.set_ligand_from_file(f"pdbqt/{ligand}.pdbqt")
v.compute_vina_maps(center=pocket_center, box_size=ligand_box)
v.dock(exhaustiveness=5, n_poses=5)

After the `dock` function, we can write the poses that were calculated to a file.

In [None]:
v.write_poses(f"docking_results/{ligand}.pdbqt", n_poses=5, overwrite=True)

We can see the energies of the calculated poses by calling `energies` on the docking calculation variable.
According to the Vina documentaiton, the rows correspond to the poses, while columns correspond to different energy types.
The types of energies in the columns are `["total", "inter", "intra", "torsions", "intra best pose"]`.

In [None]:
v.energies()

You might wish to save these energies to return to them later. 
The cell below creates a pandas dataframe and saves the energies as a comma-separated-value (CSV) file.

In [None]:
import pandas as pd


# These are the columns for the types of energies according to AutoDock Vina docs.
column_names = ["total", "inter", "intra", "torsions", "intra best pose"]

df = pd.DataFrame(v.energies(), columns=column_names)
df.head()

In [None]:
# Save the calculated energies from docking to a CSV file
df.to_csv("docking_results/13U_energies.csv", index=False)

## Analyzing Docking Results

After performing the docking simulation and saving the energies, you might wish to visualize the poses.

In the step above, we wrote the poses to the file `docking_results/13U.pdbqt`. 
AutoDock Vina only writes in this file, but in order to visualize your results, you'll want them in a more standard file format.
We will use meeko again to convert our poses to an SDF.
Note that meeko will only convert pdbqt files if it prepared the input docking files.
That's why we had to use it above instead of OpenBabel.

Again, we use a command line script to convert out poses.

In [None]:
! mk_export.py docking_results/13U.pdbqt -o docking_results/13U.sdf

After converting to SDF, we can again visualize our results with ProLIF.

In [None]:
import prolif as plf
import MDAnalysis as mda

In [None]:
pdb_id = "2zq2"

protein = mda.Universe(f"protein_structures/protein_h.pdb")
protein_plf = plf.Molecule.from_mda(protein)

poses_plf = plf.sdf_supplier("docking_results/13U.sdf")

In [None]:
fp = plf.Fingerprint()
# run on your poses
fp.run_from_iterable(poses_plf, protein_plf)

In [None]:
pose_index=1

In [None]:
fp.plot_lignetwork(poses_plf[pose_index])

In [None]:
view = fp.plot_3d(
    poses_plf[pose_index], protein_plf, frame=pose_index, display_all=False
)
view

<div class="alert alert-block alert-warning"> 
<h3>Exercise</h3>

Try docking one of the ligands we modified in the previous notebook. Does it bind better or worse according to the docking score? Are the interactions different for the poses?
</div>