# Docking with AutoDock Vina

## Preparing for Docking: Writing PDBQT Files

AutoDock Vina needs a special file type called a PDBQT.
We will prepare our PDBQT for our protein using OpenBabel.

The following code block creates a folder for our docking structures, then uses OpenBabel to read in our
protein structure with hydrogens. **Note** This is the PDB file that was output from PDB2PQR.
If you do not have that file, please complete the `binding_site_investigation` notebook.
After we read in the file, we make sure the atoms all have partial charges assigned, then write 
the PDBQT file.

If you didn't complete that notebook, add the following to a cell before you continue with the rest of the notebook.

```python
import os # for making directories
import requests

import MDAnalysis as mda

# make a directory for pdb files
os.makedirs("structures", exist_ok=True)

pdb_id = "2zq2" # trypsin PDB file with ligand bound

pdb_request = requests.get(f"https://files.rcsb.org/download/{pdb_id}.pdb")
pdb_request.status_code

with open(f"structures/{pdb_id}.pdb", "w+") as f:
    f.write(pdb_request.text)

# isolate the protein using MDAnalysis
u = mda.Universe(f"structures/{pdb_id}.pdb")
protein = u.select_atoms("protein")
protein.write(f"structures/protein_{pdb_id}.pdb")

! pdb2pqr --pdb-output=structures/protein_h.pdb --pH=7.4 structures/protein_2zq2.pdb structures/protein_2zq2.pqr
```

In [None]:
import os
from openbabel import pybel

os.makedirs("docking_structures", exist_ok=True)

protein = next(pybel.readfile("pdb", "structures/protein_h.pdb"))
for atom in protein.atoms:
    charge = atom.OBAtom.GetPartialCharge()

# Write protein to new PQDQT file
protein.write("pdbqt", f"docking_structures/protein_h.pdbqt", overwrite=True, opt={"r": None})

When preparing small molecule PDBQT files, you could have also chosen to use OpenBabel.
However, we are going to use a special program for small molecules and docking called [meeko](https://github.com/forlilab/Meeko).
We choose to use meeko for our ligands because it will allow us to more easily visualize our results later.

We are using the command line for meeko, similar to PDB2PQR. 
You could also choose to use the Python API for this, but the command line is simpler for common tasks like converting an SDF to a PDBQT.

In the cell below, we execute a command that converts our file`ligands_to_dock/13U.sdf` in that we prepared in the `molecule_manipulation` notebook to a PDBQT file.
We are saving this PDBQT file in a folder called `docking_structures`.

In [None]:
# Use meeko to prepare small molecules - using meeko helps us visualize them later.
! mk_prepare_ligand.py -i ligands_to_dock/13U.sdf -o docking_structures/13U.pdbqt

## Preparing for Docking: Defining Ligand Box

When we dock our ligands to our protein, we will want to define the binding pocket and the binding box.
For this example, we will load in our structure from the PDB that we visualized in `binding_site_investigation`.

We will use MDAnalysis tools to find the center of the ligand, as well as its min and maximum coordinates in all directions in order to define our target ligand box for docking.

In [None]:
# find the center of the ligand
import MDAnalysis as mda

original_structure = mda.Universe("structures/2zq2.pdb")
ligand_mda = original_structure.select_atoms("resname 13U")

# Get the center of the ligand as the "pocket center"
pocket_center = ligand_mda.center_of_geometry()
print(pocket_center)

In [None]:
# compute min and max coordinates of the ligand
# take the ligand box to be the difference between the max and min in each direction.
ligand_box = ligand_mda.positions.max(axis=0) - ligand_mda.positions.min(axis=0)
ligand_box

The `pocket_center` and `ligand_box` variables are NumPy arrays.
However, AutoDock Vina expects them to be lists.
We convert them to lists in the cell below.

In [None]:
pocket_center = pocket_center.tolist()
ligand_box = ligand_box.tolist()

## Docking Ligands with AutoDock Vina

Now that we have PDBQT files of our protein and ligand and have defined our docking box, we are ready to perform the actual docking.
Before docking, we will make a directory to store our results.

In [None]:
# make a directory to store our results
import os

os.makedirs("docking_results", exist_ok=True)

We will dock using the AutoDock Vina Python API.
First, we import `Vina` from `vina`.
We start docking with the line `v = Vina(sf_name="vina")`. 
This creates a docking calculation, `v`, and sets the scoring function to the `vina` scoring function.

In [None]:
from vina import Vina
ligand = "13U"

v = Vina(sf_name="vina")
v.set_receptor(f"docking_structures/protein_h.pdbqt")
v.set_ligand_from_file(f"docking_structures/{ligand}.pdbqt")
v.compute_vina_maps(center=pocket_center, box_size=ligand_box)
v.dock(exhaustiveness=5, n_poses=5)
v.write_poses(f"docking_results/{ligand}.pdbqt", n_poses=5, overwrite=True)

We can see the energies of the calculated poses by calling `energies` on the docking calculation variable.
According to the Vina documentaiton, the rows correspond to the poses, while columns correspond to different energy types.
The types of energies in the columns are `["total", "inter", "intra", "torsions", "intra best pose"]`.

In [None]:
v.energies()

You might wish to save these energies to return to them later. 
The cell below creates a pandas dataframe and saves the energies as a comma-separated-value (CSV) file.

In [None]:
import pandas as pd


# These are the columns for the types of energies according to AutoDock Vina docs.
column_names = ["total", "inter", "intra", "torsions", "intra best pose"]

df = pd.DataFrame(v.energies(), columns=column_names)
df.head()

In [None]:
# Save the calculated energies from docking to a CSV file
df.to_csv("docking_results/13U_energies.csv", index=False)

## Analyzing Docking Results

After performing the docking simulation and saving the energies, you might wish to visualize the poses.

In the step above, we wrote the poses to the file `docking_results/13U.pdbqt`. 
AutoDock Vina only writes in this file, but in order to visualize your results, you'll want them in a more standard file format.
We will use meeko again to convert our poses to an SDF.
Note that meeko will only convert pdbqt files if it prepared the input docking files.
That's why we had to use it above instead of OpenBabel.

Again, we use a command line script to convert out poses.

In [None]:
! mk_export.py docking_results/13U.pdbqt -o docking_results/13U.sdf

After converting to SDF, we can again visualize our results with ProLIF.

In [None]:
import prolif as plf
import MDAnalysis as mda

In [None]:
protein = mda.Universe("structures/protein_h.pdb")
protein_plf = plf.Molecule.from_mda(protein)

poses_plf = plf.sdf_supplier("docking_results/13U.sdf")

In [None]:
fp = plf.Fingerprint()
# run on your poses
fp.run_from_iterable(poses_plf, protein_plf)

In [None]:
fp.plot_barcode(xlabel="Pose")

In [None]:
pose_index=1

In [None]:
fp.plot_lignetwork(poses_plf[pose_index])

In [None]:
view = fp.plot_3d(
    poses_plf[pose_index], protein_plf, frame=pose_index, display_all=False
)
view

## Exercise

Dock one of the modified ligands