# **1. Theory**

## Contents

- Molecular docking
- Sampling algorithms
- Scoring functions
- Limitations
- Visual inspection
- Docking software
  - Commercial
  - Free (for academics)

## 1.1. Theoretical aspects

As more and more protein structures are determined experimentally using X-ray crystallography or nuclear magnetic resonance (NMR) spectroscopy, molecular docking is increasingly used as a tool in **drug discovery**.

**Molecular docking simulations** explore the potential binding poses of small molecules on the **binding site** of a target protein for which an experimentally determined structure is available. 
Docking against protein targets generated by **comparative modelling** also becomes possible for proteins whose structures are yet to be solved.

Thus, the **_druggability_** of different compounds and their binding affinity on a given protein target can be calculated for further lead optimization processes.

<figure>
<center>
<img src='https://raw.githubusercontent.com/pb3lab/ibm3202/master/images/docking_01.png'/>
<figcaption>FIGURE 1. In molecular docking, binding is evaluated in two steps: A) Energetics of the transition of the unbound states of ligand and target towards the conformations of the bound complex; and B) energetics of protein-ligand binding in these conformations. <br> Huey R et al (2007) <i>J Comput Chem 28(6), 1145-1152.</i></figcaption></center>
</figure>

Molecular docking programs perform a **search algorithm** in which varying conformations of a given ligand, typically generated using Monte Carlo or Genetic algorithms, are recursively evaluated until convergence to an energy minimum is reached. Finally, through an **affinity scoring function**, a ΔG [binding free energy in kcal/mol] is estimated and employed to rank the candidate poses as the sum of several energetic contributions (electrostatics, van der Waals, desolvation, etc).

A **molecular docking workflow** usually involves the following steps (Fig. 2):

- Input file preparation, e.g. protonation and conversion into specific file formats  
- Conformational sampling of the ligand inside the binding pocket  
- Scoring of the generated docking poses  
- Post-processing, e.g. storing a diverse and highly scored set of docking poses for further analysis

![Docking workflow](https://drive.google.com/uc?id=1K1FDhM8k8rRwokDL_lrIQQS8z2E9FtUZ)  

__FIGURE 2:__ Molecular docking workflow, _Michele Wichmann_ and _David Schaller_

## 1.2. Sampling algorithms

Most of the currently available molecular docking tools use one or more of the following algorithms to sample the conformations of ligands inside a protein binding pocket ([_Curr Comput Aided Drug Des_ (2011), __7__, 2, 146-157](https://doi.org/10.2174/157340911795677602)).

* **Matching algorithms (MA)** compare the shape similarity of ligand conformations and the protein binding pocket, which usually also includes chemical information, e.g. hydrogen bond acceptors and donors. Programs using MA for sampling conformations belong usually to the fastest docking programs. However, these programs require a prior computation of ligand conformations that are used during shape comparison. If the biologically relevant conformation is not present in this library, the algorithm will fail.
* In the **incremental instruction** method, the ligand is first deconstructed into smaller fragments by breaking its rotatable bonds. One of the fragments, for example the biggest one, is placed first into the binding pocket. Subsequently, the complete ligand is incrementally constructed inside the binding pocket by connecting the remaining fragments at the appropriate positions of the core fragment. Incremental construction belongs to the fastest class of algorithms but is limited to medium-sized ligands, since an increasing number of fragments can lead to a combinatorial explosion that can extremely slow down the docking calculation.
* **Stochastic methods** sample ligand conformations by rigid-body rotation and translation as well as bond rotation.
    * **Monte Carlo methods** generate random placements and evaluate obtained conformations inside the protein binding pocket with an energy-based selection criterion. If the pose passes a certain threshold, the conformation is saved and subsequently randomly modified to generate another conformation. This process is performed until a pre-defined number of conformations is reached.
    * **Genetic algorithms** are inspired by the natural selection concepts borrowed from Darwin's _Theory of evolution_. Geometric properties of a ligand pose are thereby stored on so-called _chromosomes_, which define the conformation of the ligand. Genetic operations like mutation and cross-over are used to sample the conformational space of the ligand. A scoring function is then used to estimate the quality of the conformations inside the binding pocket. Populations with genetic alterations with the highest scores are finally used for a new generation, which resembles Darwin's concept of the _survival of the fittest_.

## 1.3. Scoring functions

Scoring functions are used to discriminate correct from incorrect docking poses, i.e., to prioritize active over inactive molecules. They need to be as accurate as possible, and at the same time should require little computing time. Thus, they involve many assumptions and simplifications to reduce computational costs ([_Curr Comput Aided Drug Des_ (2011), __7__, 2, 146-157](https://doi.org/10.2174/157340911795677602), [_Int J Mol Sci_, (2021), __22__, 9, 1-34](https://doi.org/10.3390/ijms22094435), [_J Cheminform_ (2021), __13__, 43, 13-43](https://doi.org/10.1186/s13321-021-00522-2)).

* **Force field-based** scoring functions estimate the binding energy by calculating the strength of non-bonded interactions (e.g. van der Waals force, electrostatic interactions) of the protein-ligand complex. Extensions of these methods also include estimates for entropy and desolvation penalties upon ligand binding. A disadvantage of these scoring functions is their comparably high computational cost.
* **Empirical** scoring functions are based on coefficients that are used to estimate the contributions of different interaction types, e.g., hydrogen bonds, ionic interactions, hydrophobic contacts. These coefficients were obtained from a regression analysis of protein-ligand complexes with known binding affinity.
* **Knowledge-based** scoring functions integrate results from a statistical analysis of experimentally resolved protein-ligand complexes, which gathered information about interatomic contact frequencies and distances observed between protein and ligand. Docking poses will be scored higher if they show contact characteristics that were often observed in the statistical analysis.
* **ML/DL-based** scoring functions are machine learning (ML)/deep learning (DL) models that were trained on a set of available protein-ligand complexes with known binding affinity. The protein-ligand complexes are thereby encoded in a computer-readable format, e.g. as interaction fingerprints or graph. Such scoring functions can be applied during post-processing to rank hit compounds more accurately or during the pose evaluation step.

## 1.4. Limitations

* Docking programs can consider some residue sidechains flexible during docking calculations to account for binding pocket flexibility. However, the dynamic, adaptive nature of the protein-ligand binding is insufficiently explored by protein-ligand docking. This can result in false positives: Even if the ligand finds a suitable pose in the binding pocket, this position is not guaranteed until the protein is allowed to explore near-minima conformations. Hence, short molecular dynamics simulations are nowadays recommended to evaluate the stability of the predicted pinding pose ([_Curr Comput Aided Drug Des_ (2011), __7__, 2, 146-157](https://doi.org/10.2174/157340911795677602)).  
* Scoring functions used by docking programs must be cheap to compute. While the accuracy is good enough to distinguish good poses from bad poses, it can have problems sorting the best poses. For example, while most popular docking programs are able to find the experimental pose in their calculations, this pose is rarely the best one of the proposed set. Furthermore, several retrospective studies have shown that docking scores often poorly correlate with binding affinity ([_J Med Chem_ (2006), __49__, 20, 5912-31](https://doi.org/10.1021/jm050362n), [_Phys Chem Chem Phys_ (2016), __18__, 18, 12964-75](https://doi.org/10.1039/c6cp01555g)).  
* To reduce the computational cost, docking procedures are only performed in a subset of the protein (normally around a known binding pocket). Choosing the correct binding site can become another challenge, if the binding pocket is not known a priori.  
* To maximize the accuracy of the calculation, the ligand and protein structures must be prepared appropriately. Protonation states of amino acids and the ligands can be tricky to get right, especially in the case of (potential) tautomers. This introduces yet another cause to obtain unfavorable results.

## 1.5. Visual inspection

Due to the afore mentioned limitations of sampling algorithms and docking scoring functions, a visual inspection is commonly performed in most docking scenarios. Interestingly, a survey revealed that molecular modeling experts find docking scores to be the least important criterion for selecting docking poses ([_J Med Chem_ (2021), __64__, 5, 2489–2500](https://doi.org/10.1021/acs.jmedchem.0c02227)). Instead, the following criteria are used when selecting docking poses:  

* Similarity to binding modes observed in available crystal structures of the protein of interest  
* Steric as well as electrostatic and hydrophobic complementary of ligand and protein  
  * Unsatisfied hydrogen bond donor and acceptor groups in ligand and protein  
  * Solvent exposed hydrophobic ligand moieties  
* Interactions with side chains or metal ions critical for protein function, e.g., enzymatic activity
* Interaction partners and localization of hydrogen bonds
  * Hydrogen bonds formed with the protein backbone in an enclosed hydrophobic protein environment are considered more favorable  
  * Hydrogen bonds with charged and solvent exposed protein side chains are considered less favorable  
* Displacement of or interactions with water molecules in the binding pocket  
* Protein and ligand strain induced by ligand binding  

However, also the visual inspection of docking poses has considerable limitations. Of course the success of visual inspection strongly depends on the experience and intuition of the participating scientists. Also, visual inspection can only be performed for a rather small number of molecules considering the available chemical space in virtual screening. Hence, visual inspection is often restricted to the highest scoring molecules of a virtual screening pipeline or only performed for a smaller set of ligands, in which scientists are particularly interested. Finally, also the best expert cannot pick the correct binding pose if it was not sampled by the docking program.

## 1.6. Docking software

In the following, several docking programs are exemplary listed and categorized according to the availability of free licenses. A more comprehensive list can be found at [Wikipedia](https://en.wikipedia.org/wiki/List_of_protein-ligand_docking_software).

**Commercial**

* [GOLD](https://www.ccdc.cam.ac.uk/solutions/csd-discovery/components/gold/)
* [Glide](https://www.schrodinger.com/glide)
* [FlexX](https://www.biosolveit.de/FlexX/)

**Free (for academics)**

* [AutoDock](http://autodock.scripps.edu/)
* [AutoDock Vina](http://vina.scripps.edu/)
* [DOCK](http://dock.compbio.ucsf.edu/)
* [OpenEye](https://www.eyesopen.com/)
* [Smina](https://sourceforge.net/projects/smina/)

<figure>
<center>
<img src='https://raw.githubusercontent.com/pb3lab/ibm3202/master/images/docking_02.png' />
<figcaption>FIGURE 3. General steps of molecular docking. First, the target protein and ligand or ligands are parameterized. Then, the system is prepared by setting up the search grid. Once the docking calculation is performed, ligand poses are scored based on a given energy function. Lastly, the computational search is processed and compared against experimental data for validation <br><i>Taken from Pars Silico (en.parssilico.com).</i></figcaption></center>
</figure>

## 1.7. References
- Molecular docking:
    - Pagadala _et al._, [_Biophy Rev_ (2017), __9__, 91-102](https://doi.org/10.1007/s12551-016-0247-1)
    - Meng _et al._, [_Curr Comput Aided Drug Des_ (2011), __7__, 2, 146-157](https://doi.org/10.2174/157340911795677602)
    - Gromski _et al._, [_Nat Rev Chem_ (2019), __3__, 119-128](https://doi.org/10.1038/s41570-018-0066-y)
- Docking and scoring function assessment:
    - Warren _et al._, [_J Med Chem_ (2006), __49__, 20, 5912-31](https://doi.org/10.1021/jm050362n)
    - Wang _et al._, [_Phys Chem Chem Phys_ (2016), __18__, 18, 12964-75](https://doi.org/10.1039/c6cp01555g)
    - Koes _et al._, [_J Chem Inf Model_ (2013), __53__, 8, 1893-1904](https://doi.org/10.1021/ci300604z)
    - Kimber _et al._, [_Int J Mol Sci_, (2021), __22__, 9, 1-34](https://doi.org/10.3390/ijms22094435)
    - McNutt _et al._, [_J Cheminform_ (2021), __13__, 43, 13-43](https://doi.org/10.1186/s13321-021-00522-2)
- Visual inspection of docking results: Fischer et al., [_J Med Chem_ (2021), __64__, 5, 2489–2500](https://doi.org/10.1021/acs.jmedchem.0c02227)

---
---
# **2. Install software**

Before we start, you must first **remember to start the hosted runtime in Google Colab**.

Then, we must install several pieces of software to perform this tutorial. Namely:

- [Miniconda](https://docs.conda.io/en/latest/miniconda.html): a free minimal installer of conda for software package and environment management.
- [OpenBabel](http://openbabel.org/wiki/Main_Page):  for parameterization of our ligand(s).
- Docking software
  - [Autodock Vina](https://autodock-vina.readthedocs.io/en/latest/): for the docking process
  - [QVina2](https://qvina.github.io/): for the docking process
  - [Smina](https://github.com/mwojcikowski/smina): for the docking process
  - [Vina GPU](https://github.com/DeltaGroupNJUPT/Vina-GPU): for the docking process with GPU
- [py3Dmol](http://nglviewer.org/nglview/latest/): for visualization of the protein structure and setting up the search grid. 
- [Autodock Tools](https://github.com/Valdes-Tresanco-MS/AutoDockTools_py3/tree/master): for parameterization of our target protein using Gasteiger charges.
- [pdb2pqr](https://www.cgl.ucsf.edu/chimera/docs/ContributedSoftware/apbs/pdb2pqr.html) :for parameterization of our protein using the AMBER ff99 forcefield.
- [biopython](https://biopython.org/) for manipulation of the PDB files
- [rdkit](https://www.rdkit.org/): cheminformatics package
* [PLIP](https://github.com/pharmai/plip): the Protein–Ligand Interaction Profiler [_Nucl. Acids Res._ (2015), __43__, W1, W443-W447](https://academic.oup.com/nar/article/43/W1/W443/2467865))

After several tests, the following installation instructions are the best way of setting up **Google Colab** for this laboratory session.



In [None]:
#@title **2.1. Miniconda**
!wget -c https://repo.anaconda.com/miniconda/Miniconda3-py37_4.12.0-Linux-x86_64.sh
!chmod +x Miniconda3-py37_4.12.0-Linux-x86_64.sh
!bash ./Miniconda3-py37_4.12.0-Linux-x86_64.sh -b -f -p /usr/local

import sys
sys.path.append('/usr/local/lib/python3.7/site-packages/')
!conda install -c conda-forge mamba -y

--2023-05-28 14:21:08--  https://repo.anaconda.com/miniconda/Miniconda3-py37_4.12.0-Linux-x86_64.sh
Resolving repo.anaconda.com (repo.anaconda.com)... 104.16.130.3, 104.16.131.3, 2606:4700::6810:8303, ...
Connecting to repo.anaconda.com (repo.anaconda.com)|104.16.130.3|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 104996770 (100M) [application/x-sh]
Saving to: ‘Miniconda3-py37_4.12.0-Linux-x86_64.sh’


2023-05-28 14:21:08 (188 MB/s) - ‘Miniconda3-py37_4.12.0-Linux-x86_64.sh’ saved [104996770/104996770]

PREFIX=/usr/local
Unpacking payload ...
Collecting package metadata (current_repodata.json): - done
Solving environment: | / done

## Package Plan ##

  environment location: /usr/local

  added / updated specs:
    - _libgcc_mutex==0.1=main
    - _openmp_mutex==4.5=1_gnu
    - brotlipy==0.7.0=py37h27cfd23_1003
    - ca-certificates==2022.3.29=h06a4308_1
    - certifi==2021.10.8=py37h06a4308_2
    - cffi==1.15.0=py37hd667e15_1
    - charset-normalizer==

In [None]:
#@title **2.2. Install pip package**
!pip install py3Dmol==2.0.0.post2
!pip install pybel==0.15.5
!pip install rdkit-pypi==2022.3.5
!pip install git+https://github.com/Valdes-Tresanco-MS/AutoDockTools_py3

import os
os.chdir('/content')
!git clone https://github.com/tieulongphan8995/dock_util.git

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting py3Dmol==2.0.0.post2
  Downloading py3Dmol-2.0.0.post2-py2.py3-none-any.whl (6.7 kB)
Installing collected packages: py3Dmol
Successfully installed py3Dmol-2.0.0.post2
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting pybel==0.15.5
  Downloading pybel-0.15.5-py3-none-any.whl (387 kB)
[K     |████████████████████████████████| 387 kB 4.8 MB/s 
[?25hCollecting pandas
  Downloading pandas-1.3.5-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.3 MB)
[K     |████████████████████████████████| 11.3 MB 73.1 MB/s 
[?25hCollecting tabulate
  Downloading tabulate-0.9.0-py3-none-any.whl (35 kB)
Collecting psycopg2-binary
  Downloading psycopg2_binary-2.9.6-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.0 MB)
[K     |████████████████████████████████| 3.0 MB 68.5 MB/s 
[?25hCollecting click
  Downloadi

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting rdkit-pypi==2022.3.5
  Downloading rdkit_pypi-2022.3.5-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (36.8 MB)
[K     |████████████████████████████████| 36.8 MB 49 kB/s 
[?25hCollecting Pillow
  Downloading Pillow-9.5.0-cp37-cp37m-manylinux_2_28_x86_64.whl (3.4 MB)
[K     |████████████████████████████████| 3.4 MB 71.7 MB/s 
Installing collected packages: Pillow, rdkit-pypi
Successfully installed Pillow-9.5.0 rdkit-pypi-2022.3.5
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting git+https://github.com/Valdes-Tresanco-MS/AutoDockTools_py3
  Cloning https://github.com/Valdes-Tresanco-MS/AutoDockTools_py3 to /tmp/pip-req-build-ttyav4wb
  Running command git clone -q https://github.com/Valdes-Tresanco-MS/AutoDockTools_py3 /tmp/pip-req-build-ttyav4wb
  Resolved https://github.com/Valdes-Tresanco-MS/AutoDockTool

In [None]:
#@title **2.3. Install Autodock vina and forks**

# autodock vina 1.2.0
!wget https://github.com/ccsb-scripps/AutoDock-Vina/releases/download/v1.2.0/vina_1.2.0_linux_x86_64 -O vina
!chmod u+x vina


# qvina2
%cd /content
!git clone https://github.com/QVina/qvina.git
%cd /content/qvina/bin
!chmod u+x qvina2.1
%cd /content



# 5. Smina
!wget https://downloads.sourceforge.net/project/smina/smina.static
!chmod u+x smina.static
!./smina.static --help

--2023-05-28 14:22:37--  https://github.com/ccsb-scripps/AutoDock-Vina/releases/download/v1.2.0/vina_1.2.0_linux_x86_64
Resolving github.com (github.com)... 140.82.112.3
Connecting to github.com (github.com)|140.82.112.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/258054635/58844daf-a594-48a8-99ce-bda336b96967?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20230528%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20230528T142238Z&X-Amz-Expires=300&X-Amz-Signature=4331f4c9ce4cca3cd55fc844ef3c9848323c00a87daa00526ba4528ea38673a4&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=258054635&response-content-disposition=attachment%3B%20filename%3Dvina_1.2.0_linux_x86_64&response-content-type=application%2Foctet-stream [following]
--2023-05-28 14:22:38--  https://objects.githubusercontent.com/github-production-release-asset-2e65be/258054635/58844daf-a594-48a8-

In [None]:
#@title **2.4. Install Vina-GPU**
import torch
if torch.cuda.is_available() == True:
  !lscpu |grep 'Model name'
  !ulimit -s 8192
  %cd /content
  !git clone https://github.com/DeltaGroupNJUPT/Vina-GPU.git

  %cd /usr/local
  !wget https://boostorg.jfrog.io/artifactory/main/release/1.80.0/source/boost_1_80_0.tar.gz
  !chmod 777 boost_1_80_0.tar.gz
  !tar -xzvf boost_1_80_0.tar.gz


  %cd /content/Vina-GPU
  import os

  # Read in the file
  with open('Makefile', 'r') as file :
    filedata = file.read()

  # Replace the target string
  filedata = filedata.replace('../boost_1_77_0', '/usr/local/boost_1_80_0')
  filedata = filedata.replace('-DOPENCL_3_0', '-DOPENCL_1_2')

  # Write the file out again
  with open('Makefile', 'w') as file:
    file.write(filedata)

  %cd /content/Vina-GPU
  !make clean
  !make source
  !./Vina-GPU --config ./input_file_example/2bm2_config.txt
  # Remake to build without compiling kernel files
  !make clean
  !make
  os.chdir('/content')
else:
  print('Change GPU to activate')

Change GPU to activate


In [None]:
#@title **2.5. Install conda package**
!mamba install -c conda-forge -c bioconda mgltools=1.5.7 biopython=1.78 \
          openbabel=3.1.0 plip=2.2.2 zlib=1.2.11 xlsxwriter=3.0.3 -y


                  __    __    __    __
                 /  \  /  \  /  \  /  \
                /    \/    \/    \/    \
███████████████/  /██/  /██/  /██/  /████████████████████████
              /  / \   / \   / \   / \  \____
             /  /   \_/   \_/   \_/   \    o \__,
            / _/                       \_____/  `
            |/
        ███╗   ███╗ █████╗ ███╗   ███╗██████╗  █████╗
        ████╗ ████║██╔══██╗████╗ ████║██╔══██╗██╔══██╗
        ██╔████╔██║███████║██╔████╔██║██████╔╝███████║
        ██║╚██╔╝██║██╔══██║██║╚██╔╝██║██╔══██╗██╔══██║
        ██║ ╚═╝ ██║██║  ██║██║ ╚═╝ ██║██████╔╝██║  ██║
        ╚═╝     ╚═╝╚═╝  ╚═╝╚═╝     ╚═╝╚═════╝ ╚═╝  ╚═╝

        mamba (0.15.3) supported by @QuantStack

        GitHub:  https://github.com/mamba-org/mamba
        Twitter: https://twitter.com/QuantStack

█████████████████████████████████████████████████████████████


Looking for: ['mgltools=1.5.7', 'biopython=1.78', 'openbabel=3.1.0', 'plip=2.2.2', 'zlib=1.2.11', 'xlsxwriter=3.

---
---
# **3. Set up folder**

In [None]:
#@title **3.1. Import Python modules**

import os
import sys
sys.path.append('/content/dock_util')
from Utility import *


import ast
import math
import plip
import timeit
import shutil
import py3Dmol
import contextlib
import xlsxwriter
import urllib.request

import numpy as np
import pandas as pd

from google.colab import drive, files
from tqdm.notebook import tqdm
from openbabel import pybel
from Bio.PDB import PDBIO, PDBParser
from rdkit import Chem
from rdkit.Chem import rdFMCS, AllChem, PandasTools
from plip.exchange.report import BindingSiteReport
from plip.structure.preparation import PDBComplex
from AutoDockTools.Utilities24 import prepare_ligand4, prepare_receptor4
print(f"+ Imported done")
print(f"+ Environment ready for molecular docking")

+ Imported done
+ Environment ready for molecular docking


In [None]:
#@title **3.2. Create folders**
#@markdown Enter a **< Job Name >** without space.\
#@markdown This create a folder for protein, ligand, experimental and docking.

Job_name = "6PUW" #@param {type:"string"}
assert not Job_name == "", "Do not leave this blank."
assert not any(c == "/" or c == "." for c in Job_name), "Disallowed characters."

DIR = os.getcwd()
WRK_DIR = os.path.join(DIR, Job_name)
PRT_FLD = os.path.join(WRK_DIR, "PROTEIN")
LIG_FLD = os.path.join(WRK_DIR, "LIGAND")
EXP_FLD = os.path.join(WRK_DIR, "EXPERIMENTAL")
DCK_FLD = os.path.join(WRK_DIR, "DOCKING")
INT_FLD = os.path.join(WRK_DIR, "INTERACTION")
 
folders = [WRK_DIR, PRT_FLD, LIG_FLD, EXP_FLD, DCK_FLD, INT_FLD]

for f in folders:
    if os.path.exists(f):
        print(f"+ {os.path.basename(f)} folder already exists")
    if not os.path.exists(f):
        os.mkdir(f)
        print(f"+ {os.path.basename(f)} folder created")

+ 6PUW folder created
+ PROTEIN folder created
+ LIGAND folder created
+ EXPERIMENTAL folder created
+ DOCKING folder created
+ INTERACTION folder created


In [None]:
#@title **3.3. Set up utilities**
#@markdown This creates important variables and functions that will be utilized 
#@markdown throughout the docking study.

%alias vina /content/vina
%alias qvina2 /content/qvina/bin/qvina2.1
%alias vina-gpu /content/Vina-GPU/Vina-GPU

COLORS = ["red", "orange", "yellow", "lime", "green", "cyan", "teal", "blue", 
          "violet", "purple", "pink", "gray", "brown", "white", "black"]

BOND_DICT = {"hydrophobic": ["0x59e382", "GREEN"], 
             "hbond": ["0x59bee3", "LIGHT BLUE"],
             "waterbridge": ["0x4c4cff", "BLUE"], 
             "saltbridge": ["0xefd033", "YELLOW"], 
             "pistacking": ["0xb559e3", "PURPLE"], 
             "pication": ["0xe359d8", "VIOLET"], 
             "halogen": ["0x59bee3", "LIGHT BLUE"], 
             "metal": ["0xe35959", "ORANGE"]}


In [None]:
%vina --help

AutoDock Vina v1.2.0

Input:
  --receptor arg             rigid part of the receptor (PDBQT)
  --flex arg                 flexible side chains, if any (PDBQT)
  --ligand arg               ligand (PDBQT)
  --batch arg                batch ligand (PDBQT)
  --scoring arg (=vina)      scoring function (ad4, vina or vinardo)

Search space (required):
  --maps arg                 affinity maps for the autodock4.2 (ad4) or vina 
                             scoring function
  --center_x arg             X coordinate of the center (Angstrom)
  --center_y arg             Y coordinate of the center (Angstrom)
  --center_z arg             Z coordinate of the center (Angstrom)
  --size_x arg               size in the X dimension (Angstrom)
  --size_y arg               size in the Y dimension (Angstrom)
  --size_z arg               size in the Z dimension (Angstrom)

Output (optional):
  --out arg                  output models (PDBQT), the default is chosen based
                             on the

In [None]:
%vina-gpu --help

/bin/bash: /content/Vina-GPU/Vina-GPU: No such file or directory
