<a href="https://colab.research.google.com/github/alanseb92/Protein-Protein-Docking-in-Google-Colab/blob/main/Protein_Protein_Docking_.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

##**Protein Protein Docking with MEGADOCK**

This notebook is managed from the published work of *Masahito Ohue*. If you are planning to publish results generated from this noteboook, please cite

**1. MEGADOCK-on-Colab: an easy-to-use protein–protein docking tool on Google Colaboratory**

**2. MEGADOCK 4.0: an ultra–high-performance protein–protein docking software for heterogeneous supercomputers**.

You are not allowed to use this program commercially, in case you are planning to use it commercially, please take a prior permission from Ohue lab.

##Upload Your Protein Structures for Computer
Once you click, an upload tab will appear

In [None]:
from google.colab import files
#@markdown **Receptor PDB** (`-R`)
print("upload receptor PDB file (.pdb)")
rup = files.upload()
rfilename = list(rup.keys())[0]

upload receptor PDB file (.pdb)


Saving 8g0z.pdb to 8g0z.pdb


In [None]:
#@markdown **Ligand PDB** (`-L`)
print("upload ligand PDB files (.pdb)")
lup = files.upload()
lfilename = list(lup.keys())[0]

upload ligand PDB files (.pdb)


Saving 7s15.pdb to 7s15.pdb


In [None]:
#@markdown MEGADOCK parameters (if you want to change)
t = "6" #@param {type:"string"}
#D = 0 #@param {type:"string"}
N = "10800" #@param {type:"string"}
outfile_name = "dock.out" #@param {type:"string"}

In [None]:
#@title Install all the Required software

# MEGADOCK
!git clone https://github.com/akiyamalab/MEGADOCK
!git clone https://github.com/NVIDIA/cuda-samples
!apt-get install -y libfftw3-dev libfftw3-single3

%cd ./MEGADOCK
!make -j 2 -f Makefile.colab

# biopython
!pip -q install biopython

# NGLView
!pip install nglview==3.0.8

!jupyter-nbextension enable nglview --py --sys-prefix

Cloning into 'MEGADOCK'...
remote: Enumerating objects: 484, done.[K
remote: Counting objects: 100% (71/71), done.[K
remote: Compressing objects: 100% (54/54), done.[K
remote: Total 484 (delta 37), reused 29 (delta 17), pack-reused 413 (from 1)[K
Receiving objects: 100% (484/484), 715.80 KiB | 6.51 MiB/s, done.
Resolving deltas: 100% (240/240), done.
Cloning into 'cuda-samples'...
remote: Enumerating objects: 25919, done.[K
remote: Counting objects: 100% (13306/13306), done.[K
remote: Compressing objects: 100% (1407/1407), done.[K
remote: Total 25919 (delta 12631), reused 11902 (delta 11899), pack-reused 12613 (from 2)[K
Receiving objects: 100% (25919/25919), 134.14 MiB | 29.09 MiB/s, done.
Resolving deltas: 100% (23005/23005), done.
Updating files: 100% (2498/2498), done.
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
  libfftw3-bin libfftw3-double3 libfftw3-long3 libfftw3-qu

In [None]:
#@title MEGADOCK parameter setting
import os
os.environ['MDt'] = str(t)
os.environ['MDN'] = str(N)
os.environ['MDPDBR'] = rfilename
os.environ['MDPDBL'] = lfilename
os.environ['MDOF'] = str(outfile_name)

!bash -c "mv /content/$MDPDBR ."
!bash -c "mv /content/$MDPDBL ."

In [None]:
#@title Run MEGADOCK
!./megadock-gpu -R $MDPDBR -L $MDPDBL -t $MDt -N $MDN -o $MDOF

 MEGADOCK ver. 4.1.3 for GPU & single node
      megadock@bi.c.titech.ac.jp   lastupdated: 26 March, 2019

# Using OpenMP parallelization: 2 threads.
# Using CUDA device 0: Tesla T4
# CUFFT version : 11203
# Number of available [threads / GPUs] : [2 / 1]
# Set number of scores per one angle = 6
# Number of output = 10800
# Output file = dock.out
# Using   2 CPU cores, 1 GPUs

Receptor = 8g0z.pdb
Receptor max size = 131.531
Required voxel size = 147.531
Number of grid = 125
FFT N = 250

Ligand = 7s15.pdb
Ligand max size = 107.341
Required voxel size = 113.341
Number of grid = 96
FFT N = 192
Memory requirement (/node)  = 956.0 MB
# GPU Memory : Use 496.9 MB ( 3.0%), Free 14598.1 MB (96.0%), Total 15095.1 MB

---------- Start docking calculations

Ligand = 7s15.pdb
Target receptors:
 8g0z.pdb

   >Ligand rotation =   360 /  3600 ( 0)
   >Ligand rotation =   720 /  3600 ( 0)
   >Ligand rotation =  1080 /  3600 ( 0)
   >Ligand rotation =  1440 /  3600 ( 0)
   >Ligand rotation =  1800 /  360

The Below script will allow you to visualize docking result of the Best Pose. If you are unable to see, that is fine. We will download the result and can analyze in offline mode using PyMol, Discovery Visualizer, VMD, Chimera or any other.

In [None]:
from google.colab import output
output.enable_custom_widget_manager()

In [None]:
import ipywidgets as widgets
from IPython.display import display
import os
import sys
from urllib.request import urlretrieve
import Bio
from Bio import PDB
from Bio import SeqIO, SearchIO, Entrez
from Bio.Seq import Seq
import pylab
import urllib
import pandas as pd
import nglview as nv
from collections import Counter
from Bio.PDB import PDBParser,MMCIFParser



In [None]:
#@title Show the 1st solution in PDB viewer (NGLView)
!./decoygen lig.1.pdb $MDPDBL $MDOF 1
!cat $MDPDBR lig.1.pdb | sed 's/END//g' > complex1.pdb
from google.colab import output
output.enable_custom_widget_manager()
import nglview as nv
view = nv.show_structure_file("complex1.pdb")
view

NGLWidget()

In [None]:
#@title Show the top 5 solutions in PDB viewer (NGLView)
!./decoygen lig.2.pdb $MDPDBL $MDOF 2
!./decoygen lig.3.pdb $MDPDBL $MDOF 3
!./decoygen lig.4.pdb $MDPDBL $MDOF 4
!./decoygen lig.5.pdb $MDPDBL $MDOF 5
!cat complex1.pdb lig.2.pdb lig.3.pdb lig.4.pdb lig.5.pdb | sed 's/END//g' > complex5s.pdb
from google.colab import output
output.enable_custom_widget_manager()
import nglview as nv
view = nv.show_structure_file("complex5s.pdb")
view

In [None]:
#@title PPI score calculation (for protein-protein interaction prediction)
!./ppiscore $MDOF $MDN

##Understand the PPIscore generate above in terms of Interaction
The value of the PPI Score (the value of $E$ in the above cell) can predict whether two protein chains will interact or not.  
The approximate values of precision (positive predictive value, PPV) and PPI Score are shown in the Figure below.  
* The precision is about 10% when $E > 8$ is predicted as "PPI positive."
* The precision is about 50% when $E > 10$ is predicted as "PPI positive."
* The precision is about 80% when $E > 12$ is predicted as "PPI positive."

![](http://drive.google.com/uc?export=view&id=1aVl9yRh-E4HXtn6AQ1fjrMb0M7Nx-wxH)
(from Ohue M, _et al_. _Jikkenigaku_, 37(9):1469, 2019.)