<a href="https://colab.research.google.com/github/RyanZR/ColabDock-Vina/blob/main/%F0%9F%8D%8APLIA.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 🍊 **PLIA**
_**P**rotein-**L**igand **I**nteraction **A**nalysis_ is a Jupyter Notebook written to perform protein-ligand binding interaction analysis on  using **MDAnalysis** and **ProLiF**.


Proceed to [MOUNTAIN_V2.pynb](https://colab.research.google.com/github/RyanZR/ColabDock-Vina/blob/main/%F0%9F%8D%8AMOUNTAIN_V2.ipynb) to perform single molecular docking.

Proceed to [UNOIN_V2.pynb](https://colab.research.google.com/github/RyanZR/ColabDock-Vina/blob/main/%F0%9F%8D%8AUNION_V2.ipynb) to perform virtual screening.

---
---
# **Setting Up the Environment for Interaction Analysis**

Before starting, we need to install all the necessary software and dependecies to perform molecular docking. 

+ condacolab (https://github.com/con)
+ MDAnalysis (https://www.mdanalysis.org/)
+ ProLiF (https://github.com/chemosim-lab/ProLIF)

In [None]:
#@title **Install dependencies**
#@markdown It will take a few minutes, please, drink a coffee and wait. ;-)

# install dependencies

%%capture
import sys
!pip -q install py3Dmol 2>&1 1>/dev/null
!pip install --upgrade MDAnalysis 2>&1 1>/dev/null
!pip install rdkit-pypi
!pip install Cython
!git clone https://github.com/pablo-arantes/ProLIF.git
prolif1 = "cd /content/ProLIF"
prolif2 = "sed -i 's/mdanalysis.*/mdanalysis==2.0.0/' setup.cfg"
prolif3 = "pip install ."

original_stdout = sys.stdout
with open('prolif.sh', 'w') as f:
    sys.stdout = f
    print(prolif1)
    print(prolif2)
    print(prolif3)
    sys.stdout = original_stdout

!chmod 700 prolif.sh 2>&1 1>/dev/null
!bash prolif.sh >/dev/null 2>&1

# install conda
!wget -qnc https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh 
!bash Miniconda3-latest-Linux-x86_64.sh -bfp /usr/local 2>&1 1>/dev/null
!rm -r Miniconda3-latest-Linux-x86_64.sh /content/sample_data /content/ProLIF prolif.sh
!conda install -y -q -c conda-forge openmm=7.6 python=3.7 pdbfixer 2>&1 1>/dev/null

sys.path.append('/usr/local/lib/python3.7/site-packages/')

In [None]:
# @title **Import Python modules**
# @markdown This allow Python accessible to the neccessary modules.

import openmm
from openmm.app import PDBFile
from pdbfixer import PDBFixer

import os
import warnings
import shutil
import IPython.display

import numpy as np
import pandas as pd
from google.colab import drive, files

import MDAnalysis as mda
from MDAnalysis.coordinates import PDB

import prolif as plf
from prolif.plotting.network import LigNetwork

# Capture python output
class Hide:
  def __enter__(self):
    self._original_stdout = sys.stdout
    sys.stdout = open(os.devnull, "w")
  
  def __exit__(self, exc_type, exc_val, exc_tb):
    sys.stdout.close()
    sys.stdout = self._original_stdout

warnings.filterwarnings("ignore")

In [None]:
# @title **Import Google Drive**
# @markdown This allow data to be stored in Google Drive.

# Flush and mount GDrive
with Hide():
  drive.flush_and_unmount()
  drive.mount("/content/drive", force_remount=True)

print("> Mounted at /content/drive")

---
---
# **PLIA** for Single Docking

This section of the codes load data obtained from [MOUNTAIN_V2.ipynb](https://colab.research.google.com/github/RyanZR/ColabDock-Vina/blob/main/%F0%9F%8D%8AMOUNTAIN_V2.ipynb) and generate ligand interaction network for analysis.

In [None]:
# @title **Create and select folders**
# @markdown Select a **work directory** name without space. Analysis folder will be created to store the data necessary for interaction analysis.

# Define path of folder
GDrive_dir = "/content/drive/MyDrive/Docking/7KNX_docking" #@param {type: "string"}
dir = os.path.abspath(".")
analysis_folder = os.path.join(dir,"analysis")
protein_folder = os.path.join(GDrive_dir,"protein")
ligand_folder = os.path.join(GDrive_dir,"ligand")
experimental_folder = os.path.join(GDrive_dir,"experimental")
docking_folder = os.path.join(GDrive_dir,"docking")

# Create folder if folder have not exists
if os.path.exists(analysis_folder):
  print("> %s already exists" % analysis_folder)
if not os.path.exists(analysis_folder):
  os.mkdir(analysis_folder)
  print("> %s was successfully created" % analysis_folder)

In [None]:
# @title **Protein Preparation**

Protein_pdb = "7KNX_prot_A.pdb" #@param {type : "string"}
protein_pdb_dfile = os.path.join(docking_folder,Protein_pdb)
protein_H_pdb_afile = os.path.join(analysis_folder,Protein_pdb[:-4] + "_H.pdb")

fix_protein = PDBFixer(protein_pdb_dfile)
fix_protein.addMissingHydrogens(7.4)
PDBFile.writeFile(fix_protein.topology, fix_protein.positions, open(protein_H_pdb_afile,"w"))

source = mda.Universe(protein_pdb_dfile)
newsrc = mda.Universe(protein_H_pdb_afile)

resNum = [res.resid for res in source.residues]
for n,r in enumerate(newsrc.residues):
  r.resid = resNum[n]

save = PDB.PDBWriter(protein_H_pdb_afile)
save.write(newsrc)
save.close()

print("> All hydrogen added into " + Protein_pdb)
print("> " + Protein_pdb[:-4] + "_H.pdb successfully created in " + analysis_folder)

In [None]:
# @title **Ligand Preparation**
# @markdown Select `output.sdf` file of interest.

Ligand_output_sdf = "C26A6_output.sdf" #@param {type:"string"}
ligand_output_sdf_dfile = os.path.join(docking_folder,Ligand_output_sdf)
ligand_output_sdf_afile = os.path.join(analysis_folder,Ligand_output_sdf)

shutil.copy(ligand_output_sdf_dfile, ligand_output_sdf_afile)

print("> " + Ligand_output_sdf + " is selected")
print("> " + Ligand_output_sdf + " successfully copied to " + analysis_folder)

In [None]:
# @title **Generate Interaction Network Dataframe**
# @markdown This load the protein and ligand into 3D universe and generate a dataframe for interaction network. 

# Define variables
plia_output_xlsx = "plia_output.xlsx"
plia_output_xlsx_afile = os.path.join(analysis_folder, plia_output_xlsx)
ligand_name = Ligand_output_sdf[:-11]

# Load protein
protein = mda.Universe(protein_H_pdb_afile)
protein = plf.Molecule.from_mda(protein)

# Load ligands
ligand = list(plf.sdf_supplier(ligand_output_sdf_afile))

# Generate and export interaction network dataframe
fp = plf.Fingerprint()
fp.run_from_iterable(ligand,protein)
results_df = fp.to_dataframe(return_atoms=True)
results_df.to_excel(plia_output_xlsx_afile, sheet_name="Raw data")

# Generate and export interaction network
count = 0
for n in ligand:
  count += 1
  net = LigNetwork.from_ifp(results_df, 
                          ligand[0], 
                          kind = "frame", 
                          frame = count - 1, 
                          rotation = 360)
  net.save(os.path.join(analysis_folder, ligand_name + "_" + str(count) + ".html"))

print("> " + plia_output_xlsx + " successfully created in " + analysis_folder)
print("> " + str(count) + " html files of interaction network successfully generated in " + analysis_folder)
print("> Showing protein-ligand interaction network ...")
print("")
results_df

In [None]:
# @title **Show Ligand Interaction Network**{run: "auto"}
Pose = 7 #@param ["1", "2", "3", "4", "5", "6", "7", "8", "9"] {type:"raw"}
ligand_Pose_html = ligand_name + "_" + str(Pose) + ".html"

# Show network
IPython.display.HTML(os.path.join(analysis_folder, ligand_Pose_html))

In [None]:
# @title **Store result in Google Drive**
# @markdown The analysis folder will be created. This save all the files created into Google Drive.

# Define varibles
destination_folder = os.path.join(GDrive_dir, "analysis")

# Copy file to GDrive
shutil.copytree(analysis_folder, destination_folder)

print("> Data saved at " + destination_folder)

---
---
# **PLIA** for Virtual Screening

This section of the codes load data obtained from [UNION_V2.ipynb](https://colab.research.google.com/github/RyanZR/ColabDock-Vina/blob/main/%F0%9F%8D%8AUNION_V2.ipynb) and generate ligand interaction network for analysis.

In [None]:
# @title **Select and create folders**
# @markdown Select a **work directory** name without space. Analysis folder will be created to store the data necessary for interaction analysis.

# Define path of folder
GDrive_dir = "/content/drive/MyDrive/Docking/7KNX_VS_2" #@param {type: "string"}
dir = os.path.abspath(".")
analysis_folder = os.path.join(dir,"analysis")
protein_folder = os.path.join(GDrive_dir,"protein")
ligand_folder = os.path.join(GDrive_dir,"ligand")
experimental_folder = os.path.join(GDrive_dir,"experimental")
docking_folder = os.path.join(GDrive_dir,"docking")

# Create folder if folder have not exists
if os.path.exists(analysis_folder):
  print("> %s already exists" % analysis_folder)
if not os.path.exists(analysis_folder):
  os.mkdir(analysis_folder)
  print("> %s was successfully created" % analysis_folder)

In [None]:
# @title **Protein Preparation**

Protein_pdb = "7KNX_prot_A.pdb" #@param {type : "string"}
protein_pdb_dfile = os.path.join(docking_folder,Protein_pdb)
protein_H_pdb_afile = os.path.join(analysis_folder,Protein_pdb[:-4] + "_H.pdb")

fix_protein = PDBFixer(protein_pdb_dfile)
fix_protein.addMissingHydrogens(7.4)
PDBFile.writeFile(fix_protein.topology, fix_protein.positions, open(protein_H_pdb_afile,"w"))

source = mda.Universe(protein_pdb_dfile)
newsrc = mda.Universe(protein_H_pdb_afile)

resNum = [res.resid for res in source.residues]
for n,r in enumerate(newsrc.residues):
  r.resid = resNum[n]

save = PDB.PDBWriter(protein_H_pdb_afile)
save.write(newsrc)
save.close()

print("> All hydrogen added into " + Protein_pdb)
print("> " + Protein_pdb[:-4] + "_H.pdb successfully created in " + analysis_folder)

In [None]:
# @title **Ligand Preparation**
# @markdown This generate a list of ligand with best pose into a single `.sdf` file.
ligand_output_sdf_dfile = sorted([ os.path.join(docking_folder,f + "/" + f + "_output.sdf") for f in os.listdir(docking_folder) if "Lig_" in f ])
ligand_output_sdf_afile = os.path.join(analysis_folder,"bp_output.sdf")

with open(ligand_output_sdf_afile,"w") as g:
  for i in ligand_output_sdf_dfile:
    f = open(i,"r").readlines()
    g.write("".join(f[0:f.index("$$$$\n") + 1]))

print("> Best pose of each " + str(len(ligand_output_sdf_dfile)) + " output.sdf files extract to " +ligand_output_sdf_afile[len(analysis_folder) + 1:])
print("> " + ligand_output_sdf_afile[len(analysis_folder) + 1:] + " successfully created in " + analysis_folder) 

In [None]:
# @title **Generate Interaction Networks Dataframe**
# @markdown This load the protein and ligands into 3D universe and generate a dataframe for interaction networks. 

# Define variables
plia_output_xlsx = "plia_output.xlsx"
plia_output_xlsx_afile = os.path.join(analysis_folder, plia_output_xlsx)

# Load protein
protein = mda.Universe(protein_H_pdb_afile)
protein = plf.Molecule.from_mda(protein)

# Load ligands
ligand = list(plf.sdf_supplier(ligand_output_sdf_afile))

# Generate and export interaction network dataframe
fp = plf.Fingerprint()
fp.run_from_iterable(ligand,protein)
results_df = fp.to_dataframe(return_atoms=True)
results_df.to_excel(plia_output_xlsx_afile, sheet_name="Raw data")

# Generate and export interaction network
count = 0
leadZero = len(str(len(ligand)))
for n in ligand:
  count += 1
  net = LigNetwork.from_ifp(results_df, 
                          ligand[int(count) - 1], 
                          kind = "frame", 
                          frame = int(count) - 1, 
                          rotation = 360)
  net.save(os.path.join(analysis_folder, "Lig_" + str(count).zfill(leadZero) + ".html"))

print("> " + plia_output_xlsx + " successfully generated in " + analysis_folder)
print("> " + str(count) + " html files of interaction network successfully generated in " + analysis_folder)
print("> Showing protein-ligand interaction network ...")
print("")
results_df

In [None]:
# @title **Show Ligand Interaction Network** {run: "auto"}
# @markdown Insert the name of the ligand of interest.

Load_ligand = "Lig_049" #@param {type:"string"}
Load_ligand_html = Load_ligand + ".html"

# Show network
IPython.display.HTML(os.path.join(analysis_folder, Load_ligand_html))

In [None]:
# @title **Store result in Google Drive**
# @markdown The analysis folder will be created. This save all the files created into Google Drive.

# Define varibles
destination_folder = os.path.join(GDrive_dir, "analysis")

# Copy file to GDrive
shutil.copytree(analysis_folder, destination_folder)

print("> Data saved at " + destination_folder)