# Classical Molecular Interaction Potentials tutorial using BioExcel Building Blocks (biobb)

***
This tutorial aims to illustrate the process of computing **classical molecular interaction potentials** from **protein structures**, step by step, using the **BioExcel Building Blocks library (biobb)**. Examples shown are **Molecular Interaction Potentials (MIPs) grids, protein-protein/ligand interaction potentials, and protein titration**. The particular structures used are the **Lysozyme** protein (PDB code [1AKI](https://www.rcsb.org/structure/1aki)), and a MD simulation of the complex formed by the **SARS-CoV-2 Receptor Binding Domain and the human Angiotensin Converting Enzyme 2** (PDB code [6VW1](https://www.rcsb.org/structure/6vw1)). 

The code wrapped is the ***Classical Molecular Interaction Potentials (CMIP)*** code:

**Classical molecular interaction potentials: Improved setup procedure in molecular dynamics simulations of proteins.**
*Gelpí, J.L., Kalko, S.G., Barril, X., Cirera, J., de la Cruz, X., Luque, F.J. and Orozco, M. (2001)*
*Proteins, 45: 428-437. https://doi.org/10.1002/prot.1159*
***

## Settings

### Biobb modules used

 - [biobb_io](https://github.com/bioexcel/biobb_io): Tools to fetch biomolecular data from public databases.
 - [biobb_cmip](https://github.com/bioexcel/biobb_cmip): Tools to compute classical molecular interaction potentials from protein structures.
 - [biobb_structure_utils](https://github.com/bioexcel/biobb_structure_utils): Tools to modify or extract information from a PDB structure.
  
### Auxiliar libraries used

 - [nb_conda_kernels](https://github.com/Anaconda-Platform/nb_conda_kernels): Enables a Jupyter Notebook or JupyterLab application in one conda environment to access kernels for Python, R, and other languages found in other environments.
 - [ipywidgets](https://github.com/jupyter-widgets/ipywidgets): Interactive HTML widgets for Jupyter notebooks and the IPython kernel.
 - [nglview](http://nglviewer.org/#nglview): Jupyter/IPython widget to interactively view molecular structures and trajectories in notebooks.
 - [plotly](https://plotly.com/python/): Python Open Source Graphing Library. 


### Conda Installation and Launch

```console
git clone https://github.com/bioexcel/biobb_wf_cmip.git
cd biobb_wf_cmip
conda env create -f conda_env/environment.yml
conda activate biobb_wf_cmip
jupyter-nbextension enable --py --user widgetsnbextension
jupyter-notebook biobb_wf_cmip/notebooks/biobb_wf_cmip.ipynb
  ``` 

***
## Pipeline steps
 1. [Input Parameters](#input)
 2. [Fetching PDB structure](#fetch)
 3. [PDB preparation (from PDB databank)](#preparePDB)
 4. [Structural water molecules & ions](#titration)
 5. [Molecular Interaction Potentials](#mips)
    1. [MIP+](#mip_pos) 
    2. [MIP-](#mip_neg) 
    3. [MIPn](#mip_neutral) 
 6. [PDB preparation (from MD)](#preparePDB_MD)
 7. [Interaction Potentials](#interaction)
    1. [Ligand](#ligand)
    2. [Protein](#protein)
 8. [Questions & Comments](#questions)
 
***
<img src="https://bioexcel.eu/wp-content/uploads/2019/04/Bioexcell_logo_1080px_transp.png" alt="Bioexcel2 logo"
	title="Bioexcel2 logo" width="400" />
***

<a id="input"></a>
## Input parameters
**Input parameters** needed:
 - **pdbCode**: PDB code of the protein structure (e.g. 1AKI)
 - **MDCode**: Code of the Molecular Dynamics trajectory (e.g. RBD-hACE2)
     - **inputPDB_MD**: MD reference structure (PDB format)
     - **inputTOP_MD**: MD topology (Amber Parmtop7 format)

In [1]:
import nglview
import ipywidgets
import plotly
from plotly import subplots
import plotly.graph_objs as go

pdbCode = "1aki"

MDCode = "RBD-hACE2"
inputPDB_MD = "Files/" + MDCode + ".pdb" 
inputTOP_MD = "Files/" + MDCode + ".top" 



<a id="fetch"></a>
***
## Fetching PDB structure
Downloading **PDB structure** with the **protein molecule** from the RCSB PDB database.<br>
Alternatively, a **PDB file** can be used as starting structure. <br>
***
**Building Blocks** used:
 - [Pdb](https://biobb-io.readthedocs.io/en/latest/api.html#module-api.pdb) from **biobb_io.api.pdb**
***

In [2]:
# Downloading desired PDB file 
# Import module
from biobb_io.api.pdb import pdb

# Create properties dict and inputs/outputs
downloaded_pdb = pdbCode+'.pdb'
prop = {
    'pdb_code': pdbCode,
    'api_id' : 'mmb'
}

#Create and launch bb
pdb(output_pdb_path=downloaded_pdb,
    properties=prop)

2022-08-11 12:31:18,670 [MainThread  ] [INFO ]  Downloading: 1aki from: http://mmb.irbbarcelona.org/api/pdb/1aki/coords/?
2022-08-11 12:31:18,705 [MainThread  ] [INFO ]  Writting pdb to: 1aki.pdb
2022-08-11 12:31:18,706 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'MODEL', 'ENDMDL']


0

<a id="vis3D"></a>
### Visualizing 3D structure
Visualizing the downloaded/given **PDB structure** using **NGL**: 

In [3]:
# Show protein
view = nglview.show_structure_file(downloaded_pdb)
view.add_representation(repr_type='ball+stick', selection='all')
view._remote_call('setSize', target='Widget', args=['','600px'])
view

NGLWidget()

<a id="preparePDB"></a>
***
## PDB Preparation (from PDB structure)
**CMIP** tool needs additional information (e.g. charges, elements) to be included in the **structure PDB file** to properly run. A specific **BioBB building block** (prepare_pdb) is used in the next cell to prepare the **input PDB file**, adding this extra information. **Charges and elements** are taken from an internal **CMIP library** based on the **AMBER force fields**. 
***
**Building Blocks** used:
 - [prepare_pdb](https://biobb-cmip.readthedocs.io/en/latest/cmip.html#module-cmip.prepare_pdb) from **biobb_cmip.cmip.prepare_pdb**
***

In [4]:
from biobb_cmip.cmip.prepare_pdb import prepare_pdb

cmipPDB = pdbCode + ".cmip.pdb"

prepare_pdb(input_pdb_path=downloaded_pdb,
            output_cmip_pdb_path=cmipPDB
)

2022-08-11 12:31:25,670 [MainThread  ] [INFO ]  Not using any container
2022-08-11 12:31:25,671 [MainThread  ] [INFO ]  check_structure -v -i 1aki.pdb -o 1aki.cmip.pdb --output_format cmip --non_interactive command_list --list 'water --remove yes; backbone --add_caps none; fixside --fix All; add_hydrogen --add_mode auto --add_charges CMIP'

2022-08-11 12:31:26,483 [MainThread  ] [INFO ]  Exit code 0

=                   BioBB structure checking utility v3.9.11                   =
=            P. Andrio, A. Hospital, G. Bayarri, J.L. Gelpi 2018-22           =

Structure 1aki.pdb loaded
 Title: 
 Experimental method: unknown
 Resolution (A): N.A.

 Num. models: 1
 Num. chains: 1 (A: Protein)
 Num. residues:  129
 Num. residues with ins. codes:  0
 Num. HETATM residues:  0
 Num. ligands or modified residues:  0
 Num. water mol.:  0
 Num. atoms:  1001


Step 1: water --remove yes
Running water. Options: --remove yes
No water molecules found

Step 2:  backbone --add_caps none
Running backbo

0

<a id="titration"></a>
***
## Structural water molecules & ions
One of the many steps involved in the **MD structure setup process** is the addition of **solvent and counterions** (when working with explicit solvent). **Solvent molecules** and **counterions** are usually integrated on the **structure surface** in two steps:
- **Structural waters/ions**: A **first shell** of **water molecules and ions** is commonly added in the **energetically most favorable positions** on the surface of the structure. It is a computationally expensive process and is usually reduced to just tens of **water molecules** and **ions** (depending on the structure size).
- **Solvent box/ionic concentration**: A box of **solvent molecules** is created surrounding the original structure, and an additional number of **ions** are added until reaching a desired **ionic concentration**.

Whereas the second step is integrated in all the **MD packages**, the first one is rarely available. **CMIP** and the **biobb_titration building block** is helping in this task.   
***
**Building Blocks** used:
 - [titration](https://biobb-cmip.readthedocs.io/en/latest/cmip.html#module-cmip.titration) from **biobb_cmip.cmip.titration**
 - [cat_pdb](https://biobb-structure-utils.readthedocs.io/en/latest/utils.html#module-utils.cat_pdb) from **biobb_structure_utils.utils.cat_pdb**
***

<a id="run_titration"></a>
### Computing structural water molecules & ions positions
Computing the positions of **20 structural water molecules**, **5 positive** and **5 negative ions** in the most **energetically favourable** regions of the **structure surface**.

In [83]:
from biobb_cmip.cmip.titration import titration

wat_ions_pdb = pdbCode + ".wat_ions.pdb"
wat_ions_log = pdbCode + ".wat_ions.log"

prop = { 
#    'neutral' : True, # Can be also used to neutralize the system
    'num_positive_ions' : 5,
    'num_negative_ions' : 5,    
    'num_wats' : 20
}

titration(input_pdb_path=cmipPDB,
          output_pdb_path=wat_ions_pdb,
          output_log_path=wat_ions_log,
          properties=prop)

2022-07-04 08:56:50,281 [MainThread  ] [INFO ]  Not using any container
2022-07-04 08:56:50,285 [MainThread  ] [INFO ]  titration -i bfd92208-f5e4-4a45-80aa-1b23f3c1712c/params -vdw /anaconda3/envs/biobb_CMIP_tutorial/share/cmip/dat/vdwprm -hs 1aki.cmip.pdb -outpdb 1aki.wat_ions

2022-07-04 08:57:48,473 [MainThread  ] [INFO ]  Exit code 0

 =                                               =
 =                C M I P  (2.7.0)               =
 =          J. Ll. Gelpi, A. Morreale,           =
 =            F. J. Luque, M.Orozco              =
 =      Dept. Biochemistry. Univ. Barcelona      =
 =                  1999-2021                    =

  Code for ASA calculation by Juan F. Recio       

#T Run started at    8:56:50 h on  4- 7-2022
 SIZES
 -----
 MAXTIP:           100
 MAXATH:            80000
 MAXATP x MAXCONF: 100000

 INPUT FILES
 -----------
    Calc. settings: bfd92208-f5e4-4a45-80aa-1b23f3c1712c/params                                                                           

2022-07-04 08:57:48,479 [MainThread  ] [INFO ]  Removed: ['bfd92208-f5e4-4a45-80aa-1b23f3c1712c']


0

<a id="catPDB_tit"></a>
### Adding structural water molecules & ions
Adding the 20 + 10 computed **structural water molecules** and **ions** to the original **PDB file**.

In [84]:
from biobb_structure_utils.utils.cat_pdb import cat_pdb

titPDB = pdbCode + ".tit.pdb"

cat_pdb(input_structure1=cmipPDB,
       input_structure2=wat_ions_pdb,
       output_structure_path=titPDB)

2022-07-04 08:58:57,241 [MainThread  ] [INFO ]  Removed: []


0

<a id="visTIT"></a>
### Visualizing structural water molecules & ions
Visualizing the recently added **structural water molecules and ions**. 

In [85]:
view = nglview.show_structure_file(titPDB)
view.clear_representations()
view.add_representation(repr_type='cartoon', selection='protein', color='sstruc')
#view.add_representation(repr_type='surface', selection='protein', radius='0.2', color='grey', opacity='0.2')
#view.add_representation(repr_type='licorice', radius='.5', selection='water')
view.add_representation(repr_type='spacefill', selection='water')
view.add_representation(repr_type='spacefill', selection='.Na', color='element')
view.add_representation(repr_type='spacefill', selection='.Cl', color='element')

#view.add_representation(repr_type='cartoon',selection='not het',colorScheme = 'atomindex')
#view.center()
view._remote_call('setSize', target='Widget', args=['','600px'])
view

NGLWidget()

<a id="mips"></a>
***
## Molecular Interaction Potentials (MIPs)
**Molecular interaction potentials (MIP)** are field properties arising from the interaction of a **probe** (e.g., methyl, proton or water) with a molecule. These are calculated in the surface of the molecule, with a grid defined around the structure.

**MIPs** are one of the most important molecular properties in the relationship between **molecular and binding data** (e.g. *3D Quantitative Structure-Activity Relationships, 3D-QSAR*), and is extensively applied in **drug discovery** processes.  

In this example, three different **MIPs** are used, with a **Water Oxygen atom** as a probe:
 - **Positive** MIP - highlighting the protein regions with **higher affinity** to **negatively charged groups**.
 - **Negative** MIP - highlighting the protein regions with **higher affinity** to **positively charged groups**.
 - **Neutral** MIP - highlighting the protein regions with **lower affinity** to **electrocharged groups**.
***
**Building Blocks** used:
 - [cmip](https://biobb-cmip.readthedocs.io/en/latest/cmip.html#module-cmip.cmip) from **biobb_cmip.cmip.cmip**
***

<a id="mip_pos"></a>
### Positive MIP

In [5]:
from biobb_cmip.cmip.cmip import cmip

mip_pos_log = pdbCode + ".mip_pos.log"
mip_pos_cube = pdbCode + ".mip_pos.cube"

prop = { 
    'execution_type' : 'mip_pos',
    'remove_tmp' : False,
    #'binary_path' : "/Users/hospital/COVID/CMIP/CMIP-master-68aefeae92993bbaa7234a8f5010cc42264624d7/src/cmip"
}

cmip(input_pdb_path=cmipPDB,
          #output_pdb_path='output.pdb',  # If added, python crashes with output_pdb_path not exists!!
          output_log_path=mip_pos_log,
          output_cube_path=mip_pos_cube,
          properties=prop)

2022-08-11 12:31:31,401 [MainThread  ] [INFO ]  Not using any container
2022-08-11 12:31:31,402 [MainThread  ] [INFO ]  cmip -i 90b91022-7058-40ba-879b-277b22cb315c/params -vdw /home/adam.local/anaconda3/envs/biobb_CMIP_tutorial/share/cmip/dat/vdwprm -hs 1aki.cmip.pdb -cube 1aki.mip_pos.cube -o 1aki.mip_pos.log

2022-08-11 12:31:34,824 [MainThread  ] [INFO ]  Exit code 0



0

<a id="visMIP1"></a>
### Visualizing 3D structure
Visualizing the **positive MIP grid**, with protein regions with **higher affinity** to **negatively charged groups** highlighted.

In [6]:
view = nglview.show_structure_file(mip_pos_cube)
view.add_component(cmipPDB)
view.clear_representations()
view.add_representation(repr_type='cartoon', selection='protein', color='sstruc')
view.add_surface(isolevelType="value", isolevel=-5, color="blue")
view.component_1.center()
view._remote_call('setSize', target='Widget', args=['','600px'])
view

NGLWidget()

<a id="mip_neg"></a>
### Negative MIP

In [9]:
from biobb_cmip.cmip.cmip import cmip

mip_neg_log = pdbCode + ".mip_neg.log"
mip_neg_cube = pdbCode + ".mip_neg.cube"

prop = { 
    'execution_type' : 'mip_neg',
    #'binary_path' : "/Users/hospital/COVID/CMIP/CMIP-master-68aefeae92993bbaa7234a8f5010cc42264624d7/src/cmip"
}

cmip(input_pdb_path=cmipPDB,
          #output_pdb_path='output.pdb',  # If added, python crashes with output_pdb_path not exists!!
          output_log_path=mip_neg_log,
          output_cube_path=mip_neg_cube,
          properties=prop)

2022-07-27 15:59:28,945 [MainThread  ] [INFO ]  Not using any container
2022-07-27 15:59:28,946 [MainThread  ] [INFO ]  cmip -i 55c8dfa1-da44-43bd-b097-dc7eded761bf/params -vdw /opt/anaconda3/envs/biobb_CMIP_tutorial/share/cmip/dat/vdwprm -hs 1aki.cmip.pdb -cube 1aki.mip_neg.cube -o 1aki.mip_neg.log

2022-07-27 16:01:17,376 [MainThread  ] [INFO ]  Exit code 0

2022-07-27 16:01:17,380 [MainThread  ] [INFO ]  Removed: ['55c8dfa1-da44-43bd-b097-dc7eded761bf']


0

<a id="visMIP2"></a>
### Visualizing 3D structure
Visualizing the **negative MIP grid**, with protein regions with **higher affinity** to **positively charged groups** highlighted.

In [10]:
view = nglview.show_structure_file(mip_neg_cube)
view.add_component(cmipPDB)
view.clear_representations()
view.add_representation(repr_type='cartoon', selection='protein', color='sstruc')
view.add_surface(isolevelType="value", isolevel=-5, color="red")
view.component_1.center()
view._remote_call('setSize', target='Widget', args=['','600px'])
view

NGLWidget()

<a id="mip_neutral"></a>
### Neutral MIP

In [11]:
from biobb_cmip.cmip.cmip import cmip

mip_neutral_log = pdbCode + ".mip_neutral.log"
mip_neutral_cube = pdbCode + ".mip_neutral.cube"

prop = { 
    'execution_type' : 'mip_neu'
}

cmip(input_pdb_path=cmipPDB,
          #output_pdb_path='output.pdb',  # If added, python crashes with output_pdb_path not exists!!
          output_log_path=mip_neutral_log,
          output_cube_path=mip_neutral_cube,
          properties=prop)

2022-07-27 16:01:25,186 [MainThread  ] [INFO ]  Not using any container
2022-07-27 16:01:25,187 [MainThread  ] [INFO ]  cmip -i a4d7dfba-ac38-401d-9256-615198261a18/params -vdw /opt/anaconda3/envs/biobb_CMIP_tutorial/share/cmip/dat/vdwprm -hs 1aki.cmip.pdb -cube 1aki.mip_neutral.cube -o 1aki.mip_neutral.log

2022-07-27 16:03:11,476 [MainThread  ] [INFO ]  Exit code 0

2022-07-27 16:03:11,478 [MainThread  ] [INFO ]  Removed: ['a4d7dfba-ac38-401d-9256-615198261a18']


0

<a id="visMIP3"></a>
### Visualizing 3D structure
Visualizing the **neutral MIP grid**, with protein regions with **lower affinity** to **electrocharged groups** highlighted.

In [12]:
view = nglview.show_structure_file(mip_neutral_cube)
view.add_component(cmipPDB)
view.clear_representations()
view.add_representation(repr_type='cartoon', selection='protein', color='sstruc')
view.add_surface(isolevelType="value", isolevel=-1, color="grey")
view.component_1.center()
view._remote_call('setSize', target='Widget', args=['','600px'])
view

NGLWidget()

<a id="visMIP4"></a>
### Visualizing 3D structure
Visualizing all **MIP grids**, for comparison purposes.

In [13]:
#Show different structures generated (for comparison)
view1 = nglview.show_structure_file(cmipPDB)
view1.add_component(mip_pos_cube)
view1.component_0.add_representation(repr_type='cartoon', selection='protein', color='sstruc')
view1.component_1.add_surface(isolevelType="value", isolevel=-5, color="blue")
view1.component_0.center()
view1._remote_call('setSize', target='Widget', args=['350px','400px'])
view1.camera='orthographic'
view1
view2 = nglview.show_structure_file(cmipPDB)
view2.add_component(mip_neg_cube)
view2.component_0.add_representation(repr_type='cartoon', selection='protein', color='sstruc')
view2.component_1.add_surface(isolevelType="value", isolevel=-5, color="red")
view2.component_0.center()
view2._remote_call('setSize', target='Widget', args=['350px','400px'])
view2.camera='orthographic'
view2
view3 = nglview.show_structure_file(cmipPDB)
view3.add_component(mip_neutral_cube)
view3.component_0.add_representation(repr_type='cartoon', selection='protein', color='sstruc')
view3.component_1.add_surface(isolevelType="value", isolevel=-1, color="grey")
view3.component_0.center()
view3._remote_call('setSize', target='Widget', args=['350px','400px'])
view3.camera='orthographic'
view3
ipywidgets.HBox([view1, view2, view3])

HBox(children=(NGLWidget(), NGLWidget(), NGLWidget()))

<a id="interaction"></a>
***
## Interaction Potential Energies 

Closely related to the previous study of **Molecular Interaction Potentials**, the **Interaction Potential Energies** calculation computes the contributions to the **total energy** of the system from the different **interactions between the subunits of the molecule** considered. These **interaction energies** usually depend on the relation between the **charge** and **positions** of the units studied (e.g. *electrostatic, van der Waals*) and the **solvation energy** (energy released when a compound is dissolved in a solvent).

**Interaction Potential Energies** give useful insights in the **macromolecular interaction** process, with the possibility to identify **key residues** involved in the interaction, and thus being another key component of the **drug discovery** process.  

To illustrate the calculation of the **interaction potentials** between two subunits of a **structure complex** (e.g. protein-protein, protein-ligand), the example of the **SARS-CoV-2 Receptor Binding Domain and the human Angiotensin Converting Enzyme 2** is used. 

***
**Building Blocks** used:
 - [extract_chain](https://biobb-structure-utils.readthedocs.io/en/latest/utils.html#module-utils.extract_chain) from **biobb_structure_utils.utils.extract_chain**
 - [prepare_structure](https://biobb-cmip.readthedocs.io/en/latest/cmip.html#module-cmip.prepare_structure) from **biobb_cmip.cmip.prepare_structure**
 - [cmip](https://biobb-cmip.readthedocs.io/en/latest/cmip.html#module-cmip.cmip) from **biobb_cmip.cmip.cmip**
***

<a id="preparePDB_MD"></a>
***
## PDB Preparation (from Molecular Dynamics topology)
When working with **structure conformations** taken from **MD simulations**, it is recommended to use the **charges, atom types and elements** considered in the simulation. Usually this information is stored in the so-called ***topology*** files. A specific **building block** (***prepare_structure***) is available to extract these information from an **MD topology file** and use it for the **CMIP calculations**.
 
The next cells are taking one frame of the **MD simulation**, splitting the subunits (chains) in two different **PDB files**, and preparing them to be used in **CMIP**, taking the **charges and elements** used in the simulations from the **MD topology file**. 
***
**Building Blocks** used:
 - [extract_chain](https://biobb-structure-utils.readthedocs.io/en/latest/utils.html#module-utils.extract_chain) from **biobb_structure_utils.utils.extract_chain**
 - [prepare_structure](https://biobb-cmip.readthedocs.io/en/latest/cmip.html#module-cmip.prepare_structure) from **biobb_cmip.cmip.prepare_structure**
***

<a id="visRBD-hACE2"></a>
### Visualizing 3D structure
Visualizing the original **SARS-CoV-2 Receptor Binding Domain** (blue chain) and the **human Angiotensin Converting Enzyme 2** (red chain) structure complex, taken from a **MD simulation trajectory**.

In [7]:
view = nglview.show_structure_file(inputPDB_MD)
view.clear_representations()
view.add_representation(repr_type='cartoon', selection='protein', color='chainname')
view._remote_call('setSize', target='Widget', args=['','400px'])
view

NGLWidget()

### WARNING: ZN is not working, check

### Extracting chains 
Saving **hACE2** (chain A) and **RBD** (chain B) in two different **PDB files**. 

In [8]:
from biobb_structure_utils.utils.extract_chain import extract_chain

inputPDB_MD_hACE2 = MDCode + ".hACE2.pdb"
inputPDB_MD_RBD = MDCode + ".RBD.pdb"

prop = {
    'chains': [ 'A' ]
}
extract_chain(input_structure_path=inputPDB_MD,
            output_structure_path=inputPDB_MD_hACE2,
            properties=prop)

prop = {
    'chains': [ 'B' ]
}
extract_chain(input_structure_path=inputPDB_MD,
            output_structure_path=inputPDB_MD_RBD,
            properties=prop)

2022-08-11 12:31:52,094 [MainThread  ] [INFO ]  Selected Chains: A
2022-08-11 12:31:52,096 [MainThread  ] [INFO ]  Not using any container
2022-08-11 12:31:52,097 [MainThread  ] [INFO ]  check_structure -i /home/adam.local/biobb/biobb_tutorials_dev/biobb_wf_cmip/biobb_wf_cmip/notebooks/Files/RBD-hACE2.pdb -o RBD-hACE2.hACE2.pdb --force_save chains --select A

2022-08-11 12:31:52,898 [MainThread  ] [INFO ]  Exit code 0

=                   BioBB structure checking utility v3.9.11                   =
=            P. Andrio, A. Hospital, G. Bayarri, J.L. Gelpi 2018-22           =

Structure /home/adam.local/biobb/biobb_tutorials_dev/biobb_wf_cmip/biobb_wf_cmip/notebooks/Files/RBD-hACE2.pdb loaded
 Title: 
 Experimental method: unknown
 Resolution (A): N.A.

 Num. models: 1
 Num. chains: 2 (A: Protein, B: Protein)
 Num. residues:  790
 Num. residues with ins. codes:  0
 Num. HETATM residues:  0
 Num. ligands or modified residues:  0
 Num. water mol.:  0
 Num. atoms:  12517

Running chains.

0

In [9]:
#Show different structures generated 
view1 = nglview.show_structure_file(inputPDB_MD_hACE2)
view1.component_0.add_representation(repr_type='cartoon', selection='protein', color='sstruc')
view1.component_0.center()
view1._remote_call('setSize', target='Widget', args=['400px','400px'])
view1.camera='orthographic'
view1
view2 = nglview.show_structure_file(inputPDB_MD_RBD)
view2.component_0.add_representation(repr_type='cartoon', selection='protein', color='sstruc')
view2.component_0.center()
view2._remote_call('setSize', target='Widget', args=['400px','400px'])
view2.camera='orthographic'
view2
ipywidgets.HBox([view1, view2])

HBox(children=(NGLWidget(), NGLWidget()))

### Preparing structures 
Preparing both structures for the **CMIP calculations**, using the original **MD topology file** information.

In [120]:
from biobb_cmip.cmip.prepare_structure import prepare_structure

cmipPDB_MD = MDCode + ".cmip.pdb"
cmipPDB_MD_hACE2 = MDCode + ".hACE2.cmip.pdb"
cmipPDB_MD_RBD = MDCode + ".RBD.cmip.pdb"

prepare_structure(input_pdb_path=inputPDB_MD,
                  input_topology_path=inputTOP_MD,
            output_cmip_pdb_path=cmipPDB_MD
)

prepare_structure(input_pdb_path=inputPDB_MD_hACE2,
                  input_topology_path=inputTOP_MD,
            output_cmip_pdb_path=cmipPDB_MD_hACE2
)

prepare_structure(input_pdb_path=inputPDB_MD_RBD,
                  input_topology_path=inputTOP_MD,
            output_cmip_pdb_path=cmipPDB_MD_RBD
)

2022-08-11 17:07:12,922 [MainThread  ] [INFO ]  Reading: Files/RBD-hACE2.top to extract charges
2022-08-11 17:07:41,489 [MainThread  ] [INFO ]  Reading: Files/RBD-hACE2.top to extract elements
2022-08-11 17:08:32,186 [MainThread  ] [INFO ]  Removed: []
2022-08-11 17:08:32,195 [MainThread  ] [INFO ]  Reading: Files/RBD-hACE2.top to extract charges
2022-08-11 17:09:01,143 [MainThread  ] [INFO ]  Reading: Files/RBD-hACE2.top to extract elements
2022-08-11 17:09:52,076 [MainThread  ] [INFO ]  Removed: []
2022-08-11 17:09:52,084 [MainThread  ] [INFO ]  Reading: Files/RBD-hACE2.top to extract charges
2022-08-11 17:10:21,041 [MainThread  ] [INFO ]  Reading: Files/RBD-hACE2.top to extract elements
2022-08-11 17:11:13,245 [MainThread  ] [INFO ]  Removed: []


0

In [121]:
from biobb_cmip.cmip.cmip import cmip

cmip_RBD_box_log = "RBD.box.log"
cmip_RBD_box_out = "RBD.box.byat.out"
cmip_RBD_box_json = "RBD.box.json"

prop = { 
    'execution_type' : 'check_only',
    'remove_tmp':False,
    #'box_size_factor': 0.97,
    'params' : {
         'perfill' : 0.8,
    }
}

cmip(input_pdb_path=cmipPDB_MD_RBD,
     output_log_path=cmip_RBD_box_log,
     output_byat_path=cmip_RBD_box_out,
     output_json_box_path=cmip_RBD_box_json,
     properties=prop)

2022-08-11 17:11:16,467 [MainThread  ] [INFO ]  Not using any container
2022-08-11 17:11:16,467 [MainThread  ] [INFO ]  cmip -i a858a850-df0c-4134-999d-bc846253399c/params -vdw /home/adam.local/anaconda3/envs/biobb_CMIP_tutorial/share/cmip/dat/vdwprm -hs RBD-hACE2.RBD.cmip.pdb -byat RBD.box.byat.out -o RBD.box.log -l e1ee0b49-0d20-4f88-adb2-b6840938d225/key_value_cmip_log.log

2022-08-11 17:11:16,683 [MainThread  ] [INFO ]  Exit code 0

2022-08-11 17:11:16,685 [MainThread  ] [INFO ]  STOP 0



0

In [122]:
import nglview as nv
from biobb_cmip.utils.representation import create_box_representation

boxedFilename, atomPair = create_box_representation(cmip_RBD_box_log, inputPDB_MD)
# Represent the new file in ngl
view = nv.show_structure_file(boxedFilename, default=False)
# Structure
view.add_representation(repr_type='cartoon', selection='not het', color='#cccccc', opacity=.2)
# ligands box
view.add_representation(repr_type='ball+stick', selection='9999', aspectRatio = 10)
# lines box
view.add_representation(repr_type='distance', atomPair= atomPair, labelColor= 'transparent', color= 'black')
view.center()
view._remote_call('setSize', target='Widget', args=['','600px'])
view

NGLWidget()

In [123]:
from biobb_cmip.cmip.cmip import cmip

cmip_hACE2_box_log = "hACE2.box.log"
cmip_hACE2_box_out = "hACE2.box.byat.out"
cmip_hACE2_box_json = "hACE2.box.json"

prop = { 
    'execution_type' : 'check_only',
    'remove_tmp':False,
    #'box_size_factor': 0.97,
    'params' : {
         'perfill' : 0.8,
    }
}

cmip(input_pdb_path=cmipPDB_MD_hACE2,
     output_log_path=cmip_hACE2_box_log,
     output_byat_path=cmip_hACE2_box_out,
     output_json_box_path=cmip_hACE2_box_json,
     properties=prop)

2022-08-11 17:11:23,206 [MainThread  ] [INFO ]  Not using any container
2022-08-11 17:11:23,207 [MainThread  ] [INFO ]  cmip -i 3adefad2-97f3-494e-b4c7-68da123c7af6/params -vdw /home/adam.local/anaconda3/envs/biobb_CMIP_tutorial/share/cmip/dat/vdwprm -hs RBD-hACE2.hACE2.cmip.pdb -byat hACE2.box.byat.out -o hACE2.box.log -l 236cc019-0741-4ba0-9737-8da4b2bfddba/key_value_cmip_log.log

2022-08-11 17:11:23,402 [MainThread  ] [INFO ]  Exit code 0

2022-08-11 17:11:23,405 [MainThread  ] [INFO ]  STOP 0



0

In [124]:
import nglview as nv
from biobb_cmip.utils.representation import create_box_representation

boxedFilename, atomPair = create_box_representation(cmip_hACE2_box_log, inputPDB_MD)
# Represent the new file in ngl
view = nv.show_structure_file(boxedFilename, default=False)
# Structure
view.add_representation(repr_type='cartoon', selection='not het', color='#cccccc', opacity=.2)
# ligands box
view.add_representation(repr_type='ball+stick', selection='9999', aspectRatio = 10)
# lines box
view.add_representation(repr_type='distance', atomPair= atomPair, labelColor= 'transparent', color= 'black')
view.center()
view._remote_call('setSize', target='Widget', args=['','600px'])
view

NGLWidget()

In [125]:
from biobb_cmip.cmip.cmip import cmip

cmip_COMPLEX_box_log = "COMPLEX.box.log"
cmip_COMPLEX_box_out = "COMPLEX.box.byat.out"
cmip_COMPLEX_box_json = "COMPLEX.box.json"

prop = { 
    'execution_type' : 'check_only',
    'remove_tmp':False,
    #'box_size_factor': 0.97,
    'params' : {
         'perfill' : 0.6,
    }
}

cmip(input_pdb_path=cmipPDB_MD,
     output_log_path=cmip_COMPLEX_box_log,
     output_byat_path=cmip_COMPLEX_box_out,
     output_json_box_path=cmip_COMPLEX_box_json,
     properties=prop)

2022-08-11 17:11:29,858 [MainThread  ] [INFO ]  Not using any container
2022-08-11 17:11:29,859 [MainThread  ] [INFO ]  cmip -i bc477725-0f77-4b76-ab3d-6cf67ccc1f11/params -vdw /home/adam.local/anaconda3/envs/biobb_CMIP_tutorial/share/cmip/dat/vdwprm -hs RBD-hACE2.cmip.pdb -byat COMPLEX.box.byat.out -o COMPLEX.box.log -l 19d44dea-64be-4798-ba11-2e4d6421f42a/key_value_cmip_log.log

2022-08-11 17:11:30,346 [MainThread  ] [INFO ]  Exit code 0

2022-08-11 17:11:30,347 [MainThread  ] [INFO ]  STOP 0



0

In [126]:
import nglview as nv
from biobb_cmip.utils.representation import create_box_representation

boxedFilename, atomPair = create_box_representation(cmip_COMPLEX_box_log, inputPDB_MD)
# Represent the new file in ngl
view = nv.show_structure_file(boxedFilename, default=False)
# Structure
view.add_representation(repr_type='cartoon', selection='not het', color='#cccccc', opacity=.2)
# ligands box
view.add_representation(repr_type='ball+stick', selection='9999', aspectRatio = 10)
# lines box
view.add_representation(repr_type='distance', atomPair= atomPair, labelColor= 'transparent', color= 'black')
view.center()
view._remote_call('setSize', target='Widget', args=['','600px'])
view

NGLWidget()

<a id="RBDinteraction"></a>
***
## RDB Interaction Potential Energies 
The first analysis computes the **interaction potential energies** for the **RBD atoms** (CMIP input probe) with respect to the **hACE2 enzyme** (CMIP input protein). 

In [127]:
from biobb_cmip.cmip.ignore_residues import ignore_residues

cmipPDB_MD_RBD_ignored = MDCode + ".RBD_ignored.cmip.pdb"

prop = {
    'residue_list': "B:"
}

ignore_residues(input_cmip_pdb_path = cmipPDB_MD,
               output_cmip_pdb_path = cmipPDB_MD_RBD_ignored,
               properties = prop)

2022-08-11 17:11:36,786 [MainThread  ] [INFO ]  Residue list: ['B:']
2022-08-11 17:11:36,819 [MainThread  ] [INFO ]  3000 residues have been marked
2022-08-11 17:11:36,820 [MainThread  ] [INFO ]  Removed: []


In [128]:
!cat COMPLEX.box.json

{
    "origin": {
        "x": -6.35,
        "y": -14.18,
        "z": 6.15
    },
    "size": {
        "x": 133.5,
        "y": 154.0,
        "z": 108.5
    },
    "params": {
        "CEN": [
            60.4,
            62.82,
            56.315
        ],
        "DIM": [
            267,
            308,
            217
        ],
        "INT": [
            0.5,
            0.5,
            0.5
        ]
    }
}

In [129]:
from biobb_cmip.cmip.cmip import cmip

RBD_energies_log = "RBD.energies.log"
RBD_byat_out = "RBD.energies.byat.out"

prop = { 
    'execution_type' : 'energy',
    'remove_tmp' : False,
    'params' : {
        'perfill' : 0.8,
        'readgrid0' : 0,
        #'perfill0' : 0.05,
        'cenx0' : 60.4,
        'ceny0' : 62.82,
        'cenz0' : 56.31,
        'dimx0' : 267,
        'dimy0' : 308,
        'dimz0' : 217,
        'intx0' : 1.5,
        'inty0' : 1.5,
        'intz0' : 1.5        
    }
}

cmip(input_pdb_path=cmipPDB_MD_RBD_ignored,
     input_probe_pdb_path=cmipPDB_MD_RBD,
     input_json_box_path=cmip_RBD_box_log,
#          output_pdb_path='output.pdb', # If added, python crashes with output_pdb_path not exists!!
          output_log_path=RBD_energies_log,
          output_byat_path=RBD_byat_out,
          properties=prop)

2022-08-11 17:11:53,582 [MainThread  ] [INFO ]  Not using any container
2022-08-11 17:11:53,583 [MainThread  ] [INFO ]  cmip -i c3a73ae1-1c68-4071-97ba-727300ba0b12/params -vdw /home/adam.local/anaconda3/envs/biobb_CMIP_tutorial/share/cmip/dat/vdwprm -hs RBD-hACE2.RBD_ignored.cmip.pdb -pr RBD-hACE2.RBD.cmip.pdb -byat RBD.energies.byat.out -o RBD.energies.log

2022-08-11 17:14:14,854 [MainThread  ] [INFO ]  Exit code 0



0

<a id="visBOX1"></a>
### Visualizing CMIP Box
Visualizing the **box** used by **CMIP** to compute the **Interaction Potential Energies** (taken from the log file). It is important to check that the box includes the whole **interaction region**, which is the region of interest. 

In [130]:
import nglview as nv
from biobb_cmip.utils.representation import create_box_representation

boxedFilename, atomPair = create_box_representation(RBD_energies_log, inputPDB_MD)
# Represent the new file in ngl
view = nv.show_structure_file(boxedFilename, default=False)
# Structure
view.add_representation(repr_type='cartoon', selection='not het', color='#cccccc', opacity=.2)
# ligands box
view.add_representation(repr_type='ball+stick', selection='9999', aspectRatio = 10)
# lines box
view.add_representation(repr_type='distance', atomPair= atomPair, labelColor= 'transparent', color= 'black')
view.center()
view._remote_call('setSize', target='Widget', args=['','600px'])
view

NGLWidget()

<a id="plotRBD_atoms"></a>
### Interaction energy by atom
Visualizing the **interaction potential energies** computed by **CMIP**. The plot shows **interactions energies** (in kcal/mol, Y axis) for **each of the atoms** of the **RBD protein** (X axis). 

In [131]:
import plotly
import plotly.graph_objs as go
from biobb_cmip.utils.representation import get_energies_byat

atom_list, energy_dict = get_energies_byat(RBD_byat_out, cutoff=55)

plotly.offline.init_notebook_mode(connected=True)

fig = {"data": [go.Scatter(x=atom_list, y=energy_dict['ES&VDW'])],
       "layout": go.Layout(title="CMIP Interaction Potential", 
                           xaxis=dict(title = "Atom Number"), 
                           yaxis=dict(title = "Potential Energy Kcal/mol"))}

plotly.offline.iplot(fig)

<a id="plotRBD_residues"></a>
### Interaction energy by residue
Visualizing the **interaction potential energies** computed by **CMIP**. The plot shows **interactions energies** (in kcal/mol, Y axis) for **each of the residues** (computed summing the contributions of all atoms included in the residue) of the **RBD protein** (X axis). 

In [132]:
import plotly
import plotly.graph_objs as go
from biobb_cmip.utils.representation import get_energies_byres


res_list, energy_dict = get_energies_byres(RBD_byat_out, cutoff=55)

plotly.offline.init_notebook_mode(connected=True)

fig = {"data": [go.Scatter(x=res_list, y=energy_dict['ES&VDW'])],
       "layout": go.Layout(title="CMIP Interaction Potential", 
                           xaxis=dict(title = "Residue ID"), 
                           yaxis=dict(title = "Potential Energy Kcal/mol"))}

plotly.offline.iplot(fig)


<a id="hACE2interaction"></a>
***
## hACE2 Interaction Potential Energies 
The first analysis computes the **interaction potential energies** for the **hACE2 atoms** (CMIP input probe) with respect to the **RBD receptor** (CMIP input protein). 

In [135]:
from biobb_cmip.cmip.ignore_residues import ignore_residues

cmipPDB_MD_hACE2_ignored = MDCode + ".hACE2_ignored.cmip.pdb"

prop = {
    'residue_list': "A:"
}

ignore_residues(input_cmip_pdb_path = cmipPDB_MD,
               output_cmip_pdb_path = cmipPDB_MD_hACE2_ignored,
               properties = prop)

2022-08-11 17:18:52,058 [MainThread  ] [INFO ]  Residue list: ['A:']
2022-08-11 17:18:52,103 [MainThread  ] [INFO ]  9517 residues have been marked
2022-08-11 17:18:52,104 [MainThread  ] [INFO ]  Removed: []


In [134]:
from biobb_cmip.cmip.cmip import cmip

hACE2_energies_log = "hACE2.energies.log"
hACE2_byat_out = "hACE2.energies.byat.out"

prop = { 
    'execution_type' : 'energy'
}

prop = { 
    'execution_type' : 'energy',
    'remove_tmp' : False,
    'params' : {
        'perfill' : 0.8,
        'readgrid0' : 0,
#        'perfill0' : 0.4,
        'cenx0' : 60.4,
        'ceny0' : 62.82,
        'cenz0' : 56.31,
        'dimx0' : 267,
        'dimy0' : 308,
        'dimz0' : 217,
        'intx0' : 1.5,
        'inty0' : 1.5,
        'intz0' : 1.5        
    }
}

cmip(input_pdb_path=cmipPDB_MD_hACE2_ignored,
     input_probe_pdb_path=cmipPDB_MD_hACE2,
     input_json_box_path=cmip_hACE2_box_log,
#          output_pdb_path='output.pdb', # If added, python crashes with output_pdb_path not exists!!
          output_log_path=hACE2_energies_log,
          output_byat_path=hACE2_byat_out,
          properties=prop)

2022-08-11 17:14:57,389 [MainThread  ] [INFO ]  Not using any container
2022-08-11 17:14:57,390 [MainThread  ] [INFO ]  cmip -i c80432b7-92bb-4782-8c65-ea0eedef940b/params -vdw /home/adam.local/anaconda3/envs/biobb_CMIP_tutorial/share/cmip/dat/vdwprm -hs RBD-hACE2.hACE2_ignored.cmip.pdb -pr RBD-hACE2.hACE2.cmip.pdb -byat hACE2.energies.byat.out -o hACE2.energies.log

2022-08-11 17:16:33,362 [MainThread  ] [INFO ]  Exit code 0



0

<a id="visBOX2"></a>
### Visualizing CMIP Box
Visualizing the **box** used by **CMIP** to compute the **Interaction Potential Energies** (taken from the log file). It is important to check that the box includes the whole **interaction region**, which is the region of interest. 

In [136]:
import nglview as nv
from biobb_cmip.utils.representation import create_box_representation

boxedFilename, atomPair = create_box_representation(hACE2_energies_log, inputPDB_MD)
# Represent the new file in ngl
view = nv.show_structure_file(boxedFilename, default=False)
# Structure
view.add_representation(repr_type='cartoon', selection='not het', color='#cccccc', opacity=.2)
# ligands box
view.add_representation(repr_type='ball+stick', selection='9999', aspectRatio = 10)
# lines box
view.add_representation(repr_type='distance', atomPair= atomPair, labelColor= 'transparent', color= 'black')
view.center()
view._remote_call('setSize', target='Widget', args=['','600px'])
view

NGLWidget()

<a id="plotRBD_atoms"></a>
### Interaction energy by atom
Visualizing the **interaction potential energies** computed by **CMIP**. The plot shows **interactions energies** (in kcal/mol, Y axis) for **each of the atoms** of the **hACE2 protein** (X axis). 

In [137]:
import plotly
import plotly.graph_objs as go
from biobb_cmip.utils.representation import get_energies_byat

atom_list, energy_dict = get_energies_byat(hACE2_byat_out, cutoff=55)

plotly.offline.init_notebook_mode(connected=True)

fig = {"data": [go.Scatter(x=atom_list, y=energy_dict['ES&VDW'])],
       "layout": go.Layout(title="CMIP Interaction Potential", 
                           xaxis=dict(title = "Atom Number"), 
                           yaxis=dict(title = "Potential Energy Kcal/mol"))}

plotly.offline.iplot(fig)

<a id="plotRBD_residues"></a>
### Interaction energy by residue
Visualizing the **interaction potential energies** computed by **CMIP**. The plot shows **interactions energies** (in kcal/mol, Y axis) for **each of the residues** (computed summing the contributions of all atoms included in the residue) of the **hACE2 protein** (X axis). 

In [138]:
import plotly
import plotly.graph_objs as go
from biobb_cmip.utils.representation import get_energies_byres


res_list, energy_dict = get_energies_byres(hACE2_byat_out, cutoff=55)

plotly.offline.init_notebook_mode(connected=True)

fig = {"data": [go.Scatter(x=res_list, y=energy_dict['ES&VDW'])],
       "layout": go.Layout(title="CMIP Interaction Potential", 
                           xaxis=dict(title = "Residue ID"), 
                           yaxis=dict(title = "Potential Energy Kcal/mol"))}

plotly.offline.iplot(fig)


***
<a id="questions"></a>

## Questions & Comments

Questions, issues, suggestions and comments are really welcome!

* GitHub issues:
    * [https://github.com/bioexcel/biobb](https://github.com/bioexcel/biobb)

* BioExcel forum:
    * [https://ask.bioexcel.eu/c/BioExcel-Building-Blocks-library](https://ask.bioexcel.eu/c/BioExcel-Building-Blocks-library)
