# Protein-ligand Docking tutorial using BioExcel Building Blocks (biobb)
### -- *PDB Cluster90 Binding Site Version* --

***
This tutorial aims to illustrate the process of **protein-ligand docking**, step by step, using the **BioExcel Building Blocks library (biobb)**. The particular example used is the **Mitogen-activated protein kinase 14** (p38-α) protein (PDB code [3HEC](https://www.rcsb.org/structure/3HEC)), a well-known **Protein Kinase enzyme**, 
 in complex with the FDA-approved **Imatinib**, (PDB Ligand code [STI](https://www.rcsb.org/ligand/STI), DrugBank Ligand Code [DB00619](https://go.drugbank.com/drugs/DB00619)), a small molecule **kinase inhibitor** used to treat certain types of **cancer**. 
 
The tutorial will guide you through the process of identifying the **active site cavity** (pocket) without previous knowledge, and the final prediction of the **protein-ligand complex**. 

Please note that **docking algorithms**, and in particular, **AutoDock Vina** program used in this tutorial, are **non-deterministic**. That means that results obtained when running the workflow **could be diferent** from the ones we obtained during the writing of this tutorial (see [AutoDock Vina manual](http://vina.scripps.edu/manual.html)). We invite you to try the docking process several times to verify this behaviour. 
***

<div style="background:#b5e0dd; padding: 15px;"><strong>Important:</strong> it is recommended to execute this tutorial step by step (not as a single workflow execution, <strong><em>Run All</em></strong> mode), as it has interactive selections.</div>

## Settings

### Biobb modules used

 - [biobb_io](https://github.com/bioexcel/biobb_io): Tools to fetch biomolecular data from public databases.
 - [biobb_structure_utils](https://github.com/bioexcel/biobb_structure_utils): Tools to modify or extract information from a PDB structure file.
 - [biobb_chemistry](https://github.com/bioexcel/biobb_chemistry): Tools to perform chemoinformatics processes.
 - [biobb_vs](https://github.com/bioexcel/biobb_vs): Tools to perform virtual screening studies.
 
### Auxiliar libraries used

* [jupyter](https://jupyter.org/): Free software, open standards, and web services for interactive computing across all programming languages.
* [nglview](http://nglviewer.org/#nglview): Jupyter/IPython widget to interactively view molecular structures and trajectories in notebooks.

### Conda Installation

```console
git clone https://github.com/bioexcel/biobb_wf_virtual-screening.git
cd biobb_wf_virtual-screening
conda env create -f conda_env/environment.yml
conda activate biobb_VS_tutorial
jupyter-notebook biobb_wf_virtual-screening/notebooks/ebi_api/wf_vs_clusterBindingSite.ipynb
```

***
## Pipeline steps
 1. [Input Parameters](#input)
 2. [Fetching PDB Structure](#fetch)
 3. [Extract Protein Structure](#extractProtein)
 4. [Computing Protein Cavities (fpocket)](#fpocket)
 5. [Filtering Protein Cavities (fpocket output)](#fpocketFilter)
 6. [Extract Pocket Cavity ](#fpocketSelect)
 7. [Generating Cavity Box ](#cavityBox)
 8. [Downloading Small Molecule](#downloadSmallMolecule)
 9. [Converting Small Molecule](#sdf2pdb)
 10. [Preparing Small Molecule (ligand) for Docking](#ligand_pdb2pdbqt)
 11. [Preparing Target Protein for Docking](#protein_pdb2pdbqt)
 12. [Running the Docking](#docking)
 13. [Extract a Docking Pose](#extractPose)
 14. [Converting Ligand Pose to PDB format](#pdbqt2pdb)
 15. [Superposing Ligand Pose to the Target Protein Structure](#catPdb)
 16. [Comparing final result with experimental structure](#viewFinal)
 17. [Questions & Comments](#questions)
 
***
<img src="https://bioexcel.eu/wp-content/uploads/2019/04/Bioexcell_logo_1080px_transp.png" alt="Bioexcel2 logo"
	title="Bioexcel2 logo" width="400" />
***


<a id="input"></a>
## Input parameters
**Input parameters** needed:

 - **pdb_code**: PDB code of the experimental complex structure (if exists).<br>
In this particular example, the **p38α** structure in complex with the **Imatinib drug** was experimentally solved and deposited in the **PDB database** under the **3HEC** PDB code. The protein structure from this PDB file will be used as a **target protein** for the **docking process**, after stripping the **small molecule**. An **APO structure**, or any other structure from the **p38α** [cluster 100](https://www.rcsb.org/search?request=%7B%22query%22%3A%7B%22type%22%3A%22terminal%22%2C%22service%22%3A%22sequence%22%2C%22parameters%22%3A%7B%22target%22%3A%22pdb_protein_sequence%22%2C%22value%22%3A%22RPTFYRQELNKTIWEVPERYQNLSPVGSGAYGSVCAAFDTKTGLRVAVKKLSRPFQSIIHAKRTYRELRLLKHMKHENVIGLLDVFTPARSLEEFNDVYLVTHLMGADLNNIVKCQKLTDDHVQFLIYQILRGLKYIHSADIIHRDLKPSNLAVNEDCELKILDFGLARHTDDEMTGYVATRWYRAPEIMLNWMHYNQTVDIWSVGCIMAELLTGRTLFPGTDHIDQLKLILRLVGTPGAELLKKISSESARNYIQSLTQMPKMNFANVFIGANPLAVDLLEKMLVLDSDKRITAAQALAHAYFAQYHDPDDEPVADPYDQSFESRDLLIDEWKSLTYDEVISFVPPP%22%2C%22identity_cutoff%22%3A1%2C%22evalue_cutoff%22%3A0.1%7D%2C%22node_id%22%3A0%7D%2C%22return_type%22%3A%22polymer_entity%22%2C%22request_options%22%3A%7B%22pager%22%3A%7B%22start%22%3A0%2C%22rows%22%3A25%7D%2C%22scoring_strategy%22%3A%22combined%22%2C%22sort%22%3A%5B%7B%22sort_by%22%3A%22score%22%2C%22direction%22%3A%22desc%22%7D%5D%7D%2C%22request_info%22%3A%7B%22src%22%3A%22ui%22%2C%22query_id%22%3A%22bea5861f8b38a9e25a3e626b39d6bcbf%22%7D%7D) (sharing a 100% of sequence similarity with the **p38α** structure) could also be used as a **target protein**. This structure of the **protein-ligand complex** will be also used in the last step of the tutorial to check **how close** the resulting **docking pose** is from the known **experimental structure**. 
 -----
 - **ligandCode**: Ligand PDB code (3-letter code) for the small molecule (e.g. STI).<br>
In this particular example, the small molecule chosen for the tutorial is the FDA-approved drug **Imatinib** (PDB Code STI), a type of cancer growth blocker, used in [diferent types of leukemia](https://go.drugbank.com/drugs/DB00619).

In [1]:
import nglview
import ipywidgets

pdb_code = "3HEC"         # P38 + Imatinib

ligand_code = "STI"       # Imatinib



<a id="fetch"></a>
***
## Fetching PDB structure
Downloading **PDB structure** with the **protein molecule** from the PDBe database.<br>
Alternatively, a **PDB file** can be used as starting structure. <br>
***
**Building Blocks** used:
 - [Pdb](https://biobb-io.readthedocs.io/en/latest/api.html#module-api.pdb) from **biobb_io.api.pdb**
***

In [2]:
from biobb_io.api.pdb import pdb

download_pdb = "download.pdb"
prop = {
  "pdb_code": pdb_code,
  "filter": ["ATOM", "HETATM"]
}

pdb(output_pdb_path=download_pdb,
    properties=prop)

2021-05-17 15:15:49,040 [MainThread  ] [INFO ]  Downloading: 3hec from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb3hec.ent
2021-05-17 15:15:49,530 [MainThread  ] [INFO ]  Writting pdb to: download.pdb
2021-05-17 15:15:49,534 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']


0

<a id="vis3D"></a>
### Visualizing 3D structure
Visualizing the downloaded/given **PDB structure** using **NGL**.<br><br>
Note (and try to identify) the **Imatinib small molecule (STI)** and the **detergent (β-octyl glucoside) (BOG)** used in the experimental reservoir solution to obtain the crystal.

In [23]:
view = nglview.show_structure_file(download_pdb, default=True)
view.center()
view._remote_call('setSize', target='Widget', args=['','600px'])

view.render_image()
view.download_image(filename='ngl1.png')
view

NGLWidget()

<img src='ngl1.png'></img>

<a id="extractProtein"></a>
***
## Extract Protein Structure
Extract **protein structure** from the **downloaded PDB file**. Removing **any extra molecule** (ligands, ions, water molecules). <br><br>
The **protein structure** will be used as a **target** in the **protein-ligand docking process**. 
***
**Building Blocks** used:
 - [extract_molecule](https://biobb-structure-utils.readthedocs.io/en/latest/utils.html#module-utils.extract_molecule) from **biobb_structure_utils.utils.extract_molecule**
***

In [4]:
from biobb_structure_utils.utils.extract_molecule import extract_molecule

pdb_protein = "pdb_protein.pdb"

extract_molecule(input_structure_path=download_pdb,
             output_molecule_path = pdb_protein)

2021-05-17 15:15:55,068 [MainThread  ] [INFO ]  Creating ed60c66d-be1b-48cd-9551-5411add09a64 temporary folder
2021-05-17 15:15:55,876 [MainThread  ] [INFO ]  check_structure -i /home/gbayarri_local/projects/BioBB/tutorials/biobb_wf_virtual-screening/biobb_wf_virtual-screening/notebooks/clusterBindingSite/download.pdb -o pdb_protein.pdb --force_save --non_interactive command_list --list ed60c66d-be1b-48cd-9551-5411add09a64/extract_prot.lst

2021-05-17 15:15:55,879 [MainThread  ] [INFO ]  Exit code 0

=                   BioBB structure checking utility v3.7.2                   =
=                 A. Hospital, P. Andrio, J.L. Gelpi 2018-20                  =

Structure /home/gbayarri_local/projects/BioBB/tutorials/biobb_wf_virtual-screening/biobb_wf_virtual-screening/notebooks/clusterBindingSite/download.pdb loaded
 Title: 
 Experimental method: unknown
 Resolution: None A

 Num. models: 1
 Num. chains: 1 (A: Protein)
 Num. residues:  420
 Num. residues with ins. codes:  0
 Num. HETATM 

0

<a id="vis3D"></a>
### Visualizing 3D structure
Visualizing the downloaded/given **PDB structure** using **NGL**.<br><br>
Note that the **small molecules** included in the original structure are now gone. The new structure only contains the **protein molecule**, which will be used as a **target** for the **protein-ligand docking**. 

In [24]:
view = nglview.show_structure_file(pdb_protein, default=True)
view.center()
view._remote_call('setSize', target='Widget', args=['','600px'])

view.render_image()
view.download_image(filename='ngl2.png')
view

NGLWidget()

<img src='ngl2.png'></img>

<a id="bindingSite"></a>
***
## Computing Protein Cavities (Cluster90 Binding Site)
Computing the **protein cavities** (pockets) using information from the **PDB Cluster90**. The **PDB Cluster90** is a collection derived from the **PDB database**, containing structures having **less than 90% sequence identity** to each other. The **Cluster90 Binding Site** is using information from all the structures of the **Cluster90 collection** for a particular **input protein** to discover the possible **binding sites** from **small molecules** attached to **similar proteins**. <br>

These **cavities** will be then used in the **docking procedure** to try to find the **best region of the protein surface** where the small molecule can **bind**. <br><br>
Although in this particular example we already know the **binding site** region, as we started from a **protein-ligand complex** structure where the ligand was located in the same **binding site** as **Imatinib** is binding, this is not always the case. In the cases where we do not know these regions, the **Cluster90 binding site** utility will help us identifying the possible **binding sites** of our **target protein**.<br>
<br>

***
**Building Blocks** used:
 - [pdb_cluster_zip](https://biobb-io.readthedocs.io/en/latest/api.html#module-api.pdb_cluster_zip) from **biobb_io.api.pdb_cluster_zip**
 - [bindingsite](https://biobb-vs.readthedocs.io/en/latest/utils.html#module-utils.bindingsite) from **biobb_vs.utils.bindingsite**
***

<a id="cluster90"></a>
### Getting Cluster 90 collection 
Extracting the **Cluster90** collection from the input **PDB structure**. The collection will contain all the structures in the **PDB database** having **less than 90% sequence identity** to the input structure (our target protein).<br><br>
*Please note that depending on the size of the cluster, the execution can take a while (minutes).*

In [6]:
from biobb_io.api.pdb_cluster_zip import pdb_cluster_zip

pdb_cluster = "pdb_cluster.zip"
prop = {
    "pdb_code": pdb_code,
    "filter": ["ATOM", "HETATM"],
    "cluster": 90
}

pdb_cluster_zip(output_pdb_zip_path = pdb_cluster,
            properties=prop)

2021-05-17 15:16:03,254 [MainThread  ] [INFO ]  Cluster: 90 of pdb_code: 3hec
 List: {'1wbo', '5n67', '5o8v', '4fa2', '1ouy', '3nnv', '4r3c', '1wbv', '3o8u', '4loq', '5n63', '3fln', '5o90', '1kv2', '3lfb', '4f9w', '4aa0', '3rin', '3new', '3zs5', '2oza', '1ouk', '3mh1', '3o8p', '1ywr', '3obg', '3p7c', '3bv3', '5etc', '3p78', '1wbw', '3od6', '2zb0', '4f9y', '1lew', '5tco', '3p5k', '2rg5', '2yis', '1bmk', '2npq', '3iw5', '2baq', '3hp2', '1wfc', '3oc1', '4eh4', '3kf7', '2bal', '5xyy', '3flq', '1w82', '3nww', '3kq7', '3ha8', '4tyh', '1m7q', '3itz', '3zsi', '3fls', '3s3i', '3lfc', '3mpt', '4dlj', '3lhj', '3fmn', '5mz3', '1oz1', '4a9y', '5lar', '4eh3', '1r39', '3mw1', '3k3i', '3dt1', '2onl', '1bl7', '3fmj', '3hec', '3hv6', '4kip', '4loo', '3fkn', '3iw8', '1a9u', '3fc1', '3mvm', '3uvq', '4eh9', '3ds6', '1di9', '3tg1', '3zya', '4l8m', '4dli', '2lgc', '3flw', '1w84', '3c5u', '3ctq', '3fly', '3fml', '3u8w', '2fst', '5mtx', '5n66', '2fso', '2gfs', '5eti', '3mpa', '3heg', '3nnu', '2y8o', '2qd9', '3

2021-05-17 15:16:07,803 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/3o8u.pdb
2021-05-17 15:16:07,805 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:16:07,816 [MainThread  ] [INFO ]  Downloading: 4loq from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb4loq.ent
2021-05-17 15:16:08,838 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/4loq.pdb
2021-05-17 15:16:08,840 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:16:08,902 [MainThread  ] [INFO ]  Downloading: 5n63 from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb5n63.ent
2021-05-17 15:16:09,639 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/5n63.pdb
2021-05-17 15:16:09,643 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:16:09,653 [MainThread  ] [

2021-05-17 15:16:15,341 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/2oza.pdb
2021-05-17 15:16:15,344 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:16:15,365 [MainThread  ] [INFO ]  Downloading: 1ouk from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb1ouk.ent
2021-05-17 15:16:15,768 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/1ouk.pdb
2021-05-17 15:16:15,771 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:16:15,789 [MainThread  ] [INFO ]  Downloading: 3mh1 from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb3mh1.ent
2021-05-17 15:16:16,292 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/3mh1.pdb
2021-05-17 15:16:16,295 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:16:16,303 [MainThread  ] [

2021-05-17 15:16:21,173 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/2zb0.pdb
2021-05-17 15:16:21,175 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:16:21,195 [MainThread  ] [INFO ]  Downloading: 4f9y from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb4f9y.ent
2021-05-17 15:16:21,790 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/4f9y.pdb
2021-05-17 15:16:21,794 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:16:21,806 [MainThread  ] [INFO ]  Downloading: 1lew from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb1lew.ent
2021-05-17 15:16:22,207 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/1lew.pdb
2021-05-17 15:16:22,211 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:16:22,228 [MainThread  ] [

2021-05-17 15:16:26,724 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/1wfc.pdb
2021-05-17 15:16:26,729 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:16:26,740 [MainThread  ] [INFO ]  Downloading: 3oc1 from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb3oc1.ent
2021-05-17 15:16:27,210 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/3oc1.pdb
2021-05-17 15:16:27,213 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:16:27,235 [MainThread  ] [INFO ]  Downloading: 4eh4 from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb4eh4.ent
2021-05-17 15:16:27,782 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/4eh4.pdb
2021-05-17 15:16:27,785 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:16:27,808 [MainThread  ] [

2021-05-17 15:16:32,673 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/1m7q.pdb
2021-05-17 15:16:32,677 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:16:32,697 [MainThread  ] [INFO ]  Downloading: 3itz from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb3itz.ent
2021-05-17 15:16:33,154 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/3itz.pdb
2021-05-17 15:16:33,157 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:16:33,177 [MainThread  ] [INFO ]  Downloading: 3zsi from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb3zsi.ent
2021-05-17 15:16:33,723 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/3zsi.pdb
2021-05-17 15:16:33,726 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:16:33,755 [MainThread  ] [

2021-05-17 15:16:38,603 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/4a9y.pdb
2021-05-17 15:16:38,607 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:16:38,632 [MainThread  ] [INFO ]  Downloading: 5lar from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb5lar.ent
2021-05-17 15:16:39,173 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/5lar.pdb
2021-05-17 15:16:39,177 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:16:39,212 [MainThread  ] [INFO ]  Downloading: 4eh3 from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb4eh3.ent
2021-05-17 15:16:39,642 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/4eh3.pdb
2021-05-17 15:16:39,646 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:16:39,661 [MainThread  ] [

2021-05-17 15:16:44,621 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/4kip.pdb
2021-05-17 15:16:44,624 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:16:44,649 [MainThread  ] [INFO ]  Downloading: 4loo from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb4loo.ent
2021-05-17 15:16:45,192 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/4loo.pdb
2021-05-17 15:16:45,196 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:16:45,230 [MainThread  ] [INFO ]  Downloading: 3fkn from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb3fkn.ent
2021-05-17 15:16:45,685 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/3fkn.pdb
2021-05-17 15:16:45,688 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:16:45,709 [MainThread  ] [

2021-05-17 15:16:51,474 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/3zya.pdb
2021-05-17 15:16:51,477 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:16:51,505 [MainThread  ] [INFO ]  Downloading: 4l8m from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb4l8m.ent
2021-05-17 15:16:51,949 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/4l8m.pdb
2021-05-17 15:16:51,952 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:16:51,971 [MainThread  ] [INFO ]  Downloading: 4dli from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb4dli.ent
2021-05-17 15:16:52,427 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/4dli.pdb
2021-05-17 15:16:52,428 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:16:52,434 [MainThread  ] [

2021-05-17 15:16:58,325 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/5mtx.pdb
2021-05-17 15:16:58,329 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:16:58,346 [MainThread  ] [INFO ]  Downloading: 5n66 from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb5n66.ent
2021-05-17 15:16:58,867 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/5n66.pdb
2021-05-17 15:16:58,870 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:16:58,893 [MainThread  ] [INFO ]  Downloading: 2fso from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb2fso.ent
2021-05-17 15:16:59,321 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/2fso.pdb
2021-05-17 15:16:59,325 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:16:59,343 [MainThread  ] [

2021-05-17 15:17:04,222 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/1r3c.pdb
2021-05-17 15:17:04,225 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:17:04,244 [MainThread  ] [INFO ]  Downloading: 2puu from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb2puu.ent
2021-05-17 15:17:04,660 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/2puu.pdb
2021-05-17 15:17:04,663 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:17:04,680 [MainThread  ] [INFO ]  Downloading: 5etf from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb5etf.ent
2021-05-17 15:17:05,111 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/5etf.pdb
2021-05-17 15:17:05,115 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:17:05,133 [MainThread  ] [

2021-05-17 15:17:10,455 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/3fmm.pdb
2021-05-17 15:17:10,460 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:17:10,484 [MainThread  ] [INFO ]  Downloading: 3mh3 from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb3mh3.ent
2021-05-17 15:17:10,906 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/3mh3.pdb
2021-05-17 15:17:10,911 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:17:10,932 [MainThread  ] [INFO ]  Downloading: 3nnx from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb3nnx.ent
2021-05-17 15:17:11,381 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/3nnx.pdb
2021-05-17 15:17:11,384 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:17:11,403 [MainThread  ] [

2021-05-17 15:17:15,949 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/3fi4.pdb
2021-05-17 15:17:15,953 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:17:15,975 [MainThread  ] [INFO ]  Downloading: 3fkl from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb3fkl.ent
2021-05-17 15:17:16,397 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/3fkl.pdb
2021-05-17 15:17:16,398 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:17:16,406 [MainThread  ] [INFO ]  Downloading: 3gi3 from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb3gi3.ent
2021-05-17 15:17:16,871 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/3gi3.pdb
2021-05-17 15:17:16,872 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:17:16,880 [MainThread  ] [

2021-05-17 15:17:21,132 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/3bv2.pdb
2021-05-17 15:17:21,135 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:17:21,158 [MainThread  ] [INFO ]  Downloading: 4aac from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb4aac.ent
2021-05-17 15:17:21,693 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/4aac.pdb
2021-05-17 15:17:21,696 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:17:21,720 [MainThread  ] [INFO ]  Downloading: 1wbn from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb1wbn.ent
2021-05-17 15:17:22,142 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/1wbn.pdb
2021-05-17 15:17:22,144 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:17:22,163 [MainThread  ] [

2021-05-17 15:17:26,671 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/4e5a.pdb
2021-05-17 15:17:26,675 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:17:26,698 [MainThread  ] [INFO ]  Downloading: 3bx5 from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb3bx5.ent
2021-05-17 15:17:27,225 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/3bx5.pdb
2021-05-17 15:17:27,229 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:17:27,250 [MainThread  ] [INFO ]  Downloading: 3ocg from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb3ocg.ent
2021-05-17 15:17:27,737 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/3ocg.pdb
2021-05-17 15:17:27,740 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:17:27,757 [MainThread  ] [

2021-05-17 15:17:33,466 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/2gtn.pdb
2021-05-17 15:17:33,469 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:17:33,487 [MainThread  ] [INFO ]  Downloading: 4e5b from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb4e5b.ent
2021-05-17 15:17:33,951 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/4e5b.pdb
2021-05-17 15:17:33,955 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:17:33,971 [MainThread  ] [INFO ]  Downloading: 4ka3 from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb4ka3.ent
2021-05-17 15:17:34,514 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/4ka3.pdb
2021-05-17 15:17:34,517 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:17:34,542 [MainThread  ] [

2021-05-17 15:17:39,575 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/3gcu.pdb
2021-05-17 15:17:39,578 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:17:39,596 [MainThread  ] [INFO ]  Downloading: 5mty from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb5mty.ent
2021-05-17 15:17:40,024 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/5mty.pdb
2021-05-17 15:17:40,028 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:17:40,049 [MainThread  ] [INFO ]  Downloading: 3ody from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb3ody.ent
2021-05-17 15:17:40,534 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/3ody.pdb
2021-05-17 15:17:40,542 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:17:40,572 [MainThread  ] [

2021-05-17 15:17:45,104 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/2yiw.pdb
2021-05-17 15:17:45,107 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:17:45,127 [MainThread  ] [INFO ]  Downloading: 4ehv from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb4ehv.ent
2021-05-17 15:17:45,620 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/4ehv.pdb
2021-05-17 15:17:45,623 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:17:45,647 [MainThread  ] [INFO ]  Downloading: 2zaz from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb2zaz.ent
2021-05-17 15:17:46,120 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/2zaz.pdb
2021-05-17 15:17:46,124 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:17:46,142 [MainThread  ] [

2021-05-17 15:17:51,556 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/3s4q.pdb
2021-05-17 15:17:51,560 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:17:51,579 [MainThread  ] [INFO ]  Downloading: 3p7b from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb3p7b.ent
2021-05-17 15:17:51,989 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/3p7b.pdb
2021-05-17 15:17:51,992 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:17:52,010 [MainThread  ] [INFO ]  Downloading: 6anl from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb6anl.ent
2021-05-17 15:17:52,560 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/6anl.pdb
2021-05-17 15:17:52,564 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:17:52,593 [MainThread  ] [

2021-05-17 15:17:57,653 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/2gtm.pdb
2021-05-17 15:17:57,657 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:17:57,674 [MainThread  ] [INFO ]  Downloading: 3hrb from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb3hrb.ent
2021-05-17 15:17:58,158 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/3hrb.pdb
2021-05-17 15:17:58,161 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:17:58,185 [MainThread  ] [INFO ]  Downloading: 3lfa from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb3lfa.ent
2021-05-17 15:17:58,587 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/3lfa.pdb
2021-05-17 15:17:58,590 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:17:58,607 [MainThread  ] [

2021-05-17 15:18:03,571 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/3e93.pdb
2021-05-17 15:18:03,575 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:18:03,593 [MainThread  ] [INFO ]  Downloading: 3zsh from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb3zsh.ent
2021-05-17 15:18:04,104 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/3zsh.pdb
2021-05-17 15:18:04,106 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:18:04,132 [MainThread  ] [INFO ]  Downloading: 4zth from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb4zth.ent
2021-05-17 15:18:04,674 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/4zth.pdb
2021-05-17 15:18:04,678 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:18:04,693 [MainThread  ] [

2021-05-17 15:18:10,127 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/5omh.pdb
2021-05-17 15:18:10,130 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:18:10,148 [MainThread  ] [INFO ]  Downloading: 1wbt from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb1wbt.ent
2021-05-17 15:18:10,567 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/1wbt.pdb
2021-05-17 15:18:10,570 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:18:10,595 [MainThread  ] [INFO ]  Downloading: 3l8s from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb3l8s.ent
2021-05-17 15:18:11,137 [MainThread  ] [INFO ]  Writting pdb to: ef39c5c6-6878-49ff-8a15-b88b851253bd/3l8s.pdb
2021-05-17 15:18:11,140 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'HETATM']
2021-05-17 15:18:11,156 [MainThread  ] [

2021-05-17 15:18:14,293 [MainThread  ] [INFO ]  to: /home/gbayarri_local/projects/BioBB/tutorials/biobb_wf_virtual-screening/biobb_wf_virtual-screening/notebooks/clusterBindingSite/pdb_cluster.zip
2021-05-17 15:18:14,322 [MainThread  ] [INFO ]  Removed temporary folder: ef39c5c6-6878-49ff-8a15-b88b851253bd


0

<a id="bindingsite"></a>
### Extracting the Cluster90 Binding Site(s)  
Extracting the protein binding site(s) from the **Cluster90** collection. There are two ways in which the **Cluster90 Binding Site** can be used:

- **With information about the ligand**: In this particular example we already know the **ligand** we want to dock, and also that the **PDB database** contain an **experimental structure** where the **ligand** was solved in **complex** with our protein of interest (**p38-α Tyrosine kinase**). In that case, the **ligand id** can be used to guide the **binding site** tool. <br><br>
- **Without information about the ligand**: If the docking study starts from a **protein receptor** with no information about the **binding site** or any known **protein-ligand complex**, the **binding site** tool can not be guided. Still, the tool should be able to extract **binding site(s)** information from **similar proteins** solved with **small molecules** attached (if any).<br><br>

This example is using the **ligand id** to guide the **binding site tool**, but we invite you to try and explore the differences in the output removing this **input information** (ligand property in the next building block).

The **Cluster90 binding site** tool is internally running **sequence alignments** to structurally **superpose** the structures contained in the **Cluster90 collection** and extract the **residue numbering** corresponding to the **binding site** residues. Thus, **sequence alignment parameters** can be changed using building block input properties. In this example, we have chosen the well-known **blosum62** substitution matrix, with a **penalty** for opening a gap of -10.0, and a **penalty** to extend a gap of -0.5. The total number of **superimposed ligands** to be extracted from the cluster is limited at 15, and the **cut-off distance** around the ligand atoms to consider a residue to be part of a **binding site** is fixed at 5 Ångstroms.

In [7]:
from biobb_vs.utils.bindingsite import bindingsite

output_bindingsite = "bindingsite.pdb"
prop = {
    "ligand": ligand_code,
    "matrix_name": "BLOSUM62",
    "gap_open": -10.0,
    "gap_extend": -0.5,
    "max_num_ligands": 15,
    "radius": 5
}

bindingsite(input_pdb_path = pdb_protein,
            input_clusters_zip = pdb_cluster,
            output_pdb_path = output_bindingsite,
            properties=prop)

2021-05-17 15:18:30,064 [MainThread  ] [INFO ]  Loading input PDB structure pdb_protein.pdb
2021-05-17 15:18:30,142 [MainThread  ] [INFO ]  Found 329 residues in pdb_protein.pdb
2021-05-17 15:18:30,144 [MainThread  ] [INFO ]  Creating 224c5ac9-25fc-4b30-a3b5-de67eeae6bef temporary folder
2021-05-17 15:18:30,429 [MainThread  ] [INFO ]  Extracting: /home/gbayarri_local/projects/BioBB/tutorials/biobb_wf_virtual-screening/biobb_wf_virtual-screening/notebooks/clusterBindingSite/pdb_cluster.zip
2021-05-17 15:18:30,430 [MainThread  ] [INFO ]  to:
2021-05-17 15:18:30,431 [MainThread  ] [INFO ]  ['224c5ac9-25fc-4b30-a3b5-de67eeae6bef/1a9u.pdb', '224c5ac9-25fc-4b30-a3b5-de67eeae6bef/1bl6.pdb', '224c5ac9-25fc-4b30-a3b5-de67eeae6bef/1bl7.pdb', '224c5ac9-25fc-4b30-a3b5-de67eeae6bef/1bmk.pdb', '224c5ac9-25fc-4b30-a3b5-de67eeae6bef/1di9.pdb', '224c5ac9-25fc-4b30-a3b5-de67eeae6bef/1ian.pdb', '224c5ac9-25fc-4b30-a3b5-de67eeae6bef/1kv1.pdb', '224c5ac9-25fc-4b30-a3b5-de67eeae6bef/1kv2.pdb', '224c5ac9-25f

2021-05-17 15:18:30,432 [MainThread  ] [INFO ]  Iterating on all clusters:
2021-05-17 15:18:30,433 [MainThread  ] [INFO ]   
2021-05-17 15:18:30,435 [MainThread  ] [INFO ]  ------------ Iteration #1 --------------
2021-05-17 15:18:30,436 [MainThread  ] [INFO ]  Cluster member: 1a9u
2021-05-17 15:18:30,507 [MainThread  ] [INFO ]  Ligand STI not found in 1a9u cluster member, skipping this cluster
2021-05-17 15:18:30,512 [MainThread  ] [INFO ]   
2021-05-17 15:18:30,513 [MainThread  ] [INFO ]  ------------ Iteration #2 --------------
2021-05-17 15:18:30,514 [MainThread  ] [INFO ]  Cluster member: 1bl6
2021-05-17 15:18:30,572 [MainThread  ] [INFO ]  Ligand STI not found in 1bl6 cluster member, skipping this cluster
2021-05-17 15:18:30,573 [MainThread  ] [INFO ]   
2021-05-17 15:18:30,574 [MainThread  ] [INFO ]  ------------ Iteration #3 --------------
2021-05-17 15:18:30,575 [MainThread  ] [INFO ]  Cluster member: 1bl7
2021-05-17 15:18:30,618 [MainThread  ] [INFO ]  Ligand STI not found in

2021-05-17 15:18:32,675 [MainThread  ] [INFO ]   
2021-05-17 15:18:32,676 [MainThread  ] [INFO ]  ------------ Iteration #26 --------------
2021-05-17 15:18:32,677 [MainThread  ] [INFO ]  Cluster member: 1wbv
2021-05-17 15:18:32,799 [MainThread  ] [INFO ]  Ligand STI not found in 1wbv cluster member, skipping this cluster
2021-05-17 15:18:32,800 [MainThread  ] [INFO ]   
2021-05-17 15:18:32,801 [MainThread  ] [INFO ]  ------------ Iteration #27 --------------
2021-05-17 15:18:32,801 [MainThread  ] [INFO ]  Cluster member: 1wbw
2021-05-17 15:18:32,850 [MainThread  ] [INFO ]  Ligand STI not found in 1wbw cluster member, skipping this cluster
2021-05-17 15:18:32,851 [MainThread  ] [INFO ]   
2021-05-17 15:18:32,853 [MainThread  ] [INFO ]  ------------ Iteration #28 --------------
2021-05-17 15:18:32,854 [MainThread  ] [INFO ]  Cluster member: 1wfc
2021-05-17 15:18:32,921 [MainThread  ] [INFO ]  No ligands found that could guide the binding site search. Ignoring this member: 1wfc
2021-05-1

2021-05-17 15:18:34,836 [MainThread  ] [INFO ]  Cluster member: 2npq
2021-05-17 15:18:34,884 [MainThread  ] [INFO ]  Ligand STI not found in 2npq cluster member, skipping this cluster
2021-05-17 15:18:34,885 [MainThread  ] [INFO ]   
2021-05-17 15:18:34,886 [MainThread  ] [INFO ]  ------------ Iteration #52 --------------
2021-05-17 15:18:34,887 [MainThread  ] [INFO ]  Cluster member: 2okr
2021-05-17 15:18:35,031 [MainThread  ] [INFO ]  No ligands found that could guide the binding site search. Ignoring this member: 2okr
2021-05-17 15:18:35,032 [MainThread  ] [INFO ]   
2021-05-17 15:18:35,033 [MainThread  ] [INFO ]  ------------ Iteration #53 --------------
2021-05-17 15:18:35,033 [MainThread  ] [INFO ]  Cluster member: 2onl
2021-05-17 15:18:35,213 [MainThread  ] [INFO ]  No ligands found that could guide the binding site search. Ignoring this member: 2onl
2021-05-17 15:18:35,214 [MainThread  ] [INFO ]   
2021-05-17 15:18:35,215 [MainThread  ] [INFO ]  ------------ Iteration #54 -----

2021-05-17 15:18:36,872 [MainThread  ] [INFO ]  Ligand STI not found in 3e93 cluster member, skipping this cluster
2021-05-17 15:18:36,873 [MainThread  ] [INFO ]   
2021-05-17 15:18:36,874 [MainThread  ] [INFO ]  ------------ Iteration #77 --------------
2021-05-17 15:18:36,874 [MainThread  ] [INFO ]  Cluster member: 3fc1
2021-05-17 15:18:36,919 [MainThread  ] [INFO ]  Ligand STI not found in 3fc1 cluster member, skipping this cluster
2021-05-17 15:18:36,920 [MainThread  ] [INFO ]   
2021-05-17 15:18:36,920 [MainThread  ] [INFO ]  ------------ Iteration #78 --------------
2021-05-17 15:18:36,921 [MainThread  ] [INFO ]  Cluster member: 3fi4
2021-05-17 15:18:36,966 [MainThread  ] [INFO ]  Ligand STI not found in 3fi4 cluster member, skipping this cluster
2021-05-17 15:18:36,967 [MainThread  ] [INFO ]   
2021-05-17 15:18:36,968 [MainThread  ] [INFO ]  ------------ Iteration #79 --------------
2021-05-17 15:18:36,968 [MainThread  ] [INFO ]  Cluster member: 3fkl
2021-05-17 15:18:37,071 [Mai

2021-05-17 15:18:38,469 [MainThread  ] [INFO ]   
2021-05-17 15:18:38,470 [MainThread  ] [INFO ]  ------------ Iteration #102 --------------
2021-05-17 15:18:38,471 [MainThread  ] [INFO ]  Cluster member: 3gcv
2021-05-17 15:18:38,512 [MainThread  ] [INFO ]  Ligand STI not found in 3gcv cluster member, skipping this cluster
2021-05-17 15:18:38,513 [MainThread  ] [INFO ]   
2021-05-17 15:18:38,513 [MainThread  ] [INFO ]  ------------ Iteration #103 --------------
2021-05-17 15:18:38,514 [MainThread  ] [INFO ]  Cluster member: 3gfe
2021-05-17 15:18:38,608 [MainThread  ] [INFO ]  Ligand STI not found in 3gfe cluster member, skipping this cluster
2021-05-17 15:18:38,609 [MainThread  ] [INFO ]   
2021-05-17 15:18:38,617 [MainThread  ] [INFO ]  ------------ Iteration #104 --------------
2021-05-17 15:18:38,618 [MainThread  ] [INFO ]  Cluster member: 3gi3
2021-05-17 15:18:38,672 [MainThread  ] [INFO ]  Ligand STI not found in 3gi3 cluster member, skipping this cluster
2021-05-17 15:18:38,673 [

2021-05-17 15:18:40,003 [MainThread  ] [INFO ]  Cluster member: 3iw6
2021-05-17 15:18:40,113 [MainThread  ] [INFO ]  Ligand STI not found in 3iw6 cluster member, skipping this cluster
2021-05-17 15:18:40,114 [MainThread  ] [INFO ]   
2021-05-17 15:18:40,115 [MainThread  ] [INFO ]  ------------ Iteration #125 --------------
2021-05-17 15:18:40,115 [MainThread  ] [INFO ]  Cluster member: 3iw7
2021-05-17 15:18:40,156 [MainThread  ] [INFO ]  Ligand STI not found in 3iw7 cluster member, skipping this cluster
2021-05-17 15:18:40,157 [MainThread  ] [INFO ]   
2021-05-17 15:18:40,157 [MainThread  ] [INFO ]  ------------ Iteration #126 --------------
2021-05-17 15:18:40,158 [MainThread  ] [INFO ]  Cluster member: 3iw8
2021-05-17 15:18:40,205 [MainThread  ] [INFO ]  Ligand STI not found in 3iw8 cluster member, skipping this cluster
2021-05-17 15:18:40,206 [MainThread  ] [INFO ]   
2021-05-17 15:18:40,206 [MainThread  ] [INFO ]  ------------ Iteration #127 --------------
2021-05-17 15:18:40,207 [

2021-05-17 15:18:41,774 [MainThread  ] [INFO ]  Ligand STI not found in 3mw1 cluster member, skipping this cluster
2021-05-17 15:18:41,775 [MainThread  ] [INFO ]   
2021-05-17 15:18:41,776 [MainThread  ] [INFO ]  ------------ Iteration #150 --------------
2021-05-17 15:18:41,776 [MainThread  ] [INFO ]  Cluster member: 3new
2021-05-17 15:18:41,821 [MainThread  ] [INFO ]  Ligand STI not found in 3new cluster member, skipping this cluster
2021-05-17 15:18:41,821 [MainThread  ] [INFO ]   
2021-05-17 15:18:41,822 [MainThread  ] [INFO ]  ------------ Iteration #151 --------------
2021-05-17 15:18:41,823 [MainThread  ] [INFO ]  Cluster member: 3nnu
2021-05-17 15:18:41,870 [MainThread  ] [INFO ]  Ligand STI not found in 3nnu cluster member, skipping this cluster
2021-05-17 15:18:41,870 [MainThread  ] [INFO ]   
2021-05-17 15:18:41,871 [MainThread  ] [INFO ]  ------------ Iteration #152 --------------
2021-05-17 15:18:41,872 [MainThread  ] [INFO ]  Cluster member: 3nnv
2021-05-17 15:18:41,921 [

2021-05-17 15:18:43,354 [MainThread  ] [INFO ]   
2021-05-17 15:18:43,354 [MainThread  ] [INFO ]  ------------ Iteration #175 --------------
2021-05-17 15:18:43,355 [MainThread  ] [INFO ]  Cluster member: 3py3
2021-05-17 15:18:43,403 [MainThread  ] [INFO ]  No ligands found that could guide the binding site search. Ignoring this member: 3py3
2021-05-17 15:18:43,404 [MainThread  ] [INFO ]   
2021-05-17 15:18:43,405 [MainThread  ] [INFO ]  ------------ Iteration #176 --------------
2021-05-17 15:18:43,406 [MainThread  ] [INFO ]  Cluster member: 3qud
2021-05-17 15:18:43,513 [MainThread  ] [INFO ]  Ligand STI not found in 3qud cluster member, skipping this cluster
2021-05-17 15:18:43,514 [MainThread  ] [INFO ]   
2021-05-17 15:18:43,514 [MainThread  ] [INFO ]  ------------ Iteration #177 --------------
2021-05-17 15:18:43,515 [MainThread  ] [INFO ]  Cluster member: 3que
2021-05-17 15:18:43,569 [MainThread  ] [INFO ]  Ligand STI not found in 3que cluster member, skipping this cluster
2021-0

2021-05-17 15:18:44,956 [MainThread  ] [INFO ]  ------------ Iteration #200 --------------
2021-05-17 15:18:44,956 [MainThread  ] [INFO ]  Cluster member: 4e5b
2021-05-17 15:18:44,999 [MainThread  ] [INFO ]  No ligands found that could guide the binding site search. Ignoring this member: 4e5b
2021-05-17 15:18:45,000 [MainThread  ] [INFO ]   
2021-05-17 15:18:45,001 [MainThread  ] [INFO ]  ------------ Iteration #201 --------------
2021-05-17 15:18:45,004 [MainThread  ] [INFO ]  Cluster member: 4e6a
2021-05-17 15:18:45,056 [MainThread  ] [INFO ]  Ligand STI not found in 4e6a cluster member, skipping this cluster
2021-05-17 15:18:45,057 [MainThread  ] [INFO ]   
2021-05-17 15:18:45,058 [MainThread  ] [INFO ]  ------------ Iteration #202 --------------
2021-05-17 15:18:45,058 [MainThread  ] [INFO ]  Cluster member: 4e6c
2021-05-17 15:18:45,097 [MainThread  ] [INFO ]  Ligand STI not found in 4e6c cluster member, skipping this cluster
2021-05-17 15:18:45,098 [MainThread  ] [INFO ]   
2021-0

2021-05-17 15:18:46,961 [MainThread  ] [INFO ]  ------------ Iteration #225 --------------
2021-05-17 15:18:46,961 [MainThread  ] [INFO ]  Cluster member: 4loq
2021-05-17 15:18:47,185 [MainThread  ] [INFO ]  No ligands found that could guide the binding site search. Ignoring this member: 4loq
2021-05-17 15:18:47,186 [MainThread  ] [INFO ]   
2021-05-17 15:18:47,187 [MainThread  ] [INFO ]  ------------ Iteration #226 --------------
2021-05-17 15:18:47,188 [MainThread  ] [INFO ]  Cluster member: 4r3c
2021-05-17 15:18:47,233 [MainThread  ] [INFO ]  Ligand STI not found in 4r3c cluster member, skipping this cluster
2021-05-17 15:18:47,234 [MainThread  ] [INFO ]   
2021-05-17 15:18:47,235 [MainThread  ] [INFO ]  ------------ Iteration #227 --------------
2021-05-17 15:18:47,235 [MainThread  ] [INFO ]  Cluster member: 4tyh
2021-05-17 15:18:47,316 [MainThread  ] [INFO ]  Ligand STI not found in 4tyh cluster member, skipping this cluster
2021-05-17 15:18:47,317 [MainThread  ] [INFO ]   
2021-0

2021-05-17 15:18:49,158 [MainThread  ] [INFO ]  ------------ Iteration #250 --------------
2021-05-17 15:18:49,158 [MainThread  ] [INFO ]  Cluster member: 5tbe
2021-05-17 15:18:49,203 [MainThread  ] [INFO ]  Ligand STI not found in 5tbe cluster member, skipping this cluster
2021-05-17 15:18:49,204 [MainThread  ] [INFO ]   
2021-05-17 15:18:49,205 [MainThread  ] [INFO ]  ------------ Iteration #251 --------------
2021-05-17 15:18:49,205 [MainThread  ] [INFO ]  Cluster member: 5tco
2021-05-17 15:18:49,248 [MainThread  ] [INFO ]  Ligand STI not found in 5tco cluster member, skipping this cluster
2021-05-17 15:18:49,249 [MainThread  ] [INFO ]   
2021-05-17 15:18:49,249 [MainThread  ] [INFO ]  ------------ Iteration #252 --------------
2021-05-17 15:18:49,250 [MainThread  ] [INFO ]  Cluster member: 5uoj
2021-05-17 15:18:49,297 [MainThread  ] [INFO ]  No ligands found that could guide the binding site search. Ignoring this member: 5uoj
2021-05-17 15:18:49,297 [MainThread  ] [INFO ]   
2021-0

0

<a id="viewPockets"></a>
### Visualizing selected pockets (cavities)
Visualizing the selected **pockets** (cavities) from the generated list using **NGL viewer**.<br>

**Protein residues** forming the **cavity** are shown in **licorice** representation. **Pockets** are represented in a **greyish surface**. The **original ligand** (if exists) is shown in **green-colored ball and stick** representation. 

In [25]:
view = nglview.show_structure_file(download_pdb, default=False)

# ligand
view[0].add_representation(repr_type='ball+stick', 
                          selection='STI',
                          aspect_ratio=4,
                          color='green')

view[0].add_representation(repr_type='cartoon', 
                        selection='not het',
                          opacity=.2,
                          color='#cccccc')

view.add_component(output_bindingsite, default=False)
view[1].add_representation(repr_type='surface', 
                           selection='*', 
                           opacity = .3,
                           radius='1.5',
                           lowResolution= True,
                           # 0: low resolution 
                           smooth=1,
                           useWorker= True,
                           wrap= True)
view[1].add_representation(repr_type='licorice', 
                        selection='*')

view[0].center()
view._remote_call('setSize', target='Widget', args=['','600px'])

view.render_image()
view.download_image(filename='ngl3.png')
view

NGLWidget()

<img src='ngl3.png'></img>

<a id="cavityBox"></a>
***
## Generating Cavity Box 
Generating a **box** surrounding the selected **protein cavity** (pocket), to be used in the **docking procedure**. The **box** is defining the region on the **surface** of the **protein target** where the **docking program** should explore a possible **ligand dock**.<br>
An offset of **12 Angstroms** is used to generate a **big enough box** to fit the **small molecule** and its possible rotations.<br>

***
**Building Blocks** used:
 - [box](https://biobb-vs.readthedocs.io/en/latest/utils.html#module-utils.box) from **biobb_vs.utils.box**
***

In [9]:
from biobb_vs.utils.box import box

output_box = "box.pdb"
prop = {
    "offset": 12,
    "box_coordinates": True
}

box(input_pdb_path = output_bindingsite,
            output_pdb_path = output_box,
            properties=prop)

2021-05-17 15:19:10,748 [MainThread  ] [INFO ]  Loading residue PDB selection from bindingsite.pdb
2021-05-17 15:19:10,750 [MainThread  ] [INFO ]  Binding site center (Angstroms):     16.790    -4.357   -20.224
2021-05-17 15:19:10,751 [MainThread  ] [INFO ]  Adding 12.0 Angstroms offset
2021-05-17 15:19:10,753 [MainThread  ] [INFO ]  Binding site size (Angstroms):       23.958    25.097    25.515
2021-05-17 15:19:10,754 [MainThread  ] [INFO ]  Volume (cubic Angstroms): 122738
2021-05-17 15:19:10,755 [MainThread  ] [INFO ]  Adding box coordinates
2021-05-17 15:19:10,756 [MainThread  ] [INFO ]  Saving output PDB file (with box setting annotations): box.pdb


0

<a id="vis3D"></a>
### Visualizing binding site box in 3D structure
Visualizing the **protein structure**, the **selected cavity**, and the **generated box**, all together using **NGL** viewer. Using the **original structure** with the **small ligand** inside (Imatinib, [STI](https://www.rcsb.org/ligand/STI)), to check that the **selected cavity** is placed in the **same region** as the **original ligand**. 

In [26]:
#view = nglview.show_structure_file(box, default=False)
view = nglview.NGLWidget()
#s = view.add_component(pdb_single_chain)
s = view.add_component(download_pdb)
b = view.add_component(output_box)
s = view.add_component(output_bindingsite)

atomPair = [
    [ "9999:Z.ZN1", "9999:Z.ZN2" ],
    [ "9999:Z.ZN2", "9999:Z.ZN4" ],
    [ "9999:Z.ZN4", "9999:Z.ZN3" ],
    [ "9999:Z.ZN3", "9999:Z.ZN1" ],
    
    [ "9999:Z.ZN5", "9999:Z.ZN6" ],
    [ "9999:Z.ZN6", "9999:Z.ZN8" ],
    [ "9999:Z.ZN8", "9999:Z.ZN7" ],
    [ "9999:Z.ZN7", "9999:Z.ZN5" ],
    
    [ "9999:Z.ZN1", "9999:Z.ZN5" ],
    [ "9999:Z.ZN2", "9999:Z.ZN6" ],
    [ "9999:Z.ZN3", "9999:Z.ZN7" ],
    [ "9999:Z.ZN4", "9999:Z.ZN8" ]
]

#view.shape.add_cylinder( [ 0, 2, 7 ], [ 10, 0, 9 ], [ 1, 0, 0 ], 0.1 )

# structure
s.add_representation(repr_type='cartoon', 
                        selection='not het',
                        color='#cccccc',
                       opacity=.2)
# ligands box
b.add_representation(repr_type='ball+stick', 
                        selection='9999',
                        color='pink',
                       aspectRatio = 10)
# lines box
b.add_representation(repr_type='distance', 
                        atomPair= atomPair,
                       labelColor= 'transparent',
                       color= 'black')

# output bindingsite
s.add_representation(repr_type='surface', 
                        selection='*', 
                        color='skyblue',
                        lowResolution= True,
                        # 0: low resolution 
                        smooth=1,
                        surfaceType= 'av', 
                        contour=True,
                        opacity=0.4,
                        useWorker= True,
                        wrap= True)


view.center()
view._remote_call('setSize', target='Widget', args=['','600px'])

view.render_image()
view.download_image(filename='ngl4.png')
view

NGLWidget()

<img src='ngl4.png'></img>

<a id="downloadSmallMolecule"></a>
***
## Downloading Small Molecule 
Downloading the desired **small molecule** to be used in the **docking procedure**. <br>
In this particular example, the small molecule of interest is the FDA-approved drug **Imatinib**, with PDB code **STI**.<br>

***
**Building Blocks** used:
 - [ideal_sdf](https://biobb-io.readthedocs.io/en/latest/api.html#module-api.ideal_sdf) from **biobb_io.api.ideal_sdf**
***

In [11]:
from biobb_io.api.ideal_sdf import ideal_sdf

sdf_ideal = "ideal.sdf"
prop = {
  "ligand_code": ligand_code
}

ideal_sdf(output_sdf_path=sdf_ideal,
    properties=prop)

2021-05-17 15:19:27,272 [MainThread  ] [INFO ]  Downloading: STI from: ftp://ftp.ebi.ac.uk/pub/databases/msd/pdbechem/files/sdf/STI.sdf
2021-05-17 15:19:27,275 [MainThread  ] [INFO ]  Writting sdf to: ideal.sdf


0

<a id="sdf2pdb"></a>
***
## Converting Small Molecule 
Converting the desired **small molecule** to be used in the **docking procedure**, from **SDF** format to **PDB** format using the **OpenBabel chemoinformatics** tool. <br>

***
**Building Blocks** used:
 - [babel_convert](https://biobb-chemistry.readthedocs.io/en/latest/babelm.html#module-babelm.babel_convert) from **biobb_chemistry.babelm.babel_convert**
***

In [12]:
from biobb_chemistry.babelm.babel_convert import babel_convert

ligand = "ligand.pdb"
prop = {
    "input_format": "sdf",
    "output_format": "pdb",
    "binary_path": "obabel"
}

babel_convert(input_path = sdf_ideal,
            output_path = ligand,
            properties=prop)

2021-05-17 15:19:29,300 [MainThread  ] [INFO ]  Value  is not compatible as a coordinates value
2021-05-17 15:19:29,302 [MainThread  ] [INFO ]  Not using any container
2021-05-17 15:19:29,419 [MainThread  ] [INFO ]  obabel -isdf ideal.sdf -opdb -Oligand.pdb  

2021-05-17 15:19:29,420 [MainThread  ] [INFO ]  Exit code 0

2021-05-17 15:19:29,421 [MainThread  ] [INFO ]  1 molecule converted



0

<a id="ligand_pdb2pdbqt"></a>
***
## Preparing Small Molecule (ligand) for Docking
Preparing the **small molecule** structure for the **docking procedure**. Converting the **PDB file** to a **PDBQT file** format (AutoDock PDBQT: Protein Data Bank, with Partial Charges (Q), & Atom Types (T)), needed by **AutoDock Vina**. <br><br>
The process adds **partial charges** and **atom types** to every atom. Besides, the **ligand flexibility** is also defined in the information contained in the file. The concept of **"torsion tree"** is used to represent the **rigid and rotatable** pieces of the **ligand**. A rigid piece (**"root"**) is defined, with zero or more rotatable pieces (**"branches"**), hanging from the root, and defining the **rotatable bonds**.<br><br>
More info about **PDBQT file format** can be found in the [AutoDock FAQ pages](http://autodock.scripps.edu/faqs-help/faq/what-is-the-format-of-a-pdbqt-file).

***
**Building Blocks** used:
 - [babel_convert](https://biobb-chemistry.readthedocs.io/en/latest/babelm.html#module-babelm.babel_convert) from **biobb_chemistry.babelm.babel_convert**
***

In [13]:
from biobb_chemistry.babelm.babel_convert import babel_convert

prep_ligand = "prep_ligand.pdbqt"
prop = {
    "input_format": "pdb",
    "output_format": "pdbqt",
    "binary_path": "obabel"
}

babel_convert(input_path = ligand,
            output_path = prep_ligand,
            properties=prop)

2021-05-17 15:19:31,878 [MainThread  ] [INFO ]  Value  is not compatible as a coordinates value
2021-05-17 15:19:31,879 [MainThread  ] [INFO ]  Not using any container
2021-05-17 15:19:31,951 [MainThread  ] [INFO ]  obabel -ipdb ligand.pdb -opdbqt -Oprep_ligand.pdbqt  

2021-05-17 15:19:31,953 [MainThread  ] [INFO ]  Exit code 0

2021-05-17 15:19:31,953 [MainThread  ] [INFO ]  1 molecule converted



0

<a id="viewDrug"></a>
### Visualizing small molecule (drug)
Visualizing the desired **drug** to be docked to the **target protein**, using **NGL viewer**.<br>
- **Left panel**: **PDB-formatted** file, with all hydrogen atoms.
- **Right panel**: **PDBqt-formatted** file (AutoDock Vina-compatible), with **united atom model** (only polar hydrogens are placed in the structures to correctly type heavy atoms as hydrogen bond donors).


In [14]:
from ipywidgets import HBox

v0 = nglview.show_structure_file(ligand)
v1 = nglview.show_structure_file(prep_ligand)

v0._set_size('500px', '')
v1._set_size('500px', '')

def on_change(change):
    v1._set_camera_orientation(change['new'])
    
v0.observe(on_change, ['_camera_orientation'])

HBox([v0, v1])

HBox(children=(NGLWidget(), NGLWidget()))

<img src='ngl5.png'></img>

<a id="protein_pdb2pdbqt"></a>
***
## Preparing Target Protein for Docking
Preparing the **target protein** structure for the **docking procedure**. Converting the **PDB file** to a **PDBqt file**, needed by **AutoDock Vina**. Similarly to the previous step, the process adds **partial charges** and **atom types** to every target protein atom. In this case, however, we are not taking into account **receptor flexibility**, although **Autodock Vina** allows some limited flexibility of selected **receptor side chains** [(see the documentation)](https://autodock-vina.readthedocs.io/en/latest/docking_flexible.html).<br>

***
**Building Blocks** used:
 - [str_check_add_hydrogens](https://biobb-structure-utils.readthedocs.io/en/latest/utils.html#utils-str-check-add-hydrogens-module) from **biobb_structure_utils.utils.str_check_add_hydrogens**
***

In [15]:
from biobb_structure_utils.utils.str_check_add_hydrogens import str_check_add_hydrogens

prep_receptor = "prep_receptor.pdbqt"
prop = {
    "charges": True,
    "mode": "auto"
}

str_check_add_hydrogens(input_structure_path = pdb_protein,
            output_structure_path = prep_receptor,
            properties=prop)

2021-05-17 15:19:44,142 [MainThread  ] [INFO ]  check_structure -i /home/gbayarri_local/projects/BioBB/tutorials/biobb_wf_virtual-screening/biobb_wf_virtual-screening/notebooks/clusterBindingSite/pdb_protein.pdb -o prep_receptor.pdbqt --force_save add_hydrogen --add_charges --add_mode auto

2021-05-17 15:19:44,143 [MainThread  ] [INFO ]  Exit code 0

=                   BioBB structure checking utility v3.7.2                   =
=                 A. Hospital, P. Andrio, J.L. Gelpi 2018-20                  =

Structure /home/gbayarri_local/projects/BioBB/tutorials/biobb_wf_virtual-screening/biobb_wf_virtual-screening/notebooks/clusterBindingSite/pdb_protein.pdb loaded
 Title: 
 Experimental method: unknown
 Resolution: None A

 Num. models: 1
 Num. chains: 1 (A: Protein)
 Num. residues:  329
 Num. residues with ins. codes:  0
 Num. HETATM residues:  0
 Num. ligands or modified residues:  0
 Num. water mol.:  0
 Num. atoms:  2668

Running add_hydrogen. Options: --add_charges --add_mode a

0

<a id="docking"></a>
***
## Running the Docking
Running the **docking process** with the prepared files:
- **ligand**
- **target protein**
- **binding site box**<br>

using **AutoDock Vina**. <br><br>

***
**Building Blocks** used:
 - [autodock_vina_run](https://biobb-vs.readthedocs.io/en/latest/vina.html#module-vina.autodock_vina_run) from **biobb_vs.vina.autodock_vina_run**
***

In [16]:
from biobb_vs.vina.autodock_vina_run import autodock_vina_run

output_vina_pdbqt = "output_vina.pdbqt"
output_vina_log = "output_vina.log"
prop = { }

autodock_vina_run(input_ligand_pdbqt_path = prep_ligand,
             input_receptor_pdbqt_path = prep_receptor,
             input_box_path = output_box,
             output_pdbqt_path = output_vina_pdbqt,
             output_log_path = output_vina_log,
             properties = prop)

2021-05-17 15:19:47,166 [MainThread  ] [INFO ]  prep_receptor.pdbqt file ends with END, cleaning
2021-05-17 15:19:47,171 [MainThread  ] [INFO ]  Executing AutoDock Vina
2021-05-17 15:19:47,172 [MainThread  ] [INFO ]  Not using any container
2021-05-17 15:20:27,135 [MainThread  ] [INFO ]  vina --ligand prep_ligand.pdbqt --receptor prep_receptor.pdbqt --center_x=16.790 --center_y=-4.357 --center_z=-20.224 --size_x=23.958 --size_y=25.097 --size_z=25.515 --out output_vina.pdbqt --log output_vina.log

2021-05-17 15:20:27,136 [MainThread  ] [INFO ]  Exit code 0

2021-05-17 15:20:27,137 [MainThread  ] [INFO ]  #################################################################
# If you used AutoDock Vina in your work, please cite:          #
#                                                               #
# O. Trott, A. J. Olson,                                        #
# AutoDock Vina: improving the speed and accuracy of docking    #
# with a new scoring function, efficient optimization and  

0

<a id="viewDocking"></a>
### Visualizing docking output poses
Visualizing the generated **docking poses** for the **ligand**, using **NGL viewer**. <br>
- **Left panel**: **Docking poses** displayed with atoms coloured by **partial charges** and **licorice** representation.
- **Right panel**: **Docking poses** displayed with atoms coloured by **element** and **ball-and-stick** representation.

In [17]:
models = 'all'
#models = '/0 or /1 or /4'

v0 = nglview.show_structure_file(output_vina_pdbqt, default=False)
v0.add_representation(repr_type='licorice', 
                        selection=models,
                       colorScheme= 'partialCharge')
v0.center()
v1 = nglview.show_structure_file(output_vina_pdbqt, default=False)
v1.add_representation(repr_type='ball+stick', 
                        selection=models)
v1.center()

v0._set_size('500px', '')
v1._set_size('500px', '')

def on_change(change):
    v1._set_camera_orientation(change['new'])
    
v0.observe(on_change, ['_camera_orientation'])

HBox([v0, v1])

HBox(children=(NGLWidget(), NGLWidget()))

<img src='ngl6.png'></img>

<a id="selectPose"></a>
### Select Docking Pose
Select a specific **docking pose** from the output list for **visual inspection**.
<br>
Choose a **docking pose** from the **DropDown list**.

In [18]:
from Bio.PDB import PDBParser
parser = PDBParser(QUIET = True)
structure = parser.get_structure("protein", output_vina_pdbqt)
models = []
for i, m in enumerate(structure):
    models.append(('model' + str(i), i))
    
mdsel = ipywidgets.Dropdown(
    options=models,
    description='Sel. model:',
    disabled=False,
)
display(mdsel)

Dropdown(description='Sel. model:', options=(('model0', 0), ('model1', 1), ('model2', 2), ('model3', 3), ('mod…

<a id="extractPose"></a>
***
## Extract a Docking Pose
Extract a specific **docking pose** from the **docking** outputs. <br>

***
**Building Blocks** used:
 - [extract_model_pdbqt](https://biobb-vs.readthedocs.io/en/latest/utils.html#module-utils.extract_model_pdbqt) from **biobb_vs.utils.extract_model_pdbqt**
***

In [19]:
from biobb_vs.utils.extract_model_pdbqt import extract_model_pdbqt

output_pdbqt_model = "output_model.pdbqt"
prop = {
    "model": 1
}

extract_model_pdbqt(input_pdbqt_path = output_vina_pdbqt,
             output_pdbqt_path = output_pdbqt_model,
            properties=prop)

2021-05-17 15:20:41,685 [MainThread  ] [INFO ]  Saving model 1 to output_model.pdbqt


0

<a id="pdbqt2pdb"></a>
***
## Converting Ligand Pose to PDB format
Converting **ligand pose** to **PDB format**. <br>

***
**Building Blocks** used:
 - [babel_convert](https://biobb-chemistry.readthedocs.io/en/latest/babelm.html#module-babelm.babel_convert) from **biobb_chemistry.babelm.babel_convert**
***

In [20]:
from biobb_chemistry.babelm.babel_convert import babel_convert

output_pdb_model = "output_model.pdb"
prop = {
    "input_format": "pdbqt",
    "output_format": "pdb",
    "binary_path": "obabel"
}

babel_convert(input_path = output_pdbqt_model,
             output_path = output_pdb_model,
            properties=prop)

2021-05-17 15:20:44,121 [MainThread  ] [INFO ]  Value  is not compatible as a coordinates value
2021-05-17 15:20:44,123 [MainThread  ] [INFO ]  Not using any container
2021-05-17 15:20:44,172 [MainThread  ] [INFO ]  obabel -ipdbqt output_model.pdbqt -opdb -Ooutput_model.pdb  

2021-05-17 15:20:44,176 [MainThread  ] [INFO ]  Exit code 0

2021-05-17 15:20:44,178 [MainThread  ] [INFO ]  1 molecule converted



0

<a id="catPdb"></a>
***
## Superposing Ligand Pose to the Target Protein Structure
Superposing **ligand pose** to the target **protein structure**, in order to see the **protein-ligand docking conformation**. <br><br>Building a new **PDB file** with both **target and ligand** (binding pose) structures. <br>

***
**Building Blocks** used:
 - [cat_pdb](https://biobb-structure-utils.readthedocs.io/en/latest/utils.html#module-utils.cat_pdb) from **biobb_structure_utils.utils.cat_pdb**
***

In [21]:
from biobb_structure_utils.utils.cat_pdb import cat_pdb

output_structure = "output_structure.pdb"

cat_pdb(input_structure1 = pdb_protein,
             input_structure2 = output_pdb_model,
             output_structure_path = output_structure)

2021-05-17 15:20:46,424 [MainThread  ] [INFO ]  File output_structure.pdb created


0

<a id="viewFinal"></a>
### Comparing final result with experimental structure 
Visualizing and comparing the generated **protein-ligand** complex with the original **protein-ligand conformation** (downloaded from the PDB database), using **NGL viewer**. <br>
- **Licorice, element-colored** representation: **Experimental pose**.
- **Licorice, green-colored** representation: **Docking pose**.
<br>

Note that outputs from **AutoDock Vina** don't contain all the atoms, as the program works with a **united-atom representation** (i.e. only polar hydrogens).

In [27]:
view = nglview.NGLWidget()

# v1 = Experimental Structure
v1 = view.add_component(download_pdb)

v1.clear()
v1.add_representation(repr_type='licorice', 
                     selection='STI',
                     radius=0.5)

# v2 = Docking result
v2 = view.add_component(output_structure)
v2.clear()
v2.add_representation(repr_type='cartoon', colorScheme = 'sstruc')
v2.add_representation(repr_type='licorice', radius=0.5, color= 'green', selection='UNL')

view._remote_call('setSize', target='Widget', args=['','600px'])
view

# align reference and output
code = """
var stage = this.stage;
var clist_len = stage.compList.length;
var i = 0;
var s = [];
for(i = 0; i <= clist_len; i++){
    if(stage.compList[i] != undefined && stage.compList[i].structure != undefined) {        
       s.push(stage.compList[i])
    }
}
NGL.superpose(s[0].structure, s[1].structure, true, ".CA")
s[ 0 ].updateRepresentations({ position: true })
s[ 0 ].autoView()
"""

view._execute_js_code(code)

view.render_image()
view.download_image(filename='ngl7.png')
view

NGLWidget()

<img src='ngl7.png'></img>

***
<a id="questions"></a>

## Questions & Comments

Questions, issues, suggestions and comments are really welcome!

* GitHub issues:
    * [https://github.com/bioexcel/biobb](https://github.com/bioexcel/biobb)

* BioExcel forum:
    * [https://ask.bioexcel.eu/c/BioExcel-Building-Blocks-library](https://ask.bioexcel.eu/c/BioExcel-Building-Blocks-library)
