### Using this notebook
- You have to execute this notebook by **running the cells in order**.
  - If you need to modify a value or a selection in a cell, you might need to **rerun the cells** below to reflect the change.
  - Running the whole workflow **exports all files to the user storage** and removes them from the temporary folders. This could cause some of the cells to not work properly if manually executed after a complete run (e.g. molecule visualizations). Therefore, a **cell by cell execution** is suggested instead of a whole run.
- **Any changes to the notebook need to be saved** before leaving the page in order to be persisted.
- A **setup process** is needed before running this notebook. This process installs software and libraries dependencies. Please find the **"Initial Setup Process" Jupyter netbook** in the Navigation section (left part of the collab) and execute it before starting with this workflow. 

# Small molecule force field parametrization for atomistic Molecular Dynamics simulations

***
**Aim:**
This tutorial aims to illustrate the process of **parameterizing a small molecule**, step by step. The particular example used is the **Imipramine** molecule (PDB code [IXX](http://www.rcsb.org/ligand/IXX), DrugBank code [DB00458](https://www.drugbank.ca/drugs/DB00458)). 


**Imipramine** is a tricyclic **antidepressant** (TCA) which is used mainly in the treatment of **depression**. It can also reduce symptoms of **agitation** and **anxiety**. 


This workflow makes extensive use of the **BioExcel Building Blocks library** ([biobb](https://github.com/bioexcel/biobb)). Each step of the process is performed by a **building block** (bb), which are wrappers of tools/scripts that computes a particular functionality (e.g. Adding hydrogens). If you are interested in expanding/modifying the current workflow, please visit the **existing documentation** for each of the packages [here](https://github.com/bioexcel/biobb). 

Although the **pipeline** is presented **step by step** with associated information, it is extremely advisable to previously spend some time reading documentation about **small molecule parameterization**, to get familiar with the terms used, especially for newcomers to the field. 

***
**Version:** 1.0 (August 2019)
***
**Contributors:**  Adam Hospital, Pau Andrio, Aurélien Luciani, Genís Bayarri, Francesco Colizzi, Josep Lluís Gelpí, Modesto Orozco (IRB-Barcelona, Spain)
***
**Contact:** [adam.hospital@irbbarcelona.org](mailto:adam.hospital@irbbarcelona.org)
***
**Thanks:** This use case took the code to generate the data in a single folder to finally copy it to the storage collab from the **multipipsa tool to calculate the electrostatic potential surrounding a protein in aqueous solution** use case by Neil Bruce, Lukas Adam, Stefan Richter, Rebecca Wade (HITS, Heidelberg, Germany).

## Setting up the working environment

### Importing required libraries

In [None]:
import os, datetime, magic
import nglview
import ipywidgets
import zipfile
from hbp_service_client.storage_service.client import Client

### Set up collab storage for saving data at the end of the MD setup

In [None]:
# Find your own collab storage path
collab_path = get_collab_storage_path()
print(collab_path)
storage_client = Client.new(oauth.get_token())

### Set up local directory structure

In [None]:
# Create a local working directory
try:
    homeDir = os.environ['HOME']
except:
    print("Error in environment")

else:
    workDir = os.path.join(homeDir, 'IRB')
    if not os.path.isdir(workDir):
        try:
            os.mkdir(workDir)
        except:
            print("unable to make working directory")
    
    # Make a new directory to run the use case in. 
    # If directory already exists, add a number to make a unique name
    baseDir = 'SmallMoleculeParam'
    dirIter = 0
    useCaseDir = os.path.join(workDir, baseDir)
    print(useCaseDir)
    
    if os.path.exists(useCaseDir):
        while os.path.exists(useCaseDir):
            dirIter += 1
            useCaseDir = os.path.join(workDir, baseDir + '.' + str(dirIter))            
    
    try:
        os.mkdir(useCaseDir)
    except:
        print("Failed to make use case working directory")
    else:
        print("Working directory for current use case: %s" % useCaseDir)
        os.chdir(useCaseDir)


### Defining logging output cells
By default, Jupyter notebooks display the logging information in **red-coloured cells**.

Here they are redefined to different colours, depending on the **logging level** (INFO, WARNING, ERROR), in order to avoid confusions with **critical error messages**. 

If you prefer to keep the default Juyter notebooks configuration, please **disable/comment** (or just not execute) the next cell.

In [None]:
from IPython.core.display import display, HTML
display(HTML('''<script>
const mo = new MutationObserver(
  mutations => mutations.forEach(mutation => {
    const element = mutation.target.querySelector(
        '.output_text.output_stderr');
    if (!element) return;
    if (element.textContent.includes('[INFO')) 
        element.style.background = '#DDD';
    else if (element.textContent.includes('[WARN')) 
        element.style.background = 'sandybrown';
    else if (element.textContent.includes('[ERROR')) 
        element.style.background = 'salmon';
}));
mo.observe(document.documentElement, 
    { childList: true, subtree: true });
</script>'''))

***
## Input parameters
**Input parameters** needed:
 - **ligandCode**: 3-letter code of the ligand structure (e.g. IXX)
 - **mol_charge**: Molecule net charge (e.g. -1)
 - **pH**: Acidity or alkalinity for the small molecule. Hydrogen atoms will be added according to this pH. (e.g. 7.4)

In [None]:
ligandCode = 'IXX'
mol_charge = 1
pH = 7.4

***
## Fetching ligand structure
Downloading **ligand structure** in **PDB format** from the IRB PDB database.<br>
Alternatively, a **PDB file** can be used as starting structure. <br>
***

In [None]:
import json
import requests

url = 'http://mmb.irbbarcelona.org/api/pdbMonomer/'
rest = url + ligandCode

input_structure = ligandCode + '.pdb'

r = requests.get(rest).content.rstrip()
open(input_structure, 'wb').write(r)

### Visualizing 3D structure
Visualizing the downloaded/given **ligand PDB structure** using **NGL**:    

In [None]:
# Show small ligand structure
slig = nglview.FileStructure(
    os.path.join(useCaseDir, input_structure))
view = nglview.show_file(slig)
view.add_representation(repr_type='ball+stick', selection='all')
view._remote_call('setSize', target='Widget', args=['','300px'])
view.camera='orthographic'
view

***
## Add Hydrogen Atoms
Adding **Hydrogen atoms** to the small molecule, according to the given pH.
***

In [None]:
# Babel_add_hydrogens: add Hydrogen atoms to a small molecule
# Import module
from biobb_chemistry.babelm.babel_add_hydrogens import BabelAddHydrogens

# Create prop dict and inputs/outputs
output_babel_h = ligandCode + '.H.mol2' 

prop = {
    'ph' : pH,
    'input_format' : 'pdb',
    'output_format' : 'mol2'
}

#Create and launch bb
BabelAddHydrogens(input_path=input_structure,
                  output_path=output_babel_h,
                  properties=prop).launch()

### Visualizing 3D structure
Visualizing the **ligand PDB structure** with the newly added **hydrogen atoms** using **NGL**:    

In [None]:
# Show small ligand structure
sligH = nglview.FileStructure(
    os.path.join(useCaseDir, output_babel_h))
view = nglview.show_file(sligH)
view.add_representation(repr_type='ball+stick', selection='all')
#view._remote_call('setSize', target='Widget', args=['','300px'])
view.camera='orthographic'
view

***
## Energetically minimize Hydrogen Atoms
Energetically minimize newly added **Hydrogen atoms**.
***

In [None]:
# Babel_minimize: Structure energy minimization of a small molecule after being modified adding hydrogen atoms
# Import module
from biobb_chemistry.babelm.babel_minimize import BabelMinimize

# Create prop dict and inputs/outputs
output_babel_min = ligandCode + '.H.min.pdb'                              
prop = {
    'method' : 'sd',
    'criteria' : '1e-10',
    'force_field' : 'GAFF'
}


#Create and launch bb
BabelMinimize(input_path=output_babel_h,
              output_path=output_babel_min,
              properties=prop).launch()

### Visualizing 3D structure
Visualizing the **ligand PDB structure** with the newly added **hydrogen atoms**, **energetically minimized**, using **NGL**:   

In [None]:
# Show small ligand structure
sligH_min = nglview.FileStructure(
    os.path.join(useCaseDir, output_babel_min))
view = nglview.show_file(sligH_min)
view.add_representation(repr_type='ball+stick', selection='all')
view._remote_call('setSize', target='Widget', args=['','300px'])
view.camera='orthographic'
view

### Visualizing 3D structures
Visualizing all the structures generated so far:

 - Original **ligand PDB structure** (left)
 - **Ligand PDB structure** with **hydrogen atoms** (middle)
 - **Ligand PDB structure** with **hydrogen atoms energetically minimized** (right)  

In [None]:
# Show different structures generated (for comparison)

# Original Ligand
slig = nglview.FileStructure(
    os.path.join(useCaseDir, input_structure))
view1 = nglview.show_file(slig)
view1.add_representation(repr_type='ball+stick')
view1._remote_call('setSize', target='Widget', args=['250px','300px'])
view1.camera='orthographic'

# Original Ligand with added Hydrogen atoms 
sligH = nglview.FileStructure(
    os.path.join(useCaseDir, output_babel_h))
view2 = nglview.show_file(sligH)
view2.add_representation(repr_type='ball+stick')
view2._remote_call('setSize', target='Widget', args=['250px','300px'])
view2.camera='orthographic'

# Original Ligand with added Hydrogen atoms, energetically mimimized
sligH_min = nglview.FileStructure(
    os.path.join(useCaseDir, output_babel_min))
view3 = nglview.show_file(sligH_min)
view3.add_representation(repr_type='ball+stick')
view3._remote_call('setSize', target='Widget', args=['250px','300px'])
view3.camera='orthographic'

# Show
ipywidgets.HBox([view1, view2, view3])

***
## Generating ligand parameters
**Building GROMACS topology** corresponding to the **ligand structure**.

**Force field** used in this tutorial step is **amberGAFF**: [General AMBER Force Field](http://ambermd.org/antechamber/gaff.html), designed for rational drug design.

***

In [None]:
# Acpype_params_gmx: Generation of topologies for GROMACS with ACPype
# Import module
from biobb_chemistry.acpype.acpype_params_gmx import AcpypeParamsGMX

# Create prop dict and inputs/outputs
output_acpype_gro = ligandCode + 'params.gro'
output_acpype_itp = ligandCode + 'params.itp'
output_acpype_top = ligandCode + 'params.top'
output_acpype = ligandCode + 'params'
prop = {
    'basename' : output_acpype,
    'charge' : mol_charge
}

#Create and launch bb
#AcpypeParamsGMX(input_path=test, output_path_gro=output_acpype_gro,
AcpypeParamsGMX(input_path=output_babel_min,
                output_path_gro=output_acpype_gro,
                output_path_itp=output_acpype_itp,
                output_path_top=output_acpype_top,
                properties=prop).launch()

### Visualizing 3D structure
Visualizing the generated **GROMACS** gro structure corresponding to the parameterized **ligand PDB structure** using **NGL**:    

In [None]:
# Show small ligand structure
slig_gro = nglview.FileStructure(
    os.path.join(useCaseDir, output_acpype_gro))
view = nglview.show_file(slig_gro)
view.add_representation(repr_type='ball+stick', selection='all')
view._remote_call('setSize', target='Widget', args=['','300px'])
view.camera='orthographic'
view

***
## Saving results to the storage area
Getting all generated files in the use case folder and transfer them to the storage area
***

In [None]:
# Building a zip file with all the generated info
zip_file = ligandCode + '_SmallMoleculeParam' + ".zip"
with zipfile.ZipFile(zip_file, 'w') as zip_f:
    print ("Generating a zip file with all the content generated\
in the {} project.".format(useCaseDir))
    for fName in os.listdir(useCaseDir):
        if not "zip" in fName:
            localFile = os.path.join(useCaseDir, fName)
            print ("Adding {} to the zip file...".format(fName))
            zip_f.write(localFile, arcname=fName)

In [None]:
# Saving results to the storage area
baseStorageDir = ligandCode + '_SmallMoleculeParam_'
timestamp = datetime.datetime.now().strftime('%Y-%m-%d-%H-%M-%S')
storageDir = os.path.join(collab_path, baseStorageDir + timestamp)
try:
    print('Creating storage directory: %s' % storageDir)
    storage_client.mkdir(storageDir)
except:
    print('There was an error creating the storage directory')
else:
    # Copy files to the storage area and remove the local files
    # At the same time creating a zip file containing the whole project (to be downloaded)
    cleanDir = True
    for fName in os.listdir(useCaseDir):
        localFile = os.path.join(useCaseDir, fName)
        storageFile = os.path.join(storageDir, fName)
        fType =  magic.Magic(mime=True).from_file(localFile)
        try:
            storage_client.upload_file(localFile, storageFile, fType)
        except:
            print('Error copying %s to storage' % fName)
            cleanDir = False
        else: 
            os.remove(localFile)
            
    print('All files in the working directory have been moved to the storage area directory:')
    print(storageDir)
    os.chdir(homeDir)
    if cleanDir:
        os.rmdir(useCaseDir)