<h1><span style="color:red">Please read this very carefully! </span></h1>

In order to setup your own experiments, you need to download remote files to your linux disk image in the collaboratory environment. As data for your user account is NOT reset when you close or reload the HBP, you have to be very careful how you organize & structure your data. In order to help you with that we create a unique working directory for each molecular use case you run.

Please be also aware that we switch current working directories in this use case. That means that you have to restart and clear all output in order to go back to your starting directory. 

# Identify potential protein binding sites by comparing the electrostatic potentials of a set of protein isoforms

**Aim:** This use case shows how to use the multipipsa tool to predict potential protein binding sites.

**Version:** 1.1 (January 2020)

**Contributors:**  Neil Bruce, Lukas Adam, Stefan Richter, Rebecca Wade (HITS, Heidelberg, Germany)

**Contact:** [mcmsoft@h-its.org](mailto:mcmsoft@h-its.org)

**Note:** This notebook has graphical output using nglview. If you use the "RunAll" function of the notebook, this graphical output might not appear on your screen. The cell defined to show the output must be visible in the browser during execution.

## Setting up your environment

### Check that all required python packages are installed and working

In [None]:
! pip install --upgrade pip
! pip uninstall --yes numpy
! pip uninstall --yes pandas
! pip install pandas>=1.0.1
! pip install numpy>=1.16

In [None]:
# Check that required packages are installed
! pip install --upgrade "hbp-service-client" 
! pip install wget python-magic
! pip install rpy2==2.9.1
! pip install setuptools
! pip install --extra-index-url https://projects.h-its.org/pypi multipipsa==4.0.10
! pip install nglview
! mkdir -p ~/.R/lib
! grep -qxF 'R_LIBS_USER=~/.R/lib/' ~/.Renviron || echo 'R_LIBS_USER=~/.R/lib' >> ~/.Renviron
! wget -c https://cran.r-project.org/src/contrib/fastcluster_1.1.25.tar.gz
! wget -c https://cran.r-project.org/src/contrib/heatmap3_1.1.7.tar.gz
! R CMD INSTALL -l ~/.R/lib fastcluster_1.1.25.tar.gz
! R CMD INSTALL -l ~/.R/lib heatmap3_1.1.7.tar.gz

In [None]:
# Import python packages/classes used in this notebook
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)
import numpy
import rpy2
import os, wget, datetime, magic, inspect
from multipipsa.multipipsa import PipsaRun, ApbsRun
from multipipsa.clusterpipsa import ClusterPipsa
from multipipsa.pipsatypes import DistanceType
from PIL import Image
from hbp_service_client.storage_service.client import Client
import nglview

### Set up local directory structure

In [None]:
# Create a local working directory
try:
    homeDir = os.environ['HOME']
except:
    print("Error in environment")

else:
    workDir = os.path.join(homeDir, 'work')
    if not os.path.isdir(workDir):
        try:
            os.mkdir(workDir)
        except:
            print("unable to make working directory")
    
    # Make a new directory to run the use case in. 
    # If directory already exists, add a number to make a unique name
    baseDir = 'multipipsaBinding'
    dirIter = 0
    useCaseDir = os.path.join(workDir, baseDir)
    print(useCaseDir)
    
    if os.path.exists(useCaseDir):
        while os.path.exists(useCaseDir):
            dirIter += 1
            useCaseDir = os.path.join(workDir, baseDir + '.' + str(dirIter))            
    
    try:
        os.mkdir(useCaseDir)
    except:
        print("Failed to make use case working directory")
    else:
        print("Working directory for current use case: %s" % useCaseDir)


### Set up collab storage for saving data at end of calculation


In [None]:
#Find your own collab storage path
collab_path = get_collab_storage_path()
print(collab_path)
storage_client = Client.new(oauth.get_token())

# Identify potential protein binding sites by comparing the electrostatic potentials of a set of protein isoforms


In [None]:
# Download isoform structure files from CSCS storage for calculation

# Loop to download AC1 - 9 structures
for iso in range(1, 10):

    try:
        print("Downloading AC%d structure file from CSCS storage area" % iso)
        try:
            fileUrl= 'https://object.cscs.ch/v1/AUTH_c0a333ecf7c045809321ce9d9ecdfdea/SGA2_molecular_models/data/Modelled_adenylyl_cyclase_AC_isoform_structures/refined/AC' + str(iso) + '.pdb'
        except:
            print("Error defining file url")
        else:
             wget.download(fileUrl, useCaseDir)
    except:
        print("Error downloading structure file AC%d CSCS storage" % iso)
        print(fileUrl)
    else:
        print("Sucessfully downloaded the structure file AC%d from CSCS storage" % iso)

In [None]:
ingrp = ["AC1", "AC5", "AC6"]
outgrp = ["AC2", "AC3", "AC4", "AC7", "AC8", "AC9"]

In [None]:
# Define the location of the PIPSA software exectutables
pipsaDir = os.path.join(os.path.dirname(inspect.getfile(PipsaRun)), 'data', 'pipsa')

In [None]:
# Create an ApbsRun instance for the current calculation
epCalc = ApbsRun(
                dataDir=useCaseDir,       # Pass the use case work directory as the directory for running the calculation
                pipsaRoot=pipsaDir,       # Pass the location of the PIPSA executables defined above
                temp='298.15',            # Define the temperature in Kelvin
                ios='0.100',              # Define the solvent ionic strength in Molar concentration
                pH='7.4',                 # Define the solvent pH
                structures=ingrp+outgrp   # Pass the list of structures defined above
                ) 

In [None]:
epCalc.runPdb2Pqr()
epCalc.runApbs()

In [None]:
# Choose a reference structure
referenceStructure='AC5'

pipsaCalc = PipsaRun(pipsaRoot=pipsaDir,
                     dataDir=useCaseDir,
                     pointsTemplate=referenceStructure
                    )

In [None]:
pipsaCalc.runBindingScorePipsa(ingrp, outgrp)

In [None]:
from multipipsa.pipsatypes import ScoreType, SimilarityType
useCaseDir

In [None]:
#similarityScorePDB = os.path.join(useCaseDir, "simHodgkin.pdb")
#compactnessScorePDB = os.path.join(useCaseDir, "compactHodgkin.pdb")

pipsaCalc.savePDBResult(filename='simHodgkin.pdb',
                 scoreType=ScoreType.MS,
                 similarityType=SimilarityType.HODGKIN)

pipsaCalc.savePDBResult(filename='compactHodgkin.pdb',
                 scoreType=ScoreType.CS,
                 similarityType=SimilarityType.HODGKIN)

In [None]:
# View the downloaded structure
# Create a NGL widget object
viewSimilarity = nglview.NGLWidget()
# Set the display size
viewSimilarity._remote_call('setSize', target='Widget', args=['600px','400px'])

# Define files to load
AC5_struct_file = nglview.FileStructure(os.path.join(useCaseDir, 'simHodgkin.pdb'))

# Create a component object for displaying the structure
component = viewSimilarity.add_component(AC5_struct_file)
component.clear_representations()
component.add_representation('cartoon', sele=':A', color=0x7f2704)
component.add_representation('cartoon', sele=':B', color=0x00441b)
component.add_representation('surface', color='bfactor', opacity=0.7, colorScheme='Red')

In [None]:
colorJS='''
var = color BfactorColormaker({
    domain: [-1.0, 1.0],
    scale: ['red', 'white', 'blue']
})
'''
viewSimilarity._execute_js_code(colorJS)

In [None]:
viewSimilarity

In [None]:
# Set up a timestamped directory name for saving results to the storage area
baseStorageDir = 'multipipsaBinding'
timestamp = datetime.datetime.now().strftime('%Y-%m-%d-%H-%M-%S')
storageDir = os.path.join(collab_path, baseStorageDir + timestamp)
try:
    print('Creating storage directory: %s' % storageDir)
    storage_client.mkdir(storageDir)
except:
    print('There was an error creating the storage directory')
else:
    # Copy files to the storage area and remove the local files
    cleanDir = True
    for fName in os.listdir(useCaseDir):
        localFile = os.path.join(useCaseDir, fName)
        storageFile = os.path.join(storageDir, fName)
        fType = magic.Magic(mime=True).from_file(localFile)
        try:
            storage_client.upload_file(localFile, storageFile, fType)
        except:
            print('Error copying %s to storage' % fName)
            cleanDir = False
        else: 
            os.remove(localFile)
            
    print('All files in the working directory have been moved to the storage area directory:')
    print(storageDir)
    os.chdir(homeDir)
    if cleanDir:
        os.rmdir(useCaseDir)