# Talktorial 11 (part B)

# CADD web services that can be used via a Python API

__Developed at AG Volkamer, Charité__

Dr. Jaime Rodríguez-Guerra

## Aim of this talktorial

> This is part B of the "Online webservices" talktorial:
>
> - 11a. Querying KLIFS & PubChem for potential kinase inhibitors
> - __11b. Docking the candidates against the target obtained in 11a__
> - 11c. Assessing the results and comparing against known data


After obtaining input structures we will use molecular docking software to find good protein-ligand poses.

## Learning goals

### Theory

- Molecular docking basics
- Available software

### Practical

- Prepare the structures
- Run the calculation
- Save the results

### Discussion

Pending.

### Quiz

Pending.

***

# Theory: what is molecular docking?

Protein-ligand interactions are mainly governed by non-covalent interactions.

There are several ways to analyze the vast search space that results from exploring multiple conformations and chemical variations.

- Molecular mechanics
- Shape recognition
- Knowledge-based

## Known limitations

- False positives
- Energetic accuracy


## Existing software

Commercial
- GOLD
- Schrödinger

Free (or free for academics):
- AutoDock Vina
- DOCK
- OpenEye

# Practice

There are a couple of webservices available online for free use: SwissDock and OPAL webservices (which includes AutoDock Vina).

### SwissDock

* Role: Perform docking calculations
* Website: http://www.swissdock.ch
* API: Yes, SOAP-based. No official client, use `suds`.
* Documentation: http://www.swissdock.ch/pages/soap_access
* Literature:
    * Nucleic Acids Res. 2011 Jul;39(Web Server issue):W270-7. doi: 10.1093/nar/gkr366. https://academic.oup.com/nar/article/39/suppl_2/W270/2506492
    * J Comput Chem. 2011 Jul 30;32(10):2149-59. doi: 10.1002/jcc.21797. https://onlinelibrary.wiley.com/doi/abs/10.1002/jcc.21797

> SwissDock, a web service to predict the molecular interactions that may occur between a target protein and a small molecule.
> SwissDock is based on the docking software EADock DSS, whose algorithm consists of the following steps:
> 1. Many binding modes are generated either in a box (local docking) or in the vicinity of all target cavities (blind docking).
> 2. Simultaneously, their CHARMM energies are estimated on a grid.
> 3. The binding modes with the most favorable energies are evaluated with FACTS, and clustered.
> 4. The most favorable clusters can be visualized online and downloaded on your computer.


### OPAL webservices
* Role: CADD as a service
* Website: http://nbcr-222.ucsd.edu/opal2/dashboard
* API: Yes, SOAP-based. No official client, use `suds`.
* Documentation: http://nbcr-222.ucsd.edu/opal2/dashboard?command=docs (currently offline)
* Literature:
    * Nucleic Acids Res. 2010 Jul;38(Web Server issue):W724-31. doi: 10.1093/nar/gkq503 https://academic.oup.com/nar/article/38/suppl_2/W724/1122840
    * J Comput Chem. 2010 Jan 30; 31(2): 455–461. doi: 10.1002/jcc.21334 https://onlinelibrary.wiley.com/doi/abs/10.1002/jcc.21334
    * Opal: Simple Web Services Wrappers for Scientific Applications http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.533.7960&rep=rep1&type=pdf
    
> Biomedical applications have become increasingly complex, and they often require large-scale high-performance computing resources with a large number of processors and memory. The complexity of application deployment and the advances in cluster, grid and cloud computing require new modes of support for biomedical research. Scientific Software as a Service (sSaaS) enables scalable and transparent access to biomedical applications through simple standards-based Web interfaces. Towards this end, we built a production web server (http://ws.nbcr.net) in August 2007 to support the bioinformatics application called MEME. The server has grown since to include docking analysis with AutoDock and AutoDock Vina, electrostatic calculations using PDB2PQR and APBS, and off-target analysis using SMAP. All the applications on the servers are powered by Opal, a toolkit that allows users to wrap scientific applications easily as web services without any modification to the scientific codes, by writing simple XML configuration files. Opal allows both web forms-based access and programmatic access of all our applications. The Opal toolkit currently supports SOAP-based Web service access to a number of popular applications from the National Biomedical Computation Resource (NBCR) and affiliated collaborative and service projects. In addition, Opal's programmatic access capability allows our applications to be accessed through many workflow tools, including Vision, Kepler, Nimrod/K and VisTrails. From mid-August 2007 to the end of 2009, we have successfully executed 239,814 jobs. The number of successfully executed jobs more than doubled from 205 to 411 per day between 2008 and 2009. The Opal-enabled service model is useful for a wide range of applications. It provides for interoperation with other applications with Web Service interfaces, and allows application developers to focus on the scientific tool and workflow development. Web server availability: http://ws.nbcr.net.

## Use SwissDock

In [1]:
from suds.client import Client
import zlib
import string

# Server seems to be down at the moment...
# http://swissdock.vital-it.ch/soap/ replies with 503 Unavailable
SWISSDOCK_WSDL = "http://www.swissdock.ch/soap/wsdl"
SWISSDOCK_CLIENT = Client(SWISSDOCK_WSDL)

def prepare_protein(protein):
    """
    Given a PDB file (string contents), returns PSF and CRD
    """
    encoded_protein = zlib.compress(protein.encode('utf-8'))
    job_id = SWISSDOCK_CLIENT.service.prepareTarget(target=encoded_protein)
    while True:
        result = SWISSDOCK_CLIENT.service.isTargetPrepared(jobID=job_id)
        if result is None:
            raise ValueError("No such a job present")
        if result in (False, "false", 0):
            time.sleep(5)
        else:  # ready!
            break
    protein_files = SWISSDOCK_CLIENT.service.getPreparedTarget(job_id)
    if protein_files is None or len(protein_files) != 2:
        raise ValueError("Could not prepare protein!")
    return protein_files
            

def prepare_ligand(ligand):
    """
    Given a MOL2 file (string contents), returns PDB, RTF, PAR.
    
    Ligand must be protonated beforehand!
    """
    encoded_ligand = zlib.compress(ligand.encode('utf-8'))
    job_id = SWISSDOCK_CLIENT.service.prepareLigand(ligand=encoded_ligand)
    while True:
        result = SWISSDOCK_CLIENT.service.isLigandPrepared(jobID=job_id)
        if result is None:
            raise ValueError("No such a job present")
        if result in (False, "false", 0):
            time.sleep(5)
        else:  # ready!
            break
    ligand_files = SWISSDOCK_CLIENT.service.getPreparedLigand(job_id)
    if ligand_files is None or len(ligand_files) != 3:
        raise ValueError("Could not prepare ligand!")
    return ligand_files

def dock(protein, ligand, name=None):
    protein_psf, protein_crd = prepare_protein(protein)
    ligand_pdb, ligand_rtf, ligand_par = prepare_ligand(ligand)
    
    if name is None:
        name = "teachopencadd" + ''.join([random.choice(string.ascii_letters) for _ in range(5)])
    job_id = SWISSDOCK_CLIENT.service.startDocking(
        protein_psf, protein_crd,
        ligand_pdb,
        [ligand_rtf],
        [ligand_par],
        name)
    if job_id in (None, "None"):
        raise ValueError("Docking job could not be submitted")
    while not SWISSDOCK_CLIENT.service.isDockingTerminated(job_id):
        time.sleep(5)
    all_files = SWISSDOCK_CLIENT.service.getPredictedDockingAllFiles(job_id)
    with open('docking_results.zip', 'w') as f:
        f.write(all_files)
    target, docked = SWISSDOCK_CLIENT.service.getPredictedDocking(job_id)
    SWISSDOCK_CLIENT.service.forget(job_id)
    return target, docked

In [6]:
# Ugly hack to get Mol2 writer/readers in RDKit
import os
working_dir = os.getcwd()
os.chdir(_dh[0])
!wget https://raw.githubusercontent.com/rdkit/rdkit/60081d31f45fa8d5e8cef527589264c57dce7c65/rdkit/Chem/Mol2Writer.py
os.chdir(working_dir)
import Mol2Writer

--2019-08-26 13:23:12--  https://raw.githubusercontent.com/rdkit/rdkit/60081d31f45fa8d5e8cef527589264c57dce7c65/rdkit/Chem/Mol2Writer.py
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.112.133
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.112.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 12380 (12K) [text/plain]
Saving to: ‘Mol2Writer.py.2’


2019-08-26 13:23:13 (24,6 MB/s) - ‘Mol2Writer.py.2’ saved [12380/12380]



In [3]:
def step_03_swissdock(protein, ligand):
    # Protein must be PDB
    # TODO: Convert from MOL2 to PDB
    # rd_protein = Mol2Writer.MolFromCommonMol2Block(protein)
    # protein_pdb = Chem.MolToPDBBlock(rd_protein)
    protein_pdb = protein
    # Ligand must be protonated Mol2
    ligand_mol2 = Mol2Writer.MolToCommonMol2Block(ligand)
    return dock(protein_pdb, ligand_mol2)

In [11]:
from rdkit import Chem
from rdkit.Chem import AllChem
# Retrieve protein and ligands from previous steps
with open('data/protein.mol2') as f:
    protein = f.read()
with open('data/similar_smiles.txt') as f:
    smiles = [line.strip() for line in f]

ligands = []
for s in smiles:
    m = Chem.AddHs(Chem.MolFromSmiles(s))
    AllChem.EmbedMolecule(m)
    ligands.append(m)

In [14]:
step_03_swissdock(protein, ligands[0])

URLError: <urlopen error [Errno 111] Connection refused>

### Perform docking with OPAL webservices
SwissDock is not working recently, so we can resort to yet another webservice. The interface is a bit more rudimentary, but it should work. However, protein and ligand must be prepared locally.

In [5]:
import base64

def opal_prepare_protein(protein):
    """
    AutoDock expects PDBQT files
    """
    pass

def opal_prepare_ligand(ligand):
    """
    AutoDock expects PDBQT files
    """
    pass

def opal_run_docking(protein, ligand):
    """
    Connect to OPAL webservices and submit job
    """
    client = Client("http://nbcr-222.ucsd.edu/opal2/services/vina_1.1.2")
    file_map = {
        'receptor.pdbqt': base64.encode(protein),
        'ligand.pdbqt': base64.encode(ligand),
        'vina.conf': base64.enconde(config),
        'results.pdbqt': 'results.pdbqt'
    }
    cli_args = "--receptor receptor.pdbqt --ligand ligand.pdbqt --config vina.conf --out results.pdbqt"
    
    response = client.service.launchJobBlocking(cli_args, inputFile=file_map)
    if response.status.code != 8:  # failed!
        raise ValueError("Could not perform docking!")
    output_files = {
        'stdout.txt': response.stdOut,
        'stderr.txt': response.stdErr,
    }
    for f in response.outputFile:
        r = requests.get(f.url)
        r.raise_for_status()
        output_files[f.name] = r.text
        time.sleep(0.5)
    
    return output_files
        

# Visualize docking results

Once the calculation has run and the files have been downloaded, it's time to visualize them! You will see how to do that in Part C.

# Discussion

Pending

# Quiz

- How can you tell the docking run successfully in the remote server?
- Why do we need to prepare the AutoDock Vina input files locally?