# Talktorial 11 (part C)

# CADD web services that can be used via a Python API

__Developed at AG Volkamer, Charité__

Dr. Jaime Rodríguez-Guerra

## Aim of this talktorial

> This is part C of the "Online webservices" talktorial:
>
> - 11a. Querying KLIFS & PubChem for potential kinase inhibitors
> - 11b. Docking the candidates against the target obtained in 11a
> - __11c. Assessing the results and comparing against known data__

After obtaining input structures and docking them, we will assess whether the results are any good.

## Learning goals

### Theory

- Protein-ligand interactions
- False positives in docking

### Practical

- Visualize the results
- Run automated analysis

### Discussion

Pending.

### Quiz

Pending.

***

# Theory

## Protein-ligand interactions

Pending

## False positives

Pending

***

# Practical

## Visualize the results

Use `nglview` for that! It's a web-based molecular viewer that can be run on Jupyter Notebooks! Also, it's compatible with `PDBQT` files out of the box (but will only load the first model... we will see how to deal with that).

To install `nglview` run:

We will use `ipywidgets` to create an interactive GUI in the notebook. That way, we can click in the different ligands and the viewer will be refreshed accordingly. In particular, we want our little GUI to:

- Show the list of poses and their affinities, as reported in the Vina output
- Show the protein structure with a ribbon representation, the ligand with ball and stick, and the surrounding residues with licorice (stick-only)
- The 3D visualization should respond to the user selecting a different pose in the list.

So, this means that we need to:

1. Invoke the NGL viewer with the adequate representations
2. Build an interactive table of results (hint: use `ipywidgets.Select`)
3. Write an event handler that can communicate with the NGL Viewer: when the user clicks on a new entry, update the ligand in display, along with the surrounding residues. The ribbon should not need to be updated.

In [35]:
import pandas as pd
import time
import nglview as nv
from ipywidgets import AppLayout, Layout, Select

The PDBQT file created by Vina contain several models, but `nglview` will only parse the first one. The workaround is simple: divide the file into individual models by splitting whenever an `ENDMDL` line is found.

In [None]:
def split_pdbqt(path):
    """
    Split a multimodel PDBQT into separate files.
    """
    files = []
    with open(path) as f:
        lines = []
        i = 0
        for line in f:
            lines.append(line)
            if line.strip() == 'ENDMDL':
                fn = f'data/results.{i}.pdbqt'
                with open(fn, 'w') as o:
                    o.write(''.join(lines))
                files.append(fn)
                i += 1
                lines = []
    return files

The Vina output is a simple text file that contains the table of results. Parsing that table is relatively straightforward. We return a Pandas DataFrame for a simple visualization, if needed.

In [None]:
def parse_output(out):
    """
    Create a DataFrame out of the Vina output file
    """
    with open(out) as f:
        data = []
        for line in f:
            if line.startswith('-----+'):
                line = next(f)
                while line.split()[0].isdigit():
                    index, *floats = line.split()
                    data.append([int(index)] + list(map(float, floats)))
                    line = next(f)
    return pd.DataFrame.from_records(data, 
                                     columns=['Mode', 'Affinity (kcal/mol)', 'RMSD (l.b.)', 'RMSD (u.b.)'], 
                                     exclude=['Mode'])

Here we create the NGL viewer instance. Instead of creating a new one for each protein-pose pair, we will reuse the same canvas all over, hiding or showing the needed ligands. We will load everything first, while also labeling the ligands with their respective affinity.

In [1]:
def create_viewer(protein, ligands, affinities):
    """
    Create a nglview widget with the protein and all the ligands labeled by affinities
    """
    v = nv.show_file(protein)
    label_kwargs = dict(labelType="text", sele="@0", showBackground=True, backgroundColor="black")
    for ligand, affinity in zip(ligands, affinities):
        c = v.add_component(ligand)
        c.add_label(labelText=[str(affinity)], **label_kwargs)
    return v

In this cell below we will build the actual GUI!

It will be composed of two widgets arranged horizontally using the `AppLayout` layour.

- The selector (`ipywidgets.Select`)
- The NGL viewer

When the user clicks on a new entry in the selector, `_on_selection_change` will be called, which will:

1. Check if the new value is any different from the previous one. If that's the case, then:
2. Hide all ligands (simpler way to hide the previous one; no need to check individually)
3. Show the new one and center the camera on it with a cool 500ms animation
4. Execute some JavaScript on the canvas to update the list of sidechains within 5A of the new pose center of mass.

In [107]:
# JavaScript code needed to update residues around the ligand
# because this part is not exposed in the Python widget
# Based on: http://nglviewer.org/ngl/api/manual/snippets.html
_RESIDUES_AROUND = """
var protein = this.stage.compList[0];
var ligand_center = this.stage.compList[{index}].structure.atomCenter();
var around = protein.structure.getAtomSetWithinPoint(ligand_center, {radius});
var around_complete = protein.structure.getAtomSetWithinGroup(around);
var last_repr = protein.reprList[protein.reprList.length-1];
protein.removeRepresentation(last_repr);
protein.addRepresentation("licorice", {{sele: around_complete.toSeleString()}});
"""

def show_docking(protein, ligands, vina_output):
    # Split the multi PDBQT ligand file into separate files
    ligands_files = split_pdbqt(ligands)
    # Retrieve affinities
    affinities = parse_output(vina_output)['Affinity (kcal/mol)']
    
    # This is the event handler - action taken when the user clicks on the select box
    def _on_selection_change(change):
        if change['name'] == 'value' and (change['new'] != change['old']):
            v.hide(list(range(1,len(ligands_files) + 1)))  # Hide all ligands
            component = getattr(v, f"component_{change['new']}")
            component.show()  # Display the selected one
            component.center(500)  # Zoom view
            # Show sidechains around ligand
            v._execute_js_code(_RESIDUES_AROUND.format(index=change['new'], radius=5))
                                
    # Create viewer widget
    v = create_viewer(protein, ligands_files, affinities)
    
    # Create selection widget
    selector = Select(options=[(f"#{i} {aff} kcal/mol", i) for (i, aff) in enumerate(affinities, 1)],
                      description="",  rows=len(ligands_files), layout=Layout(width="auto"))
                 
    # Arrange GUI elements
    display(AppLayout(left_sidebar=selector, center=v, pane_widths=[1, 6, 1]))
    
    # This is the event handler - action taken when the user clicks on the select box
    def _on_selection_change(change):
        if change['name'] == 'value' and (change['new'] != change['old']):
            v.hide(list(range(1,len(ligands_files) + 1)))  # Hide all ligands
            component = getattr(v, f"component_{change['new']}")
            component.show()  # Display the selected one
            component.center(500)  # Zoom view with a 500ms animation
            # Show sidechains around ligand
            v._execute_js_code(_RESIDUES_AROUND.format(index=change['new'], radius=5))
    
    # Trigger event manually
    selector.observe(_on_selection_change)
    _on_selection_change({'name': 'value', 'new': 1, 'old': None})

    return v

In [108]:
v = show_docking("data/protein.mol2", "data/results.pdbqt", "data/vina.out")

AppLayout(children=(Select(layout=Layout(grid_area='left-sidebar', width='auto'), options=(('#1 -7.2 kcal/mol'…