# Advanced: Plotting drug binding hotspots on the protein

This is an example of some of the more advanced things one can do with Contact Map Explorer. The system here is a drug molecule (a single residue) binding to a protein. So our query is the drug molecule residue, and the haystack is the protein (heavy atoms for both).

Since there's only one residue in the query, that residue is extraneous information. When we remove that, we're left with the fraction of the trajectory that the drug spends in contact with each protein residue -- in other words, this will tell us where the hotspots in the binding event are. But the 3-dimensional nature of proteins means sequence alone doesn't show you the hotspots. You need a 3D representation of the structure.

For that 3D representation, we'll go to [NGLView](https://github.com/arose/nglview), an excellent tool for visualizing 3D molecular structures in Jupyter notebooks. We'll use the values from our contact map to set the color of the residues in the NGLView visualization.

This notebook shows how using Contact Map Explorer interactively allows you to easily interface with other software in the scientific Python ecosystem. It doesn't use any code that directly links NGLView with Contact Map Explorer. Instead, we show you how, with just a little bit of Python, you can convert the output from a contact map into the input to visualize something with NGLView.

Note that NGLView is not a requirement for Contact Map Explorer, so you may need to install it separately. It can be easily installed with conda; see [the NGLView GitHub page](https://github.com/arose/nglview) for details.

The first part of this is exactly like the contact concurrences example:

In [1]:
from __future__ import print_function
import numpy as np

from contact_map import ContactFrequency
from contact_map import plot_utils
import mdtraj as md
import nglview as nv

traj = md.load("data/gsk3b_example.h5")
print(traj)  # to see number of frames; size of system

_ColormakerRegistry()

<mdtraj.Trajectory with 100 frames, 5704 atoms, 360 residues, and unitcells>


In [2]:
topology = traj.topology
yyg = topology.select('resname YYG and element != "H"')
protein = topology.select('protein and element != "H"')

In [3]:
%%time
contacts = ContactFrequency(traj, query=yyg, haystack=protein)

CPU times: user 1.68 s, sys: 13.2 ms, total: 1.7 s
Wall time: 1.71 s


## Creating the NGLView color mapping

NGLView can take a custom color scheme in the form:

```python
my_scheme = [[color_1, selection_1], [color_2, selection_2], ...]
```

where `color_*` are colors that can be names (`'red'`, `'blue'`) or can be web-formatted hexadecimal values (`'#FF0000'`, `'#0000FF'`), and the `selection_*` are selection strings. (NB: strings, not ints!) Those selection strings should be in NGLView's selection language, which is similar, but not identical, to the one used by MDTraj. Once you've defined the list for `my_scheme`, you can register and use the custom color scheme with, for example:

```python
nglview.color.ColormakerRegistry.add_scheme("pretty colors!", my_scheme)
view.add_cartoon(color="pretty colors!")  # for an existing widget called `view`
```

We'll start from the contact frequency's `residue_contacts`, and our goal will be to create the list of lists that defines the NGLView color scheme.

The steps we will go through are:

1. Convert the `residue_contacts.most_common()` to a mapping of residue index to frequency. Because we want to map every residue, we can use a list (or `np.array`) of length `n_residues` and use the index for the "keys" of the mapping.
2. Convert the frequencies in that mapping into colors. For this, we'll use `contact_map.plot_utils.hex_colors`.
3. Convert the mapping of residue index to colors into the format required by NGLView. In particular, NGLView wants the residue indices to count from 1 (not 0) and wants them to be strings.

### Step 1: Make a mapping of protein residue to frequency

In [4]:
# Select the mdtraj.Residue for the drug. Same residue for all atoms in yyg.
yyg_res = topology.atom(yyg[0]).residue
yyg_res

YYG351

In [5]:
# Big idea: we don't need a dictionary for a mapping. A list (or np.array)
# with entries for each residue allows the index to be used for mapping.
# Default value (no contacts) is zero.
frequencies = np.zeros(traj.topology.n_residues)
for res_pair, freq in contacts.residue_contacts.most_common():
    # * `set(res_pair) - set([yyg_res])` takes the pair (order unknown) and 
    #   removes yyg_res, leaving the other residue
    # * `list(...)[0]` makes it into a list and takes the first (only) element 
    key_res = list(set(res_pair) - set([yyg_res]))[0]
    # the index of the residue (counts from 0) is also the index in the array
    frequencies[key_res.index] = freq

### Step 2: Convert this to a mapping to colors

Contact Map Explorer provides a function for this. You can use any matplotlib [named color map](https://matplotlib.org/3.1.1/tutorials/colors/colormaps.html), or you can use a custom matplotlib `Colormap` instance.

In [6]:
# convert the frequency to colors; NGLView uses web-style hex (prefaced with '#')
colors = plot_utils.hex_colors(frequencies, cmap='Reds', style='web')

### Step 3: Convert it into the format needed by NGLView

Keep in mind that NGLView needs strings to identify the residues, and also that it counts residues from 1, whereas Python counts arrays from 0. (Aside: [Should you count from 0 or from 1?](https://xkcd.com/163/))

In [7]:
nv_colorscheme = [[color, str(i+1)] for i, color in enumerate(colors)]

In [8]:
# just to show what this looks like; specifically residues with some color!
print(nv_colorscheme[25:30])  

[['#f75d42', '26'], ['#c2161b', '27'], ['#990c13', '28'], ['#ad1116', '29'], ['#e63228', '30']]


## Using NGLView to make the 3D image

Now we actually get to make the picture!

In [9]:
nv.color.ColormakerRegistry.add_scheme("contacts", nv_colorscheme)
view = nv.show_mdtraj(traj.superpose(traj), default=False)
view.center()
view.add_cartoon(selection='protein', color="contacts")
view

NGLWidget(max_frame=99)

Note that this will only be visible in HTML renderings; PDFs will not show the image.

In [10]:
#view.download_image("GSK3B_contacts.png", trim=True)  # this is how to save an image you like