# Visualizing ASAP targets

A key aspect of communicating computational chemistry is having easy to interpret vizual aids that assist decision making. To this end we have developed easy ways to visualize ASAP targets in portable and easy to interpret ways.

This includes visualizing protein-ligand conformations, molecular dynamics simulations and viral fitness data. 

# HTML views of protein-ligand conformations

Protein-ligand conformations are central to the drug design DMTA cycle and need to be viewed quickly and in large numbers. To this end we developed a portable interactive HTML representation of protein-ligand conformations for our targets based on [3DMol](https://3dmol.csb.pitt.edu/) that can easily be shared between team members and outside collaborators, embedded into various platforms and hosted on cloud repositories.  

To make one of these HTML representations, follow the steps below!

In [1]:
# import some dependencies

from asapdiscovery.dataviz.html_viz import HTMLVisualizer
from asapdiscovery.data.testing.test_resources import fetch_test_file
from asapdiscovery.simulation.simulate import SimulationResult
from asapdiscovery.docking.openeye import POSITDockingResults
from IPython.display import display, HTML, IFrame
from asapdiscovery.docking.docking import DockingInputPair
from asapdiscovery.docking.openeye import POSITDocker
from asapdiscovery.data.backend.openeye import oechem
from asapdiscovery.data.schema.complex import Complex, PreppedComplex
from asapdiscovery.data.schema.ligand import Ligand


To learn more about how the base level abstractions such as `Ligand`, `Complex` etc work, it is reccomended to run through the `working_with_data` tutorial (see Tutorial index).

We have designed the `Visualization` module (and others) so that they work seamlessly with multiple levels of abstraction. Here we will be exploring making HTML renders from a **PDB file**, an in-memory `Complex` object and from a set of **docking results**. This gives flexibility to work with data that is more or less structured with ease. 

### From a PDB file 

In [2]:
protein = fetch_test_file("Mpro-P2660_0A_bound-prepped_complex.pdb") # fetch a PDB file from the test suite, in this case a PDB from the COVID MOONSHOT.

We will use the `HTMLVisualizer` factory class to create our renders, lets inspect its arguments. 

In [3]:
HTMLVisualizer?

[0;31mInit signature:[0m
[0mHTMLVisualizer[0m[0;34m([0m[0;34m[0m
[0;34m[0m    [0;34m*[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mtarget[0m[0;34m:[0m [0masapdiscovery[0m[0;34m.[0m[0mdata[0m[0;34m.[0m[0mservices[0m[0;34m.[0m[0mpostera[0m[0;34m.[0m[0mmanifold_data_validation[0m[0;34m.[0m[0mTargetTags[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mcolor_method[0m[0;34m:[0m [0masapdiscovery[0m[0;34m.[0m[0mdataviz[0m[0;34m.[0m[0mhtml_viz[0m[0;34m.[0m[0mColorMethod[0m [0;34m=[0m [0;34m<[0m[0mColorMethod[0m[0;34m.[0m[0msubpockets[0m[0;34m:[0m [0;34m'subpockets'[0m[0;34m>[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mdebug[0m[0;34m:[0m [0mbool[0m [0;34m=[0m [0;32mFalse[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mwrite_to_disk[0m[0;34m:[0m [0mbool[0m [0;34m=[0m [0;32mTrue[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0moutput_dir[0m[0;34m:[0m [0mpathlib[0m[0;34m.[0m[0mPath[0m [0;34m=[0m [0;34m'html'

* We need to provide an ASAP target for the `target` argument, e.e `SARS-CoV-2-Mpro`.
* We would like to colour by `subpocket` (more on other options later)
* We would like to align to a canonical reference structure `align=True`
* For the purposes of this notebook we will write to a folder called "html".

In [4]:
# create a visualization factory. 
html_vizualizer = HTMLVisualizer(
        target="SARS-CoV-2-Mpro",
        color_method="subpockets",
        align=True,
        output_dir="html",
        write_to_disk=True,
    )


Fantastic! Ok now lets run our renders, passing in our list of inputs. We can optionally use [dask](https://www.dask.org/) to parallelize over our list of inputs for higher performance. This is important when dealing with lots of structures or inputs, but should be unnessecary for now. 

In [5]:
# create our visualizations, explicitly specifying an output path
vizs = html_vizualizer.visualize(inputs=[protein], outpaths=["render.html"], use_dask=False)

2024-05-03 16:38:19,787 [INFO] [plipcmd.py:124] plip.plipcmd: Protein-Ligand Interaction Profiler (PLIP) 2.3.0
2024-05-03 16:38:19,787 [INFO] [plipcmd.py:125] plip.plipcmd: brought to you by: PharmAI GmbH (2020-2021) - www.pharm.ai - hello@pharm.ai
2024-05-03 16:38:19,787 [INFO] [plipcmd.py:126] plip.plipcmd: please cite: Adasme,M. et al. PLIP 2021: expanding the scope of the protein-ligand interaction profiler to DNA and RNA. Nucl. Acids Res. (05 May 2021), gkab294. doi: 10.1093/nar/gkab294
2024-05-03 16:38:19,787 [INFO] [plipcmd.py:49] plip.plipcmd: starting analysis of tmp_complex.pdb
2024-05-03 16:38:20,001 [INFO] [plipcmd.py:165] plip.plipcmd: finished analysis, find the result files in /var/folders/f5/0zcc5b7570jc40ws28tqdp740000gn/T/tmph8imqy30/


In [6]:
vizs # result is a dataframe

Unnamed: 0,ligand_id,target_id,SMILES,html_path_pose
0,Mpro-P2660_0A_bound-prepped_complex_ligand,Mpro-P2660_0A_bound-prepped_complex_target,CNC(=O)CN1C[C@]2(CCN(C2=O)c3cncc4c3cc(cc4)Cl)c...,html/render.html


Ok now we have our render in memory, lets try and display it in this notebook!

In [7]:
from IPython.display import IFrame
IFrame(vizs["html_path_pose"][0], 1000, 1000)

Wow! Very cool, we now have an interactive way to view ligand-protein complexes of ASAP targets, annotated with key interactions and important protein subpockets for the target of interest. Our medicinal chemists find this very useful for quickly viewing key interactions in docked virtual designs and crystal structures. 

### From an in-memory Complex representation. 
We can follow similar steps to render an in-memory representation of our ligand to and HTML view.

In [8]:
# make a complex 
sars_cov_2_complex = Complex.from_pdb(protein, ligand_kwargs={"compound_name": "Mpro-P2660-bound-target"}, target_kwargs={"target_name": "Mpro-P2660"})

In [9]:
# we can re-use our factory from before 
vizs = html_vizualizer.visualize(inputs=[sars_cov_2_complex], outpaths=["from_complex.html"], use_dask=False)

2024-05-03 16:38:21,047 [INFO] [plipcmd.py:124] plip.plipcmd: Protein-Ligand Interaction Profiler (PLIP) 2.3.0
2024-05-03 16:38:21,047 [INFO] [plipcmd.py:125] plip.plipcmd: brought to you by: PharmAI GmbH (2020-2021) - www.pharm.ai - hello@pharm.ai
2024-05-03 16:38:21,047 [INFO] [plipcmd.py:126] plip.plipcmd: please cite: Adasme,M. et al. PLIP 2021: expanding the scope of the protein-ligand interaction profiler to DNA and RNA. Nucl. Acids Res. (05 May 2021), gkab294. doi: 10.1093/nar/gkab294
2024-05-03 16:38:21,047 [INFO] [plipcmd.py:49] plip.plipcmd: starting analysis of tmp_complex.pdb
2024-05-03 16:38:21,253 [INFO] [plipcmd.py:165] plip.plipcmd: finished analysis, find the result files in /var/folders/f5/0zcc5b7570jc40ws28tqdp740000gn/T/tmpt1_f32ww/


In [10]:
from IPython.display import IFrame
IFrame(vizs["html_path_pose"][0], 1000, 1000)

Note that if you can also easily open it with your web browser. e.g `google-chrome render.html`

### Docking a new structure!

We have shown pre-prepared examples here so far. What if we want to dock and visualize a new structure?

Note that docking will not be covered in depth here (see `Docking and Scoring` tutorial for more information. Lets dock our structure 

In [11]:
# make the ligand we want to dock, a simple alkane
ligand = Ligand.from_smiles("CCCCCCC", compound_name="alkane")


In [12]:
# prepare our structure
prepped_sars_cov_2_complex = PreppedComplex.from_complex(sars_cov_2_complex)
# pair it up with the ligand we want to dock.
docking_input_pair = DockingInputPair(complex=prepped_sars_cov_2_complex, ligand=ligand)


Processing BU # 1 with title: DesignUnit Components_LIG, chains AB


In [13]:
# run OpenEye POSIT docking,
docker = POSITDocker(use_omega=False)
results = docker.dock([docking_input_pair], use_dask=False)

# results is a list of POSITDockingResults, lots of info in here
print(results)

[POSITDockingResults(type='POSITDockingResults', input_pair=DockingInputPair(complex=PreppedComplex(target=PreppedTarget(target_name='Mpro-P2660', ids=None, data_format=<DataStorageType.b64oedu: 'b64oedu'>, target_hash='2353f6855b9359b5c6693a8e1dccd24b33c634f839f72d192b68e55b0e7d78b5'), ligand=Ligand(compound_name='Mpro-P2660-bound-target', ids=None, provenance=LigandProvenance(isomeric_smiles='CNC(=O)CN1C[C@]2(CCN(C2=O)c3cncc4c3cc(cc4)Cl)c5cc(ccc5C1=O)Cl', inchi='InChI=1S/C24H20Cl2N4O3/c1-27-21(31)12-29-13-24(19-9-16(26)4-5-17(19)22(29)32)6-7-30(23(24)33)20-11-28-10-14-2-3-15(25)8-18(14)20/h2-5,8-11H,6-7,12-13H2,1H3,(H,27,31)/t24-/m1/s1', inchi_key='JZJCSVMJFIAMQB-XMMPIXPASA-N', fixed_inchi='InChI=1/C24H20Cl2N4O3/c1-27-21(31)12-29-13-24(19-9-16(26)4-5-17(19)22(29)32)6-7-30(23(24)33)20-11-28-10-14-2-3-15(25)8-18(14)20/h2-5,8-11H,6-7,12-13H2,1H3,(H,27,31)/t24-/m1/s1/f/h27H', fixed_inchikey='JZJCSVMJFIAMQB-DLYUOGNHNA-N'), experimental_data=None, expansion_tag=None, tags={}, conf_tags={},

In [14]:

vizs_from_docked =  html_vizualizer.visualize(inputs=results, outpaths=["from_docked.html"], use_dask=False)

2024-05-03 16:39:13,450 [INFO] [plipcmd.py:124] plip.plipcmd: Protein-Ligand Interaction Profiler (PLIP) 2.3.0
2024-05-03 16:39:13,450 [INFO] [plipcmd.py:125] plip.plipcmd: brought to you by: PharmAI GmbH (2020-2021) - www.pharm.ai - hello@pharm.ai
2024-05-03 16:39:13,450 [INFO] [plipcmd.py:126] plip.plipcmd: please cite: Adasme,M. et al. PLIP 2021: expanding the scope of the protein-ligand interaction profiler to DNA and RNA. Nucl. Acids Res. (05 May 2021), gkab294. doi: 10.1093/nar/gkab294
2024-05-03 16:39:13,450 [INFO] [plipcmd.py:49] plip.plipcmd: starting analysis of tmp_complex.pdb
2024-05-03 16:39:13,639 [INFO] [plipcmd.py:165] plip.plipcmd: finished analysis, find the result files in /var/folders/f5/0zcc5b7570jc40ws28tqdp740000gn/T/tmpc_65fqic/


In [15]:
from IPython.display import IFrame
IFrame(vizs_from_docked["html_path_pose"][0], 1000, 1000)

We can see our alkane was docked nicely to the active site!


Note that for embedding into applications and you can also set `write_to_disk=False` to get the raw HTML strin, for example 

In [16]:
# create a visualization factory. 
html_vizualizer = HTMLVisualizer(
        target="SARS-CoV-2-Mpro",
        color_method="subpockets",
        align=True,
        write_to_disk=False,
    )

vizs_from_docked_raw =  html_vizualizer.visualize(inputs=results, outpaths=["from_docked.html"], use_dask=False)

  warn("outpaths provided but write_to_disk is False. Ignoring outpaths.")
2024-05-03 16:39:14,999 [INFO] [plipcmd.py:124] plip.plipcmd: Protein-Ligand Interaction Profiler (PLIP) 2.3.0
2024-05-03 16:39:14,999 [INFO] [plipcmd.py:125] plip.plipcmd: brought to you by: PharmAI GmbH (2020-2021) - www.pharm.ai - hello@pharm.ai
2024-05-03 16:39:14,999 [INFO] [plipcmd.py:126] plip.plipcmd: please cite: Adasme,M. et al. PLIP 2021: expanding the scope of the protein-ligand interaction profiler to DNA and RNA. Nucl. Acids Res. (05 May 2021), gkab294. doi: 10.1093/nar/gkab294
2024-05-03 16:39:14,999 [INFO] [plipcmd.py:49] plip.plipcmd: starting analysis of tmp_complex.pdb
2024-05-03 16:39:15,176 [INFO] [plipcmd.py:165] plip.plipcmd: finished analysis, find the result files in /var/folders/f5/0zcc5b7570jc40ws28tqdp740000gn/T/tmpb4qo5ybt/


In [17]:
vizs_from_docked_raw

Unnamed: 0,0
0,"<!DOCTYPE HTML>\n<html lang=""en"">\n <head>\n ..."


# HTML views of protein-ligand conformations with fitness data

ASAP's targets are viral protein, and thus are highly mutable. An effective therapeutic must not only bind to the predominant variant currently circulating, but also regions of accessible sequence space. For this reason it is beneficial to select for interactions with highly conserved residues. 

ASAP has worked with the Bloom lab to obtain Deep Mutational Scanning [DMS](https://www.nature.com/articles/nmeth.3027) data for SARS-CoV-2-Mpro (https://doi.org/10.1093/ve/veae026) and also for SARS-CoV-2-Mac1 (DOI:?) which can be visualized on the 3D protein structure to inform medicinal chemists if designed compounds are interacting with conserved or non-conserved residues.   


These vizualisations also contain [logoplots](https://en.wikipedia.org/wiki/Sequence_logo) that can inform the viewer about the sequence space for each residue.


We are in the process of spinning out this fitness viewer in a self contained package called `choppa` (https://github.com/asapdiscovery/choppa) watch this space!

You can easily make these visualizations by setting the `color_method` keyword to `fitness`

Residues highlighted in red are highly mutable, white are less mutable and blue are missing data. 


In [18]:
# create a visualization factory. 
html_vizualizer = HTMLVisualizer(
        target="SARS-CoV-2-Mpro",
        color_method="fitness",
        align=True,
        write_to_disk=True,
    )

vizs_from_docked_fitness =  html_vizualizer.visualize(inputs=results, outpaths=["fitness_from_docked.html"], use_dask=False)

2024-05-03 16:39:35,586 [INFO] [plipcmd.py:124] plip.plipcmd: Protein-Ligand Interaction Profiler (PLIP) 2.3.0
2024-05-03 16:39:35,586 [INFO] [plipcmd.py:125] plip.plipcmd: brought to you by: PharmAI GmbH (2020-2021) - www.pharm.ai - hello@pharm.ai
2024-05-03 16:39:35,586 [INFO] [plipcmd.py:126] plip.plipcmd: please cite: Adasme,M. et al. PLIP 2021: expanding the scope of the protein-ligand interaction profiler to DNA and RNA. Nucl. Acids Res. (05 May 2021), gkab294. doi: 10.1093/nar/gkab294
2024-05-03 16:39:35,586 [INFO] [plipcmd.py:49] plip.plipcmd: starting analysis of tmp_complex.pdb
2024-05-03 16:39:35,763 [INFO] [plipcmd.py:165] plip.plipcmd: finished analysis, find the result files in /var/folders/f5/0zcc5b7570jc40ws28tqdp740000gn/T/tmprxekk__t/


In [19]:
from IPython.display import IFrame
IFrame(vizs_from_docked_fitness["html_path_fitness"][0], 1000, 1000)