In this exercise we will introduce molecular visualizations. 

It is intended to help you achieve the following learning objectives:
* Visualize molecular structures with different styles. Compare the advantages and disadvantages of several styles.
* Align coordinates of protein structures with different amino acid sequences.

When you are done with this exercise, save it under `Chem456-2022F/exercises` on Google Drive. It will be graded as satisfactory or unsatisfactory based on correctly completing the sections after # -->.

# Part I. Visualizing molecular structures with py3DMol

In this exercise, we will visualize the structures of various biomolecules using [py3DMol](https://github.com/avirshup/py3dmol). py3DMol which allows for dependency-free molecular visualization in Jupyter notebooks. py3DMol wraps the [3DMol.js](http://3dmol.csb.pitt.edu/doc/index.html) library for online molecular visualization.

This exercise is an abbreviated and adapted version of [Lab 02 of IIBM3202 Molecular Modeling and Simulation](https://github.com/pb3lab/ibm3202/blob/master/tutorials/lab02_molviz.ipynb) from the Institute for Biological and Engineering at Pontificia Universidad Catolica de Chile.

As a first step, we will need to install the package.

In [None]:
try:
  import py3Dmol
except:
  !pip install py3Dmol
  import py3Dmol

We will also use prody, a package for manipulation and retrieval of protein structures and sequences.

In [None]:
try:
  import prody
except:
  !pip install -U prody
  import prody

## Retrieving structures

Now let us look at the structure of an important SARS-CoV-2 drug target, the main protease (MPro). Many structures of this enzyme are available on the [Protein Data Bank](https://www.rcsb.org/), including [5rh2](https://www.rcsb.org/structure/5rh2). This particular structure [helped inspire a clinical candidate](https://doi.org/10.1101/2022.01.26.477782) for COVID-19 treatment.

<p align = "justify">You can retrieve PDB structures from its website (www.rcsb.org). You can also directly use the terminal to download a given PDB file with known accession code as shown below, where XXXX must be replaced by the replaced by the 4-letter PDB code:

```
!wget http://www.rcsb.org/pdb/files/XXXX.pdb.gz 
!gunzip XXXX.pdb.gz
```

Finally, you can use prody. 

In [None]:
PDB_5rh2 = prody.parsePDB('5rh2')
!gunzip 5rh2.pdb.gz

## Styles

Now let's visualize the structure. In the code below, notice that we first add a <i>model</i> and then add <i>styles</i>. The `view.addStyle` function requires an [atom selection](https://3dmol.csb.pitt.edu/doc/types.html#AtomSpec) and a [style specification](https://3dmol.csb.pitt.edu/doc/types.html#AtomStyleSpec). The types of styles include `line`, `stick`, `sphere`, and `cartoon` and properties of the styles include [colorscheme](https://3dmol.csb.pitt.edu/doc/types.html#ColorschemeSpec) or specific [color](https://3dmol.csb.pitt.edu/doc/types.html#ColorSpec).

In [None]:
view = py3Dmol.view()
view.addModel(open('5rh2.pdb', 'r').read(),'pdb')
view.setBackgroundColor('white')
view.setStyle({'chain':'A'}, {'cartoon': {'color':'purple'}})
view.addStyle({'resn':'UH7'}, {'stick': {'colorscheme':'yellowCarbon'}})
view.addStyle({'within':{'distance':'8', 'sel':{'resn':'UH7'}}}, {'stick': {}})
view.addLabel("H41",{'fontOpacity':1},{'resi':'41'})
view.addLabel("CYS145",{'fontOpacity':1},{'resi':'145'})
view.zoomTo()
view.show()

Try playing with the view by using the mouse controls

| Movement | Mouse Input |	Touch Input |
| -------- | ----------- | -----------  |
| Rotation |	Primary Mouse Button | Single touch |
| Translation	| Middle Mouse Button or Ctrl+Primary	| Triple touch |
| Zoom | Scroll Wheel or Second Mouse Button or Shift+Primary | Pinch (double touch) |
| Slab |	Ctrl+Second |	Not Available |

Also, try to change the view by removing or modifying the styles.

In [None]:
# --> Write and execute code to display all atoms using the sphere style

# # --> Enter a short answer in this text box

Describe at least one advantage and disadvantage of each of the following styles:
* line
* stick
* sphere
* cartoon


## Surfaces

Now let's try adding a [surface](https://3dmol.csb.pitt.edu/doc/$3Dmol.GLViewer.html#addSurface).

In [None]:
view = py3Dmol.view()
view.addModel(open('5rh2.pdb', 'r').read(),'pdb')
view.setBackgroundColor('white')
view.setStyle({'chain':'A'}, {'cartoon': {'color':'purple'}})
view.addStyle({'resn':'UH7'}, {'stick': {'colorscheme':'yellowCarbon'}})
view.addStyle({'within':{'distance':'5', 'sel':{'resn':'UH7'}}}, {'stick': {}})
view.addSurface(py3Dmol.VDW, {'opacity':0.85, 'color':'grey'}, \
  {'not':{'or':[{'resn':'UH7'}, {'resn':'DMS'}]}})
view.zoomTo()
view.show()

# Part II. Structural alignment

It has been observed that **structure space is rather small when compared with the sequence space**. In fact, hierarchical structure classifications such as **CATH** have demonstrated that only a few proteins of the enormous pool of structures being deposited each year in the Protein Data Bank are known to provide novel folds. In this regard, it seems that we have discovered almost all protein folds. Moreover, the elegant experiment by Chothia & Lesk [Chothia C & Lesk AM (1986) _EMBO J_ 5(4), 823–826] revealed that, generally, **protein structures are quite similar between proteins when their sequence identity is > 30-40%**, with some remarkable exceptions to this rule).

As an illustrative example of the fact that similar sequences often lead to similar structure, let's align and visualize crystal structures of MPro from the original SARS-CoV and SARS-CoV-2.

First, we will use prody to superpose a structure of SARS-CoV-2 MPro onto a structure of SARS-CoV MPro.

In [None]:
PDB_1wof = prody.parsePDB('1wof')

PDB_1wof_prot = PDB_1wof.select('protein')
PDB_5rh2p = PDB_5rh2.select('protein')

map_1wof_5rh2, map_5rh2_1wof, seqid, overlap = \
  prody.matchChains(PDB_1wof_prot, PDB_5rh2p)[0]
transformation = prody.calcTransformation(map_5rh2_1wof, map_1wof_5rh2)

prody.applyTransformation(transformation, PDB_5rh2)
prody.writePDB('5rh2_aligned.pdb', PDB_5rh2)
prody.writePDB('1wof.pdb', PDB_1wof)
prody.writePDB('1wofa.pdb', PDB_1wof.select('chain A'))

Next, let us visualize the secondary structure of the two proteins. What do you notice?

In [None]:
view = py3Dmol.view()
view.setBackgroundColor('white')
view.addModel(open('1wofa.pdb', 'r').read(),'pdb')
view.addModel(open('5rh2_aligned.pdb', 'r').read(),'pdb')
view.setStyle({'model':0}, {'cartoon': {'color':'purple'}})
view.setStyle({'model':1}, {'cartoon': {'color':'yellow'}})
view.zoomTo()
view.show()

Next, we will visualize and compare side chain positions.

In [None]:
view = py3Dmol.view()
view.setBackgroundColor('white')
view.addModel(open('1wofa.pdb', 'r').read(),'pdb')
view.addModel(open('5rh2_aligned.pdb', 'r').read(),'pdb')
view.setStyle({'and':[{'model':0},{'chain':'A'}]}, {'cartoon': {'color':'purple'}})
view.addStyle({'and':[{'model':0},{'chain':'A'}]}, {'stick': {'colorscheme':'purpleCarbon'}})
view.setStyle({'model':1}, {'cartoon': {'color':'yellow'}})
view.addStyle({'model':1}, {'stick': {'colorscheme':'yellowCarbon'}})
view.zoomTo()
view.show()

MPro is actually a dimer. For simility we have only shown one of the monomers so far. In the following visualization we will also see the other one in gray.

In [None]:
view = py3Dmol.view()
view.setBackgroundColor('white')
view.addModel(open('1wof.pdb', 'r').read(),'pdb')
view.addModel(open('5rh2_aligned.pdb', 'r').read(),'pdb')
view.setStyle({'and':[{'model':0},{'chain':'A'}]}, {'cartoon': {'color':'purple'}})
view.addStyle({'and':[{'model':0},{'chain':'A'}]}, {'stick': {'colorscheme':'purpleCarbon'}})
view.setStyle({'and':[{'model':0},{'chain':'B'}]}, {'cartoon': {'color':'grey'}})
view.setStyle({'model':1}, {'cartoon': {'color':'yellow'}})
view.addStyle({'model':1}, {'stick': {'colorscheme':'yellowCarbon'}})
view.zoomTo()
view.show()

# # --> Enter a short answer in this text box

Look more closely at the side chain positions. Do you notice any patterns with how positions of the side chains are the same or different
* on the surface 
* in the interior of the protein
* at the interface between subunits?