# Protein Visualization

You can use Python and the Jupyter notebook to interactively visualize pdb files.

Questions:
- How can I use Python and the Jupyter notebook to visualize protein files?

Objectives
- Use the Python library `nglview` to visualize pdb files. 
- Learn about different molecular representations.
- Learn how to change molecule representation and display color.
- Learn how to select different parts of the protein structure file.
- Learn to save images.

## NGLView
In this notebook, we will use the Python library `nglview` to visualize data from pdb files. To use `nglview`, we first have to import it. We usually will shorten `nglview` to `nv`. We will also import `os` to build our file path for our pdb file.

In [2]:
import os
import nglview as nv

To visualize data from a file, we will use the `show_file` function. You put the file path of the pdb you want to view in this function. We will set this equal to a variable called `view`.

In [3]:
cd iqb-101/


/workspaces/iqb-2024/iqb-101


In [4]:
import os
os.getcwd()
os.listdir("PDB_files")


['5eu9.pdb',
 '3iva.pdb',
 '3fgu.pdb',
 '3vnd.pdb',
 '6zt7.pdb',
 '1ddo.pdb',
 '4eyr.pdb',
 '5veu.pdb',
 '2pkr.pdb',
 '7tim.pdb']

In [5]:
filepath = os.path.join('PDB_files', '3fgu.pdb' )
print(filepath)


PDB_files/3fgu.pdb


In [6]:
import nglview as nv
view = nv.show_file(filepath)

To actually see our pdb representation, we just type the variable `view` as the last thing in a notebook cell.

In [7]:
view


NGLWidget()

You can hover over different atoms and residues to see information about them. You can also scroll in or out and click and drag to move the molecule in the viewer. You can even drag the lower right hand corner to make the visualization window larger.

## Changing Representations

You will notice that `nglview` is representing the ligand and the protein in different ways. By default, `nglview` shows the protein in something called `cartoon` representation and the ligand in `ball+stick` representation. 

You can change or add representations to the molecule. For example, if we wanted to change our representation to all be `ball+stick`, we would first clear the default representation using the `clear_representations` function, then add a `ball+stick` representation.

In [8]:
view = nv.show_file(filepath)
view.clear_representations()
view.add_representation("ball+stick")
view

NGLWidget()

Some available representations are:
- `cartoon` - draws protein backbone structure (alpha helices and beta sheets) and nucleic acid backbone structure.
- `base` - shows nucleic acid bases. Usually used with `cartoon` representation.
- `ball+stick`- draws atoms as spheres connected by sticks (cylinders) for bonds.
- `licorice` - similar to ball+stick, but does not have spheres on atoms.
- `spacefill` - atoms are drawn as large spheres. No bonds are drawn between atoms.
- `hyperball` - a derivative of `ball+stick` in which atoms are smoothly connected.

You can see a full list on the [NGLViewer documentation](https://nglviewer.org/ngl/api/manual/molecular-representations.html).

### Exercise
Create a view using a representation of your choice.

In [9]:
view = nv.show_file(filepath)
view.clear_representations()
view.add_representation("spacefill")
view

NGLWidget()

## Selecting Parts of the Molecule

NGLView allows you to select different parts of your molecule and add representations to those parts only. For example, we can select to add a representation to only the ligands by add `'ligand'` after our representation keyword in the `add_representation` function.

There are a lot of keywords you can use (a complete list is at the end of this notebook).

In [10]:
view = nv.show_file(filepath)
view.clear_representations()
# Add code here to show ligand
view.add_representation("hyperball", "ligand")

In [11]:
view

NGLWidget()

You can also specify the color and the opacity of the representation and center the view on a selection.

In [19]:
view = nv.show_file(filepath)
view.clear_representations()
# Add code here to show ligand
view.add_representation("hyperball", "ligand", color="red", opacity=0.9)

In [20]:
view.center("ligand")
view

NGLWidget()

### Check your understanding

Add another representation to our view - select the protein make it the color blue using cartoon representation with 0.5 opacity.

In [21]:
view = nv.show_file(filepath)
view.clear_representations()
# Add code here to show ligand and protein
view.add_representation("hyperball", "ligand", color="red", opacity=0.9)
view.add_representation("cartoon", "protein", color="blue", opacity=0.5)

view.center("ligand")
view

NGLWidget()

## Adding Representations of Contacts

We can also add a representation of all the atom contacts in the molecule. You can hover over the contacts to get more information about each. 

In [22]:
view = nv.show_file(filepath)
view.clear_representations()
# Add code here to show ligand and protein
view.add_representation("hyperball", "ligand", color="red", opacity=0.9)
view.add_representation("cartoon", "protein", color="blue", opacity=0.5)
view.add_representation("contact")

view.center("ligand")
view

NGLWidget()

In [25]:
view.center("ligand")
view

NGLWidget(n_components=1)

## Saving Images



Once you have your representation looking the way you like, you can save an image to your computer for use in a presenation. To embed a static image in the Jupyter notebook, use the `render_image` function. To download the image, use the `download_image` function. An image will be put in your downloads folder.

In [26]:
view.render_image()

Image(value=b'', width='99%')

You can change the filename (default is `screenshot.png`) by adding the argument `filename`. You can increase the image quality by adding an argument called `factor`. The default value for this is `4` and setting a higher number will result in an larger image. You can make the white parts of your image transparent with the keyword `transparent=False`.

In [28]:
view.download_image("3fgu_screenshot.png")

## Hands On Exercise

There are a large number of keywords you can use for selecting parts of your pdb file:
- all, *
- sidechain
- sidechainAttached (not backbone or .CA or (PRO and .N))
- backbone
- protein
- nucleic
- rna
- dna
- hetero
- ligand (( not polymer or hetero ) and not ( water or ion ))
- ion
- saccharide/sugar
- polymer
- water
- hydrogen
- helix
- sheet
- turn (not helix and not sheet)
- small (Gly or Ala or Ser)
- nucleophilic (Ser or Thr or Cys)
- hydrophobic (Ala or Val or Leu or Ile or Met or Pro or Phe or Trp)
- aromatic (Phe or Tyr or Trp or His)
- amid (Asn or Gln)
- acidic (Asp or Glu)
- basic (His or Lys or Arg)
- charged (Asp or Glu or His or Lys or Arg)
- polar (Asp or Cys or Gly or Glu or His or Lys or Arg or Asn or Gln or Ser or Thr or Tyr)
- nonpolar (Ala or Ile or Leu or Met or Phe or Pro or Val or Trp)
- cyclic (His or Phe or Pro or Trp or Tyr)
- aliphatic (Ala or Gly or Ile or Leu or Val)
- bonded (all atoms with at least one bond)
- ring (all atoms within rings)

Try visualizing some other proteins from your pdb file. See if you can figure out how to
1. Color your protein by secondary structure.
2. Change the representation of water molecules.
3. Change the representation of atoms in rings.