# PyMOL Tutorial

At the end of this tutorial, you will be familiar with:
* Navigating a protein structure in PyMOL
* Using PyMOL to make selections, measurements, structural comparisons, and figures

This notebook will walk you though the basics of analyzing protein structures with Pymol. We will be looking at a series of [Cytochrome](https://en.wikipedia.org/wiki/Cytochrome) proteins. This superfamily of proteins is characterized by an iron-heme prosthetic group and plays important, diverse functions in cell metabolism. The diverse reduction-oxidation chemistry that the cytochrome superfamily catalyzes in addition to their importance in metabolism has made these proteins of particular interest to scientists. First, we will be investigating a cytochrome protein you may already be familiar with from your introductory biology courses: [cytochrome c](http://pdb101.rcsb.org/motm/36). This protein is embedded in the mitochondial membrane and is responsible for shuttling electrons as part of the electron transport chain. 

# PyMOL GUI

Open PyMOL. You should see something similar to the following:

<img src="Assets/PyMOLTutorial/PyMOL_GUI.png" width="800" align="center"/>

The important tools that we will be using in this tutorial are annotated.

**Command Line:** This is where we will be passing PyMOL commands. In addition to being a Python interpreter, PyMOL has its own syntax for performing a myriad of useful commands. You will be introduced to may of these commands in this tutorial.

**Console:** This is where PyMOL tells you what it is doing. Information about selections and functions will be displayed here.

**Dropdown Menus:** These menus contain much of the functionality in PyMOL. We will be using functions found under `Display`, `Scene`, and `Wizard` during this tutorial, but you are encouraged to play with the other functions!

**Object Panel:** All molecules we are working with will be listed here. These molecules can either be loaded from a PDB file or created within PyMOL. There are two types of entities displayed in the object panel: molecular objects and selection objects. Molecular objects are composed of actual atoms with 3D coordinates that are displayed in PyMOL. Selection objects are groupings of atoms that the user can define to more efficiently work with different molecules.

There are a series of actions we can apply to an object:

- **Action:** Options for setting the current view of the protein and applying basic computations. You can zoom in, center view, or use a preset to change how the object looks. 
- **Show:** Change how objects are displayed. Toggle display of the protein backbone and sidechains.
- **Hide:** Change how objects are displayed. Toggle display of the protein backbone and sidechains.
- **Label:** Labels different components of an object. PyMOL will automatically label things with varying levels of granularity.
- **Color:** Color the object!

**Selection Tools:** Change how you intereact with an object. The three-button viewing mode with a mouse is *highly* suggested. The controls are described in the panel, but a human-readable summary of the most important controls is as follows:

- **Left-click:** Select an entity. Click "Selecting" at the bottom of the selection tool panel to change what gets selected on click (atoms, residues, chains, etc.) 
- **Middle-click:** Center the view where tou clicked
- **Right-click:** Opens an option menu for the entity you clicked on with actions normally available through the Object panel.
- **Left-click + drag:** Rotate the molecule
- **Middle-click + drag:** Move the molecule in-plane (pan)
- **Right-click + drag:** Zoom in and out
- **Scroll:** Change the depth of view (useful for focusing on certain features)
- **Shift + Left-click:** select a group of atoms

An overview of this functionality and more is available at the [PyMOL Wiki](https://pymolwiki.org/index.php/Practical_Pymol_for_Beginners)!

## Basic PyMOL Functionality

Protein Data Bank (PDB) structure ID `2N9J` is an NMR structure of human cytochrome c in an oxidized state. [This paper](https://www.ncbi.nlm.nih.gov/pubmed/26718409) demonstrates that cytochrome c adopts different conformational states depending on the redox state of the iron at the center of its heme (Oxidized: $Fe^{III}$, Reduced: $Fe^{II}$). Let's see what exactly changes when heme takes on an electron (i.e. becomes reduced).

First, let's load a structure of oxidized human cytochrome c into PyMOL. Type the following into the command line and press enter: `fetch 2N9J`
This command downloads this structure from the Protein Data Bank and loads it into PyMOL. You should see the following:

<img src="Assets/PyMOLTutorial/2N9J_Loaded.png" width="800" align="center"/>

### Protein and Molecular Representations
By default, PyMOL loads and displays a protein in "cartoon" form. This tells us the overall shape and structure of a protein but doesn't tell us anything about the side chains that make up the protein. Small molecule ligands, like heme, are displayed as "sticks." This is typically what comes to mind for a molecular representation where we can see all atoms types and the bonds between them. Notice that `2N9J` has its own row in the object panel. Use the show/hide menus to explore different representations of the protein (default view is under `Action > Preset > Classifed`).

### Sequence Representation
To display the sequence of the protien, go to the dropdown menus and click `Display > Sequence`. This will show the protein sequence and any ligands in PDB numbering at the top of the GUI. As you select residues in the protein, they will also be highlighted in the sequence.

### Structural States
The structure in `2N9J` was solved using a method called protein nuclear magnetic resonance spectroscopy (protein NMR). This method of determining protein structure yeilds many possible low-energy solutions for conformations the protein may adopt. Several of these solutions are typically included in the files provided by the PDB. You can see how many of these solutions are included in `2N9J` by looking at the bottom of the Selection Tools panel where is says "State". Use the arrow buttons below "State" to look at the different low-energy solutions of the protein (hit play to watch the protein dance!). 

### Making and Using Selections
Let's investigate how cytochrome c interacts with heme. From the cartoon representation, we can see that heme lies within a pocket completely surrounded by the protein. However, what side chain interactions are cytochrome c using to bind heme?

First, let's get a full look at `2N9J` in stick form. In the Object Panel, select `Show > Licorice` for `2N9J`. You should see a complete mess. You can go through the rest of this tutorial like this, or we can go about this more intelligently using the tools PyMOL provides. Use `Hide > Licorice` for `2N9J` to remove the sticks and revert back to the cartoon representation.

Except... where did heme go??? 

The first selection tool we will be using is the sequence bar that we toggled earlier at the top off the viewer. The protein sequence is displayed as a string of single-letter amino acids, followed by any ligands that are part of the object. The three-letter identifier for heme is `HEC` and should be at the end of the protein sequence. Click on `HEC`. You will see pink atoms highlighted in the protein where heme used to be. This means heme has been selected, but we still need to toggle heme to be shown.

<img src="Assets/PyMOLTutorial/2N9J-HEC_Selected.png" width="800" align="center"/>

Now that `HEC` is selected, here are three ways we can make it reappear:

1. Right-click on `HEC` in the sequence bar to show a dropdown of options. To show `HEC` as a stick representation, select `Show > Sticks` from the dropdown.
2. Notice that a new row appeared in the object panel. This new row, called `(sele)`, is a selection group containing any objects we have selected. Select `Show > Sticks` for `(sele)` to show our selection, heme.
3. In the command line, type `show sticks, sele` and press enter.

`(sele)` is a special selection group for all objects we have selected. You can select additional objects by clicking the stucture, selecting portions of a sequence, or using the command line.

If we want to make and store a selection, we can do the following: `select my_favorite_ligand, sele`. A new selection object named "my_favorite_ligand" should appear in the object panel and contains a selection of objects in `(sele)`. We can go back anytime to apply an action to this saved selection group. 

The PyMOL command line uses its own syntax detailed [here](https://pymolwiki.org/index.php/Selection_Algebra) and [here](https://pymolwiki.org/index.php/Objects_and_Selections) to allow for specific selections by the user. When you select portions of the structure, the console above will report information such as chain, residue name, residue number, and atom name of the selection. This information, plus some clever selection algebra, will allow us to make very specific selections through the command line. We can also manipulate selected portions of the structure using keyword commands (e.g. `show` as demonstrated above). A full list of commands that are compatible with selections can be found [here](https://pymol.org/pymol-command-ref.html).

Some examples of useful commands that use selections (try these out!):

* `color red, resi 62-75` (colro residues 62-75 red)
* `show sticks, resn CYS and not backbone` (only show sidechains as sticks for cysteine residues)
* `hide sticks, hetatm` (hides objects that use PDB hetatm records, typically ligands)

Different ways to deselect objects (i.e. empty `(sele)`):
1. Type `deselect` into the command line and press `Enter`
2. Click the dead space anywhere around a structure
3. Select `Action > Delete Selection` for `(sele)`

## Investigating the Structure of Cytochrome C

Now that we are familiar with basic functions and commands in PyMOL, let's get back to investigating how the structure of cytochrome c changes depending on its oxidation state. First, let's see which residues are in Van der Waal's contact distance with heme. Type this command into the command line:
`select heme_contacts_4A, resn HEC around 4`. This command made a new selection group, "heme_contacts", that includes any atom that is within 4 angstroms (around 4) of heme (*resn*ame HEC). Show sidechains only for residues in `heme_contacts` by selecting `Show > side chain`.

Hmm... cytochrome c is pretty small, so `heme_contacts` doesn't really help us focus on residues immediately contacting heme. There are a lot of disembodied atoms floating around since our selection only specified atoms that were close enough to heme. We're interested in how the protein structure changes with oxidation state, so maybe we should focus on interactions with the iron in heme specifically?

Let's try: `select heme_contacts_2A, br. resn HEC and name FE around 2.5`

Neat! Which two residues pop out?

Play around with selections and show/hide commands to learn more about the interactions that cytochrome c makes with heme. There are two more residues that make special interactions with heme!

### Taking Measurements
From the dropdown menus, select `Wizard > Measurement`. The "Measurement" tab will appear above the Selection Tools panel. By default, the measurement tool will measure distances in Angstroms. To change the measurement more, click `Distances` and select the type of measurement you want to make.

<img src="Assets/PyMOLTutorial/Measurement_Panel.png" width="300" align="center"/>

Let's see how close the two residues in the `heme_contacts_2A` selection are to the iron atom. Click on the iron atom and then the closest atom in one of the contacting residues. This will create a new measurement object in the object panel. By default, each measurement will create a new measurement object. This behavior can be toggled by clicking on `Create New Object` under the measurement method tab and selecting a new option (replace previous measurement or merge).

Cycle through the different states of `2N9J`. How do the measurements vary in the different states?

Let's measure the angle formed between the two contacting residue atoms and the heme iron. Switch to the angle measurement mode and select the first contacting atom, iron, then the second contacting iron. How does this measurement vary as you go through different states?

### Structural Alignments

## Deliverables:
* Turn in a PyMOL session of `2N9J` and `2N9I` with the following:
    - Color residues coordinating iron RED and place them in a group called "CoordResidues" 
    - Color residues covalently bonded to heme YELLOW and place them in a group called "CovalentResidues"
    - Color heme PURPLE and label it with its residue identifier
    - Color the rest of the protein white
    - Align `2N9J` onto `2N9I`
* Produce a .png image of the key interactions between cytochrome c and heme in `2N9J`. Show the two residues coordinating iron and the two residues covalently bound to heme as sticks. Label these residues with their position number. No other protein residues should be visible!

## Suggested Activities:
* Play around with the Presets found under the Actions tab to see possible protein representations in PyMOL.
* Change how the protein is represented using different options under the Show tab. What information is gained/lost with different representations?

[Cytochrome p450s]()
[Review of engineered Cytochrome P450s](https://www.sciencedirect.com/science/article/abs/pii/S1367593118301431)

Additional Resources:
* PyMol Wiki