[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/jmou2/PaviaProteinDesign/blob/main/02_Tuesday/tutorial_2_alphafold.ipynb)

# Analyzing an AlphaFolded structure

Our goal in this task is to understand how to make, use and analyze AlphaFold 2 (AF2) predictions of protein structures. Alphafold is a deep learning model that gives a single (or multiple ranked) protein structure(s) when given the amino acid sequence of a protein. 

In this exercise we will use AF2 to to predict the structure of a protein called ClpS using the sequence of the protein crystalized in pdb ID 3dnj chain A.


### Setup
Run the cell below to download and install ProDy.

In [None]:
! pip install prody

import prody as pr

### 1 - Downloading ClpS and cleaning it up

Download ClpS by loading PDB ID 3dnj into a prody object. Clean it up by:
- removing any heteroatoms like waters, unwanted ligands, or solvents
- selecting only chain A and removing the other chains

Save the cleaned up protein as a pdb file. 

In [None]:
# download the protein


# clean up the protein


# save the protein as a pdb file

### 2 - Alphafold your protein

We will use an [Colabfold](https://colab.research.google.com/github/sokrypton/ColabFold/blob/main/AlphaFold2.ipynb) to fold our protein. 

Prepare the sequence of ClpS chain A using ProDy below and paste it into the `query_sequence` input in the ColabFold notebook.

First, try using single-sequence mode by setting `msa_mode` to `single_sequence`. What is the average pLDDT of the best prediction? 

Predict the structure again, this time using a multiple sequence alignment (MSA) (msa_mode: mmseqs2_uniref_env). How confident is the AF2 prediction now? 

In [None]:
# get the sequence of your cleaned up protein using ProDy


# then copy and paste the sequence into colabfold


### 3 - Analyze the Alphafolded structure 

Read in your rank 1 predicted structures into a ProDy object. Superpose the predicted structure with the cleaned up protein and calculate the RMSD of the CA atoms. Open both in pymol. What is the RMSD of the best prediction without the MSA? What about with the MSA?

What does this suggest about the importance of using an MSA for AF2 predictions?

In [None]:
# parse the alphafolded structures


# superpose the predicted structures


# calculate the RMSDs


# save the superposed predictions

In pymol, color the predicted structure by b-factor. This is a per-residue representation of the confidence of AF2 predictions. 

What can you notice about the confidence for different secondary structures? How does the confidence of the structure compare with its alignment?

### 4 - Modeling structural changes following a mutation

Let’s test the performance of AF2 to model structural changes that might occur to a protein following a mutation of a single amino acid. Open your pymol session with 3dnj chain A. Mutate residue 43 (VAL) of 3dnj chain A to a destabilizing amino acid using the PyMOL mutagenesis wizard. 

Click accept mutation (if you haven’t already) and then save the resulting fasta file by typing `save ~/Desktop/3dnj_mutant.fasta, 3dnj_A` (~/Desktop/3dnj_mutant.fasta is the path to your output file and 3dnj_A is the name of your chain A object). Colabfold the mutated sequence.

Use ProDy to align and calculate the RMSD of the mutated protein as above. What is the resulting C-alpha RMSD? 

In [None]:
# your code here

In PyMOL, locate the mutant residue at residue number 9 in the predicted structure (corresponding to residue number 43 in the original pdb 3dnj). 

Is it clashing with any nearby residues in the predicted structure? What is the mutant residue’s pLDDT? (You can see this by clicking on the residue to select it, then in the menu bar at the right clicking next to “(sele)” -->L (for “label) --> b-factor.) 

What is the pLDDT for some surrounding residues? Is the pLDDT of the mutant residue appreciably different than the surrounding residues?

Altogether, what do these results (the RMSD, avg pLDDT, and mutant-residue pLDDT) lead you to conclude about AF2’s ability to model structural changes resulting from single amino-acid mutations?