# **Submodule 1.3 - Protein-Protein Interactions**

## **Learning Objectives:**
* Become familiar with protein-protein interface
* Be able to use software to identify key protein-protein interactions

## **Prerequisites:**
* Understand the protein structure basics covered in 1.1/1.2
* PyMol

## **Introduction**
This submodule explores the diverse world of protein-protein interactions (PPIs) and their critical role in biological systems and drug discovery. Using PyMOL as our visualization platform, we'll investigate how proteins recognize and interact with each other. Through hands-on exercises, you'll learn to analyze interaction interfaces, identify key binding hotspots, and understand the various forces that govern protein-protein recognition.

## **Protein-Protein Interaction Overview**
Protein-protein interactions represent crucial control points in biological systems, mediating everything from signal transduction to enzyme regulation. These interactions typically involve large contact surfaces (1,000-2,000 Å²) with multiple binding hotspots, creating both challenges and opportunities for drug design. Unlike traditional small molecule binding pockets, PPI interfaces are often flat, extended, and more dynamic, requiring innovative approaches for therapeutic intervention.

PPIs can be targeted through various strategies:

- Small molecules targeting hotspot regions
- Peptide-based inhibitors mimicking interface regions
- Protein-based therapeutics (biologics)
- Allosteric modulators affecting interface formation

Understanding PPI interfaces is critical for drug development because:
- Many disease states involve dysregulated protein-protein interactions
- PPI interfaces are often highly specific, potentially reducing off-target effects
- Different classes of PPIs may require different therapeutic approaches
- Interface dynamics can provide multiple druggable states

Recent successes in targeting PPIs, such as Bcl-2 family inhibitors and PD-1/PD-L1 blocking antibodies, have validated these interactions as therapeutic targets. Modern drug discovery increasingly relies on detailed structural understanding of protein-protein interfaces, combined with computational methods to identify druggable sites and design appropriate therapeutic agents.


-----------------
# 📊 Tutorials
In these tutorials, we will use the PyMOL, ClusPro, and other cheminformatic tools to work through <u>**four**</u> applied activities to:
- Build a small molecule ligand when a pulic *.pdb* file is not available.
- Learn the nuances of generating a grid for a molecule with reactive atoms.
- Use ClusPro for Protein-Protein docking.
- Protein modeling using Alpha-fold server.

## Before you begin:
- Run the PyMOL GUI by following the directions provided in the Submodule 0 notebook, provided here: [pymol_notebook](../submodule0_pymol_setup/pymol_notebook.ipynb)


## 🌟 **Activity 1: Visualizing Protein-Protein Interaction Interface**
In this activity, you will use PyMOL to identify and analyze the protein-protein interaction interface in the *p53-MDM2 complex*.

### **Objectives:**
Explore the secondary structure elements of the *p53-MDM2 complex*, highlight key amino acid residues, and analyze the non-covalent interactions stabilizing the complex.

### **Steps to Complete Activity**

### **Step 1: Load the PDB File**
1. Load the PDB file using the following PyMOL command:<br>
     `fetch 1YCR` <br>
     - The 3D structure of the p53-MDM2 complex will load, showing both protein chains in a basic cartoon representation.
     - Both MDM2 and p53 have helical secondary structures.

### **Step 2: Highlight p53 Peptide**
1. Color the p53 peptide in blue:<br>
   `color blue, /1YCR/B/B/`

### **Step 3: Identify and Analyze Secondary Structures**
1. Identify the secondary structure elements involved in the interaction. Note that p53 interacts with MDM2 through its helical secondary structure.

### **Step 6: Change Display Modes**
1. Hide the cartoon representation:<br>
   `hide cartoon`
2. Show the structure as lines:<br>
   `show lines`

### **Step 7: Display Sequence and Find Polar Contacts**
1. Display the sequence:<br>
   `Display > Sequence`
2. Select the entire sequence:<br>
   `sele > A > find polar contacts > within selection`
3. Hydrogen bonds will be displayed.


### **Step 8: Label Residues**
1. Label all residues in the structure:<br>
   `(1YCR) > L > residues`

### **Step 9: Highlight Key Residues**
#### MDM2 (Green Color)
1. Select and highlight the following residues:<br>
   `L54, L57, I61, Y67, V75, F86, F91, I99, and Y100`
2. Use the following command:<br>
   `sele > S > sticks`
3. Deselect by clicking on any blank area of the screen.

#### p53 (Blue Color)
1. Select and highlight the following residues:<br>
   `F19, W23, L26, and P27`
2. Use the following command:<br>
   `sele > S > sticks`

### **Step 10: Analyze the Interaction**
1. Observe the polar contacts, hydrogen bonds, and other non-covalent interactions stabilizing the complex.
2. Note the hydrophobic interactions between residues in *MDM2* and *p53*.
3. You can move the center mouse roll up and down to adjust the slab position of the structure so that you get minimum overlap of residues. Using the middle roller on the mouse adjust the structure such a way that only sticks ar eclearly visible and some part of the protein is out of the view
4. In the 3D structure observe that L57, I99, F86, F91 and V75 of MDM2 form a hydrophobic pocket  that accomodates F19, W23, L26 as shown in the figure

<center><img src="images/protein_interactions.png" width=800 /></center><br><br>

- Hydrogen bonding (yellow line)  is observed between W23 side chain of P53 to L54 of MDM2
- E17 of P53 formed hydrogen bond K94 of MDM2
- N29 of P52 formed hydrogen bon with Y100 of MDM2

### **Step 11: Generate Hydrogen Bonds Table**
1. To generate a comprehensive hydrogen bonds table within the protein and between the two interacting proteins, you can optionally use the VADAR web server. Upload the 1YCR PDB file to VADAR to obtain a detailed hydrogen bonding table along with other structural analyses.
2. When you Submit the *1YCR* PDB file you will get a calculated Ramachandran map, side chain torsion angles, and hydrogen bonding table.
3. Click on hydrogen bonding Table, to see all hydrogen bonding interactions between the two molecules.

<center><img src="images/vadar.png" width=600 /></center><br><br>

### **Key Takeaway(s)**
The interaction between the proteins or interaction of proteins with ligands depends on the complementarity of shapes, surface properties such as hydrophobic, hydrophilic and electrostatic. While designing a ligand drug for a protein surface knowledge of protein surface is very crucial. Here we demonstrate the representation of protein surface based on the properties of amino acid residues.

## 🌟 **Activity 2: Visualizing Protein Surface Chemistry with YRB Highlighting in PyMOL**
The YRB script was developed to emphasize important functional groups on protein surfaces, such as hydrophobic groups and charged residues, which play crucial roles in protein-protein interactions and ligand binding.

### **Overview of YRB Highlighting Scheme**

The YRB highlighting scheme colors protein surfaces as follows:

- **Yellow:** Carbon atoms in non-polar (hydrophobic) groups, such as CH, CH2, CH3, that are not bound to nitrogen or oxygen atoms.
- **Red:** Negatively charged atoms, such as those found in aspartate (D) and glutamate (E).
- **Blue:** Positively charged atoms, such as those found in lysine (K) and arginine (R).
- **Grey:** Backbone atoms, polar atoms, and any remaining atoms.

This color scheme helps to visualize and understand which areas of the protein are involved in hydrophobic interactions, charged interactions, and potential binding regions.

### **Objectives:**
In this activity, we will learn how to use the YRB script in PyMOL to visualize the surface hydrophobicity and charge distribution of a protein structure.

### **Steps to Complete Activity**

#### **Step 1: Load the Protein Structure in PyMOL**
1. Open PyMOL and load your protein structure of interest from PDB using the following command:<br>
  `PyMol> fetch 1A3N, async=0  # Replace '1A3N' with your desired PDB code`

#### **Step 2: Set Up the Visual Representation**
1. Hide all default representations to keep the display clean:<br>
  `PyMol> hide everything, all`

2. Set the representation to surface:<br>
  `PyMol> show surface, all`
  
#### **Step 3: Obtain the YRB Script**
1. The YRB script is a script which colors the molecule using the YRB(Yellow-Red-Blue) color scheme. The script below performs this action and was obtained from a publication by [Hagemans et al](https://www.frontiersin.org/journals/molecular-biosciences/articles/10.3389/fmolb.2015.00056/full). 

In [2]:
# Open the script.pml file in write mode
with open("pml_scripts/YRB.pml", "w") as scriptout:
    # Remove all hydrogen atoms from the visualization for better clarity
    scriptout.write("remove hydro\n")

    # Set custom colors for different functional groups
    scriptout.write("set_color yellow, [0.950, 0.78, 0.0]\n")  # Hydrophobic regions
    scriptout.write("set_color grey, [0.95, 0.95, 0.95]\n")    # Backbone atoms and polar groups
    scriptout.write("set_color red, [1.0, 0.4, 0.4]\n")        # Negatively charged regions (e.g., acidic residues)
    scriptout.write("set_color blue, [0.2, 0.5, 0.8]\n")       # Positively charged regions (e.g., basic residues)

    # Define a dictionary to map specific amino acids to their atoms and corresponding colors
    mapping = {
        'arg': [('NE,NH2,NH1', 'blue'), ('CD,CZ', 'grey'), ('CG', 'yellow')],  # Arginine
        'asn': [('CG,OD1,ND2', 'grey')],  # Asparagine
        'asp': [('CG', 'grey'), ('OD2,OD1', 'red')],  # Aspartic acid (negatively charged)
        'cys': [('SG', 'grey')],  # Cysteine
        'gln': [('CG', 'yellow'), ('CD,OE1,NE2', 'grey')],  # Glutamine
        'glu': [('CG', 'yellow'), ('CD', 'grey'), ('OE1,OE2', 'red')],  # Glutamic acid (negatively charged)
        'his': [('CG,CD2,ND1,NE2,CE1', 'grey')],  # Histidine
        'ile': [('CG1,CG2,CD1', 'yellow')],  # Isoleucine (hydrophobic)
        'leu': [('CG,CD1,CD2', 'yellow')],  # Leucine (hydrophobic)
        'lys': [('CG,CD', 'yellow'), ('CE', 'grey'), ('NZ', 'blue')],  # Lysine (positively charged)
        'met': [('CG,CE', 'yellow'), ('SD', 'grey')],  # Methionine
        'phe': [('CG,CD1,CE1,CZ,CE2,CD2', 'yellow')],  # Phenylalanine (aromatic)
        'pro': [('CG', 'yellow'), ('CD', 'grey')],  # Proline
        'ser': [('CB,OG', 'grey')],  # Serine (polar)
        'thr': [('CB,OG1', 'grey'), ('CG2', 'yellow')],  # Threonine (polar)
        'trp': [('CG,CD2,CZ2,CH2,CZ3,CE3', 'yellow'), ('CD1,NE1,CE2', 'grey')],  # Tryptophan (aromatic)
        'tyr': [('CG,CE1,CD1,CE2,CD2', 'yellow'), ('CZ,OH', 'grey')],  # Tyrosine (aromatic and polar)
        'val': [('CG1,CG2', 'yellow')]  # Valine (hydrophobic)
    }

    # Set the selection name to 'all' to apply coloring to all molecular objects
    selection = 'all'
    scriptout.write("set selection_name, " + selection + "\n")

    # Color the backbone atoms (N, C, CA, O) in grey for the selected objects
    scriptout.write("color grey, (n. N,C,CA,O and " + selection + ")\n")

    # Color the beta-carbon (CB) atoms in yellow for the selected objects
    scriptout.write("color yellow, (n. CB and " + selection + ")\n")

    # Apply specific colors based on the amino acid type and atom name for the selected objects
    for key in mapping:
        for (atom, color) in mapping[key]:
            scriptout.write("color " + color + ", (n. " + atom + " and r. " + key + " and " + selection + ")\n")

#### **Step 4: Run the YRB Script in PyMOL**
1. In PyMOL, navigate to the menu:<br>
    `File > Run...`
2. Select the `yrb.py` script you saved earlier.
3. Apply the YRB highlighting scheme to all loaded structures:<br>
  `PyMol> yrb`
  - This will color all structures in the session based on the YRB scheme.
4. (Optional) To apply the scheme to a specific structure:<br>
  `PyMol> yrb 1A3N  # Replace '1A3N' with the name of your specific structure`
  
#### **Step 5: Adjust Visualization Settings for Better Viewing (Optional)**
1. Adjust the transparency of the surface to make internal features visible:<br>
  `PyMol> set transparency, 0.3, all`

2. Show the cartoon view alongside the surface to understand how the surface features relate to the protein backbone:<br>
  `PyMol> show cartoon, all`<br>
  `PyMol> set transparency, 0.5, all`

---------------------
## 🌟 **Activity 3: Interaction Analysis**

### **Objectives:**
With the analysis conducted in this activity, we can understand how the ligand interacts with its binding site through different types of interactions, such as:

- <u>Binding Affinity</u>: By analyzing π-π and ionic interactions, we can infer how strong the ligand binds to the active site.
- <u>Stability of Interaction</u>: Understanding the nature of interactions (hydrophobic, ionic) helps predict the stability of the ligand-protein complex.
- <u>Binding Pocket Characterization</u>: The surface representation highlights the spatial fit and accessibility of the ligand within the active site.
- <u>Ligand Optimization:</u> Identifying key interactions aids in optimizing ligand modifications to enhance binding efficacy or reduce undesired interactions.

<center><img src="images/interactions.png" width=600 /></center><br><br>

### **Steps to Complete Activity**

#### **Step 1. Visualize Van der Waals Interactions**
1. Select and visualize atoms within Van der Waals contact distance of the ligand:<br>
    `PyMol> select vdW_contacts, ligand around 3.5 and not ligand  # Select atoms within 3.5 Å distance for Van der Waals interactions`<br>
    `PyMol> show spheres, vdW_contacts                             # Show these selected atoms as spheres`<br>
    `PyMol> color cyan, vdW_contacts                               # Color spheres cyan for visualization`<br>

#### **Step 2. Ionic Interaction Analysis**
1. Select and display ionic interactions:<br>
    `PyMol> select charged_residues, resn ARG+LYS+ASP+GLU                       # Select positively and negatively charged residues`<br>
    `PyMol> select ionic_interactions, ligand around 5.0 and charged_residues   # Select charged residues within 5 Å of ligand`<br>
    `PyMol> show sticks, ionic_interactions                                     # Show ionic interactions as sticks`<br>
    `PyMol> color red, ionic_interactions                                       # Color ionic interactions red for easy visualization`

#### **Step 3. π-π Stacking and Cation-π Interaction Analysi**
1. Identify aromatic interactions with the ligand:<br>
    `PyMol> select aromatic_residues, resn PHE+TYR+TRP                          # Select aromatic residues`<br>
    `PyMol> select pi_interactions, ligand around 5.0 and aromatic_residues     # Select aromatic residues within 5 Å of ligand`<br>
    `PyMol> show sticks, pi_interactions                                        # Show aromatic residues as sticks`<br>
    `PyMol> color purple, pi_interactions                                       # Color aromatic interactions purple`

#### **Step 4. Adjust Visualization for Better Clarity**
1. Zoom into the ligand and binding site:<br>
    `PyMol> zoom (ligand or active_site)        # Zoom into the ligand and binding site for better visualization`
2. Adjust transparency for a better view of interactions.<br>
    `PyMol> set transparency, 0.5, active_site  # Set transparency for active site residues`

------------
## 🌟 **Activity 4: Protein modeling with Alphafold server**
The 3D structure of a protein that you download from protein data bank is elucidated from one of the experimental methods namely, X-ray crystallography, Nuclear magnetic Resonance (NMR) or by Cryo-electron Microscopy (Cryo-EM). If you want a 3D structure of a protein that is not available on PDB you have to model the protein. There are different protein modeling methods available. Most of these methods work well when part of the protein structure you plan to model is similar to the existing 3D protein structure in the database or if there is a sequence similarity between the protein of unknown 3D structure and a known structure. If sequence similarity is more than 40% then model generated from theoretical methods works well. Higher the sequence similarity between known and unknown, the better the model is.

### **Objectives:**
Here, we take the sequence of a protein from the Uniprot database and model it using the Alphafold server. We will model human proprotein *convertase subtilisin/kexin type 7 (PCSK7)*.

### **Steps to Complete Activity**

#### **Step 1. Navigate to protein database**
1. Go to Uniprot https://www.uniprot.org/

#### **Step 2. Download protein sequence**
1. At UniProtKB  enter `PCSK7` and check for entry `Q16549  Homo sapiens (human)`.
2. Scroll down to the sequence
3. Download (or cut and paste into text editor) the fasta formatted sequence (should be displayed).
4. Save the file as *PCSK7.txt*
    - <u>Note</u>: The protein should have 785 amino acids.

#### **Step 3. Navigating to Alphafold Server**
1. Go to Alphafold server: https://alphafoldserver.com/fold/7c80de223b02907d
2. Continue with Google and accept the terms. You will see the following window (below). Paste the sequence you saved in the notepad

<center><img src="images/alphafold_request.png" width=800 /></center><br><br>

#### **Step 4. Submitting to Alphafold Server**
1. Continue to preview job and submit.
2. Once the job is completed you will see the date and time and click that and use the download option to download the zip folder.
3. On the same screen you will see the structure of the molecule displayed and very high to very low pIDDT. Low and very low are displayed in yellow and orange. On the structure the chains having yellow and orange are low and very low. Those structures that look similar to sphagetti are not reliable structures.

<center><img src="images/alphafold_output.png" width=800 /></center><br><br>

#### **Step 5. Next Steps**
1. Within the downloaded folder named fold-xxxx different structures are downloaded and are named model 0, model 1, etc.  For the purpose of drug design you can remove the sphagetti type of structures using PyMOL and use the proper folded structure for analysis.  You can check the quality of the structure using a Ramachandran Map.



------------------------
# 📖 **Submodule 1 QUIZ**

In [None]:
#Render Quiz: Q1
from IPython.display import IFrame
IFrame('quiz/quiz_Q1.html', width=600, height=400)

### Image for Q2 - Peptide Bonds
<center><img src="images/sub1_q2.png" width=400 /></center><br><br>

In [None]:
#Render Quiz: Q2-Q6
from IPython.display import IFrame
IFrame('quiz/quiz_Q2_Q6.html', width=800, height=800)

### Image for Q6 - Insulin Structure
<center><img src="images/sub1_q6.jpg" width=300 /></center><br><br>

In [None]:
#Render Quiz: Q7
from IPython.display import IFrame
IFrame('quiz/quiz_Q7.html', width=800, height=300)

### Image for Q7 - Insulin Structure
<center><img src="images/sub1_q7.jpg" width=250 /></center><br><br>

---------------
## Conclusions
Users shoud be able to visualize complex interactions of protein surface usign PyMol. Identify the amino acid residues involved in stabilizing teh protein-protein interaction and details of the nature of interaction, such as hydrogen bonding and hydrophobic interaction between two proteins. Users should also be able to perform protein modeling of unknown 3D structures from protein sequences data using the Alphafold server and analyze the quality of 3D structures produced from Alphafold.
## Clean Up
<div class="alert alert-block alert-warning"> <b>Attention:</b> Remember to shutdown VM and delete any relevant resources</a>. </div>