# **Submodule 1.1 - Protein Basics**

## **Learning Objectives:**
*   <mark>To visualize the 3D structure of proteins and describe the details of secondary, tertiary, and quaternary structures.
*   <mark>To visualize </mark>intramolecular interaction that stabilize the protein secondary and tertiary structures and intermolecular interactions that stablilizes protein quaternary structure and protein-prostehtic group.</mark>

## **Prerequisites:**
1. Familiarity with Python syntax and basic programming concepts.
2. Basic understanding of using Biopython for sequence analysis and PDB file parsing.
3. Basic knowledge of PDB file formats and how to access and interpret them.
4. Basic knowledge of molecular visualization techniques and interpreting 3D protein structures.

## **Introduction**
This module serves as an introductory module for drug design, and any undergraduate student can use this module to learn the protein structure. The module starts with the learning of basic protein structure in detail from primary, secondary, tertiary, and quaternary structures with examples and also provides the physicochemical basis of principles of protein 3D structure. Each part involves examples and tutorials using visualization software. To test the user’s knowledge, assignments are provided, and users can utilize the tutorial they go through to solve the assignment problem. Public online software’s for protein sequence analysis such as Uniprot, a protein data bank is used for analysis. Visualization can be done by Pymol software. 3D visualization also involves binding site analysis of different protein-drug interactions and comparison of binding sites of different proteins. For proteins that do not have any 3D structures, a prediction is performed using Colab Alpha Fold.


<center><img src="images/Protein_Structure_Summary.png" width=800 /></center><br><br>

## **Why Study Protein Structure for Drug Design?**
Proteins are essential biological macromolecules, human cells are estimated to contain approximately 50,000 types of proteins facilitating various biological and physiological functions.The drug discovery process often focuses on specific protein structures like G-protein coupled receptors (GPCRs), enzymes, and ion channels. Each target class presents unique binding sites for drug molecules, and understanding the 3D structure allows scientists to design drugs that can specifically bind tp proteins and modulate biochemical pathways. Below is the breakdown of the frequency for the different drug targets:
* **G-protein-coupled receptors (GPCRs):** 33% of drug targets
* **Nuclear receptors:** 16% of drug targets
* **Enzymes and kinases:** 30% of drug targets
* **Ion channels and transporter proteins:** 18% of drug targets
* **Other targets (e.g., DNA and miscellaneous):** 3%

Ref:Santos et al. Nat. Rev. Drug Discov (2016) doi:10.1038/nrd.2016.230

The high proportion of protein targets emphasizes the importance of understanding protein structure for designing effective drugs. Examples include the development of kinase inhibitors used in cancer therapy and beta-blockers that target GPCRs.


## **Amino Acids**
Amino acids are the monomers of proteins, linked by peptide bonds to form long chains. They play a crucial role in determining the shape and function of the protein. The unique side chains (R-groups) of amino acids define their chemical properties (hydrophobic, hydrophilic, charged, or neutral), which in turn influence how a protein folds into its final structure. The chirality of amino acids (except glycine) also plays a role in how they interact with other molecules. There are 20 standard amino acids, each with distinct properties defined by their R-groups: <br>
<center><img src="images/Amino_Acids.png" width=800 /></center><br><br>
    
- **Nonpolar (Hydrophobic):** e.g., leucine, valine, and isoleucine, which prefer to be buried inside the protein away from water.
- **Polar (Uncharged):** e.g., serine, threonine, and asparagine, which can form hydrogen bonds and are often found on the protein surface.
- **Positively Charged (Basic):** e.g., lysine, arginine, and histidine, which participate in ionic interactions.
- **Negatively Charged (Acidic):** e.g., aspartic acid and glutamic acid, which also engage in ionic bonds.
- **Aromatic:** e.g., phenylalanine, tyrosine, and tryptophan, which can participate in hydrophobic and π-π interactions.

These properties influence how amino acids interact with each other and how the overall protein folds and functions.

Peptide Bonds and Protein Formation: Peptide bonds are formed through a condensation reaction between the carboxyl group of one amino acid and the amino group of another, releasing water. These bonds have partial double-bond character due to resonance, which restricts rotation and gives the peptide bond a planar configuration. This planarity is key in the formation of secondary structures like alpha helices and beta sheets. Proteins can be described using one-letter and three-letter abbreviations of amino acids, making sequence notation more compact for analysis.

<center><img src="images/AA_Condensation.png" width=600 /></center><br><br>

Each amino acid has an α-carbon atom (see Figure above) as a backbone atom. R-group or side chain carbons are labeled by Greek alphabets starting from α carbon atom.


## **Protein Primary Structure**
Refers to the linear sequence of amino acids in a polypeptide chain. The sequence is encoded by genes and ultimately determines the 3D structure and function of the protein. To give an example we will pull the sequence from the protein data bank and display the amino acid length and sequence.


In [None]:
%pip install biopython --quiet

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/3.3 MB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━[0m[90m╺[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.3/3.3 MB[0m [31m7.8 MB/s[0m eta [36m0:00:01[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m[90m━━━━━━[0m [32m2.7/3.3 MB[0m [31m39.1 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.3/3.3 MB[0m [31m32.7 MB/s[0m eta [36m0:00:00[0m
[?25h

<mark> modified code below to set an email address to comply with NCBI's policies.

In [None]:
from Bio import SeqIO, Entrez

# Set email for NCBI requests
Entrez.email = "your_email@example.com"  # Replace with your actual email

# Fetch protein sequence from NCBI
def fetch_protein_sequence(accession):
    handle = Entrez.efetch(db="protein", id=accession, rettype="fasta", retmode="text")
    seq_record = SeqIO.read(handle, "fasta")
    handle.close()
    return seq_record

accession = "YP_009724390.1"  # Example accession number
protein_seq_record = fetch_protein_sequence(accession)

# Print sequence and basic properties
print(f"Sequence ID: {protein_seq_record.id}")
print(f"Sequence Length: {len(protein_seq_record.seq)}")
print(f"Sequence: {protein_seq_record.seq}")

            Email address is not specified.

            To make use of NCBI's E-utilities, NCBI requires you to specify your
            email address with each request.  As an example, if your email address
            is A.N.Other@example.com, you can specify it as follows:
               from Bio import Entrez
               Entrez.email = 'A.N.Other@example.com'
            In case of excessive usage of the E-utilities, NCBI will attempt to contact
            a user at the email address provided before blocking access to the
            E-utilities.


Sequence ID: YP_009724390.1
Sequence Length: 1273
Sequence: MFVFLVLLPLVSSQCVNLTTRTQLPPAYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFSNVTWFHAIHVSGTNGTKRFDNPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLGVYYHKNNKSWMESEFRVYSSANNCTFEYVSQPFLMDLEGKQGNFKNLREFVFKNIDGYFKIYSKHTPINLVRDLPQGFSALEPLVDLPIGINITRFQTLLALHRSYLTPGDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFGEVFNATRFASVYAWNRKRISNCVADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYADSFVIRGDEVRQIAPGQTGKIADYNYKLPDDFTGCVIAWNSNNLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTPCNGVEGFNCYFPLQSYGFQPTNGVGYQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQDVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEHVNNSYECDIPIGAGICASYQTQTNSPRRARSVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLNRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKDFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSS

## **Protein Secondary Structure**
Localized folding patterns within a protein, such as alpha helices and beta sheets, are stabilized by hydrogen bonds. These structures serve as building blocks for the protein's overall shape. Three important secondary structures are the Alpha-Helix, Beta-Sheet, and Beta-Turn.

### **Alpha Helix**
The alpha-helix is a coiled structure with the backbone at the center, and the sidechains spread out from the helical structure. The helix is usually a right handed-coil and is stabilized by the hydrogen bonds between the backbone N-H group of one amino acid and the C=O group four residues earlier. The alpha-helix is a common structural motif found in many proteins, including enzymes and transport proteins.

### **Beta Sheet**
Beta sheets are fundamental secondary structure elements consisting of adjacent polypeptide strands connected by hydrogen bonds. These strands can run either parallel (same direction) or antiparallel (opposite directions), creating a pleated sheet-like structure. The hydrogen bonds formed between the C=O group of one strand and the N-H group of an adjacent strand creates a stable, rigid structure.

### **Beta turn**
Beta turns (β-turns) are essential structural elements that allow the polypeptide chain to change direction, often connecting adjacent strands in antiparallel beta sheets. These turns typically consist of four amino acid residues, with specific amino acids such as proline and glycine commonly found in these positions due to their unique structural properties. The first and fourth residues of the turn are connected by a characteristic hydrogen bond, stabilizing the turn structure.

In [None]:
!pip install -q py3Dmol

In [None]:
import py3Dmol
from IPython.display import display

def show_protein(pdbid):
    # Create viewer
    view = py3Dmol.view(query=pdbid)
    view.setStyle({'cartoon': {'color': 'spectrum'}})
    view.show()

<mark>Visualization of Alpha Helical Segmeny of 1ZTA

In [None]:
#Run show_protein function to visualize three proteins below
print("Protein ID: 1ZTA") #alpha-helical segment of 1ZTA
show_protein("1ZTA")

Protein ID: 1ZTA


<mark> Visualization of Beta Sheets & Loops

In [None]:
print("Protein ID: 6D0T")
show_protein("6D0T")

Protein ID: 6D0T


<mark>Visualization of Beta Sheet

In [None]:
print("Protein ID: 1LE1")
show_protein("1LE1")

Protein ID: 1LE1


## **Protein Tertiary Structure**
The overall 3D arrangement of the entire polypeptide chain, formed through interactions between side chains, including hydrophobic interactions, hydrogen bonds, ionic bonds, and disulfide bridges. Side chains (R-groups) play a crucial role in protein folding and stability. They participate in interactions such as:
* Hydrogen bonds between polar side chains.
* Ionic interactions (salt bridges) between positively and negatively charged residues.
* Hydrophobic interactions, where nonpolar side chains aggregate in the interior of the protein, away from the aqueous environment.
* Disulfide bridges between cysteine residues, providing covalent stabilization.

These interactions collectively determine the protein's final structure and function.

<center><img src="images/Tertiary_Structures.png" width=800 /></center><br><br>


In addition to side chain interactions, tertiary structure encompasses recurring combinations of secondary structure elements that form distinct architectural motifs. These modular building blocks are fundamental to protein function and stability. The Beta-Alpha-Beta (βαβ) motif, consisting of two parallel β-strands connected by an α-helix, is commonly found in nucleotide-binding proteins, with the Rossmann fold (three or more beta strands connected by an alpha helix) being a classic example (Hanukoglu Biochem. Mol. Biol. Edu. 43:206-209, (2015) https://doi.org/10.1002/bmb.20849). Another prevalent arrangement is the Alpha/Beta (α/β) barrel, also known as the triose-phosphate isomerase (TIM) barrel, which features eight parallel β-strands surrounded by eight α-helices. This versatile structure often houses enzymatic active sites at the C-terminal end of the β-strands.

## **Protein Quaternary Structure**
Quaternary structure represents the highest level of protein organization, describing how multiple protein subunits (polypeptide chains) come together to form a functional protein complex. This structure is crucial for many proteins to perform their biological functions. (e.g., hemoglobin consists of four subunits). The assembly of subunits can be either homologous (identical subunits) or heterologous (different subunits), each serving specific biological functions.

In [None]:
# Py3dmol for Hemoglobin
view = py3Dmol.view(query='3bcq')
view.setStyle({'cartoon': {'color': 'spectrum'}})
view.addStyle({'resn': 'HEM'}, {'stick': {'colorscheme': 'Jmol'}})
view.show()

<mark>**Amino Acid Color Code**

| Atom | Color |
|---|---|
| NH2 | <span style="background-color:blue; padding:5px; display:inline-block;">&nbsp;</span> Blue |
| COOH | <span style="background-color:red; padding:5px; display:inline-block;">&nbsp;</span> Red |

---------------
# 📊 Tutorials
In these tutorials, we will use the PyMOL to work through <u>**five**</u> activities to visualize how secondary structures such as helicies, beta sheets and turns are stabilized by hydrogen bonds. You will also visualize and analyze tertiary and, quartenary structure of hemoglobin


## Before you begin:
- Run the PyMOL GUI by following the directions provided in the Submodule 0 notebook, provided here: [pymol_notebook](../submodule0_pymol_setup/pymol_notebook.ipynb)


In [None]:
#should print out directory current directory that ends with `submodule1_protein_structure`
!pwd

/content


In [None]:
#make pml_script output directory
!mkdir /home/jupyter/Structural-Biology-and-Drug-Discovery/submodule1_protein_structure/pml_scripts

mkdir: cannot create directory ‘/home/jupyter/Structural-Biology-and-Drug-Discovery/submodule1_protein_structure/pml_scripts’: No such file or directory


In [None]:
#enables write access from GUI to subfolders within `submodule1_protein_structure`
!chmod -R 777 .

## 🌟 **Activity 1: Visualizing Hydrogen Bonding and Stabilization of Helical Structure**

### **Objective:**  
To visualize *hydrogen bonding* and understand the stabilization pattern of a helical structure using PyMOL.

### **Steps to Perform the Activity**

#### **Step 1: Prepare the PDB File**
1. <mark>Load the File in using the PyMOL GUI by accessing the menu and navigating to:  
    `File > open> config>s3> Structural Biology and Drug Discovery> sub module 1 protein structure> data> helix.pdb`
    
2. If the molecule is displayed with a cartoon helical structure<br>
    `PyMol>hide cartoon`<br>
    `PyMol>show lines`
    
Hydrogen bond interaction table can be generated using the following:
https://psa-lab.github.io/HbindViz/user_guide/  


#### **Step 2: Display the Protein Sequence**
1. To show sequence navigate to:<br>
     `Display > Sequence`
   - The *protein sequence* will appear at the top of the PyMOL interface.

2. Select the specific sequence by using the *left mouse button* to select a region of the sequence displayed at the top.

#### **Step 3: Highlight Hydrogen Bonds**
1. Find Polar Contacts with the selected sequence by navigating to: <br>  
    `Sele > Action > Find > Polar Contacts > Within Selection` <br>
   - This will highlight the *hydrogen bonds* (yellow lines) within the selection.

2. Label the residues by navigating to:  <br>
     `Sele > Label > Residues`  
   - The labels of the residues will now appear.

#### **Step 4: Visualize and Analyze the Structure**
1. Adjust the zoom to observe the *yellow lines* representing hydrogen bonds.

### **Alternative Method for Activity 1 - Write and Load PML Script**
<mark>updated code

In [None]:
# PML Script: Protein Hydrogen Bond Analysis
## Run this code cell to write the .pml script file and execute the analysis outlined step-by-step above.
## Once you write the file, it can be run in the PyMOL GUI in one of two ways:
##    1. Select File > Run Script and then the pml script within PyMOL -or-
##    2. Set your directory to the folder and then run: pymol> @helix_analysis.pml

with open("/home/jupyter/Structural-Biology-and-Drug-Discovery/submodule1_protein_structure/pml_scripts/helix_analysis.pml", "w") as scriptout:
    scriptout.write("# Step 1: Prepare the PDB File\n")
    scriptout.write("load /home/jupyter/Structural-Biology-and-Drug-Discovery/submodule1_protein_structure/data/helix.pdb\n")  # Load the helix PDB file

    scriptout.write("# Hide cartoon representation and show lines\n")
    scriptout.write("hide cartoon\n")
    scriptout.write("show lines\n")

    scriptout.write("# Step 2: Display the Protein Sequence\n")
    scriptout.write("set seq_view, on\n")  # Enable sequence display at the top of the interface

    scriptout.write("# Step 3: Highlight Hydrogen Bonds\n")
    scriptout.write("select helix_region, resi 10-20 and name CA\n")  # Selecting only the alpha-carbons of residues

    scriptout.write("select polar_contacts, (helix_region) and (byres helix_region expand 3) and hydro\n")

    scriptout.write("# Find polar contacts within the selected region\n")
    scriptout.write("distance hbonds, (helix_region), (byres helix_region expand 3), mode=2\n")

    scriptout.write("# Label only the selected residues involved in hydrogen bonding\n")
    scriptout.write("label helix_region, resn + resi\n")

    scriptout.write("# Adjust the label size for better visibility\n")
    scriptout.write("set label_size, 20\n")

    scriptout.write("# Final step: Zoom into the selected region for better visualization\n")
    scriptout.write("zoom helix_region\n")

print("PML script 'helix_analysis.pml' has been updated. Run it in PyMOL to perform the analysis.")


PML script 'helix_analysis.pml' has been updated. Run it in PyMOL to perform the analysis.


### **Key Observations and Understanding the Visualization**

- **Hydrogen Bonds (Yellow Lines):**  
  - Notice that *Trp14 NH* forms a hydrogen bond with *Ile10 C=O*, following an *i to i+4 pattern*, where `i` represents the first residue (*Ile10*) and *Trp14* is *4 residues away*.  
  - Similarly, *Phe13 C=O* is hydrogen bonded to *Ile17 NH*, which stabilizes the helical structure.  
  - These hydrogen bonds play a crucial role in maintaining the integrity of the alpha-helix.

- **Residue Labeling:**  
  - Labels help identify specific residues involved in hydrogen bonding, facilitating a better understanding of their interactions.

- **Helical Backbone and Side Chains:**  
  - The *backbone* is positioned at the center of the helix, providing the structural core.  
  - The *side chains* extend outward, ensuring space for interactions and functionality in the tertiary structure, further contributing to overall stability.  
  - This outward orientation supports the structural and functional properties of the protein.


--------------
## 🌟 **Activity 2: Visualizing the Hydrogen Bonding Pattern in a Beta-Sheet Structure**

### **Objective:**  
To visualize the **hydrogen bonding pattern** that stabilizes a **beta-sheet structure** (secondary protein structure) using PyMOL.

#### **Beta Sheet Structures:**
* <u>Parallel</u>: Strands run in the same direction, resulting in weaker hydrogen bonds due to a non-linear alignment.
* <u>Antiparallel</u>: Strands run in opposite directions, leading to stronger, more linear hydrogen bonds.

In [None]:
#Run the show_protein function defined above (make sure you ran the cell to create the function) to visualize `6D0T` protein again
show_protein("6D0T")

### **Steps to Perform the Activity**

#### **Step 1: Load the Protein with a Beta-Sheet Structure**
1. Fetch protein structure PDB file using the `fetch` PyMOL command:<br>
     `PyMol> fetch 5GV0`

#### **Step 2: Display the Protein Sequence**
1. Display the protein sequence (top of the PyMOL interface) by using the PyMOL GUI to navigate to:  
     `Display > Sequence`

2. Select only the beta-sheet portion of the protein sequence using the *left mouse button*.
   
3. <mark>To hide everything except the selected portion, enter the following command in the PyMOL command line:
`hide everything, not sele`

#### **Step 3: Highlight Hydrogen Bonds**
1. With the protein sequence selected, find the polar contacts by navigating and selecting the following option:  
    `Sele > Action > Find > Polar Contacts > Within Selection`
   - The *hydrogen bonds* (yellow lines) should now be displayed.
   
### **Beta-Sheet Hydrogen Bonding: Key Characteristics and Visualization**

- **Hydrogen Bonds (Yellow Lines):** The beta-sheet structure is stabilized by hydrogen bonds that connect the backbone atoms of neighboring strands.These bonds are arranged in a **ladder-like pattern**, running parallel between adjacent strands, providing structural integrity.

- **Inter-chain Hydrogen Bonds:** Hydrogen bonds form between adjacent chains of the beta-sheet, contributing to overall stability and rigidity.

- **Structural Stabilization:** The arrangement of hydrogen bonds ensures the beta-sheet remains a stable, functional component of the protein structure.


In [None]:
# Run the show_protein function defined above (make sure you ran the cell to create the function) to visualize `1LE1` protein again
show_protein("1LE1")

---------------------------
## 🌟 **Activity 3: Visualizing the Beta-Turn in a Protein Structure**

### **Objective:**  
To identify and visualize a **beta-turn** structure in the hemoglobin protein using PyMOL.

#### Beta-Turn:
A tight loop that allows the polypeptide chain to reverse direction. Beta-turns are stabilised by hydrogen bonds and are often located on the protein surface, helping proteins form compact shapes.

<center><img src="images/beta_turn.png" width=600 /><br><br></center>

### **Steps to Perform the Activity**

#### **Step 1: Load the Hemoglobin Protein Structure**
1. Fetch the hemoglobin structure PDB file:<br>
     `PyMol> fetch 3BCQ`

#### **Step 2: Simplify the Structure**
1. Remove chains *B, C, and D* to focus on chain *A*:<br>
     `PyMol> remove chain B+C+D`

2. Delete Residues Outside the Range of Interest by removing residues *1 to 30* and *71 to the end*:
     `PyMol> remove resi 1-30` <br>
     `PyMol> remove resi 71-999`

#### **Step 3: Focus on the Beta-Turn**
1. The loop structure from residues *44 to 53* contains the beta-turn between two helices. In this step select the key residues in the Beta-Turn (residues *50, 51, 52, and 53*):<br>
     `PyMol> select resi 50+51+52+53`

3. Set the display mode to *sticks* for the selected residues to display the Beta-Turn in Stick Representation:<br>
     `PyMol> show sticks, sele`

4. Add labels to the selected residues:<br>
     `PyMol> label sele, resn+resi`
   
### **Beta-Turn: Key Characteristics and Visualization**

- **Beta-Turn Structure:** The residues *Ser50, Pro51, Gly52, Ser53* form a beta-turn, which enables a sharp directional change in the protein chain between two helices. This structure ensures the protein retains its compact and functional shape.

- **Structural Stabilization:** Beta-turns provide stability by maintaining the tight folding necessary for protein functionality.

- **Beta-Turn Residues:**
The selected residues (*Ser50-Pro51-Gly52-Ser53*) stabilize the corner structure of the protein. **Functional Role:** This turn connects helices, contributing to the overall compact architecture of the hemoglobin molecule.


-------------
## 🌟 **Activity 4: How Disulfide Bonds Stabilize Two Chains of Insulin**
Disulfide bonds are formed by the oxidation of two cysteine residues, creating a covalent bond that stabilizes protein structure. For example, with insulin disulfide bonds link two polypeptide chains, contributing to its 3D structure. In antibodies, disulfide bonds provide flexibility in the hinge region, facilitating antigen binding.

### **Objective:**  
To visualize and understand the role of **disulfide bonds** in stabilizing the two chains of insulin using PyMOL.

### **Steps to Perform the Activity**

#### **Step 1: Load the Insulin Structure**
1. Fetch the PDB File to retrieve the insulin structure:<br>
     `PyMol> fetch 3I3Z`
   - This will display the insulin structure, showing its two chains.

#### **Step 2: Visualize Disulfide Bonds**
1. Highlight disulfide bonds by navigating to the *S menu* in PyMOL and selecting (the option for selecting disulfide is at the bottom of the selection menu):<br>
     ` S > Disulfide > Lines`
   - This will display the *disulfide bonds* as lines connecting sulfur atoms between the two chains.
   
2. Rotate and zoom the visualized structure to observe how the disulfide bonds link specific residues in the two chains.

#### **Step 3: Analyze the Stabilization**
1. Observe that the disulfide bonds form *covalent links* between sulfur atoms of cysteine residues, one on each chain.
2. Note that these disulfide bonds stabilize the insulin molecule by holding the two chains together in a fixed orientation.
3. Disulfide bonds are essential for the functional conformation of insulin because they ensure the protein remains stable under physiological conditions, enabling proper interaction with insulin receptors.

#### ❓ **Additional Discussion Questions**
Click through the flashcards below to explore additional questions about disulfide bonds.

In [None]:
#Render flashcards
from IPython.display import IFrame
IFrame('quiz/question_flashcards_1.1.4.html', width=600, height=500)

------------------
### 🌟 **Activity 5: Understanding the Heme Group Binding Pocket in Hemoglobin**

### **Objective:**  
To explore the *heme group binding pocket* in hemoglobin using the PyMOL visualization tool.

### **Steps to Perform the Activity**

#### **Step 1: Load the Hemoglobin Structure**
1. Fetch the PDB File using the following PyMOL command to retrieve the hemoglobin structure:<br>
     `PyMol> fetch 3BCQ`

2. Display the hemoglobin sequence (at the top of the PyMOL interface) by navigating to:<br>
     `PyMol> Display > Sequence`

#### **Step 2: Isolate Chain A with the Heme Group**
1. Run the following commands to remove Chains B, C, and D:<br>
     `PyMol> select b_chain, chain B`<br>
     `PyMol> remove b_chain`<br>
     `PyMol> select c_chain, chain C`<br>
     `PyMol> remove c_chain`<br>
     `PyMol> select d_chain, chain D`<br>
     `PyMol> remove d_chain`

2. Select water molecules (shown as red 0s in the sequence) and delete them:<br>
     `PyMol> select water, resn HOH`<br>
     `PyMol> remove water`

3. <mark>Select and remove *HEME* and *oxygen molecules* that are not associated with the protein:<br>
         `PyMol> select free_heme, resn HEM and not (byres chain A around 5)`
         `PyMol> remove free_heme`
         `PyMol> select free_oxygen, resn O2 and not within 5 of chain A`
         `PyMol> remove free_oxygen`

#### **Step 3: Analyze the Binding Pocket**
1. For the resulting structure, only the *A chain* of hemoglobin with its bound *heme* group and *oxygen* should be displayed.
2. Explore the Heme Binding Pocket by rotating and zooming into the heme group to observe:
     - How the *heme group* is embedded within the protein structure.
     - The *interactions* between the *heme group* and surrounding amino acid residues.

---------------
## Conclusions
Upon completing this module, users should be able to analyze the primary, secondary, tertiary, and quaternary protein structures in detail. Users should be able to run the scripts and use PyMol in GUI and command mode. They should be able to identify and analyze non-covalent interaction involved in stabilizing the protein structure, such as hydrogen bonding and hydrophobic interactions analysis. With this training users will be ready to analyse protein structure for the design of drugs.
## Clean Up
<div class="alert alert-block alert-warning"> <b>Attention:</b> Remember to shutdown VM and delete any relevant resources</a>. </div>