# ADD TITLE OF SUBMODULE 2.3
<mark> EDIT</mark>

### **Learning Objectives:**
- Learn how to prepare the 3D structure of small molecule ligand drugs using sketching software.
- Perform protein-protein docking and analyze the interaction surface of two proteins using ClusPro.

### **Prerequisites:** 
- Familiarity with Pymol
- Ability to sign up for ClusPro account (free)
- <mark> anything else? </mark>

## **Sketching and Generating 3D structure of a ligand**

So far, we have looked at the docking of ligand to protein using the 3D structures available on protein data bank (PDB). The 3D structure for a new ligand, however, is not always available from the typical public databases. In cases such as these, if you want to create a new ligand and dock it to the protein of interest, the 3D ligand structure needs to be created. Here we will use the structure of Acetaminophen (shown below) as an example to show how an individual can create or sketch a new molecule and generate the 3D structure of the ligand molecule.

<center><img src="images/fig16_acetaminophen_structure.png" width=300 /><br>

## **Using Autodock when you have a ligand with reactive atoms.**

Many drug molecules have reactive atoms, such as halogens, in their chemical structure. When your ligand contains halogens such as chlorine, bromine, iodine you need to create the Grid using ligand and receptor both. Most of the proteins do not have these atoms and if you create a grid using protein receptor as explained in the above example, docking calculations do not run as grid files for halogens are not found. The example below provides creating a grid in Autodock when you have halogens in the ligand. Once you create the grid with halogen atoms, docking calculations are similar to the one explained above.

<center><img src="images/fig17_halogen_ligand.png" width=300 /><br>

## **Protein-Protein docking**
Most of the docking methods used are for small molecule-protein docking. However, if you want to define the binding site of protein-protein interactions, you need protein docking methods since both proteins are large macromolecules. The docking of two proteins helps us to define the binding interface of two proteins and target protein-protein interaction inhibitors. Experimental methods such as X-ray crystallography and Cryo-electron microscopy are used to elucidate the structure of protein complexes. In many cases, the structure of individual proteins is known, and complex protein structures have to be modeled. Protein-protein docking methods are used to predict the structure of protein complexes. However, one must be cautious with the structure of complexes generated with such protein docking methods, as several possible binding modes are possible. If there is experimental information that can support a model, it should be evaluated and considered.
Why do we need the protein-protein complex structure model?<br>

Protein-protein interactions are very important in many physiological processes in the human body.  These interactions help carry out many cellular processes. Any deregulation in protein-protein interaction may result in disease states. One of the examples of protein-protein interactions is epidermal growth factor receptor (EGFR) interactions. There are four EGFRs, EGFR (HER1), HER2, HER3 and HER4.  These EGFR proteins form dimers upon ligand binding and carry out signaling for cell growth and differentiation. Any deregulation or overproduction of this pathway leads to the formation of many dimers, resulting in the disease state. In such cases, protein-protein interaction needs to be modulated or inhibited. One of the methods of drug design is protein-protein interaction inhibition. To design inhibitors of protein-protein interactions of EGFRs, we need knowledge of the interacting surface of proteins.

## **Using ClusPro**
In ClusPro, the ligand is rotated through a specified number (**n**) of rotations. For each rotation, the ligand is translated in the *x*, *y*, and *z* directions relative to the receptor on a grid. The translation with the *best score* for each rotation is selected.

Among the *n rotations*, ClusPro selects *1,000 rotation/translation combinations* with the lowest scores as possible models of the protein-protein complex.

To select the best model, use your knowledge of the two proteins and any experimental support to identify the best model. Typically you should start with the balanced model if prior knowledge is unavailable as per ClusPro default recommendations.

For more details, visit the [ClusPro website](https://cluspro.org/login.php)


-------------------

# 📊 Tutorials
In these tutorials, we will use the PyMOL, ClusPro, and other cheminformatic tools to work through <u>**three**</u> applied activities to:
- Build a small molecule ligand when a pulic *.pdb* file is not available.
- Learn the nuances of generating a grid for a molecule with reactive atoms.
- Use ClusPro for Protein-Protein docking.

## Before you begin:
- Run the PyMOL GUI by following the directions provided in the Submodule 0 notebook, provided here: [start_gui-server](../submodule0_pymol_setup/start-gui-server.ipynb)
- Register for a ClusPro account through their website - [ClusPro website](https://cluspro.org/login.php) **[OPTIONAL]**

### 🌟 **Activity 1: Sketching Acetaminophen Molecule and Preparing for Docking**

This acitivity involves creating a 3D structure of acetaminophen using an online cheminformatics tool and preparing it for docking studies.

#### **Step 1. Access the Online Cheminformatics Tool**
1. Open the [e-Drug 3D platform](https://chemoinfo.ipmc.cnrs.fr/LEA3D/drawonline.html)

#### **Step 2. Sketch the Acetaminophen Molecule**
1. Use the drawing tools to sketch the structure of acetaminophen (paracetamol).
   - The Acetaminophen structure consists of:
     - A benzene ring.
     - A hydroxyl group (-OH) at one position.
     - An amide group (-NHCOCH3) at another position.
2. Click the *Convert to SDF* button to generate a 3D SDF (Structure Data File) of the molecule.

#### **Step 3. Generate 3D Conformations**
1. From the *3D Structure Window*, click *Generate 3D SDF*.
   - You may provide options for generating multiple conformations if desired.
2. Verify the 3D structure in the visualization panel.

#### **Step 4. Download the Molecule**
1. Click the `Download PDB` button in the tool to download the 3D structure in PDB format:
2. Save the file as `acetaminophen.pdb`
3. The downloaded `acetaminophen.pdb` file can now be used as a *ligand* for docking studies with your protein of interest.

--------------------

### 🌟 **Activity 2: Generating Grid for Chlorine and Bromine Atoms**
This activity involves using AutoGrid to create a grid specifically for chlorine and bromine atoms in a molecule, preparing the ligand and receptor for docking calculations and demonstrates how to include specific reactive atom types, such as chlorine and bromine, in grid generation for docking calculations using AutoGrid and AutoDock.

### **Steps to Perform the Activity**

#### **Step 1. Access the Online Cheminformatics Tool**
1. Open the [e-Drug 3D platform](https://chemoinfo.ipmc.cnrs.fr/LEA3D/drawonline.html)

#### **Step 2. Sketch the Acetaminophen Molecule**
1. Use the drawing tools to sketch the structure shown above.
2. Click the *Convert to SDF* button to generate a 3D SDF (Structure Data File) of the molecule.

#### **Step 3. Generate 3D Conformations**
1. From the *3D Structure Window*, click *Generate 3D SDF*.
   - You may provide options for generating multiple conformations if desired.
2. Verify the 3D structure in the visualization panel.

#### **Step 4. Download the Molecule**
1. Click the `Download PDB` button in the tool to download the 3D structure in PDB format:
2. Save the file as `molecule1.pdb`

#### **Step 5. Prepare the Ligand**
1. Open the ligand file in AutoDock MGLTools by navigating to: <br>
    `Ligand > Open > Molecule1.pdb`
2. Save the ligand as a *PDBQT file*:<br>
   `Ligand > Save > PDBQT`
3. Save the file as `molecule1.pdbqt`

#### **Step 6. Load the Receptor**
1. Open the receptor file:<br>
   `File > Open > hiv1receptor.pdbqt`
    - Both ligand and receptor should now be displayed in the workspace.

#### **Step 7. Create a Grid Box for Docking**
1. Set Macromolecule by going to:<br>
   `Grid > Macromolecule > Choose` and select `hiv1receptor.pdbqt`
2. Set Map Types by navigating to:<br>
   `Grid > Set Map Types > Directly` <br>
   - Then, go to `Grid > Set Map Types > Choose Ligand` and select `molecule1.pdbqt`.
3. Configure the Grid Box:<br>
   `Grid > GridBox`
   - Adjust the grid box dimensions and spacing to encompass the receptor and ligand binding site.
4. Save the Grid Parameter File (GPF):<br>
   `Grid > Output > Save GPF`
5. Name the file `chlhivreceptor.gpf`.

#### **Step 8. Check the Grid Parameter File**
1. Navigate to:<br>
   `Grid > Edit GPF`
2. Verify that the parameter file includes entries for bromine (`Br`) and chlorine (`Cl`) atoms.

#### **Step 9. Docking Calculations [OPTIONAL]**
<div class="alert alert-block alert-warning"> <b>Attention:</b> As previously stated, on a standard laptop, the estimated runtime for this step is <u>~10–16 hours</u>. Due to the long runtime for this step we have marked it as <b>[optional]</b>. Faster results can be achieved by decreasing the search parameter values and or increasing computational resources (if running module on the cloud).</div> 
    
1. Run AutoDock with the prepared grid parameter file (`chlhivreceptor.gpf`) and docking parameter file (`hivreceptordock.dpf`).

### **Key Outputs**
- The *Grid Parameter File* (`chlhivreceptor.gpf`) which, as shown in the image below, should include parameters for chlorine (Cl) and bromine (Br) atoms.
- Docking calculations will utilize the grid box configured for the specified atoms.

<center><img src="images/fig18_grid_parameter_file.png" width=600 /><br>

-----------------

### 🌟 **Activity 3: Modeling EGFR and HER2 Protein Complex Using ClusPro**
This activity focuses on using the **ClusPro protein-protein docking server** to model the interaction between the extracellular domains of EGFR (PDB ID: 3NJP) and HER4 (PDB ID: 8U4I). The purpose of the activity is to demonstrate how to prepare protein structures using PyMOL and perform protein-protein docking with ClusPro.

### **Steps to Perform the Activity**

#### **Step 1. Access ClusPro Docking Server**
1. If you haven't already, go to the [ClusPro website](https://cluspro.org/login.php).
2. Create a ClusPro account:
   - Register with a username and password.
   - Provide an email address to receive the docking results.

#### **Step 2. Preparing EGFR Monomer Using PyMOL**
1. Type the following command in PyMOL to fetch the EGFR homodimer structure in PyMOL:<br>
   `PyMol> fetch 3NJP`

2. Remove one of the molecules from the homodimer:<br>
   `PyMol> remove chain B`
   
3. Remove the ligand peptide:<br>
   `PyMol> remove resn LIG` <mark>need to specify ligand here? </mark>
   
4. Remove solvent molecules:<br>
   `PyMol> remove solvent`
   
5. Save the cleaned structure:<br>
   `PyMol> save EGFRm1.pdb`

### Alternative Method - Write and Load PML Script
<mark>Can breakout step 3 into 3 if need be.</mark>

In [None]:
# PML Script: Proper Docking Visualization
## run this code cell to write .pml script file and run analysis outlined step-by-step above
## Once you write the file it can be run in the PyMOL GUI one of two ways:
##    1. Select file run script and then the pml script within PyMOL -or-
##    2. Set your directory into the folder and then run: pymol> @script.pml

with open("pml_scripts/2.3_preparing_egfr.pml", "w") as scriptout:
    scriptout.write("# Step 1: Fetch the EGFR homodimer structure\n")
    scriptout.write("fetch 3NJP\n")
    
    scriptout.write("# Step 2: Remove one of the molecules from the homodimer\n")
    scriptout.write("remove chain B\n")
    
    scriptout.write("# Step 3: Remove the ligand peptide and solvents\n")
    scriptout.write("remove organic or solvent or hetatm\n")
    
    scriptout.write("# Step 4: Save the cleaned structure\n")
    scriptout.write("save Structural-Biology-and-Drug-Discovery/submodule2_docking_protein_interactions/data/EGFRm1.pdb

#### **Step 3. Preparing HER4 Monomer Using PyMOL**
1. Type the following command in PyMOL to fetch the HER4 homodimer structure in PyMOL:<br>
   `PyMol> fetch 8U4I`

2. Remove one molecule from the homodimer:<br>
   `PyMol> remove chain B`
   
3. Remove the ligand peptide:<br>
   `PyMol> remove resn LIG` <mark>need to specify ligand here. Or update as shown above. </mark>
   
4. Remove solvent molecules:<br>
   `PyMol> remove solvent`
   
5. Save the cleaned structure:<br>
   `PyMol> save HER4m1.pdb`

### Alternative Method - Write and Load PML Script
<mark>Feel that there is no need to have these as two seperate. Should remove the main method and add in a section to identify the pymol code from script</mark>

In [None]:
with open("pml_scripts/2.3_preparing_her4.pml", "w") as scriptout:
    scriptout.write("# Step 1: Fetch the HER4 homodimer structure\n")
    scriptout.write("fetch 8U4I\n")
    
    scriptout.write("# Step 2: Remove one molecule from the homodimer\n")
    scriptout.write("remove chain B\n")
    
    scriptout.write("# Step 3: Remove the ligand peptide and solvents\n")
    scriptout.write("remove organic or solvent or hetatm\n")
    
    scriptout.write("# Step 4: Save the cleaned structure\n")
    scriptout.write("save Structural-Biology-and-Drug-Discovery/submodule2_docking_protein_interactions/data/HER4m1.pdb

#### **Step 4a. [OPTION 1] Submit a Docking Job on ClusPro Website**
1. Login to your ClusPro account.
2. Enter a *Job Name* (e.g., `testjois1`).
3. Upload the Receptor and Ligand
   - *Receptor File*: `EGFRm1.pdb`
   - *Ligand File*: `HER4m1.pdb`
4. Click *Dock* to submit the job.

#### **Step 4b. [OPTION 2] Submit a Docking Job Using Cluspro-API**
<mark>CLUSPRO CODE NEEDS TESTING<mark>

In [None]:
#Install cluspro-api python library
!pip install cluspro-api

In [None]:
#define cluspro-api parameter values
username = '' #REPLACE WITH YOUR CLUSPRO USERNAME WITHIN ''
secret = '' #REPLACE WITH YOUR API SECRET WITHIN ''
receptor_path = 'data/EGFRm1.pdb' #PATH TO RECEPTOR .PDB FILE
ligand_path = 'data/HER4m1.pdb' #PATH TO LIGAND .PDB FILE

In [None]:
#submit cluspro job
cluspro_submit --username username --secret secret --receptor receptor_path --ligand ligand_path

#### **Step 5a. [OPTION 1, if following Step 4a] Access Results on ClusPro Website (if not running cluspro-api)**
1. The docking job will run on the server.
2. Results are sent to your email or can be accessed by logging into ClusPro.
3. The job may take several minutes to hours, depending on server load.
4. The Job ID, Name, and Status (e.g., "Completed") will be displayed.
5. If the job is completed, click on the Job ID number to view the results.

#### **Step 5b. [OPTION 2, if following Step 4b] Access Results on ClusPro Website (if not running cluspro-api)**

In [None]:
#download cluspro results
cluspro_download <jobid> #REPLACE WITH JOB ID DISPLAYED ABOVE WHEN SUBMITTING JOB TO API

#### **Step 6. Results Analysis**
- The docking results will include multiple predicted receptor-ligand complexes ranked by clustering and energy scores.
- Analyze these results to select the most plausible interaction model.
- Display and download various protein models based on non-covalent interactions or models with balanced interactions according to figure below.
- Based on the structures in this example, the `cluster score`, and ranking, `Model 1` is the best model for the `EGFR:HER4 complex` and should be used for the next step.

<center><img src="images/fig19_cluspro_job.png" width=700 /><br>

#### **Step 7. Viewing the Complex in PyMOL**
1. Open the *Model 1* file downloaded from ClusPro in PyMOL (e.g., `model.002.01.pdb`) <mark>**NOTE:** Should add steps to define filenames and upload directories where files are being created and used. I have tried to add these for steps executed within the notebook, but should guide the user where to upload files for FTMAP and ClusPRO</mark>
2. Analyze the EGFR:HER4 complex model and identify secondary structures (**helices**, **beta sheets**) involved in complex formation. Observe how these elements contribute to stabilizing the interaction. <mark>how?</mark>

<center><img src="images/fig20_egfr_her4_molecule.png" width=400 /><br>

------------------------
# Additional Questions
Click through the flashcards below to explore FAQs and associated answers.

In [None]:
#Install python jupytercards library
!pip install jupytercards --quiet

In [None]:
#Imports
from jupytercards import display_flashcards
from IPython.display import Image

In [None]:
display_flashcards('quiz/json/question_flashcards.json')

------------------------
# 📖 **Submodule 2 QUIZ**

In [None]:
#Install python jupyterquiz library
!pip install jupyterquiz --quiet

In [None]:
# Imports
from jupyterquiz import display_quiz
from IPython.display import Image

In [None]:
display_quiz('quiz/json/quiz0.json')

In [None]:
display_quiz('quiz/json/quiz1.json')

In [None]:
display_quiz('quiz/json/quiz2.json')

In [None]:
display_quiz('quiz/json/quiz3.json')

In [None]:
display_quiz('quiz/json/quiz4.json')

In [None]:
display_quiz('quiz/json/quiz5.json')

In [None]:
display_quiz('quiz/json/quiz6.json')

In [None]:
display_quiz('quiz/json/quiz7.json')

In [None]:
display_quiz('quiz/json/quiz8.json')

In [None]:
display_quiz('quiz/json/quiz9.json')

<center><img src="images/quiz_image.jfif" width=600 /><br>

In [None]:
display_quiz('quiz/json/quiz10.json')

---------------
## Conclusions
<mark> Provide an overview of the lessons and skills learned from the module. </mark>
## Clean Up
<div class="alert alert-block alert-warning"> <b>Attention:</b> Remember to shutdown VM and delete any relevant resources</a>. </div>