<a href="https://colab.research.google.com/github/timmonspatrick/CECAM-Binding-Workshop-2025/blob/main/notebooks/0_Welcome.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Workshop: Generating Protein-Binding Peptides**

### Objective
We’ll walk through an end-to-end workflow for designing peptides that bind to the cancer-related protein MDM2.

### Learning Goals
*    Understand how to evaluate a protein as a binding target.
*    Generate novel peptide candidates with modern AI methods.
*    Predict and evaluate peptide–protein binding with Boltz-2.
*    Rank candidates based on structural confidence metrics.

## 🧬MDM2 as a Protein Target
The MDM2 protein (Mouse Double Minute 2 homolog) is an E3 ubiquitin ligase that plays a central role in regulating the tumor suppressor p53. Under normal conditions, MDM2 binds to p53 and tags it for degradation, keeping p53 levels low.

This interaction is crucial because p53 acts as a “guardian of the genome”, triggering DNA repair, cell cycle arrest, or apoptosis in response to stress or DNA damage. When MDM2 binds and inhibits p53 too strongly, it can silence this defense mechanism, which is one reason MDM2 is often overexpressed in cancers.

**Model System**

The p53–MDM2 interaction is well studied and one of the most famous examples of a peptide–protein interaction in cancer biology.
*   MDM2 regulates p53, which controls cell cycle and apoptosis.
*   Overactive MDM2 suppresses p53 → linked to cancer.
*   MDM2 binds a short peptide from p53 (residues 17–29) in a clearly defined hydrophobic pocket.

The MDM2 binding domain is relatively small (~110 amino acids), making it computationally manageable for a short workshop!

There are many experimental structures available (e.g. PDB: 1YCR), providing reliable references for validation.

**Forget most of that!**

We'll assume that no experimental structures exist and that the binding pocket is unknown.

We will work with three assumptions:

&bull; We want to target this protein because it's suspected to bind to p53 peptide.
&bull; We have a UniProt ID for this protein: Q00987.
&bull; UniProt will inform us which domains are necessary for the p53 interaction.


# **Workshop Outline**
Protein (MDM2) → Pocket Analysis (AF2Bind) →
Peptide Generation (RFDiffusion, MPNN, ESM-IF1) →
Peptide–Protein Cofolding (Boltz-2) → Evaluation & Ranking

## [1. Protein Setup & Visualization (15 min)](https://colab.research.google.com/github/timmonspatrick/CECAM-Binding-Workshop-2025/blob/main/notebooks/1_Protein_Setup_Visualization.ipynb)
*   Retrieve MDM2 sequence (UniProt).
*   Live fold with Boltz-2.
*   Visualize structure in py3Dmol.
*   Select binding domain + p53 pocket.
*   Fold just the selected substructure.

## [2. Target Feasibility: AF2Bind (15 min)](https://colab.research.google.com/github/timmonspatrick/CECAM-Binding-Workshop-2025/blob/main/notebooks/2_Target_Feasibility.ipynb)
*   Run AF2Bind.
*   Assess feasibility of the protein.
*   Identify and visualise hotspot residues.

## [3. Generating Candidate Peptide Backbones (15 min)](https://colab.research.google.com/github/timmonspatrick/CECAM-Binding-Workshop-2025/blob/main/notebooks/3_Backbone_Generation.ipynb)
*   Generate peptide backbone candidates with RFDiffusion.

## [4. Generating Candidate Peptide Sequences (15 min)](https://colab.research.google.com/github/timmonspatrick/CECAM-Binding-Workshop-2025/blob/main/notebooks/4_Peptide_Generation.ipynb)
*   Convert backbone → sequences with ProteinMPNN and ESM-IF1.

## [5. Co-Folding & Evaluation (15 min)](https://colab.research.google.com/github/timmonspatrick/CECAM-Binding-Workshop-2025/blob/main/notebooks/5_CoFolding_Evaluation.ipynb)
*   Co-fold MDM2 + peptide with Boltz-2 (~10 sec each).
*   Extract metrics:
    -   Number of contacts (peptide ↔ hotspot).
    -   iPAE values (peptide ↔ hotspot).
*   Rank peptides by metrics.
*   Compare to known p53 peptide binding pose.

# Tool List
*    UniProt Database
*    Boltz-2: fast structure/cofold predictor.
*    AF2Bind: finds binding hotspots in proteins.
*    RFDiffusion: generates new backbones conditioned on binding pockets.
*    ProteinMPNN / ESM-IF1: convert backbones into sequences.
*    py3Dmol: visualize structures directly in Colab.

# Metrics to Pay Attention To
*    pLDDT → confidence in local structure.
*    iPAE → confidence in relative positions of protein vs peptide.
*    Contacts → number of close interactions between peptide & hotspot residues.
*    Hotspot-based scoring → focusing only on critical residues.
