<a href="https://colab.research.google.com/github/etemadism/Courses/blob/main/Peptide_dock_v02.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Peptide-Protein Docking and Interface Analysis (PyRosetta)**

This Jupyter Notebook provides a complete, step-by-step computational workflow for predicting and analyzing the binding of a peptide to a protein target using the PyRosetta molecular modeling suite.

The protocol covers:



1.   Environment Setup: Installation and initialization of PyRosetta.
2.   Peptide Docking: Running the FlexPepDock protocol to generate docked peptide structures (decoys).
3.   Interface Analysis (Evaluation): Quantitatively assessing the resulting binding modes using the InterfaceAnalyzerMover to calculate critical binding metrics such as Rosetta binding energy (ΔG), Buried Surface Area (ΔSASA), and Interfacial H-bond energy.



**Context**: Designed for educational use in a **Protein Engineering** course by **Ali Etemadi** (Etemadih.uw@gmail.com), Tehran University of Medical Sciences.

##Step 1: Install the PyRosetta installer

In [None]:
!pip install pyrosetta-installer
# Install necessary tools
!apt-get install -y wget tar build-essential

Collecting pyrosetta-installer
  Downloading pyrosetta_installer-0.1.2-py3-none-any.whl.metadata (1.6 kB)
Downloading pyrosetta_installer-0.1.2-py3-none-any.whl (3.9 kB)
Installing collected packages: pyrosetta-installer
Successfully installed pyrosetta-installer-0.1.2
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
build-essential is already the newest version (12.9ubuntu3).
tar is already the newest version (1.34+dfsg-1ubuntu0.1.22.04.2).
wget is already the newest version (1.21.2-2ubuntu1.1).
0 upgraded, 0 newly installed, 0 to remove and 38 not upgraded.


##Step 2: Use the installer to download & install PyRosetta

In [None]:
import pyrosetta_installer

# This will download the appropriate PyRosetta wheel and install it
pyrosetta_installer.install_pyrosetta()


Installing PyRosetta:
 os: ubuntu
 type: Release
 Rosetta C++ extras: 
 mirror: https://west.rosettacommons.org/pyrosetta/release/release
 extra packages: numpy

PyRosetta wheel url: https://:@west.rosettacommons.org/pyrosetta/release/release/PyRosetta4.Release.python312.ubuntu.wheel/pyrosetta-2025.41+release.de3cc17d50-cp312-cp312-linux_x86_64.whl


##Step 3: Import PyRosetta and initialize

In [None]:
import pyrosetta
from pyrosetta import rosetta

pyrosetta.init(options="-mute all -ignore_unrecognized_res")


┌───────────────────────────────────────────────────────────────────────────────┐
│                                  PyRosetta-4                                  │
│               Created in JHU by Sergey Lyskov and PyRosetta Team              │
│               (C) Copyright Rosetta Commons Member Institutions               │
│                                                                               │
│ NOTE: USE OF PyRosetta FOR COMMERCIAL PURPOSES REQUIRES PURCHASE OF A LICENSE │
│          See LICENSE.PyRosetta.md or email license@uw.edu for details         │
└───────────────────────────────────────────────────────────────────────────────┘
PyRosetta-4 2025 [Rosetta PyRosetta4.Release.python312.ubuntu 2025.41+release.de3cc17d509259e29147a2ed8f2a726d644e7e34 2025-10-06T16:25:46] retrieved from: http://www.pyrosetta.org


## Step 4: Create directories

In [None]:
import os
import glob

# Create directories
os.makedirs("step_1_peptiderive/input", exist_ok=True)
os.makedirs("step_1_peptiderive/output", exist_ok=True)
os.makedirs("step_2_flexpepdock/without_server/output_files", exist_ok=True)

## Step 5: Flexpepdock

Latest version

In [None]:
import os
from pyrosetta import init, pose_from_pdb, create_score_function, dump_pdb
from pyrosetta.rosetta.protocols.flexpep_docking import FlexPepDockingProtocol
from pyrosetta.rosetta.core.scoring import ScoreFunctionFactory

# --- Configuration ---
pdb_file_path = "/content/step_1_peptiderive/input/no_interface.pdb"
output_dir = "/content/step_2_flexpepdock/results/abinitio"
os.makedirs(output_dir, exist_ok=True)

scorefile_path = os.path.join(output_dir, "flexpepdock_scores.sc")
num_decoys = 2

# --- Initialize PyRosetta ---
init("-mute all -ignore_unrecognized_res -ex1 -ex2 -use_input_sc -flexPepDocking:lowres_abinitio -out:file:scorefile -out:pdb")

# Load Pose
pose = pose_from_pdb(pdb_file_path)
print(f"✅ Loaded complex pose with {pose.num_chains()} chains.")

# Initialize Protocol
protocol = FlexPepDockingProtocol()

# Prepare scoring function
scorefxn = ScoreFunctionFactory.create_score_function("ref2015")

# Run docking for multiple decoys
with open(scorefile_path, "w") as f:
    f.write("description\tscore\n")  # header line

    for i in range(num_decoys):
        test_pose = pose.clone()
        print(f"⚙️ Running docking decoy {i+1}/{num_decoys}...")
        protocol.apply(test_pose)

        # Score and save
        score = scorefxn(test_pose)
        pdb_out = os.path.join(output_dir, f"docked_{i+1}_abinitio.pdb")
        test_pose.dump_pdb(pdb_out)
        f.write(f"docked_{i+1}\t{score:.3f}\n")
        print(f"✅ Decoy {i+1} done (score: {score:.3f})")

print("\n✅ All decoys complete.")
print(f"Results saved to: {output_dir}")


┌───────────────────────────────────────────────────────────────────────────────┐
│                                  PyRosetta-4                                  │
│               Created in JHU by Sergey Lyskov and PyRosetta Team              │
│               (C) Copyright Rosetta Commons Member Institutions               │
│                                                                               │
│ NOTE: USE OF PyRosetta FOR COMMERCIAL PURPOSES REQUIRES PURCHASE OF A LICENSE │
│          See LICENSE.PyRosetta.md or email license@uw.edu for details         │
└───────────────────────────────────────────────────────────────────────────────┘
PyRosetta-4 2025 [Rosetta PyRosetta4.Release.python312.ubuntu 2025.41+release.de3cc17d509259e29147a2ed8f2a726d644e7e34 2025-10-06T16:25:46] retrieved from: http://www.pyrosetta.org
✅ Loaded complex pose with 2 chains.
⚙️ Running docking decoy 1/2...
✅ Decoy 1 done (score: -433.345)
⚙️ Running docking decoy 2/2...
✅ Decoy 2 done (score: -415.5

In [None]:
import os
from pyrosetta import init, pose_from_pdb, create_score_function, dump_pdb
from pyrosetta.rosetta.protocols.flexpep_docking import FlexPepDockingProtocol
from pyrosetta.rosetta.core.scoring import ScoreFunctionFactory

# --- Configuration ---
pdb_file_path = "/content/step_1_peptiderive/input/no_interface.pdb"
output_dir = "/content/step_2_flexpepdock/results/flexpepdock"
os.makedirs(output_dir, exist_ok=True)

scorefile_path = os.path.join(output_dir, "flexpepdock_scores.sc")
num_decoys = 2

# --- Initialize PyRosetta ---
init("-mute all -ignore_unrecognized_res -ex1 -ex2 -use_input_sc -flexPepDocking -out:file:scorefile -out:pdb")

# Load Pose
pose = pose_from_pdb(pdb_file_path)
print(f"✅ Loaded complex pose with {pose.num_chains()} chains.")

# Initialize Protocol
protocol = FlexPepDockingProtocol()

# Prepare scoring function
scorefxn = ScoreFunctionFactory.create_score_function("ref2015")

# Run docking for multiple decoys
with open(scorefile_path, "w") as f:
    f.write("description\tscore\n")  # header line

    for i in range(num_decoys):
        test_pose = pose.clone()
        print(f"⚙️ Running docking decoy {i+1}/{num_decoys}...")
        protocol.apply(test_pose)

        # Score and save
        score = scorefxn(test_pose)
        pdb_out = os.path.join(output_dir, f"docked_{i+1}_flexpepdock.pdb")
        test_pose.dump_pdb(pdb_out)
        f.write(f"docked_{i+1}\t{score:.3f}\n")
        print(f"✅ Decoy {i+1} done (score: {score:.3f})")

print("\n✅ All decoys complete.")
print(f"Results saved to: {output_dir}")


┌───────────────────────────────────────────────────────────────────────────────┐
│                                  PyRosetta-4                                  │
│               Created in JHU by Sergey Lyskov and PyRosetta Team              │
│               (C) Copyright Rosetta Commons Member Institutions               │
│                                                                               │
│ NOTE: USE OF PyRosetta FOR COMMERCIAL PURPOSES REQUIRES PURCHASE OF A LICENSE │
│          See LICENSE.PyRosetta.md or email license@uw.edu for details         │
└───────────────────────────────────────────────────────────────────────────────┘
PyRosetta-4 2025 [Rosetta PyRosetta4.Release.python312.ubuntu 2025.41+release.de3cc17d509259e29147a2ed8f2a726d644e7e34 2025-10-06T16:25:46] retrieved from: http://www.pyrosetta.org
✅ Loaded complex pose with 2 chains.
⚙️ Running docking decoy 1/2...
✅ Decoy 1 done (score: -410.158)
⚙️ Running docking decoy 2/2...
✅ Decoy 2 done (score: -410.1

## Step 6: Interface Analysis

In [None]:
!mkdir step_3_interface_analysis
!cp /content/step_2_flexpepdock/results/abinitio/*pdb step_3_interface_analysis/
!cp /content/step_2_flexpepdock/results/flexpepdock/*pdb step_3_interface_analysis/

In [None]:

import os
import glob
from pyrosetta import init, pose_from_pdb
from pyrosetta.rosetta.core.scoring import ScoreFunctionFactory
from pyrosetta.rosetta.protocols.analysis import InterfaceAnalyzerMover
# Removed: from pyrosetta.rosetta.utility import ex_is_file
# Using os.path.isfile() instead to avoid NameError

# --- Configuration ---
INPUT_DIR = "/content/step_3_interface_analysis"
OUTPUT_FILE = "interface_analysis_summary.csv"

# --- Initialize PyRosetta ---
# Initialize with necessary flags (same as the previous script for consistency)
init("-mute all -ignore_unrecognized_res -ex1 -ex2 -use_input_sc")

# Prepare scoring function (ref2015 is standard for interface analysis)
scorefxn = ScoreFunctionFactory.create_score_function("ref2015")

# Initialize InterfaceAnalyzerMover
# We initialize this once outside the loop for efficiency
# Jump ID = 1 (Standard for a two-chain protein-peptide complex in Rosetta)
interface_analyzer = InterfaceAnalyzerMover(
    1,       # interface jump number
    False,   # tracer (False to print to stdout/file)
    scorefxn, # Use the ref2015 score function
    False,   # compute_packstat (optional, can be True)
    True,    # pack_input (Repack the complex interface before scoring)
    True     # pack_separated (Repack separated monomers before scoring)
)

print(f"🔬 Starting batch interface analysis on PDB files in: {INPUT_DIR}")
print(f"📝 Results will be saved to: {OUTPUT_FILE}\n")

# Check if the input directory exists
if not os.path.isdir(INPUT_DIR):
    print(f"❌ Error: Input directory not found: {INPUT_DIR}")
    exit()

# Find all PDB files
pdb_files = glob.glob(os.path.join(INPUT_DIR, "*.pdb"))

if not pdb_files:
    print(f"⚠️ Warning: No PDB files found in {INPUT_DIR}. Exiting.")
    exit()

# --- Analysis Loop and Reporting ---
results = []
header = ["PDB_Name", "Interface_dG_RU", "Buried_SASA_A2", "Hbond_Energy_RU", "Num_Interface_Residues"]
results.append(",".join(header))

for pdb_path in pdb_files:
    pdb_name = os.path.basename(pdb_path)

    try:
        # Load Pose
        # CHANGED: Using standard Python os.path.isfile() instead of PyRosetta's ex_is_file
        if not os.path.isfile(pdb_path):
             print(f"❌ Could not load {pdb_name}: File check failed (file not found).")
             continue

        pose = pose_from_pdb(pdb_path)
        print(f"⚙️ Analyzing {pdb_name}...")

        # 2. Apply the mover to the pose
        interface_analyzer.apply(pose)

        # 3. Extract Metrics
        interface_dG = interface_analyzer.get_interface_dG()
        delta_sasa = interface_analyzer.get_interface_delta_sasa()
        total_Hbond_E = interface_analyzer.get_total_Hbond_E()
        num_res_interface = interface_analyzer.get_num_interface_residues()

        # --- Print to Console (Integer Output) ---
        print("=========================================")
        print(f"Interface Analysis Results for {pdb_name}:")
        print("=========================================")
        # Casting all float metrics to int for console output, as requested
        print(f"Rosetta Interface dG (ref2015): \t{int(interface_dG)} R.U.")
        print(f"Buried Surface Area (dSASA): \t{int(delta_sasa)} A^2")
        print(f"Interface H-bond Energy: \t\t{int(total_Hbond_E)} R.U.")
        print(f"Number of Interface Residues: \t{num_res_interface}")
        print("=========================================\n")

        # --- Collect results for CSV (Using float precision for storage) ---
        row = [
            pdb_name,
            f"{interface_dG:.3f}",
            f"{delta_sasa:.3f}",
            f"{total_Hbond_E:.3f}",
            str(num_res_interface)
        ]
        results.append(",".join(row))

    except Exception as e:
        print(f"❌ Failed to process {pdb_name}. Error: {e}\n")

# --- Write final summary file ---
with open(OUTPUT_FILE, "w") as f:
    f.write("\n".join(results))

print(f"✅ Batch analysis complete! Summary saved to {OUTPUT_FILE}")


┌───────────────────────────────────────────────────────────────────────────────┐
│                                  PyRosetta-4                                  │
│               Created in JHU by Sergey Lyskov and PyRosetta Team              │
│               (C) Copyright Rosetta Commons Member Institutions               │
│                                                                               │
│ NOTE: USE OF PyRosetta FOR COMMERCIAL PURPOSES REQUIRES PURCHASE OF A LICENSE │
│          See LICENSE.PyRosetta.md or email license@uw.edu for details         │
└───────────────────────────────────────────────────────────────────────────────┘
PyRosetta-4 2025 [Rosetta PyRosetta4.Release.python312.ubuntu 2025.41+release.de3cc17d509259e29147a2ed8f2a726d644e7e34 2025-10-06T16:25:46] retrieved from: http://www.pyrosetta.org
🔬 Starting batch interface analysis on PDB files in: /content/step_3_interface_analysis
📝 Results will be saved to: interface_analysis_summary.csv

⚙️ Analyzing do

In [None]:
from pyrosetta import *
from pyrosetta.rosetta import *
import urllib.request
import os
import shutil

# -------------------------------
# Get user input for PDB ID and chains
# -------------------------------
pdb_id = input("Please enter the PDB ID (e.g., 1nvu): ")
chains_input = input("Please enter the chain letters to keep, separated by commas (e.g., R,S): ")

# Convert the input string of chains into a list
chains_to_keep = [chain.strip() for chain in chains_input.split(',')]

# Define the destination path for the input files
destination_path = "step_1_peptiderive/input/"
os.makedirs(destination_path, exist_ok=True) # Ensure the directory exists

# -------------------------------
# Download and load structure
# -------------------------------
pdb_file = f"{pdb_id}.pdb"
urllib.request.urlretrieve(f"https://files.rcsb.org/download/{pdb_id}.pdb", pdb_file)
print(f"✅ Downloaded {pdb_id}.pdb")

# Copy the original downloaded PDB to the input directory
original_pdb_filepath = os.path.join(destination_path, pdb_file)
shutil.copy(pdb_file, original_pdb_filepath)
print(f"✅ Copied original PDB to {original_pdb_filepath}")


# -------------------------------
# Initialize PyRosetta, ignoring unrecognized residues
# -------------------------------
init(options="-ignore_unrecognized_res true")

pose = pose_from_pdb(pdb_file)
pose_clean = core.pose.Pose()

# -------------------------------
# Extract and combine desired chains
# -------------------------------
for chain_letter in chains_to_keep:
    chain_id = -1
    for i in range(1, pose.num_chains() + 1):
        if pose.pdb_info().chain(pose.chain_begin(i)) == chain_letter:
            chain_id = i
            break

    if chain_id != -1:
        chain_pose = core.pose.Pose()
        core.pose.append_subpose_to_pose(chain_pose, pose, pose.chain_begin(chain_id), pose.chain_end(chain_id))

        if pose_clean.total_residue() == 0:
            pose_clean = chain_pose
        else:
            core.pose.append_pose_to_pose(pose_clean, chain_pose, new_chain=True)
    else:
        print(f"❗Warning: Chain {chain_letter} not found in the PDB.")


# -------------------------------
# Remove non-protein residues from the combined pose
# -------------------------------
# This will remove the GTP and any other non-protein residues
if pose_clean.total_residue() > 0:
  core.pose.remove_nonprotein_residues(pose_clean)
else:
  print("❗Warning: No protein chains were kept, cannot remove non-protein residues.")


# -------------------------------
# Save cleaned complex directly to the destination path
# -------------------------------
if pose_clean.total_residue() > 0:
    clean_pdb_filename = f"{pdb_id}_{'_'.join(chains_to_keep)}_clean.pdb"
    clean_pdb_filepath = os.path.join(destination_path, clean_pdb_filename)
    pose_clean.dump_pdb(clean_pdb_filepath)
    print(f"✅ Clean complex saved as {clean_pdb_filepath}")
else:
    print("❗Warning: No pose was created, skipping save step.")

Please enter the PDB ID (e.g., 1nvu): 1nvu
Please enter the chain letters to keep, separated by commas (e.g., R,S): R,S
✅ Downloaded 1nvu.pdb
✅ Copied original PDB to step_1_peptiderive/input/1nvu.pdb
┌───────────────────────────────────────────────────────────────────────────────┐
│                                  PyRosetta-4                                  │
│               Created in JHU by Sergey Lyskov and PyRosetta Team              │
│               (C) Copyright Rosetta Commons Member Institutions               │
│                                                                               │
│ NOTE: USE OF PyRosetta FOR COMMERCIAL PURPOSES REQUIRES PURCHASE OF A LICENSE │
│          See LICENSE.PyRosetta.md or email license@uw.edu for details         │
└───────────────────────────────────────────────────────────────────────────────┘
PyRosetta-4 2025 [Rosetta PyRosetta4.Release.python312.ubuntu 2025.41+release.de3cc17d509259e29147a2ed8f2a726d644e7e34 2025-10-06T16:25:46] ret