![Macrocycle Hero](assets/macrocycle_hero.png)

# üíç Macrocycle Design Lab: Engineering Cyclic Peptides

**Objective**: Explore how `synth-pdb` generates random cyclic peptides and visualize the "closure" of the macrocycle via physics-based minimization.

### üíä Why Macrocycles?
Macrocycles (cyclic peptides) are the "Goldilocks" of drug discovery. They are larger than small molecules but smaller than proteins, allowing them to bind to difficult targets while remaining stable in the body. They offer three key advantages over linear peptides:

- **Improve Stability**: Protect the peptide from degradation by proteases in the body.
- **Increase Binding Affinity**: Reduce the "entropy penalty" of binding by pre-shaping the peptide to match its target.
- **Cross Membranes**: Many cyclic peptides (like Cyclosporine A) can enter cells more easily than linear ones.

In AI models (like AlphaFold-3), training on cyclic peptides is difficult because they are rare in the Protein Data Bank (PDB). `synth-pdb` allows you to generate millions of "Correctly Closed" macrocycles to train more robust models.

- **Cyclosporine A**: A famous immunosuppressant that is a 11-mer cyclic peptide.
- **Oxytocin**: The "love hormone" is a 9-mer cyclic peptide.

### üèóÔ∏è The Engineering Challenge
How do you "close the ring"? In this lab, we use **Forcefield Minimization** to pull the N-terminus and C-terminus together into a physically realistic bond.

In [None]:
# @title Setup & Installation { display-mode: "form" }
import os
import sys
from pathlib import Path

# Ensure the local synth_pdb source code is prioritized if running from the repo
try:
    current_path = Path(".").resolve()
    repo_root = current_path.parent.parent 
    if (repo_root / "synth_pdb").exists():
        if str(repo_root) not in sys.path:
            sys.path.insert(0, str(repo_root))
            print(f"üìå Added local library to path: {repo_root}")
except Exception:
    pass

if 'google.colab' in str(get_ipython()):
    if not os.path.exists("installed.marker"):
        print("Running on Google Colab. Installing dependencies...")
        get_ipython().run_line_magic('pip', 'install synth-pdb py3Dmol')
        
        with open("installed.marker", "w") as f:
            f.write("done")
        
        print("üîÑ Installation complete. KERNEL RESTARTING AUTOMATICALLY...")
        print("‚ö†Ô∏è Please wait 10 seconds, then Run All Cells again.")
        os.kill(os.getpid(), 9)
    else:
        print("‚úÖ Dependencies Ready.")
else:
    import synth_pdb
    print(f"‚úÖ Running locally. Using synth-pdb version: {synth_pdb.__version__} from {synth_pdb.__file__}")

In [None]:
import numpy as np
import py3Dmol
from synth_pdb.generator import generate_pdb_content

def center_pdb(pdb_str):
    lines = pdb_str.splitlines()
    coords = []
    for line in lines:
        if line.startswith("ATOM"):
            coords.append([float(line[30:38]), float(line[38:46]), float(line[46:54])])
    if not coords: return pdb_str
    coords = np.array(coords)
    # Robust centroid calculation
    center = (coords.min(axis=0) + coords.max(axis=0)) / 2
    new_lines = []
    for line in lines:
        if line.startswith("ATOM"):
            x, y, z = float(line[30:38]) - center[0], float(line[38:46]) - center[1], float(line[46:54]) - center[2]
            new_lines.append(line[:30] + f"{x:>8.3f}{y:>8.3f}{z:>8.3f}" + line[54:])
        else: new_lines.append(line)
    return "\n".join(new_lines)

print("Libraries Loaded.")

## 1. Generating a Macrocycle

We use the `cyclic=True` flag to signal the generator to produce a head-to-tail bond. However, simply placing atoms in space isn't enough‚Äîthe N-terminus and C-terminus might be far apart.

To solve this, we use **Physics-Based Minimization** (`minimize_energy=True`) powered by OpenMM. This pulls the termini together into a physically plausible bond.

In [None]:
sequence = "TRP-SER-GLY-VAL-VAL-ASN-GLY-SER" # A random 8-mer

print("Generating Linear Control...")
linear_pdb = generate_pdb_content(sequence_str=sequence, cyclic=False, minimize_energy=True)

print("Generating Cyclic Macrocycle (Minimized)...")
cyclic_pdb = generate_pdb_content(sequence_str=sequence, cyclic=True, minimize_energy=True)

print("Generation Complete.")

## 2. Visual Comparison: Linear vs. Cyclic

Observe the difference in the **"Global Topology"**. Let's visualize both structures. In the cyclic version, you should see a continuous loop where the first and last residues are bonded, while the linear peptide remains a flexible string.

In [None]:
def view_structures(pdb1, title1, pdb2, title2):
    view = py3Dmol.view(width=800, height=400, linked=False, viewergrid=(1, 2))
    view.setBackgroundColor("#fdfdfd")
    
    # Centering proteins for a tighter view
    pdb1 = center_pdb(pdb1)
    pdb2 = center_pdb(pdb2)
    
    # Model 1
    view.addModel(pdb1, 'pdb', viewer=(0, 0))
    view.setStyle({'stick': {'radius': 0.15}, 'cartoon': {'color': 'spectrum'}}, viewer=(0, 0))
    view.addLabel(title1, {'position': {'x': 0, 'y': 20, 'z': 0}, 'backgroundColor': 'white', 'fontColor':'black'}, viewer=(0, 0))
    
    # Model 2
    view.addModel(pdb2, 'pdb', viewer=(0, 1))
    view.setStyle({'stick': {'radius': 0.15}, 'cartoon': {'color': 'spectrum'}}, viewer=(0, 1))
    view.addLabel(title2, {'position': {'x': 0, 'y': 20, 'z': 0}, 'backgroundColor': 'white', 'fontColor':'black'}, viewer=(0, 1))
    
    view.zoomTo()
    view.center()
    view.zoom(1.2)
    view.show()

view_structures(linear_pdb, "Linear Peptide", cyclic_pdb, "Cyclic Macrocycle")

## 3. Atomic Breakdown: The Closure Bond
Look at the `CONECT` records at the end of the PDB. This is how software knows the ring is closed.

In a linear peptide, the N-terminus has extra hydrogens (or a capping group), and the C-terminus has an Oxygen (OXT). In a **Cyclic** peptide, these are replaced by a standard Peptide Bond (C-N).

Notice in the PDB output of the cyclic peptide that residue 1 is bonded to residue 8.

In [None]:
print("--- Cyclic PDB Footer (CONECT records for loop closure) ---")
lines = cyclic_pdb.splitlines()
conect_lines = [l for l in lines if l.startswith("CONECT")]
for l in conect_lines[-3:]:
    print(l)

## 4. Scaling the Lab: Random Macrocycle Libraries

We can use this to generate a library of diverse macrocycles for ML datasets. By using `--minimize`, we ensure every structure is a geometrically valid "negative" or "positive" sample for a design model.

In [None]:
# Generate 3 random macrocycles
results = []
for i in range(3):
    print(f"Generating Macrocycle {i+1}...")
    p = generate_pdb_content(length=7, cyclic=True, minimize_energy=True)
    results.append(p)

print("‚úÖ Generated library of 3 unique macrocycles.")

### üèÜ Next Steps
1. Try creating a **D-Amino Acid** macrocycle by adding `D-` to your sequence (e.g., `D-ALA-D-VAL`). How does the chiral inversion affect the ring shape? üß™üíç
2. Try changing the length or adding the `--refine-clashes` flag to see how it affects the density of the cyclic loop! üöÄ