# Prolix MD: Interactive Tutorial

Welcome to **Prolix**, a high-performance molecular dynamics engine built on **JAX**.

This tutorial covers:
1. **Setup & Installation**: Getting ready in Colab
2. **Loading Structures**: Fetch PDB structures from RCSB
3. **System Parameterization**: Apply AMBER force fields (ff14SB)
4. **MD Simulation**: Run implicit solvent simulation with one function call
5. **Analysis & Visualization**: RMSD plots and py2Dmol trajectory viewer

## 1. Setup & Installation

In [None]:
# Check if running in Colab
try:
    import google.colab
    IN_COLAB = True
except ImportError:
    IN_COLAB = False

print(f"Running in Colab: {IN_COLAB}")

if IN_COLAB:
    # Clone Prolix
    !git clone https://github.com/maraxen/prolix.git || echo "Prolix already cloned"
    %cd prolix
    !git clone https://github.com/maraxen/priox.git || echo "Priox already cloned"
    !uv pip install -e . --system
    !uv pip install -e priox --system

In [None]:
import jax
import jax.numpy as jnp
import numpy as np
import matplotlib.pyplot as plt
import biotite.structure as struc
import biotite.database.rcsb as rcsb

# Prolix imports
from prolix import simulate
from prolix.visualization import (
    TrajectoryReader, plot_rmsd, save_trajectory_html, view_structure, view_trajectory
)
from priox.io.parsing import biotite as parsing_biotite
from priox.md.bridge.core import parameterize_system
from priox.physics.force_fields.loader import load_force_field

print(f"JAX devices: {jax.devices()}")
print(f"Backend: {jax.default_backend()}")

## 2. Load Structure
We'll use **Trp-cage (1UAO)**, a small fast-folding protein.

In [None]:
# Fetch from RCSB
pdb_file = rcsb.fetch("1UAO", "pdb", "/tmp")
print(f"Downloaded: {pdb_file}")

# Load structure (model=1 selects first NMR model)
atom_array = parsing_biotite.load_structure_with_hydride(pdb_file, model=1)
positions = jnp.array(atom_array.coord)
print(f"Loaded 1UAO: {len(positions)} atoms")

# Visualize
view_structure(pdb_file)

## 3. System Parameterization
Apply **AMBER ff14SB** force field with implicit solvent (GBSA).

In [None]:
# Load force field
ff_path = "data/force_fields/ff14SB.eqx"
ff = load_force_field(ff_path)

# Prepare topology
res_starts = struc.get_residue_starts(atom_array)
residues = [atom_array.res_name[i] for i in res_starts]
atom_names = list(atom_array.atom_name)

atom_counts = []
for i in range(len(res_starts)-1):
    atom_counts.append(res_starts[i+1] - res_starts[i])
atom_counts.append(len(atom_array) - res_starts[-1])

# Parameterize system
system_params = parameterize_system(
    ff, residues, atom_names, atom_counts=atom_counts
)

print(f"Total charge: {jnp.sum(system_params['charges']):.3f}")

### Memory Budget Check
Before running the simulation, we can estimate the GPU memory required. This is especially important for larger systems or long trajectories.

In [None]:
from prolix import resource_guard

# Estimate memory usage for 1UAO
resource_guard.check_memory_budget(
    n_atoms=len(positions),
    accumulate_steps=500,
    use_neighbor_list=False,
    use_pbc=False
)

## 4. Run MD Simulation
The `SimulationSpec` + `run_simulation()` pattern handles everything:
- Energy minimization (automatic, critical for stability)
- NVT Langevin dynamics setup
- Trajectory saving to ArrayRecord format

In [None]:
# Define simulation parameters
spec = simulate.SimulationSpec(
    total_time_ns=0.01,  # 10 ps
    step_size_fs=2.0,
    save_interval_ns=0.001,  # Save every 1 ps
    accumulate_steps=500,  # Accumulate 500 frames before writing
    save_path="1uao_implicit_traj.array_record",
    temperature_k=300.0,
    gamma=1.0,
    use_pbc=False  # Implicit solvent
)
# Run simulation
print("Running 10ps MD simulation...")
final_state = simulate.run_simulation(
    system_params=system_params,
    r_init=positions,
    spec=spec,
    key=jax.random.PRNGKey(42)
)
print(f"Complete! Final energy: {final_state.potential_energy:.2f} kcal/mol")

## 5. Analysis & Visualization
Analyze RMSD and visualize the trajectory.

In [None]:
# Load trajectory
traj = TrajectoryReader("1uao_implicit_traj.array_record")

# Plot RMSD
plot_rmsd(traj, reference_positions=positions)
plt.show()

In [None]:
# Generate HTML visualization
save_trajectory_html(
    trajectory="1uao_implicit_traj.array_record",
    pdb_path=pdb_file,
    output_path="1uao_visualization.html",
    stride=2,
    style="cartoon",
    title="1UAO MD Simulation (10ps)"
)
print("Saved 1uao_visualization.html")

In [None]:
# Display inline (Jupyter/Colab)
try:
    view_trajectory("1uao_implicit_traj.array_record", pdb_file, stride=5)
except ImportError:
    print("py2Dmol not available. Download HTML file instead.")

## Next Steps

Try exploring:
- **Explicit solvent**: See `explicit_solvent_colab.ipynb` for PME simulations with TIP3P water
- **Longer simulations**: Increase `total_time_ns` to 0.1+ ns
- **Different proteins**: Try larger systems like ubiquitin (1UBQ)
- **Advanced analysis**: Use `prolix.analysis` for contact maps, Ramachandran plots, etc.