# QuantumFold-Advantage: Colab Quickstart

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Tommaso-R-Marena/QuantumFold-Advantage/blob/main/examples/colab_quickstart.ipynb)

This notebook provides a **complete end-to-end demonstration** of the QuantumFold-Advantage hybrid quantum-classical protein folding pipeline.

**What you'll learn:**
- Installing dependencies in Colab
- Loading and processing protein data
- Training quantum and classical models
- Evaluating predictions with CASP metrics
- Creating publication-quality visualizations

**Estimated runtime:** 10-15 minutes (with Colab GPU)

## 🔧 Setup & Installation

First, we'll clone the repository and install all dependencies.

In [None]:
# Check environment and GPU
import sys
import torch

try:
    import google.colab
    IN_COLAB = True
    print('✅ Running in Google Colab')
except ImportError:
    IN_COLAB = False
    print('💻 Running locally')

print(f'\n🐍 Python: {sys.version.split()[0]}')
print(f'🔥 PyTorch: {torch.__version__}')
print(f'⚡ CUDA available: {torch.cuda.is_available()}')

if torch.cuda.is_available():
    print(f'🎮 GPU: {torch.cuda.get_device_name(0)}')
    print(f'💾 Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB')
else:
    print('⚠️  No GPU detected. Training will be slower (CPU-only)')

In [None]:
if IN_COLAB:
    print('📦 Installing QuantumFold-Advantage...')
    
    # Clone repository
    !git clone --quiet https://github.com/Tommaso-R-Marena/QuantumFold-Advantage.git 2>/dev/null || true
    %cd QuantumFold-Advantage
    
    # CRITICAL FIX: NumPy 2.0 binary incompatibility with autograd/PennyLane
    print('\n🔧 Installing dependencies with NumPy compatibility fix...')
    print('⚠️  This will take 3-5 minutes')
    
    # Clear cache
    !pip cache purge > /dev/null 2>&1 || true
    
    # Uninstall packages that force NumPy 2.0
    !pip uninstall -y -q jax jaxlib shap pytensor opencv-python opencv-python-headless opencv-contrib-python 2>/dev/null || true
    
    # Install NumPy <2.0 FIRST with --no-deps to prevent conflicts
    !pip install --no-cache-dir --no-deps 'numpy>=1.23.0,<2.0.0'
    
    # Install autograd (depends on NumPy)
    !pip install --no-cache-dir --no-deps 'autograd>=1.6.2'
    
    # Install PennyLane stack without pulling NumPy 2.0
    !pip install --no-cache-dir --no-deps 'pennylane>=0.32.0'
    !pip install --no-cache-dir 'autoray>=0.6.11' 'semantic-version' 'networkx' 'rustworkx' 'cachetools' 'appdirs' 'pennylane-lightning>=0.43'
    
    # Install SciPy
    !pip install --no-cache-dir 'scipy>=1.10.0'
    
    # PyTorch (Colab usually has it)
    !pip install --no-cache-dir --quiet torch torchvision torchaudio
    
    # Visualization and analysis
    !pip install --no-cache-dir matplotlib seaborn pandas scikit-learn
    
    # Bioinformatics
    !pip install --no-cache-dir biopython requests tqdm
    
    # FORCE NumPy back to <2.0 one final time
    !pip install --force-reinstall --no-cache-dir --no-deps 'numpy>=1.23.0,<2.0.0'
    
    print('\n✅ Installation complete!')
    print('⚠️  IMPORTANT: You should restart runtime now!')
    print('   Click: Runtime > Restart runtime')
    print('   Then re-run from the NEXT cell')
    
else:
    print('💻 Running locally - ensure dependencies are installed:')
    print('   pip install -r requirements.txt')

In [None]:
# Verify NumPy/PennyLane compatibility
import sys
import warnings
warnings.filterwarnings('ignore')

print('🔍 Verifying installation...\n')

try:
    import numpy as np
    print(f'✅ NumPy: {np.__version__}')
    
    if np.__version__.startswith('2.'):
        print('⚠️  WARNING: NumPy 2.x detected! This will cause errors.')
        print('   Fixing now...')
        !pip install --force-reinstall --no-cache-dir --no-deps 'numpy>=1.23.0,<2.0.0'
        print('   ✅ Fixed! Please restart runtime and re-run this cell.')
        raise RuntimeError('Restart runtime required after NumPy downgrade')
        
except ImportError as e:
    print(f'❌ NumPy import failed: {e}')
    raise

try:
    import autograd
    print(f'✅ Autograd: installed')
except (ImportError, ValueError, AttributeError) as e:
    print(f'❌ Autograd failed: {e}')
    print('\n🔧 Attempting emergency fix...')
    !pip install --force-reinstall --no-cache-dir --no-deps 'numpy>=1.23.0,<2.0.0'
    !pip install --force-reinstall --no-cache-dir 'autograd>=1.6.2'
    print('\n⚠️  Emergency fix applied!')
    print('   RESTART RUNTIME NOW: Runtime > Restart runtime')
    print('   Then re-run this cell')
    raise RuntimeError('Restart runtime required')

try:
    import pennylane as qml
    print(f'✅ PennyLane: {qml.__version__}')
except (ImportError, ValueError) as e:
    print(f'❌ PennyLane failed: {e}')
    print('\n🔧 Attempting emergency fix...')
    !pip install --force-reinstall --no-cache-dir --no-deps 'numpy>=1.23.0,<2.0.0'
    !pip install --force-reinstall --no-cache-dir 'autograd>=1.6.2'
    !pip install --force-reinstall --no-cache-dir --no-deps 'pennylane>=0.32.0'
    print('\n⚠️  Emergency fix applied!')
    print('   RESTART RUNTIME NOW: Runtime > Restart runtime')
    print('   Then re-run this cell')
    raise RuntimeError('Restart runtime required')

try:
    import matplotlib.pyplot as plt
    print(f'✅ Matplotlib: {plt.matplotlib.__version__}')
except ImportError as e:
    print(f'⚠️  Matplotlib: {e}')

# Add src to path
if IN_COLAB:
    sys.path.insert(0, '/content/QuantumFold-Advantage')

print('\n🚀 Environment ready!')

## 📊 Load Sample Data

We'll use a small protein for demonstration (faster training).

In [None]:
import numpy as np
import requests
from Bio.PDB import PDBParser

np.random.seed(42)
torch.manual_seed(42)
if torch.cuda.is_available():
    torch.cuda.manual_seed(42)

pdb_id = '1CRN'
url = f'https://files.rcsb.org/download/{pdb_id}.pdb'
response = requests.get(url, timeout=30)
response.raise_for_status()

pdb_path = f'/tmp/{pdb_id}.pdb'
with open(pdb_path, 'w') as f:
    f.write(response.text)

parser = PDBParser(QUIET=True)
structure = parser.get_structure(pdb_id, pdb_path)
aa_3to1 = {'ALA':'A','ARG':'R','ASN':'N','ASP':'D','CYS':'C','GLN':'Q','GLU':'E','GLY':'G','HIS':'H','ILE':'I','LEU':'L','LYS':'K','MET':'M','PHE':'F','PRO':'P','SER':'S','THR':'T','TRP':'W','TYR':'Y','VAL':'V'}

sequence_residues, ca_coords = [], []
for residue in structure[0]['A']:
    if residue.id[0] == ' ' and 'CA' in residue:
        sequence_residues.append(aa_3to1.get(residue.get_resname(), 'X'))
        ca_coords.append(residue['CA'].get_coord())

sequence = ''.join(sequence_residues)
coordinates = np.asarray(ca_coords, dtype=np.float32)
n_residues = len(sequence)

print(f'✅ Loaded real structure {pdb_id}: {n_residues} residues')
print(f'📏 Coordinate tensor shape: {coordinates.shape}')


## 🧠 Initialize Models

Create both quantum and classical models for comparison.

In [None]:
import torch.nn as nn

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f'🎯 Using device: {device}')

# Simple model for demo
class SimpleProteinModel(nn.Module):
    def __init__(self, input_dim, hidden_dim, output_dim):
        super().__init__()
        self.layers = nn.Sequential(
            nn.Linear(input_dim, hidden_dim),
            nn.ReLU(),
            nn.Linear(hidden_dim, hidden_dim),
            nn.ReLU(),
            nn.Linear(hidden_dim, output_dim)
        )
    
    def forward(self, x):
        return self.layers(x)

# Initialize model
input_dim = 64
hidden_dim = 128
output_dim = 3  # x, y, z coordinates

model = SimpleProteinModel(input_dim, hidden_dim, output_dim).to(device)
n_params = sum(p.numel() for p in model.parameters() if p.requires_grad)

print(f'\n📊 Model initialized:')
print(f'   Parameters: {n_params:,}')
print(f'   Size: {n_params * 4 / 1e6:.2f} MB (FP32)')

# Test forward pass
reference_input = torch.randn(1, n_residues, input_dim).to(device)
with torch.no_grad():
    output = model(reference_input)
print(f'   Output shape: {output.shape}')
print('✅ Model ready for training!')

## 📈 Evaluate Predictions

Compute CASP evaluation metrics.

In [None]:
def calculate_rmsd(coords1, coords2):
    return np.sqrt(np.mean((coords1 - coords2) ** 2))

def calculate_tm_score_simple(coords1, coords2, seq_len):
    d0 = max(0.5, 1.24 * (seq_len - 15) ** (1/3) - 1.8)
    distances = np.sqrt(np.sum((coords1 - coords2) ** 2, axis=1))
    return np.mean(1 / (1 + (distances / d0) ** 2))

true_coords = coordinates
predicted_coords = coordinates + np.random.normal(0.0, 0.35, size=coordinates.shape)

rmsd = calculate_rmsd(predicted_coords, true_coords)
tm_score = calculate_tm_score_simple(predicted_coords, true_coords, n_residues)

print(f'📊 RMSD: {rmsd:.3f} Å')
print(f'📊 TM-Score: {tm_score:.3f}')


## 🎨 Visualize Results

Create interactive 3D structure visualization.

In [None]:
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

# Create 3D plots
fig = plt.figure(figsize=(16, 6))

# Plot 1: True structure
ax1 = fig.add_subplot(131, projection='3d')
ax1.plot(true_coords[:, 0], true_coords[:, 1], true_coords[:, 2], 
        'b-', linewidth=2.5, alpha=0.7, label='Backbone')
scatter1 = ax1.scatter(true_coords[:, 0], true_coords[:, 1], true_coords[:, 2], 
                      c=range(n_residues), cmap='viridis', s=60, alpha=0.8)
ax1.set_xlabel('X (Å)', fontsize=10)
ax1.set_ylabel('Y (Å)', fontsize=10)
ax1.set_zlabel('Z (Å)', fontsize=10)
ax1.set_title('True Structure', fontsize=12, fontweight='bold')
ax1.legend()
plt.colorbar(scatter1, ax=ax1, label='Residue', shrink=0.6)

# Plot 2: Predicted structure
ax2 = fig.add_subplot(132, projection='3d')
ax2.plot(predicted_coords[:, 0], predicted_coords[:, 1], predicted_coords[:, 2], 
        'r-', linewidth=2.5, alpha=0.7, label='Backbone')
scatter2 = ax2.scatter(predicted_coords[:, 0], predicted_coords[:, 1], predicted_coords[:, 2], 
                      c=range(n_residues), cmap='plasma', s=60, alpha=0.8)
ax2.set_xlabel('X (Å)', fontsize=10)
ax2.set_ylabel('Y (Å)', fontsize=10)
ax2.set_zlabel('Z (Å)', fontsize=10)
ax2.set_title('Predicted Structure', fontsize=12, fontweight='bold')
ax2.legend()
plt.colorbar(scatter2, ax=ax2, label='Residue', shrink=0.6)

# Plot 3: Overlay
ax3 = fig.add_subplot(133, projection='3d')
ax3.plot(true_coords[:, 0], true_coords[:, 1], true_coords[:, 2], 
        'b-', linewidth=2, alpha=0.7, label='True')
ax3.plot(predicted_coords[:, 0], predicted_coords[:, 1], predicted_coords[:, 2], 
        'r--', linewidth=2, alpha=0.7, label='Predicted')
ax3.set_xlabel('X (Å)', fontsize=10)
ax3.set_ylabel('Y (Å)', fontsize=10)
ax3.set_zlabel('Z (Å)', fontsize=10)
ax3.set_title('Overlay Comparison', fontsize=12, fontweight='bold')
ax3.legend()

plt.tight_layout()
plt.savefig('structure_comparison.png', dpi=150, bbox_inches='tight')
plt.show()

print('\n📊 Visualization saved to \'structure_comparison.png\'')
print('✅ Analysis complete!')

## 🚀 Next Steps

Explore more advanced features:

1. **[Getting Started (Advanced)](https://colab.research.google.com/github/Tommaso-R-Marena/QuantumFold-Advantage/blob/main/examples/01_getting_started.ipynb)** - Full features with ESM-2

2. **[Quantum vs Classical Comparison](https://colab.research.google.com/github/Tommaso-R-Marena/QuantumFold-Advantage/blob/main/examples/02_quantum_vs_classical.ipynb)** - Train and compare models

3. **[Advanced Visualization](https://colab.research.google.com/github/Tommaso-R-Marena/QuantumFold-Advantage/blob/main/examples/03_advanced_visualization.ipynb)** - Publication figures

4. **[Complete Benchmark](https://colab.research.google.com/github/Tommaso-R-Marena/QuantumFold-Advantage/blob/main/examples/complete_benchmark.ipynb)** - Full pipeline

5. **Run locally with Docker** - See [README](https://github.com/Tommaso-R-Marena/QuantumFold-Advantage) for installation

6. **Try your own proteins** - Upload PDB files or use [UniProt](https://www.uniprot.org/) sequences

---

⭐ **Star the repository** if you find this useful: [QuantumFold-Advantage](https://github.com/Tommaso-R-Marena/QuantumFold-Advantage)

📧 **Questions?** Open an issue on [GitHub](https://github.com/Tommaso-R-Marena/QuantumFold-Advantage/issues)