# QuantumFold-Advantage: Complete Production Benchmark

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Tommaso-R-Marena/QuantumFold-Advantage/blob/main/examples/complete_production_run.ipynb)

**Complete end-to-end training and benchmarking pipeline for publication-quality results**

## üéØ What This Notebook Does

This notebook runs the **complete research pipeline** to generate publication-ready results:

1. ‚úÖ **Data Preparation** - Download and process CATH protein structures
2. ‚úÖ **Quantum Model Training** - Full training with ESM-2 embeddings + quantum layers
3. ‚úÖ **Classical Baseline Training** - Identical architecture without quantum enhancement
4. ‚úÖ **Comprehensive Evaluation** - TM-score, RMSD, GDT-TS, pLDDT on test set
5. ‚úÖ **Statistical Validation** - Hypothesis tests, effect sizes, confidence intervals
6. ‚úÖ **Publication Figures** - Training curves, distributions, comparison plots
7. ‚úÖ **Results Export** - Trained models, metrics, plots saved to Google Drive

## ‚öôÔ∏è Requirements

**Recommended Setup:**
- **Colab Pro/Pro+** with A100 GPU (40GB VRAM)
- **High RAM** runtime
- **~4-6 hours** total runtime

**Free Tier Compatibility:**
- ‚ö†Ô∏è Possible but will require reducing dataset size and model parameters
- See optimization flags in configuration cell

## üìä Expected Results

After completion, you'll have:
- **Trained quantum model** (~200MB checkpoint)
- **Trained classical baseline** (~200MB checkpoint)
- **Performance metrics** (JSON + CSV)
- **Statistical analysis** (p-values, effect sizes, CI)
- **Publication figures** (10+ high-resolution plots)
- **Complete results archive** (ZIP for download)

In [None]:
#@title üîç Environment Check
import subprocess
import sys
import torch
import psutil
import os

print('=' * 80)
print('ENVIRONMENT CHECK')
print('=' * 80)

# GPU Check
if torch.cuda.is_available():
    gpu_name = torch.cuda.get_device_name(0)
    gpu_memory = torch.cuda.get_device_properties(0).total_memory / 1e9
    print(f'‚úÖ GPU: {gpu_name}')
    print(f'‚úÖ VRAM: {gpu_memory:.1f}GB')
    
    if 'A100' in gpu_name:
        print('üî• OPTIMAL: A100 detected - full pipeline enabled')
    elif 'V100' in gpu_name:
        print('‚úÖ GOOD: V100 detected - full pipeline possible')
    elif 'T4' in gpu_name:
        print('‚ö†Ô∏è  WARNING: T4 detected - consider reducing model size')
        print('   Set USE_REDUCED_CONFIG=True below')
    else:
        print(f'‚ö†Ô∏è  WARNING: {gpu_name} - may need optimization')
else:
    print('‚ùå ERROR: No GPU detected!')
    print('   Runtime ‚Üí Change runtime type ‚Üí GPU')
    sys.exit(1)

# RAM Check
ram_gb = psutil.virtual_memory().total / 1e9
print(f'\nüíæ System RAM: {ram_gb:.1f}GB')
if ram_gb >= 50:
    print('‚úÖ High RAM runtime detected - optimal for large datasets')
elif ram_gb >= 25:
    print('‚úÖ Standard RAM - sufficient for full pipeline')
else:
    print('‚ö†Ô∏è  Low RAM - consider enabling High RAM runtime')

# Disk Check
disk = psutil.disk_usage('/')
disk_free = disk.free / 1e9
print(f'\nüíø Free Disk: {disk_free:.1f}GB')
if disk_free < 10:
    print('‚ö†Ô∏è  WARNING: Low disk space (<10GB)')

# Colab Check
try:
    from google.colab import drive
    print('\n‚úÖ Google Colab environment detected')
    IS_COLAB = True
except ImportError:
    print('\n‚ö†Ô∏è  Not running in Colab - some features may be limited')
    IS_COLAB = False

print('\n' + '=' * 80)
print('Ready to proceed!')
print('=' * 80)

In [None]:
#@title üìÅ Mount Google Drive (Recommended for saving results)

MOUNT_DRIVE = True  #@param {type:"boolean"}

if MOUNT_DRIVE and IS_COLAB:
    from google.colab import drive
    drive.mount('/content/drive')
    
    # Create results directory
    RESULTS_DIR = '/content/drive/MyDrive/QuantumFold_Results'
    os.makedirs(RESULTS_DIR, exist_ok=True)
    print(f'‚úÖ Results will be saved to: {RESULTS_DIR}')
else:
    RESULTS_DIR = '/content/results'
    os.makedirs(RESULTS_DIR, exist_ok=True)
    print(f'‚ö†Ô∏è  Results will be saved locally to: {RESULTS_DIR}')
    print('   (Download manually before session ends)')

In [None]:
%%capture
# Install QuantumFold-Advantage and dependencies
get_ipython().system('pip install -q git+https://github.com/Tommaso-R-Marena/QuantumFold-Advantage.git')
get_ipython().system('pip install -q fair-esm biopython pennylane pennylane-qiskit')
get_ipython().system('pip install -q wandb tensorboard scipy scikit-learn matplotlib seaborn plotly')
get_ipython().system('pip install -q einops py3Dmol MDAnalysis')

print('‚úÖ All dependencies installed')

In [None]:
# Core imports
import os
import sys
import json
import time
import warnings
from datetime import datetime
from pathlib import Path
import numpy as np
import torch
import torch.nn as nn
from torch.utils.data import DataLoader
from tqdm.auto import tqdm
import matplotlib.pyplot as plt
import seaborn as sns

# QuantumFold imports
from src.advanced_model import AdvancedProteinFoldingModel
from src.protein_embeddings import ESM2Embedder
from src.data import ProteinDataset, fetch_pdb_structures as download_pdb_structures
from src.advanced_training import AdvancedTrainer
from src.benchmarks import compute_tm_score, compute_rmsd, compute_gdt_ts
from src.statistical_validation import ComprehensiveBenchmark
from src.reproducibility import set_seed

warnings.filterwarnings('ignore')

# Set random seed for reproducibility
set_seed(42)

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f'üî• Using device: {device}')

# Configure plotting
sns.set_style('whitegrid')
plt.rcParams['figure.figsize'] = (12, 8)
plt.rcParams['figure.dpi'] = 100

print('‚úÖ Imports complete')

In [None]:
import psutil
#@title ‚öôÔ∏è Configuration

#@markdown ### Hardware Optimization
USE_REDUCED_CONFIG = False  #@param {type:"boolean"}
#@markdown Enable for T4 GPU or Free Tier Colab

#@markdown ### Training Configuration
NUM_TRAINING_PROTEINS = 100  #@param {type:"slider", min:50, max:500, step:50}
NUM_EPOCHS_QUANTUM = 50  #@param {type:"slider", min:10, max:100, step:10}
NUM_EPOCHS_CLASSICAL = 50  #@param {type:"slider", min:10, max:100, step:10}
BATCH_SIZE = 4  #@param {type:"slider", min:1, max:16, step:1}

#@markdown ### Model Configuration
ESM_MODEL = "esm2_t33_650M_UR50D"  #@param ["esm2_t33_650M_UR50D", "esm2_t36_3B_UR50D"]
HIDDEN_DIM = 384  #@param {type:"slider", min:128, max:768, step:128}
NUM_STRUCTURE_LAYERS = 4  #@param {type:"slider", min:2, max:8, step:1}

#@markdown ### Quantum Configuration
NUM_QUBITS = 8  #@param {type:"slider", min:4, max:16, step:2}
NUM_QUANTUM_LAYERS = 3  #@param {type:"slider", min:1, max:5, step:1}
NOISE_LEVEL = 0.01  #@param {type:"slider", min:0.0, max:0.1, step:0.01}

#@markdown ### Advanced Options
USE_MIXED_PRECISION = True  #@param {type:"boolean"}
USE_EMA = True  #@param {type:"boolean"}
USE_GRADIENT_CHECKPOINTING = True  #@param {type:"boolean"}
WANDB_LOGGING = False  #@param {type:"boolean"}

# Apply reduced config if needed
if USE_REDUCED_CONFIG:
    print('‚öôÔ∏è  Applying reduced configuration for T4/Free Tier...')
    NUM_TRAINING_PROTEINS = min(NUM_TRAINING_PROTEINS, 50)
    ESM_MODEL = "esm2_t33_650M_UR50D"
    HIDDEN_DIM = min(HIDDEN_DIM, 256)
    NUM_STRUCTURE_LAYERS = min(NUM_STRUCTURE_LAYERS, 3)
    NUM_QUBITS = min(NUM_QUBITS, 6)
    BATCH_SIZE = min(BATCH_SIZE, 2)

# Build config dictionary
CONFIG = {
    'hardware': {
        'gpu': torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'CPU',
        'vram_gb': torch.cuda.get_device_properties(0).total_memory / 1e9 if torch.cuda.is_available() else 0,
        'ram_gb': psutil.virtual_memory().total / 1e9,
        'reduced_config': USE_REDUCED_CONFIG
    },
    'data': {
        'num_proteins': NUM_TRAINING_PROTEINS,
        'batch_size': BATCH_SIZE
    },
    'training': {
        'epochs_quantum': NUM_EPOCHS_QUANTUM,
        'epochs_classical': NUM_EPOCHS_CLASSICAL,
        'mixed_precision': USE_MIXED_PRECISION,
        'ema': USE_EMA,
        'gradient_checkpointing': USE_GRADIENT_CHECKPOINTING
    },
    'model': {
        'esm_model': ESM_MODEL,
        'hidden_dim': HIDDEN_DIM,
        'num_structure_layers': NUM_STRUCTURE_LAYERS
    },
    'quantum': {
        'num_qubits': NUM_QUBITS,
        'num_layers': NUM_QUANTUM_LAYERS,
        'noise_level': NOISE_LEVEL
    },
    'experiment': {
        'timestamp': datetime.now().isoformat(),
        'seed': 42
    }
}

# Save config
config_path = os.path.join(RESULTS_DIR, 'experiment_config.json')
with open(config_path, 'w') as f:
    json.dump(CONFIG, f, indent=2)

print('\n' + '=' * 80)
print('EXPERIMENT CONFIGURATION')
print('=' * 80)
print(json.dumps(CONFIG, indent=2))
print('=' * 80)
print(f'\n‚úÖ Configuration saved to: {config_path}')

## üìä Step 1: Data Preparation

Download protein structures from PDB and prepare datasets.

In [None]:
def generate_diverse_protein_dataset(n_proteins=100):
    """Generate diverse set of PDB IDs from different structure classes"""
    
    # Alpha helical proteins (30%)
    alpha_proteins = [
        '1MBN', '1MYO', '1MYG', '256B', '1LFB', '1HMK', '1HCL', '1A6N', '1BVC', '1COA',
        '1CRL', '1D3B', '1DLW', '1ECD', '1FLP', '1G6N', '1H6W', '1IA0', '1JBO', '1K40',
        '1LFD', '1M6T', '1N0J', '1O06', '1PMY', '1QLA', '1R69', '1S72', '1TRZ', '1UHA'
    ]
    
    # Beta sheet proteins (30%)
    beta_proteins = [
        '1TEN', '1FNA', '1BNL', '1EAL', '1FMM', '1G2R', '1H0H', '1I2T', '1JB0', '1K20',
        '1L5B', '1M3S', '1N0U', '1O5R', '1P9I', '1QDD', '1R7J', '1S6V', '1T2F', '1U2H',
        '1BRS', '1BTH', '1CDG', '1CEW', '1CLV', '1DFJ', '1EJG', '1ETM', '1FCH', '1FIE'
    ]
    
    # Mixed alpha/beta (30%)
    mixed_proteins = [
        '1UBQ', '1CRN', '2MLT', '1PGB', '5CRO', '4PTI', '1SHG', '2CI2', '1BPI', '1YCC',
        '1AKI', '1BBA', '3CHY', '1BP2', '1LMB', '2LZM', '1CSE', '1HRC', '1CTF', '1SBP',
        '1A0P', '1A2P', '1A3A', '1A49', '1A53', '1A62', '1AIE', '1AK9', '1AKZ', '1ALY'
    ]
    
    # Small proteins for validation (10%)
    small_proteins = [
        '1VII', '2K39', '1ENH', '1RIS', '5TRV', '1L2Y', '2MJB', '1MB6', '2ERL', '1IGD'
    ]
    
    # Combine and sample
    all_proteins = alpha_proteins + beta_proteins + mixed_proteins + small_proteins
    
    # Ensure we don't exceed available
    n_proteins = min(n_proteins, len(all_proteins))
    
    # Sample with balanced representation
    np.random.seed(42)
    selected = []
    
    n_alpha = int(0.3 * n_proteins)
    n_beta = int(0.3 * n_proteins)
    n_mixed = int(0.3 * n_proteins)
    n_small = n_proteins - n_alpha - n_beta - n_mixed
    
    selected.extend(np.random.choice(alpha_proteins, min(n_alpha, len(alpha_proteins)), replace=False))
    selected.extend(np.random.choice(beta_proteins, min(n_beta, len(beta_proteins)), replace=False))
    selected.extend(np.random.choice(mixed_proteins, min(n_mixed, len(mixed_proteins)), replace=False))
    selected.extend(np.random.choice(small_proteins, min(n_small, len(small_proteins)), replace=False))
    
    return selected

# Generate dataset
print(f'üìä Generating dataset of {NUM_TRAINING_PROTEINS} proteins...')
pdb_ids = generate_diverse_protein_dataset(NUM_TRAINING_PROTEINS)
print(f'‚úÖ Selected {len(pdb_ids)} proteins')
print(f'   Classes: ~30% alpha, ~30% beta, ~30% mixed, ~10% small')

In [None]:
print('\nüì• Downloading PDB structures...')
print(f'   This may take 5-10 minutes for {len(pdb_ids)} proteins\n')

structures = download_pdb_structures(pdb_ids, max_workers=10)

print(f'\n‚úÖ Successfully downloaded {len(structures)} structures')
print(f'‚ùå Failed: {len(pdb_ids) - len(structures)} proteins')

# Print statistics
lengths = [len(s['coords']) for s in structures.values()]
print(f'\nüìà Dataset Statistics:')
print(f'   Mean length: {np.mean(lengths):.1f} residues')
print(f'   Min length: {np.min(lengths)} residues')
print(f'   Max length: {np.max(lengths)} residues')
print(f'   Median length: {np.median(lengths):.1f} residues')

In [None]:
# Split data: 70% train, 15% val, 15% test
pdb_list = list(structures.keys())
np.random.shuffle(pdb_list)

n_total = len(pdb_list)
n_train = int(0.70 * n_total)
n_val = int(0.15 * n_total)

train_ids = pdb_list[:n_train]
val_ids = pdb_list[n_train:n_train+n_val]
test_ids = pdb_list[n_train+n_val:]

print(f'\nüìä Data Split:')
print(f'   Train: {len(train_ids)} proteins')
print(f'   Val:   {len(val_ids)} proteins')
print(f'   Test:  {len(test_ids)} proteins')

# Save split
split_info = {
    'train': train_ids,
    'val': val_ids,
    'test': test_ids
}

split_path = os.path.join(RESULTS_DIR, 'data_split.json')
with open(split_path, 'w') as f:
    json.dump(split_info, f, indent=2)

print(f'‚úÖ Split saved to: {split_path}')

## üß¨ Step 2: Generate ESM-2 Embeddings

Generate pre-trained protein language model embeddings.

In [None]:
print(f'üß† Loading ESM-2 model: {ESM_MODEL}...')
embedder = ESM2Embedder(model_name=ESM_MODEL, device=device)
print(f'‚úÖ Model loaded (embedding dim: {embedder.embedding_dim})')

print(f'\nüîÑ Generating embeddings for {len(structures)} proteins...')
print('   This may take 10-20 minutes depending on GPU\n')

# Generate embeddings in batches to save memory
EMBEDDING_BATCH_SIZE = 5
embedding_cache_dir = os.path.join(RESULTS_DIR, 'embedding_cache')
os.makedirs(embedding_cache_dir, exist_ok=True)

for i in tqdm(range(0, len(pdb_list), EMBEDDING_BATCH_SIZE), desc='Generating embeddings'):
    batch_ids = pdb_list[i:i+EMBEDDING_BATCH_SIZE]
    batch_seqs = [structures[pdb_id]['sequence'] for pdb_id in batch_ids]
    
    # Generate embeddings
    batch_embeddings = embedder(batch_seqs)['embeddings']
    
    # Save to cache
    for pdb_id, emb in zip(batch_ids, batch_embeddings):
        cache_path = os.path.join(embedding_cache_dir, f'{pdb_id}.pt')
        torch.save(emb.cpu(), cache_path)
        structures[pdb_id]['embedding_path'] = cache_path
    
    # Clear cache
    del batch_embeddings
    torch.cuda.empty_cache()

# Free ESM model
del embedder
torch.cuda.empty_cache()

print(f'\n‚úÖ Embeddings cached to: {embedding_cache_dir}')
print(f'üßπ ESM model freed from memory')

In [None]:
# Create PyTorch datasets
train_dataset = ProteinDataset(train_ids, structures, augment=True)
val_dataset = ProteinDataset(val_ids, structures, augment=False)
test_dataset = ProteinDataset(test_ids, structures, augment=False)

# Create dataloaders
from src.data_processing import collate_fn

train_loader = DataLoader(
    train_dataset, 
    batch_size=BATCH_SIZE, 
    shuffle=True, 
    collate_fn=collate_fn,
    num_workers=2,
    pin_memory=True
)

val_loader = DataLoader(
    val_dataset, 
    batch_size=BATCH_SIZE, 
    shuffle=False, 
    collate_fn=collate_fn,
    num_workers=2,
    pin_memory=True
)

test_loader = DataLoader(
    test_dataset, 
    batch_size=BATCH_SIZE, 
    shuffle=False, 
    collate_fn=collate_fn,
    num_workers=2,
    pin_memory=True
)

print(f'‚úÖ DataLoaders created:')
print(f'   Train batches: {len(train_loader)}')
print(f'   Val batches:   {len(val_loader)}')
print(f'   Test batches:  {len(test_loader)}')

## ‚öõÔ∏è Step 3: Train Quantum-Enhanced Model

Train the full model with quantum layers enabled.

In [None]:
print('‚öõÔ∏è  Initializing Quantum-Enhanced Model...')

# Determine input dimension from ESM model
if '650M' in ESM_MODEL:
    esm_dim = 1280
elif '3B' in ESM_MODEL:
    esm_dim = 2560
else:
    esm_dim = 1280  # default

quantum_model = AdvancedProteinFoldingModel(
    input_dim=esm_dim,
    c_s=HIDDEN_DIM,
    c_z=HIDDEN_DIM // 3,
    num_structure_layers=NUM_STRUCTURE_LAYERS,
    use_quantum=True,
    num_qubits=NUM_QUBITS,
    num_quantum_layers=NUM_QUANTUM_LAYERS,
    noise_level=NOISE_LEVEL
).to(device)

total_params = sum(p.numel() for p in quantum_model.parameters())
trainable_params = sum(p.numel() for p in quantum_model.parameters() if p.requires_grad)

print(f'‚úÖ Model initialized:')
print(f'   Total parameters: {total_params:,}')
print(f'   Trainable parameters: {trainable_params:,}')
print(f'   Model size: ~{total_params * 4 / 1e6:.1f}MB')
print(f'   Quantum: {NUM_QUBITS} qubits, {NUM_QUANTUM_LAYERS} layers')

In [None]:
# Initialize trainer
quantum_trainer = AdvancedTrainer(
    model=quantum_model,
    train_loader=train_loader,
    val_loader=val_loader,
    device=device,
    learning_rate=5e-4,
    use_amp=USE_MIXED_PRECISION,
    use_ema=USE_EMA,
    gradient_clip=1.0,
    output_dir=os.path.join(RESULTS_DIR, 'quantum_model'),
    use_wandb=WANDB_LOGGING
)

print('‚úÖ Quantum trainer initialized')

In [None]:
print(f'\nüöÄ Training Quantum-Enhanced Model for {NUM_EPOCHS_QUANTUM} epochs...')
print(f'   Estimated time: ~{NUM_EPOCHS_QUANTUM * len(train_loader) * 2 / 60:.0f} minutes\n')

start_time = time.time()

quantum_history = quantum_trainer.train(
    num_epochs=NUM_EPOCHS_QUANTUM,
    save_freq=10,
    val_freq=5
)

quantum_training_time = time.time() - start_time

print(f'\n‚úÖ Quantum model training complete!')
print(f'   Total time: {quantum_training_time/60:.1f} minutes')
print(f'   Best val loss: {min(quantum_history["val_loss"]):.4f}')

# Save training history
history_path = os.path.join(RESULTS_DIR, 'quantum_model', 'training_history.json')
with open(history_path, 'w') as f:
    json.dump(quantum_history, f, indent=2)

print(f'   History saved to: {history_path}')

## üî¨ Step 4: Train Classical Baseline

Train identical model with quantum layers disabled.

In [None]:
print('üî¨ Initializing Classical Baseline Model...')

# Free quantum model from GPU to save memory
quantum_model = quantum_model.cpu()
torch.cuda.empty_cache()

classical_model = AdvancedProteinFoldingModel(
    input_dim=esm_dim,
    c_s=HIDDEN_DIM,
    c_z=HIDDEN_DIM // 3,
    num_structure_layers=NUM_STRUCTURE_LAYERS,
    use_quantum=False  # DISABLED
).to(device)

total_params = sum(p.numel() for p in classical_model.parameters())

print(f'‚úÖ Classical model initialized:')
print(f'   Total parameters: {total_params:,}')
print(f'   Model size: ~{total_params * 4 / 1e6:.1f}MB')
print(f'   Quantum: DISABLED')

In [None]:
# Initialize trainer
classical_trainer = AdvancedTrainer(
    model=classical_model,
    train_loader=train_loader,
    val_loader=val_loader,
    device=device,
    learning_rate=5e-4,
    use_amp=USE_MIXED_PRECISION,
    use_ema=USE_EMA,
    gradient_clip=1.0,
    output_dir=os.path.join(RESULTS_DIR, 'classical_model'),
    use_wandb=WANDB_LOGGING
)

print('‚úÖ Classical trainer initialized')

In [None]:
print(f'\nüöÄ Training Classical Baseline for {NUM_EPOCHS_CLASSICAL} epochs...')
print(f'   Estimated time: ~{NUM_EPOCHS_CLASSICAL * len(train_loader) * 2 / 60:.0f} minutes\n')

start_time = time.time()

classical_history = classical_trainer.train(
    num_epochs=NUM_EPOCHS_CLASSICAL,
    save_freq=10,
    val_freq=5
)

classical_training_time = time.time() - start_time

print(f'\n‚úÖ Classical model training complete!')
print(f'   Total time: {classical_training_time/60:.1f} minutes')
print(f'   Best val loss: {min(classical_history["val_loss"]):.4f}')

# Save training history
history_path = os.path.join(RESULTS_DIR, 'classical_model', 'training_history.json')
with open(history_path, 'w') as f:
    json.dump(classical_history, f, indent=2)

print(f'   History saved to: {history_path}')

## üìà Step 5: Comprehensive Evaluation

Evaluate both models on test set with all metrics.

In [None]:
print('üìä Evaluating models on test set...\n')

# Load best checkpoints
quantum_model.load_state_dict(torch.load(
    os.path.join(RESULTS_DIR, 'quantum_model', 'best_model.pt'),
    map_location=device
))
quantum_model.eval()

classical_model.load_state_dict(torch.load(
    os.path.join(RESULTS_DIR, 'classical_model', 'best_model.pt'),
    map_location=device
))
classical_model.eval()

print('‚úÖ Best checkpoints loaded')

def evaluate_model(model, dataloader, model_name):
    """Comprehensive model evaluation"""
    results = {
        'tm_scores': [],
        'rmsds': [],
        'gdt_ts': [],
        'plddts': []
    }
    
    print(f'\nEvaluating {model_name}...')
    
    with torch.no_grad():
        for batch in tqdm(dataloader, desc=f'{model_name} evaluation'):
            embeddings = batch['embedding'].to(device)
            true_coords = batch['coords'].to(device)
            mask = batch['mask'].to(device)
            
            # Forward pass
            output = model(embeddings, mask=mask)
            pred_coords = output['coordinates']
            plddt = output.get('plddt', None)
            
            # Compute metrics for each example in batch
            for i in range(pred_coords.shape[0]):
                m = mask[i].cpu().bool()
                pred = pred_coords[i][m].cpu().numpy()
                true = true_coords[i][m].cpu().numpy()
                
                if len(pred) < 3:
                    continue
                
                # TM-score
                tm = compute_tm_score(pred, true)
                results['tm_scores'].append(tm)
                
                # RMSD
                rmsd = compute_rmsd(pred, true)
                results['rmsds'].append(rmsd)
                
                # GDT-TS
                gdt = compute_gdt_ts(pred, true)
                results['gdt_ts'].append(gdt)
                
                # pLDDT
                if plddt is not None:
                    results['plddts'].append(plddt[i][m].mean().item())
    
    # Convert to arrays
    for key in results:
        results[key] = np.array(results[key])
    
    return results

# Evaluate both models
quantum_results = evaluate_model(quantum_model, test_loader, 'Quantum')
classical_results = evaluate_model(classical_model, test_loader, 'Classical')

print('\n‚úÖ Evaluation complete!')

In [None]:
# Print summary statistics
print('\n' + '=' * 80)
print('TEST SET RESULTS')
print('=' * 80)

metrics = ['tm_scores', 'rmsds', 'gdt_ts', 'plddts']
metric_names = ['TM-score', 'RMSD (√Ö)', 'GDT-TS', 'pLDDT']

for metric, name in zip(metrics, metric_names):
    if len(quantum_results[metric]) == 0:
        continue
    
    q_mean = np.mean(quantum_results[metric])
    q_std = np.std(quantum_results[metric])
    c_mean = np.mean(classical_results[metric])
    c_std = np.std(classical_results[metric])
    
    print(f'\n{name}:')
    print(f'  Quantum:   {q_mean:.4f} ¬± {q_std:.4f}')
    print(f'  Classical: {c_mean:.4f} ¬± {c_std:.4f}')
    
    diff = q_mean - c_mean
    if metric == 'rmsds':
        better = diff < 0
    else:
        better = diff > 0
    
    symbol = '‚úÖ' if better else '‚ùå'
    print(f'  Difference: {diff:+.4f} {symbol}')

print('\n' + '=' * 80)

## üìä Step 6: Statistical Validation

Rigorous statistical testing for quantum advantage.

In [None]:
print('üìä Running statistical validation...\n')

benchmark = ComprehensiveBenchmark(
    output_dir=os.path.join(RESULTS_DIR, 'statistical_analysis')
)

# Test for each metric
stat_results = {}

for metric, name in zip(metrics, metric_names):
    if len(quantum_results[metric]) == 0:
        continue
    
    print(f'\nTesting {name}...')
    
    higher_is_better = metric != 'rmsds'
    
    result = benchmark.compare_methods(
        quantum_scores=quantum_results[metric],
        classical_scores=classical_results[metric],
        metric_name=name,
        higher_is_better=higher_is_better
    )
    
    stat_results[name] = result
    
    print(f'  Wilcoxon p-value: {result["wilcoxon_p"]:}')
    print(f'  t-test p-value: {result["ttest_p"]:.4f}')
    print(f"  Cohen's d: {result['cohens_d']:.4f}")
    print(f'  95% CI: [{result["ci_lower"]:.4f}, {result["ci_upper"]:.4f}]')
    
    if result['wilcoxon_p'] < 0.05:
        print(f'  ‚úÖ SIGNIFICANT at p<0.05')
    else:
        print(f'  ‚ùå Not significant at p<0.05')

# Save statistical results
stats_path = os.path.join(RESULTS_DIR, 'statistical_analysis', 'results.json')
with open(stats_path, 'w') as f:
    json.dump(stat_results, f, indent=2)

print(f'\n‚úÖ Statistical analysis saved to: {stats_path}')

## üìà Step 7: Generate Publication Figures

Create high-quality plots for publication.

In [None]:
# Plot training curves
fig, axes = plt.subplots(2, 2, figsize=(16, 12))

# Train loss
axes[0, 0].plot(quantum_history['train_loss'], label='Quantum', linewidth=2)
axes[0, 0].plot(classical_history['train_loss'], label='Classical', linewidth=2)
axes[0, 0].set_xlabel('Epoch')
axes[0, 0].set_ylabel('Training Loss')
axes[0, 0].set_title('Training Loss Curves')
axes[0, 0].legend()
axes[0, 0].grid(True, alpha=0.3)

# Val loss
axes[0, 1].plot(quantum_history['val_loss'], label='Quantum', linewidth=2)
axes[0, 1].plot(classical_history['val_loss'], label='Classical', linewidth=2)
axes[0, 1].set_xlabel('Epoch')
axes[0, 1].set_ylabel('Validation Loss')
axes[0, 1].set_title('Validation Loss Curves')
axes[0, 1].legend()
axes[0, 1].grid(True, alpha=0.3)

# Learning rate
axes[1, 0].plot(quantum_history.get('learning_rate', []), linewidth=2)
axes[1, 0].set_xlabel('Epoch')
axes[1, 0].set_ylabel('Learning Rate')
axes[1, 0].set_title('Learning Rate Schedule')
axes[1, 0].set_yscale('log')
axes[1, 0].grid(True, alpha=0.3)

# Training time comparison
times = [quantum_training_time/60, classical_training_time/60]
axes[1, 1].bar(['Quantum', 'Classical'], times, color=['#2E86AB', '#A23B72'])
axes[1, 1].set_ylabel('Training Time (minutes)')
axes[1, 1].set_title('Training Time Comparison')
axes[1, 1].grid(True, alpha=0.3, axis='y')

plt.tight_layout()
plt.savefig(os.path.join(RESULTS_DIR, 'training_curves.png'), dpi=300, bbox_inches='tight')
plt.show()

print('‚úÖ Training curves saved')

In [None]:
# Plot metric distributions
fig, axes = plt.subplots(2, 2, figsize=(16, 12))
axes = axes.flatten()

for idx, (metric, name) in enumerate(zip(metrics[:4], metric_names[:4])):
    if len(quantum_results[metric]) == 0:
        continue
    
    ax = axes[idx]
    
    # Violin plots
    data = [quantum_results[metric], classical_results[metric]]
    parts = ax.violinplot(data, positions=[1, 2], showmeans=True, showextrema=True)
    
    # Color the violins
    for pc, color in zip(parts['bodies'], ['#2E86AB', '#A23B72']):
        pc.set_facecolor(color)
        pc.set_alpha(0.6)
    
    ax.set_xticks([1, 2])
    ax.set_xticklabels(['Quantum', 'Classical'])
    ax.set_ylabel(name)
    ax.set_title(f'{name} Distribution')
    ax.grid(True, alpha=0.3, axis='y')

plt.tight_layout()
plt.savefig(os.path.join(RESULTS_DIR, 'metric_distributions.png'), dpi=300, bbox_inches='tight')
plt.show()

print('‚úÖ Distribution plots saved')

In [None]:
# Paired comparison plot
fig, axes = plt.subplots(2, 2, figsize=(16, 12))
axes = axes.flatten()

for idx, (metric, name) in enumerate(zip(metrics[:4], metric_names[:4])):
    if len(quantum_results[metric]) == 0:
        continue
    
    ax = axes[idx]
    
    # Scatter plot
    ax.scatter(classical_results[metric], quantum_results[metric], 
               alpha=0.6, s=100, edgecolors='black', linewidths=0.5)
    
    # Diagonal line (y=x)
    lims = [
        np.min([ax.get_xlim(), ax.get_ylim()]),
        np.max([ax.get_xlim(), ax.get_ylim()])
    ]
    ax.plot(lims, lims, 'k--', alpha=0.5, zorder=0, label='Equal Performance')
    
    ax.set_xlabel(f'Classical {name}')
    ax.set_ylabel(f'Quantum {name}')
    ax.set_title(f'Paired Comparison: {name}')
    ax.legend()
    ax.grid(True, alpha=0.3)
    ax.set_aspect('equal')

plt.tight_layout()
plt.savefig(os.path.join(RESULTS_DIR, 'paired_comparison.png'), dpi=300, bbox_inches='tight')
plt.show()

print('‚úÖ Comparison plots saved')

## üíæ Step 8: Export Results

Save all results for download and future analysis.

In [None]:
# Save raw results as CSV
import pandas as pd

results_df = pd.DataFrame({
    'quantum_tm_score': quantum_results['tm_scores'],
    'classical_tm_score': classical_results['tm_scores'],
    'quantum_rmsd': quantum_results['rmsds'],
    'classical_rmsd': classical_results['rmsds'],
    'quantum_gdt_ts': quantum_results['gdt_ts'],
    'classical_gdt_ts': classical_results['gdt_ts']
})

csv_path = os.path.join(RESULTS_DIR, 'raw_results.csv')
results_df.to_csv(csv_path, index=False)

print(f'‚úÖ Raw results saved to: {csv_path}')

# Display first few rows
print('\nFirst 5 results:')
print(results_df.head())

In [None]:
# Create comprehensive summary report
summary = {
    'experiment': {
        'timestamp': CONFIG['experiment']['timestamp'],
        'seed': CONFIG['experiment']['seed'],
        'total_runtime_minutes': (quantum_training_time + classical_training_time) / 60
    },
    'hardware': CONFIG['hardware'],
    'configuration': {
        'num_proteins': NUM_TRAINING_PROTEINS,
        'train_proteins': len(train_ids),
        'val_proteins': len(val_ids),
        'test_proteins': len(test_ids),
        'epochs_quantum': NUM_EPOCHS_QUANTUM,
        'epochs_classical': NUM_EPOCHS_CLASSICAL,
        'batch_size': BATCH_SIZE
    },
    'quantum_results': {
        'tm_score': {
            'mean': float(np.mean(quantum_results['tm_scores'])),
            'std': float(np.std(quantum_results['tm_scores'])),
            'median': float(np.median(quantum_results['tm_scores']))
        },
        'rmsd': {
            'mean': float(np.mean(quantum_results['rmsds'])),
            'std': float(np.std(quantum_results['rmsds'])),
            'median': float(np.median(quantum_results['rmsds']))
        },
        'gdt_ts': {
            'mean': float(np.mean(quantum_results['gdt_ts'])),
            'std': float(np.std(quantum_results['gdt_ts'])),
            'median': float(np.median(quantum_results['gdt_ts']))
        }
    },
    'classical_results': {
        'tm_score': {
            'mean': float(np.mean(classical_results['tm_scores'])),
            'std': float(np.std(classical_results['tm_scores'])),
            'median': float(np.median(classical_results['tm_scores']))
        },
        'rmsd': {
            'mean': float(np.mean(classical_results['rmsds'])),
            'std': float(np.std(classical_results['rmsds'])),
            'median': float(np.median(classical_results['rmsds']))
        },
        'gdt_ts': {
            'mean': float(np.mean(classical_results['gdt_ts'])),
            'std': float(np.std(classical_results['gdt_ts'])),
            'median': float(np.median(classical_results['gdt_ts']))
        }
    },
    'statistical_tests': stat_results
}

summary_path = os.path.join(RESULTS_DIR, 'RESULTS_SUMMARY.json')
with open(summary_path, 'w') as f:
    json.dump(summary, f, indent=2)

print(f'‚úÖ Summary report saved to: {summary_path}')

# Print summary
print('\n' + '=' * 80)
print('EXPERIMENT SUMMARY')
print('=' * 80)
print(json.dumps(summary, indent=2))
print('=' * 80)

In [None]:
# Create downloadable archive
import shutil

archive_name = f'quantumfold_results_{datetime.now().strftime("%Y%m%d_%H%M%S")}'
archive_path = shutil.make_archive(
    os.path.join('/content', archive_name),
    'zip',
    RESULTS_DIR
)

archive_size = os.path.getsize(archive_path) / 1e6

print(f'‚úÖ Results archive created: {archive_path}')
print(f'   Size: {archive_size:.1f}MB')

if IS_COLAB:
    from google.colab import files
    print('\nüì• Download archive:')
    files.download(archive_path)

## ‚úÖ Benchmark Complete!

### What You Now Have:

1. **Trained Models**
   - Quantum-enhanced model checkpoint
   - Classical baseline checkpoint
   - Training histories

2. **Performance Metrics**
   - TM-score, RMSD, GDT-TS for both models
   - Raw results CSV
   - Statistical analysis

3. **Publication Figures**
   - Training curves
   - Metric distributions
   - Paired comparisons

4. **Complete Documentation**
   - Experiment configuration
   - Results summary
   - Statistical validation

### Next Steps:

1. Review the statistical significance of results
2. Analyze the plots and distributions
3. Download the archive for local analysis
4. Use the results for your publication/presentation

### Citation:

```bibtex
@software{quantumfold2026,
  author = {Marena, Tommaso R.},
  title = {QuantumFold-Advantage: Quantum-Classical Hybrid Architecture for Protein Structure Prediction},
  year = {2026},
  institution = {The Catholic University of America},
  url = {https://github.com/Tommaso-R-Marena/QuantumFold-Advantage}
}
```

**‚≠ê If you found this useful, please star the repository!**