# Getting Started with QuantumFold-Advantage

This tutorial introduces the basics of using QuantumFold for protein structure prediction.

## Topics Covered
1. Loading and preparing protein data
2. Running structure predictions
3. Evaluating prediction quality
4. Visualizing results

In [None]:
import sys
import numpy as np
import torch
import matplotlib.pyplot as plt
from pathlib import Path

# Add src to path
sys.path.insert(0, str(Path.cwd().parent))

from src.pipeline import QuantumFoldPipeline
from src.benchmarks import ProteinStructureEvaluator
from src.visualize import plot_structure, plot_distance_map

print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")

## 1. Initialize Pipeline

First, we'll create a QuantumFold pipeline with a pre-trained model.

In [None]:
# Initialize pipeline
pipeline = QuantumFoldPipeline(
    checkpoint='../checkpoints/quantumfold_best.pt',
    use_quantum=True,
    device='cuda' if torch.cuda.is_available() else 'cpu'
)

print(f"Pipeline initialized on {pipeline.device}")

## 2. Predict Structure from Sequence

Let's predict the structure of a small protein (ubiquitin).

In [None]:
# Example: Ubiquitin sequence
sequence = (
    "MQIFVKTLTGKTITLEVEPSDTIENVKAKIQDKEGIPPDQQRLIFAGKQLEDGRTLSDYNIQKESTLHLVLRLRGG"
)

print(f"Sequence length: {len(sequence)} residues")
print(f"Sequence: {sequence[:50]}...")

# Run prediction
print("\nRunning prediction...")
results = pipeline.predict(sequence)

print(f"\nPrediction complete!")
print(f"Predicted coordinates shape: {results['coordinates'].shape}")
print(f"Confidence score: {results['confidence']:.3f}")

## 3. Visualize Predicted Structure

Let's visualize the predicted 3D structure.

In [None]:
# Extract coordinates
coords = results['coordinates']

# Plot structure
fig = plot_structure(coords, sequence, title="Predicted Ubiquitin Structure")
plt.tight_layout()
plt.show()

# Calculate and plot distance map
distances = np.sqrt(np.sum((coords[:, None, :] - coords[None, :, :]) ** 2, axis=2))

plt.figure(figsize=(8, 7))
plt.imshow(distances, cmap='viridis', interpolation='nearest')
plt.colorbar(label='Distance (Å)')
plt.title('Predicted Distance Map')
plt.xlabel('Residue Index')
plt.ylabel('Residue Index')
plt.tight_layout()
plt.show()

## 4. Compare with Ground Truth

If we have the true structure, we can evaluate prediction quality.

In [None]:
# Load ground truth (if available)
# coords_true = load_protein_structure('../data/ubiquitin_true.pdb')

# For this example, we'll create synthetic ground truth
coords_true = coords + np.random.randn(*coords.shape) * 2.0  # Add noise

# Evaluate
evaluator = ProteinStructureEvaluator()
metrics = evaluator.evaluate_structure(
    coords,
    coords_true,
    sequence_length=len(sequence)
)

# Print metrics
print("\nEvaluation Metrics:")
print(f"  RMSD:      {metrics.rmsd:.2f} Å")
print(f"  TM-score:  {metrics.tm_score:.3f}")
print(f"  GDT_TS:    {metrics.gdt_ts:.1f}")
print(f"  GDT_HA:    {metrics.gdt_ha:.1f}")
print(f"  lDDT:      {metrics.lddt:.3f}")
print(f"  Clash:     {metrics.clash_score:.2f} per 100 residues")

## 5. Save Predicted Structure

Save the prediction to a PDB file for further analysis.

In [None]:
# Save to PDB file
output_path = '../outputs/ubiquitin_predicted.pdb'
pipeline.save_structure(coords, output_path, sequence=sequence)

print(f"Structure saved to {output_path}")
print("\nYou can now open this file in PyMOL or Chimera for visualization.")

## Summary

In this tutorial, we learned how to:
1. Initialize the QuantumFold pipeline
2. Predict protein structure from sequence
3. Visualize predictions
4. Evaluate prediction quality
5. Save results for further analysis

Next steps:
- Try the quantum vs classical comparison notebook
- Explore advanced features like MSA integration
- Train custom models on your data