# Fine-Tuning LLMs: Complete Pipeline in Google Colab

This notebook runs the complete fine-tuning and evaluation pipeline for fine-tuning LLMs.

## Pipeline Overview
1. **Setup**: Clone repository and install dependencies
2. **Data Preparation**: Load and prepare training data
3. **Training**: Fine-tune LLaMA 3 and Qwen 3 with QLoRA
4. **Evaluation**: Evaluate models on test set
5. **Interactive Demo**: Use the CLI to query models

---


## 1. Setup Environment


In [None]:
# Check GPU availability
import torch
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")


In [None]:
# Clone repository
# Replace with your repository URL
REPO_URL = "https://github.com/yourusername/270FT.git"  # Update this!

import os
import subprocess

# Clone the repository
if not os.path.exists("270FT"):
    print("Cloning repository...")
    subprocess.run(["git", "clone", REPO_URL], check=True)
    print("[OK] Repository cloned")
else:
    print("Repository already exists, skipping clone")

# Change to project directory
os.chdir("270FT")
print(f"Current directory: {os.getcwd()}")


In [None]:
# Install dependencies
print("Installing dependencies...")
subprocess.run(["pip", "install", "-q", "-r", "requirements.txt"], check=True)
print("[OK] Dependencies installed")


In [None]:
# Verify installation
try:
    import transformers
    import peft
    import sympy
    import z3
    import wandb
    import yaml
    print("[OK] All required packages installed successfully")
    print(f"  - Transformers: {transformers.__version__}")
    print(f"  - PEFT: {peft.__version__}")
    print(f"  - SymPy: {sympy.__version__}")
except ImportError as e:
    print(f"[ERROR] Import error: {e}")


## 2. Data Preparation


In [None]:
# Create data directories if they don't exist
from pathlib import Path

data_raw = Path("270FT/data/raw")
data_processed = Path("270FT/data/processed")

data_raw.mkdir(parents=True, exist_ok=True)
data_processed.mkdir(parents=True, exist_ok=True)

print(f"Data directories created:")
print(f"  - Raw: {data_raw}")
print(f"  - Processed: {data_processed}")


In [None]:
# Example: Create sample training data if it doesn't exist
# In practice, you would upload your own data files

import json

sample_train_data = [
    {
        "prompt": "Prove that the sum of the first n natural numbers is n(n+1)/2",
        "response": "[Algorithm Outline]\nUse mathematical induction to prove the formula.\n\n[Pseudocode]\nfunction verify_sum(n):\n    if n == 1:\n        return 1 == 1 * 2 / 2  // Base case\n    // Inductive step: assume true for k, prove for k+1\n\n[Proof Summary]\nBase case (n=1): Sum = 1, formula = 1(2)/2 = 1\nInductive step: Assume sum(1..k) = k(k+1)/2.\nFor k+1: sum(1..k+1) = k(k+1)/2 + (k+1) = (k+1)(k+2)/2"
    },
    {
        "prompt": "Explain the binary search algorithm",
        "response": "[Algorithm Outline]\nBinary search finds an element in a sorted array by repeatedly dividing the search space in half.\n\n[Pseudocode]\nfunction binary_search(arr, target):\n    left = 0\n    right = len(arr) - 1\n    while left <= right:\n        mid = (left + right) // 2\n        if arr[mid] == target:\n            return mid\n        elif arr[mid] < target:\n            left = mid + 1\n        else:\n            right = mid - 1\n    return -1\n\n[Proof Summary]\nTime complexity: O(log n) because we halve the search space each iteration.\nSpace complexity: O(1) for iterative version."
    }
]

sample_test_data = [
    {
        "prompt": "Prove that 1 + 2 + ... + n = n(n+1)/2",
        "response": "[Algorithm Outline]\nMathematical induction proof.\n\n[Pseudocode]\nBase: n=1 â†’ 1 = 1(2)/2 = 1\nInductive: sum(1..k+1) = sum(1..k) + (k+1) = k(k+1)/2 + (k+1) = (k+1)(k+2)/2\n\n[Proof Summary]\nBy mathematical induction, the formula holds for all natural numbers n."
    }
]

# Save sample data (only if files don't exist)
train_path = data_raw / "train.json"
test_path = data_raw / "test.json"

if not train_path.exists():
    with open(train_path, "w") as f:
        json.dump(sample_train_data, f, indent=2)
    print(f"[OK] Created sample training data: {train_path}")
else:
    print(f"Training data already exists: {train_path}")

if not test_path.exists():
    with open(test_path, "w") as f:
        json.dump(sample_test_data, f, indent=2)
    print(f"[OK] Created sample test data: {test_path}")
else:
    print(f"Test data already exists: {test_path}")


In [None]:
# Display data statistics
if train_path.exists():
    with open(train_path, "r") as f:
        train_data = json.load(f)
    print(f"Training samples: {len(train_data)}")
    if train_data:
        print(f"Sample prompt: {train_data[0]['prompt'][:100]}...")

if test_path.exists():
    with open(test_path, "r") as f:
        test_data = json.load(f)
    print(f"Test samples: {len(test_data)}")


## 3. Configure Training


In [None]:
# Display current training configuration
import yaml

config_path = Path("270FT/configs/training_config.yaml")
with open(config_path, "r") as f:
    config = yaml.safe_load(f)

print("Training Configuration:")
print(f"  Models to train: {[m['name'] for m in config['models']]}")
print(f"  Epochs: {config['training']['epochs']}")
print(f"  Learning rate: {config['training']['learning_rate']}")
print(f"  Batch size: {config['training']['batch_size']}")
print(f"  LoRA rank: {config['training']['lora_r']}")
print(f"  LoRA alpha: {config['training']['lora_alpha']}")
print(f"  LoRA dropout: {config['training']['lora_dropout']}")


In [None]:
# Optional: Configure W&B for experiment tracking
# Uncomment and run if you want to use Weights & Biases

# import wandb
# wandb.login()
# print("[OK] W&B configured")


## 4. Training Models


In [None]:
# Run training script
# This will train both LLaMA 3 and Qwen 3 models

import sys
sys.path.append("270FT")

print("Starting training pipeline...")
print("This may take several hours depending on your GPU and dataset size.")
print("\n" + "="*60)


In [None]:
# Execute training
# Note: In Colab, you can run this as a Python script

import importlib.util
spec = importlib.util.spec_from_file_location("train_dual_lora", "270FT/training/train_dual_lora.py")
train_dual_lora = importlib.util.module_from_spec(spec)
spec.loader.exec_module(train_dual_lora)
train_main = train_dual_lora.main

# Run training
train_main()


In [None]:
# Verify models were saved
models_dir = Path("270FT/models")

for model_config in config["models"]:
    model_path = models_dir / model_config["output_dir"]
    if model_path.exists():
        files = list(model_path.iterdir())
        print(f"[OK] {model_config['name']} saved to {model_path}")
        print(f"  Files: {[f.name for f in files[:5]]}...")
    else:
        print(f"[ERROR] {model_config['name']} not found at {model_path}")


## 5. Evaluation


In [None]:
# Run evaluation script
print("Running evaluation on test set...")
print("\n" + "="*60)


In [None]:
import importlib.util
spec = importlib.util.spec_from_file_location("evaluate_models", "270FT/evaluation/evaluate_models.py")
evaluate_models = importlib.util.module_from_spec(spec)
spec.loader.exec_module(evaluate_models)
eval_main = evaluate_models.main

# Run evaluation
eval_main()


In [None]:
# Display evaluation results
results_path = Path("270FT/results/metrics_report.json")

if results_path.exists():
    with open(results_path, "r") as f:
        results = json.load(f)
    
    print("\nEvaluation Results Summary:")
    print("="*60)
    
    for model_name, model_results in results["model_results"].items():
        print(f"\nModel: {model_name}")
        print(f"  Exact Match Rate: {model_results['exact_match_rate']:.4f}")
        print(f"  Symbolic Equivalence Rate: {model_results['symbolic_equivalence_rate']:.4f}")
        print(f"  Average BLEU Score: {model_results['avg_bleu_score']:.4f}")
else:
    print("Results file not found. Please run evaluation first.")


## 6. Interactive Demo


In [None]:
# Load a model and generate a response
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
from pathlib import Path

def load_model_for_demo(model_name, adapter_path, device="cuda"):
    """Load model with adapter for interactive use."""
    print(f"Loading {model_name}...")
    
    tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
    
    model = AutoModelForCausalLM.from_pretrained(
        model_name,
        torch_dtype=torch.float16 if device == "cuda" else torch.float32,
        device_map="auto" if device == "cuda" else None,
        trust_remote_code=True,
    )
    
    model = PeftModel.from_pretrained(model, str(adapter_path))
    model.eval()
    
    if tokenizer.pad_token is None:
        tokenizer.pad_token = tokenizer.eos_token
    
    print(f"[OK] Model loaded")
    return model, tokenizer

def generate_response(model, tokenizer, question, max_new_tokens=512):
    """Generate response to a question."""
    prompt = f"### Question:\n{question}\n\n### Solution:\n"
    
    inputs = tokenizer(prompt, return_tensors="pt", truncation=True, max_length=2048).to(model.device)
    
    with torch.no_grad():
        outputs = model.generate(
            **inputs,
            max_new_tokens=max_new_tokens,
            temperature=0.7,
            do_sample=True,
            pad_token_id=tokenizer.pad_token_id,
            eos_token_id=tokenizer.eos_token_id,
        )
    
    generated = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
    return generated.strip()


In [None]:
# Load first available model
models_dir = Path("270FT/models")
device = "cuda" if torch.cuda.is_available() else "cpu"

# Try to load the first model from config
model_loaded = False
for model_config in config["models"]:
    adapter_path = models_dir / model_config["output_dir"]
    
    if adapter_path.exists() and (adapter_path / "adapter_config.json").exists():
        try:
            model, tokenizer = load_model_for_demo(
                model_config["name"],
                adapter_path,
                device=device
            )
            model_loaded = True
            current_model_name = model_config["name"]
            break
        except Exception as e:
            print(f"Error loading {model_config['name']}: {e}")
            continue

if not model_loaded:
    print("No trained models found. Please run training first.")


In [None]:
# Test the model with a sample question
if model_loaded:
    test_question = "Prove that the sum of the first n natural numbers is n(n+1)/2"
    
    print(f"Question: {test_question}\n")
    print("Generating response...\n")
    
    response = generate_response(model, tokenizer, test_question)
    
    print("Response:")
    print("="*60)
    print(response)
    print("="*60)


### Interactive Query Interface


In [None]:
# Interactive cell - modify the question and run
if model_loaded:
    # Change this question to test different queries
    your_question = "Explain the quicksort algorithm"
    
    print(f"Question: {your_question}\n")
    print("Generating response...\n")
    
    response = generate_response(model, tokenizer, your_question, max_new_tokens=1024)
    
    print("Response:")
    print("="*60)
    print(response)
    print("="*60)
else:
    print("Please load a model first.")


## 7. Download Models (Optional)


In [None]:
# Compress and download trained models
# This is useful if you want to save your trained models

import shutil

# Create a zip file of the models directory
models_dir = Path("270FT/models")
if models_dir.exists():
    print("Creating archive of trained models...")
    shutil.make_archive("trained_models", "zip", models_dir)
    print("[OK] Archive created: trained_models.zip")
    print("\nTo download, run:")
    print("  from google.colab import files")
    print("  files.download('trained_models.zip')")
else:
    print("Models directory not found.")


In [None]:
# Uncomment to download
# from google.colab import files
# files.download('trained_models.zip')


## 8. Cleanup (Optional)


In [None]:
# Clear GPU memory
if torch.cuda.is_available():
    torch.cuda.empty_cache()
    print("GPU cache cleared")

# Optionally delete models to free up space
# import shutil
# shutil.rmtree("270FT/models", ignore_errors=True)
# print("Models directory deleted")


---

## Notes

- **Training Time**: Expect 3-5 hours per model on Colab's free GPU (T4)
- **Memory**: Models require ~15-18GB VRAM. Use Colab Pro for better GPUs if needed.
- **Data**: Upload your own training data to `270FT/data/raw/` before training
- **Persistence**: Colab sessions may disconnect. Consider saving checkpoints or using Colab Pro.

## Troubleshooting

1. **Out of Memory**: Reduce batch size in `training_config.yaml`
2. **Model Not Found**: Ensure you've cloned the repository correctly
3. **Import Errors**: Restart runtime and re-run setup cells
4. **Training Fails**: Check that training data exists in `270FT/data/raw/train.json`

## Next Steps

- Experiment with different LoRA hyperparameters
- Try different base models
- Add more training data
- Fine-tune the evaluation metrics
