# 🏎️ Gemma-3 F1 Expert Training Notebook

This notebook provides a complete walkthrough for fine-tuning Google's Gemma-3 model into a Formula 1 knowledge expert using LoRA (Low-Rank Adaptation).

## 📋 What you'll learn:
- How to collect F1 data from APIs and RSS feeds
- Dataset preparation for instruction tuning
- LoRA fine-tuning with Unsloth for memory efficiency
- Model evaluation and testing
- Deployment options

## 🚀 Quick Start:
1. Run all cells in order (Runtime → Run all)
2. Total time: ~30-45 minutes on a T4 GPU
3. Model will be saved and ready for use

---

## 1️⃣ Environment Setup

First, let's set up the environment with all necessary dependencies.

In [None]:
# Check GPU availability
import torch
print(f"🖥️  GPU Available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU Name: {torch.cuda.get_device_name(0)}")
    print(f"GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1024**3:.1f} GB")
else:
    print("⚠️  No GPU detected. Training will be slow.")

In [None]:
# Install required packages
!pip install -q torch transformers datasets accelerate peft bitsandbytes trl
!pip install -q "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install -q requests feedparser tqdm pandas numpy

print("✅ Dependencies installed!")

In [None]:
# Import libraries
import json
import time
import random
import requests
import feedparser
from datetime import datetime, timedelta
from pathlib import Path
from tqdm import tqdm

import torch
from datasets import Dataset
from transformers import TrainingArguments
from unsloth import FastLanguageModel
from trl import SFTTrainer

print("📚 Libraries imported successfully!")

## 2️⃣ Data Collection

Let's collect Formula 1 data from the Jolpica API and create explanatory content.

In [None]:
# Create data directory
!mkdir -p data

# Jolpica F1 API client
class QuickF1DataCollector:
    """Simplified F1 data collector for notebook use."""
    
    BASE_URL = "https://api.jolpi.ca/ergast/f1"
    
    def __init__(self):
        self.session = requests.Session()
        self.session.headers.update({
            'User-Agent': 'gemma-f1-expert-colab/1.0'
        })
    
    def get_recent_data(self, years=3):
        """Get recent F1 data (last few years for speed)."""
        current_year = datetime.now().year
        start_year = current_year - years
        
        data = []
        
        for year in range(start_year, current_year + 1):
            print(f"Collecting {year} data...")
            
            try:
                # Get season races
                time.sleep(0.2)  # Rate limiting
                races_url = f"{self.BASE_URL}/{year}.json"
                races_resp = self.session.get(races_url, timeout=10)
                races_data = races_resp.json()
                
                races = races_data.get("MRData", {}).get("RaceTable", {}).get("Races", [])
                
                for race in races[:5]:  # Limit to first 5 races per year for speed
                    round_num = race["round"]
                    race_name = race["raceName"]
                    
                    # Get race results
                    time.sleep(0.2)
                    results_url = f"{self.BASE_URL}/{year}/{round_num}/results.json"
                    results_resp = self.session.get(results_url, timeout=10)
                    results_data = results_resp.json()
                    
                    race_results = results_data.get("MRData", {}).get("RaceTable", {}).get("Races", [])
                    
                    if race_results and race_results[0].get("Results"):
                        winner = race_results[0]["Results"][0]
                        driver_name = f"{winner['Driver']['givenName']} {winner['Driver']['familyName']}"
                        constructor = winner["Constructor"]["name"]
                        
                        data.append({
                            "year": year,
                            "race": race_name,
                            "winner": driver_name,
                            "constructor": constructor,
                            "round": round_num
                        })
                
            except Exception as e:
                print(f"Error collecting {year} data: {e}")
                continue
        
        return data

# Collect F1 data
print("🏎️  Collecting F1 race data...")
collector = QuickF1DataCollector()
f1_data = collector.get_recent_data(years=3)

print(f"✅ Collected {len(f1_data)} race results")

## 3️⃣ Dataset Creation

Now let's create question-answer pairs for training.

In [None]:
def create_f1_dataset(race_data):
    """Create Q-A dataset from race data and explanatory content."""
    qa_pairs = []
    
    # Factual questions from race data
    for race in race_data:
        year = race["year"]
        race_name = race["race"]
        winner = race["winner"]
        constructor = race["constructor"]
        
        qa_pairs.extend([
            {
                "question": f"Who won the {year} {race_name}?",
                "answer": f"{winner} won the {year} {race_name} driving for {constructor}.",
                "type": "factual",
                "category": "race_winner"
            },
            {
                "question": f"Which team won the {year} {race_name}?",
                "answer": f"{constructor} won the {year} {race_name} with {winner}.",
                "type": "factual",
                "category": "constructor_winner"
            }
        ])
    
    # Explanatory content
    explanatory_qa = [
        {
            "question": "How does DRS work in Formula 1?",
            "answer": "DRS (Drag Reduction System) is a movable rear wing flap that drivers can activate to reduce aerodynamic drag. It can only be used in designated DRS zones when a driver is within one second of the car ahead during races. When activated, the rear wing opens, reducing downforce and allowing higher straight-line speeds for overtaking. DRS is automatically disabled in wet conditions for safety.",
            "type": "explanatory",
            "category": "technical"
        },
        {
            "question": "What are the different F1 tyre compounds?",
            "answer": "F1 uses three dry tyre compounds per weekend: soft (red sidewall), medium (yellow sidewall), and hard (white sidewall). Softer compounds provide more grip but degrade faster, while harder compounds last longer but offer less grip. Teams must use at least two different compounds during a race. There are also intermediate (green) and wet (blue) tyres for rain conditions.",
            "type": "explanatory",
            "category": "technical"
        },
        {
            "question": "How does F1 qualifying work?",
            "answer": "F1 qualifying consists of three knockout sessions: Q1 (18 minutes), Q2 (15 minutes), and Q3 (12 minutes). In Q1, the five slowest drivers are eliminated. In Q2, another five are eliminated, and the remaining drivers' tyre choice determines their race start tyres. Q3 features the top 10 drivers competing for pole position.",
            "type": "explanatory",
            "category": "format"
        },
        {
            "question": "What is the current F1 points system?",
            "answer": "The current F1 points system awards points to the top 10 finishers: 25-18-15-12-10-8-6-4-2-1 points for positions 1st through 10th respectively. An additional point is awarded for the fastest lap, but only if the driver finishes in the points (top 10). Both the Drivers' Championship and Constructors' Championship use this system.",
            "type": "explanatory",
            "category": "rules"
        },
        {
            "question": "What happens during an F1 pit stop?",
            "answer": "During a pit stop, teams can change tyres, adjust front wing angles, and make minor repairs. A typical pit stop takes 2-3 seconds for a tyre change. Teams can make unlimited pit stops during a race, but drivers must use at least two different tyre compounds. Pit stops are crucial for strategy, timing them to minimize time loss while gaining track position or fresh tyres.",
            "type": "explanatory",
            "category": "strategy"
        },
        {
            "question": "How do F1 safety cars work?",
            "answer": "Safety cars are deployed when there's a hazard on track that requires marshals to work safely. All drivers must slow down and follow the safety car in single file, with no overtaking allowed. This bunches up the field and allows safe track clearing. Racing resumes when the safety car returns to the pits. Virtual Safety Cars (VSC) require drivers to maintain specific lap times without a physical safety car.",
            "type": "explanatory",
            "category": "safety"
        },
        {
            "question": "Who is Lewis Hamilton?",
            "answer": "Lewis Hamilton is a British Formula 1 driver and seven-time World Champion, tied for the most championships in F1 history with Michael Schumacher. He drives for Mercedes-AMG and holds the record for most race wins, pole positions, and podium finishes. Hamilton made his F1 debut in 2007 with McLaren and won his first championship in 2008.",
            "type": "factual",
            "category": "driver_info"
        },
        {
            "question": "What is the Monaco Grand Prix?",
            "answer": "The Monaco Grand Prix is one of Formula 1's most prestigious races, held annually on the streets of Monte Carlo in Monaco. The circuit is known for its tight, twisty layout with minimal overtaking opportunities, making qualifying position crucial. It's part of the Triple Crown of Motorsport along with the Indianapolis 500 and 24 Hours of Le Mans.",
            "type": "explanatory",
            "category": "circuit_info"
        },
        {
            "question": "How long is an F1 race?",
            "answer": "An F1 race is scheduled to run for a specific number of laps depending on the circuit, designed to last approximately 1 hour and 45 minutes. However, races have a maximum time limit of 2 hours for safety reasons. If weather delays occur, the race may be shortened. The minimum distance for a full race is 75% of the scheduled distance.",
            "type": "explanatory",
            "category": "rules"
        },
        {
            "question": "What is Formula 1?",
            "answer": "Formula 1 is the highest class of international auto racing for single-seater formula racing cars. It features a series of races called Grands Prix held on purpose-built circuits and closed public roads around the world. F1 cars are the fastest regulated road-course racing cars in the world, featuring advanced aerodynamics, hybrid power units, and cutting-edge technology.",
            "type": "explanatory",
            "category": "general"
        }
    ]
    
    qa_pairs.extend(explanatory_qa)
    
    # Add metadata
    for i, qa in enumerate(qa_pairs):
        qa["id"] = f"f1_qa_{i:04d}"
        qa["created_at"] = datetime.now().isoformat()
    
    return qa_pairs

# Create dataset
print("📊 Creating Q-A dataset...")
qa_dataset = create_f1_dataset(f1_data)

print(f"✅ Created {len(qa_dataset)} Q-A pairs")

# Show examples
print("\n📝 Example Q-A pairs:")
for i, qa in enumerate(qa_dataset[:3]):
    print(f"\n{i+1}. [{qa['type'].upper()}] {qa['question']}")
    print(f"   Answer: {qa['answer'][:100]}...")

In [None]:
# Format dataset for training
def format_for_training(qa_pairs):
    """Format Q-A pairs for instruction tuning."""
    formatted_data = []
    
    instruction_template = """Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
You are an expert Formula 1 assistant. Answer the following question accurately and concisely.

{question}

### Response:
{answer}"""
    
    for qa in qa_pairs:
        formatted_text = instruction_template.format(
            question=qa["question"],
            answer=qa["answer"]
        )
        formatted_data.append({"text": formatted_text})
    
    return formatted_data

# Format data
formatted_data = format_for_training(qa_dataset)

# Create train/test split
random.shuffle(formatted_data)
split_idx = int(len(formatted_data) * 0.9)  # 90% train, 10% test

train_data = formatted_data[:split_idx]
test_data = formatted_data[split_idx:]

print(f"📈 Training examples: {len(train_data)}")
print(f"📉 Test examples: {len(test_data)}")

# Convert to HuggingFace Dataset
train_dataset = Dataset.from_list(train_data)
test_dataset = Dataset.from_list(test_data)

print("✅ Dataset formatted for training")

## 4️⃣ Model Setup

Load Gemma-3 model and configure LoRA adapters.

In [None]:
# Model configuration
model_name = "google/gemma-2-2b"  # Using smaller Gemma model for Colab
max_seq_length = 512  # Sequence length

print(f"🤖 Loading model: {model_name}")

# Load model with Unsloth
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name=model_name,
    max_seq_length=max_seq_length,
    dtype=None,  # Auto-detect
    load_in_4bit=True,  # 4-bit quantization for memory efficiency
)

print("✅ Base model loaded")

# Configure LoRA
print("⚙️  Configuring LoRA adapters...")

model = FastLanguageModel.get_peft_model(
    model,
    r=8,  # LoRA rank (higher for better performance)
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj",
                    "gate_proj", "up_proj", "down_proj"],
    lora_alpha=16,  # LoRA scaling
    lora_dropout=0.1,  # LoRA dropout
    bias="none",  # No bias training
    use_gradient_checkpointing="unsloth",  # Memory optimization
    random_state=42,
)

print("✅ LoRA adapters configured")
print(f"📊 Model memory footprint: ~{torch.cuda.max_memory_allocated() / 1024**3:.1f} GB")

## 5️⃣ Training

Fine-tune the model with our F1 dataset.

In [None]:
# Training configuration
training_args = TrainingArguments(
    per_device_train_batch_size=4,      # Batch size
    gradient_accumulation_steps=2,       # Gradient accumulation
    warmup_steps=50,                     # Warmup steps
    num_train_epochs=3,                  # Number of epochs
    learning_rate=2e-4,                  # Learning rate
    fp16=not torch.cuda.is_available(),  # Use FP16 if CUDA available
    bf16=torch.cuda.is_available(),      # Use BF16 on modern GPUs
    logging_steps=5,                     # Log every N steps
    optim="adamw_8bit",                  # 8-bit optimizer
    weight_decay=0.01,                   # Weight decay
    lr_scheduler_type="cosine",          # Learning rate scheduler
    seed=42,                             # Random seed
    output_dir="./f1_model_output",      # Output directory
    save_steps=50,                       # Save every N steps
    save_total_limit=2,                  # Keep only 2 checkpoints
    report_to=None,                      # Disable logging to external services
    remove_unused_columns=False,
)

# Create trainer
trainer = SFTTrainer(
    model=model,
    tokenizer=tokenizer,
    train_dataset=train_dataset,
    dataset_text_field="text",
    max_seq_length=max_seq_length,
    dataset_num_proc=1,
    packing=False,
    args=training_args,
)

print("✅ Trainer configured")
print(f"📊 Training steps per epoch: {len(train_dataset) // (training_args.per_device_train_batch_size * training_args.gradient_accumulation_steps)}")
print(f"⏱️  Estimated training time: ~15-25 minutes")

In [None]:
# Start training
print("🚀 Starting training...")
print("📈 Training progress will be shown below:")

start_time = time.time()

# Train the model
trainer.train()

end_time = time.time()
training_duration = (end_time - start_time) / 60  # Convert to minutes

print(f"\n✅ Training completed in {training_duration:.1f} minutes!")
print(f"🎉 Model fine-tuned successfully!")

## 6️⃣ Testing & Evaluation

Let's test our fine-tuned model with some F1 questions.

In [None]:
# Test function
def test_f1_model(question):
    """Test the model with a question."""
    prompt = f"""Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
You are an expert Formula 1 assistant. Answer the following question accurately and concisely.

{question}

### Response:
"""
    
    # Tokenize
    inputs = tokenizer([prompt], return_tensors="pt").to("cuda")
    
    # Generate
    with torch.no_grad():
        outputs = model.generate(
            **inputs,
            max_new_tokens=128,
            temperature=0.7,
            do_sample=True,
            pad_token_id=tokenizer.eos_token_id,
        )
    
    # Decode response
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)
    
    # Extract answer
    if "### Response:" in response:
        answer = response.split("### Response:")[1].strip()
    else:
        answer = response.strip()
    
    return answer

# Test questions
test_questions = [
    "What is Formula 1?",
    "How does DRS work in Formula 1?",
    "What are F1 tyre compounds?",
    "Who is Lewis Hamilton?",
    "How does F1 qualifying work?"
]

print("🧪 Testing F1 Expert Model")
print("=" * 50)

for i, question in enumerate(test_questions, 1):
    print(f"\n{i}. 🏎️  Question: {question}")
    answer = test_f1_model(question)
    print(f"   🤖 Answer: {answer}")
    print("-" * 50)

In [None]:
# Interactive testing
print("🎮 Interactive Testing - Ask your own F1 questions!")
print("(Leave empty and run to skip this section)")

# You can modify this cell to test with your own questions
custom_questions = [
    # Add your questions here, for example:
    # "Who won the 2023 Monaco Grand Prix?",
    # "What is the current F1 points system?",
]

if custom_questions:
    for question in custom_questions:
        if question.strip():
            print(f"\n🏎️  Your Question: {question}")
            answer = test_f1_model(question)
            print(f"🤖 F1 Expert: {answer}")
            print("-" * 40)
else:
    print("No custom questions provided. Skipping interactive testing.")

## 7️⃣ Model Saving

Save the trained model for later use.

In [None]:
# Save model and tokenizer
output_dir = "./gemma_f1_expert"

print(f"💾 Saving model to: {output_dir}")

# Save LoRA adapters and tokenizer
model.save_pretrained(output_dir)
tokenizer.save_pretrained(output_dir)

# Save training metadata
metadata = {
    "base_model": model_name,
    "lora_rank": 8,
    "lora_alpha": 16,
    "training_examples": len(train_dataset),
    "training_time_minutes": training_duration,
    "training_date": datetime.now().isoformat(),
    "framework": "unsloth",
    "description": "F1 expert model trained on racing Q-A data"
}

with open(f"{output_dir}/training_metadata.json", 'w') as f:
    json.dump(metadata, f, indent=2)

print("✅ Model saved successfully!")
print(f"📁 Model location: {output_dir}")
print(f"📊 Model size: ~{sum(f.stat().st_size for f in Path(output_dir).glob('**/*') if f.is_file()) / 1024**2:.1f} MB")

In [None]:
# Optional: Download model files
print("📥 Creating downloadable model archive...")

# Create a zip file of the model
!zip -r gemma_f1_expert.zip ./gemma_f1_expert/

print("✅ Model archive created: gemma_f1_expert.zip")
print("💡 You can download this file from the Colab file browser")

# Show file sizes
!ls -lh gemma_f1_expert.zip
!ls -lh ./gemma_f1_expert/

## 8️⃣ Optional: Upload to Hugging Face Hub

Upload your model to Hugging Face Hub for easy sharing and deployment.

In [None]:
# Optional: Upload to Hugging Face Hub
# Uncomment and modify the following code to upload your model

"""
# Install huggingface_hub
!pip install -q huggingface_hub

from huggingface_hub import notebook_login

# Login to Hugging Face (you'll need to enter your token)
notebook_login()

# Upload model
hub_model_id = "your-username/gemma-f1-expert"  # Change this to your desired model name

model.push_to_hub(hub_model_id, token=True)
tokenizer.push_to_hub(hub_model_id, token=True)

print(f"✅ Model uploaded to: https://huggingface.co/{hub_model_id}")
"""

print("💡 To upload to Hugging Face Hub:")
print("1. Uncomment the code above")
print("2. Change 'your-username/gemma-f1-expert' to your desired model name")
print("3. Run the cell and follow the login prompts")

## 🎉 Congratulations!

You've successfully fine-tuned a Gemma-3 model into an F1 expert! Here's what you've accomplished:

### ✅ What you built:
- **F1 Data Collection**: Gathered race results from Jolpica API
- **Custom Dataset**: Created ~100+ F1 question-answer pairs
- **Model Fine-tuning**: Applied LoRA to Gemma-3 for F1 expertise
- **Testing & Validation**: Verified model performance on F1 questions
- **Model Export**: Saved model for deployment

### 🚀 Next Steps:
1. **Download your model** from the file browser
2. **Test with more questions** to evaluate performance
3. **Deploy locally** using the CLI script from the repository
4. **Build a web app** with Streamlit for interactive use
5. **Expand the dataset** with more F1 data for better performance

### 📚 Key Learnings:
- LoRA enables efficient fine-tuning on limited hardware
- Quality dataset creation is crucial for good model performance
- Instruction tuning format works well for Q-A tasks
- Unsloth significantly speeds up training and reduces memory usage

### 🔗 Resources:
- [Unsloth Documentation](https://github.com/unslothai/unsloth)
- [Jolpica-F1 API](https://jolpi.ca/)
- [Hugging Face Transformers](https://huggingface.co/docs/transformers)
- [PEFT Library](https://huggingface.co/docs/peft)

---
🏎️ **Happy Racing with AI!** 🏁