# 🚀 Google Colab Setup for LLaMA Insurance Fine-tuning

This notebook sets up the complete environment for fine-tuning LLaMA models on insurance data.

## What this notebook does:
1. Configures GPU and runtime settings
2. Installs all required dependencies
3. Sets up authentication (Hugging Face, W&B)
4. Clones the project repository
5. Mounts Google Drive for persistence
6. Verifies the setup

**⚠️ Important: Make sure to enable GPU in Runtime > Change runtime type**

## 1. Check GPU Availability

In [None]:
import torch
import subprocess

print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"CUDA version: {torch.version.cuda}")
    print(f"GPU device: {torch.cuda.get_device_name(0)}")
    print(f"GPU memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB")
else:
    print("⚠️ No GPU detected! Please enable GPU in Runtime > Change runtime type")

# Check available RAM
result = subprocess.run(['free', '-h'], capture_output=True, text=True)
print("\nSystem RAM:")
print(result.stdout)

## 2. Install Dependencies

Installing all required packages for LLaMA fine-tuning with LoRA/PEFT.

In [None]:
# Install core ML libraries
!pip install -q torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
!pip install -q transformers>=4.36.0
!pip install -q accelerate>=0.24.0
!pip install -q datasets>=2.14.0
!pip install -q peft>=0.7.0
!pip install -q bitsandbytes>=0.41.0

In [None]:
# Install training and evaluation tools
!pip install -q wandb>=0.16.0
!pip install -q tensorboard>=2.15.0
!pip install -q scikit-learn>=1.3.0
!pip install -q rouge-score>=0.1.2
!pip install -q nltk>=3.8.1
!pip install -q evaluate>=0.4.0

In [None]:
# Install data processing utilities
!pip install -q pandas>=2.0.0
!pip install -q numpy>=1.24.0
!pip install -q regex>=2023.10.3
!pip install -q tqdm>=4.66.0
!pip install -q huggingface-hub>=0.19.0
!pip install -q safetensors>=0.4.0
!pip install -q ipywidgets>=8.0.0

In [None]:
# Optional: Install flash attention (may not work on all Colab instances)
try:
    !pip install -q flash-attn>=2.3.0 --no-build-isolation
    print("✅ Flash Attention installed successfully")
except:
    print("⚠️ Flash Attention installation failed (this is optional)")

# Install additional tokenization support
!pip install -q sentencepiece>=0.1.99
!pip install -q protobuf>=4.25.0

## 3. Authentication Setup

Set up authentication for Hugging Face and Weights & Biases.

In [None]:
# Hugging Face Authentication
from huggingface_hub import login

print("🤗 Hugging Face Authentication")
print("Please enter your Hugging Face token (get one from https://huggingface.co/settings/tokens)")
print("This is required to download LLaMA models.")

try:
    login()
    print("✅ Successfully authenticated with Hugging Face")
except Exception as e:
    print(f"❌ Authentication failed: {e}")
    print("You can also set the HF_TOKEN environment variable")

In [None]:
# Weights & Biases Authentication (optional but recommended)
import wandb

print("📊 Weights & Biases Setup (optional for experiment tracking)")
print("Get your API key from https://wandb.ai/settings")

try:
    wandb.login()
    print("✅ Successfully authenticated with W&B")
except Exception as e:
    print(f"⚠️ W&B authentication failed (optional): {e}")
    print("You can skip this or authenticate later")

## 4. Mount Google Drive

Mount Google Drive to persist models and data across sessions.

In [None]:
from google.colab import drive
import os

# Mount Google Drive
drive.mount('/content/drive')

# Create project directory in Drive if it doesn't exist
project_drive_path = '/content/drive/MyDrive/llama-insurance-finetune'
os.makedirs(project_drive_path, exist_ok=True)
os.makedirs(f'{project_drive_path}/models', exist_ok=True)
os.makedirs(f'{project_drive_path}/data_backup', exist_ok=True)

print(f"✅ Google Drive mounted and project directory created at: {project_drive_path}")

## 5. Clone Project Repository

Clone the project repository from GitHub.

In [None]:
import os

# Change to content directory
os.chdir('/content')

# Clone repository (replace with your actual repository URL)
repo_url = "https://github.com/franklinmarkley/llama-insurance-finetune.git"

if os.path.exists('llama-insurance-finetune'):
    print("Repository already exists, pulling latest changes...")
    !cd llama-insurance-finetune && git pull
else:
    print(f"Cloning repository from {repo_url}...")
    !git clone {repo_url}

# Change to project directory
os.chdir('/content/llama-insurance-finetune')
print(f"✅ Working directory: {os.getcwd()}")

# List project contents
print("\nProject structure:")
!ls -la

## 6. Configure Git (for pushing changes back)

In [None]:
# Configure Git (replace with your information)
git_username = "Your Name"  # Replace with your name
git_email = "your.email@example.com"  # Replace with your email

!git config --global user.name "{git_username}"
!git config --global user.email "{git_email}"

print("Git configuration:")
!git config --list | grep user

## 7. Verify Installation

Test that all key components are working correctly.

In [None]:
# Test core imports
try:
    import torch
    import transformers
    import accelerate
    import datasets
    import peft
    import bitsandbytes
    import wandb
    import pandas as pd
    import numpy as np
    
    print("✅ All core libraries imported successfully")
    
    # Print versions
    print(f"\nLibrary versions:")
    print(f"PyTorch: {torch.__version__}")
    print(f"Transformers: {transformers.__version__}")
    print(f"Accelerate: {accelerate.__version__}")
    print(f"Datasets: {datasets.__version__}")
    print(f"PEFT: {peft.__version__}")
    
except ImportError as e:
    print(f"❌ Import failed: {e}")

In [None]:
# Test model access (this will download tokenizer only)
from transformers import AutoTokenizer

model_name = "meta-llama/Llama-2-7b-chat-hf"

try:
    print(f"Testing access to {model_name}...")
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    print(f"✅ Successfully loaded tokenizer for {model_name}")
    print(f"Vocabulary size: {tokenizer.vocab_size}")
    
    # Test tokenization
    test_text = "This is a test of insurance policy text."
    tokens = tokenizer.encode(test_text)
    print(f"Test tokenization: '{test_text}' -> {len(tokens)} tokens")
    
except Exception as e:
    print(f"❌ Model access failed: {e}")
    print("Make sure you have access to LLaMA models and are authenticated with Hugging Face")

## 8. Environment Summary

In [None]:
import sys
import os
from datetime import datetime

print("🎯 LLaMA Insurance Fine-tuning Environment Summary")
print("=" * 55)
print(f"Date: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
print(f"Python version: {sys.version.split()[0]}")
print(f"Working directory: {os.getcwd()}")
print(f"GPU available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU device: {torch.cuda.get_device_name(0)}")
print(f"Google Drive mounted: {os.path.exists('/content/drive')}")

# Check if config files exist
config_files = [
    'config/lora_config.json',
    'config/training_args.json',
    'config/model_card.md'
]

print("\nProject files:")
for file in config_files:
    exists = "✅" if os.path.exists(file) else "❌"
    print(f"{exists} {file}")

print("\n🚀 Setup complete! You can now proceed to the next notebook.")
print("\nNext steps:")
print("1. 01_data_preprocessing.ipynb - Process your insurance data")
print("2. 02_tokenization.ipynb - Prepare data for training")
print("3. 03_finetuning_lora.ipynb - Fine-tune the model")
print("4. 04_evaluation.ipynb - Evaluate model performance")
print("5. 05_inference_demo.ipynb - Test the trained model")

## 9. Helper Functions

Useful functions for managing the environment throughout the project.

In [None]:
import json
import torch
import gc

def clear_gpu_memory():
    """Clear GPU memory cache"""
    if torch.cuda.is_available():
        torch.cuda.empty_cache()
        gc.collect()
        print("✅ GPU memory cleared")
    else:
        print("⚠️ No GPU available")

def check_gpu_memory():
    """Check current GPU memory usage"""
    if torch.cuda.is_available():
        allocated = torch.cuda.memory_allocated(0) / 1e9
        reserved = torch.cuda.memory_reserved(0) / 1e9
        total = torch.cuda.get_device_properties(0).total_memory / 1e9
        
        print(f"GPU Memory:")
        print(f"  Allocated: {allocated:.1f} GB")
        print(f"  Reserved:  {reserved:.1f} GB")
        print(f"  Total:     {total:.1f} GB")
        print(f"  Free:      {total - reserved:.1f} GB")
    else:
        print("⚠️ No GPU available")

def backup_to_drive(source_path, drive_backup_path=None):
    """Backup important files to Google Drive"""
    if drive_backup_path is None:
        drive_backup_path = f"/content/drive/MyDrive/llama-insurance-finetune/backup_{datetime.now().strftime('%Y%m%d_%H%M%S')}"
    
    os.makedirs(drive_backup_path, exist_ok=True)
    !cp -r {source_path} {drive_backup_path}
    print(f"✅ Backup created at: {drive_backup_path}")

def load_config(config_path):
    """Load JSON configuration file"""
    with open(config_path, 'r') as f:
        return json.load(f)

def save_config(config, config_path):
    """Save configuration to JSON file"""
    with open(config_path, 'w') as f:
        json.dump(config, f, indent=2)
    print(f"✅ Configuration saved to: {config_path}")

# Test helper functions
print("🛠️ Helper functions loaded:")
print("- clear_gpu_memory()")
print("- check_gpu_memory()")
print("- backup_to_drive(source_path)")
print("- load_config(config_path)")
print("- save_config(config, config_path)")

# Check current GPU memory
check_gpu_memory()