# 🚀 Financial Model Fine-Tuning in Google Colab

Train the ServiceNow/Apriel-Nemotron-15B-Thinker model on financial datasets using Unsloth's optimized framework.

**⚠️ Important Notes:**
- This notebook requires Colab Pro/Pro+ for sufficient GPU memory
- A100 GPU recommended (V100 may work with smaller batch sizes)
- Training time: 8-24 hours depending on GPU
- Model size: ~50GB (including datasets)

## 📋 Requirements
- Colab Pro/Pro+ subscription
- A100 GPU (preferably)
- High RAM runtime
- ~100GB storage space

## Step 1: Environment Setup

In [None]:
# Check GPU availability
!nvidia-smi

# Check available memory
!free -h

In [None]:
# Install system dependencies
!apt-get update && apt-get install -y \
    build-essential \
    git \
    python3-dev

In [None]:
# Clone the repository
!git clone https://github.com/strykesg/finetuner.git
%cd finetuner

In [None]:
# Install Python dependencies
!pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
!pip install -r requirements.txt

In [None]:
# Verify installation
import torch
print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1024**3:.1f} GB")

try:
    from unsloth import FastLanguageModel
    print("✅ Unsloth available")
except ImportError as e:
    print(f"❌ Unsloth error: {e}")

## Step 2: Dataset Preparation

In [None]:
# Create necessary directories
!mkdir -p datasets/alpaca datasets/sharegpt random

# Copy sample datasets (you can upload your own)
!cp datasets/alpaca/sample_alpaca.json random/
!cp datasets/sharegpt/sample_sharegpt.json random/

In [None]:
# Convert datasets
!python convert_datasets.py

## Step 3: Memory Optimization Check

In [None]:
# Check if we can load the model
try:
    from unsloth import FastLanguageModel
    print("Testing model loading...")

    # Test with smaller model first
    model, tokenizer = FastLanguageModel.from_pretrained(
        model_name="unsloth/Mistral-7B-Instruct-v0.3-bnb-4bit",
        max_seq_length=2048,  # Smaller for testing
        dtype=None,
        load_in_4bit=True,
    )
    print("✅ Smaller model loaded successfully")

    # Clean up
    del model, tokenizer
    torch.cuda.empty_cache()

except Exception as e:
    print(f"❌ Model loading failed: {e}")
    print("Try using a smaller model or different quantization")

## Step 4: Training Configuration

In [None]:
# Colab-optimized training script
%%writefile train_colab.py

# Colab-optimized version with smaller parameters
import os
import torch
from datasets import load_dataset
from transformers import TrainingArguments
from trl import SFTTrainer
from unsloth import FastLanguageModel, is_bfloat16_supported

def main():
    # Colab-friendly model (smaller)
    model_name = "unsloth/Mistral-7B-Instruct-v0.3-bnb-4bit"  # Smaller than Nemotron-15B
    output_dir = "./colab_finance_model"

    print("Loading model...")
    model, tokenizer = FastLanguageModel.from_pretrained(
        model_name=model_name,
        max_seq_length=2048,  # Reduced for Colab
        dtype=None,
        load_in_4bit=True,
    )

    # Apply PEFT
    model = FastLanguageModel.get_peft_model(
        model,
        r=16,  # Smaller LoRA rank for Colab
        target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"],
        lora_alpha=16,
        lora_dropout=0,
        bias="none",
        use_gradient_checkpointing="unsloth",
        random_state=3407,
    )

    # Load datasets
    print("Loading datasets...")
    try:
        train_dataset = load_dataset("json", data_files="datasets/alpaca/*.jsonl", split="train")
        print(f"Loaded {len(train_dataset)} training examples")
    except:
        print("No datasets found, using sample data...")
        # Create sample dataset
        sample_data = [
            {
                "instruction": "What is compound interest?",
                "input": "",
                "output": "Compound interest is the interest on a loan or deposit calculated based on both the initial principal and the accumulated interest from previous periods."
            }
        ]
        import json
        with open("sample_data.jsonl", "w") as f:
            for item in sample_data:
                f.write(json.dumps(item) + "\n")
        train_dataset = load_dataset("json", data_files="sample_data.jsonl", split="train")

    # Training arguments optimized for Colab
    training_args = TrainingArguments(
        per_device_train_batch_size=1,  # Very small for Colab
        gradient_accumulation_steps=4,   # Effective batch size of 4
        warmup_steps=5,
        num_train_epochs=1,  # Start with 1 epoch
        learning_rate=2e-4,
        fp16=not is_bfloat16_supported(),
        bf16=is_bfloat16_supported(),
        logging_steps=1,
        optim="adamw_8bit",
        weight_decay=0.01,
        lr_scheduler_type="linear",
        seed=3407,
        output_dir=output_dir,
        save_steps=50,  # Save frequently in case of disconnection
        save_total_limit=2,
    )

    # Initialize trainer
    trainer = SFTTrainer(
        model=model,
        tokenizer=tokenizer,
        train_dataset=train_dataset,
        dataset_text_field="text",
        max_seq_length=2048,
        dataset_num_proc=2,
        packing=False,
        args=training_args,
    )

    # Train
    print("Starting training...")
    trainer.train()

    # Save
    print(f"Training complete! Saving to {output_dir}")
    model.save_pretrained(output_dir)
    tokenizer.save_pretrained(output_dir)

    print("✅ Model saved successfully!")

if __name__ == "__main__":
    main()


## Step 5: Start Training

In [None]:
# Start training with Colab optimizations
!python train_colab.py

## Step 6: Test the Trained Model

In [None]:
# Test the trained model
from unsloth import FastLanguageModel
import torch

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="./colab_finance_model",
    max_seq_length=2048,
    dtype=None,
    load_in_4bit=True,
)

FastLanguageModel.for_inference(model)

# Test prompt
messages = [
    {"role": "system", "content": "You are a financial trading assistant."},
    {"role": "user", "content": "What is compound interest?"}
]

inputs = tokenizer.apply_chat_template(messages, tokenize=True, return_tensors="pt")
outputs = model.generate(
    input_ids=inputs.to("cuda"),
    max_new_tokens=256,
    temperature=0.7,
    do_sample=True
)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print("Model Response:")
print(response)

## Step 7: Download Results

In [None]:
# Zip the model for download
!zip -r colab_finance_model.zip colab_finance_model/

# Download link
from google.colab import files
files.download('colab_finance_model.zip')

# 📋 Colab Training Notes

## Memory Issues?
- Reduce `max_seq_length` to 1024
- Use smaller model: `unsloth/Mistral-7B-Instruct-v0.3-bnb-4bit`
- Reduce batch size and increase gradient accumulation

## Time Limits?
- Save checkpoints frequently
- Use resumable training
- Consider using Colab Pro+ for longer sessions

## Storage Issues?
- Clear cache: `rm -rf ~/.cache/huggingface/datasets/`
- Use smaller datasets
- Delete intermediate files

## Success Tips
- Start with small test runs
- Monitor GPU memory usage
- Save frequently to avoid losing progress
- Use A100 GPU when available