# 🏦 Financial Regulation LLM Fine-tuning on Google Colab

This notebook demonstrates how to fine-tune a small language model for Singapore financial regulation Q&A using LoRA/QLoRA.

## 🎯 Project Overview

- **Goal**: Replace expensive large-model RAG calls with cost-effective fine-tuned small models
- **Domain**: Singapore financial regulations (MAS guidelines, compliance docs)
- **Approach**: LoRA fine-tuning for efficient parameter adaptation
- **Benefits**: 99.7% cost reduction, local hosting capability, faster responses

## 📋 Table of Contents

1. [Setup and Installation](#setup)
2. [Dataset Preparation](#dataset)
3. [Model Fine-tuning](#training)
4. [Evaluation](#evaluation)
5. [Inference Demo](#inference)
6. [Results Analysis](#results)


In [None]:
# Install required packages
!pip install torch transformers datasets peft accelerate bitsandbytes
!pip install nltk rouge-score pandas numpy
!pip install beautifulsoup4 requests

# Download NLTK data for evaluation
import nltk
nltk.download('punkt')

print("✅ All dependencies installed successfully!")


In [None]:
# Clone the project repository
!git clone https://github.com/yihhan/finetune.git
%cd finetune

# Check if we have GPU available
import torch
print(f"🔧 Device: {'CUDA' if torch.cuda.is_available() else 'CPU'}")
if torch.cuda.is_available():
    print(f"🚀 GPU: {torch.cuda.get_device_name(0)}")
    print(f"💾 GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1024**3:.1f}GB")
else:
    print("⚠️ No GPU detected - training will be slower on CPU")


In [None]:
# Run improved dataset preparation
!python improved_dataset_prep.py

# Run improved training with better parameters
print("🚀 Starting improved model fine-tuning...")
!python improved_train.py

# Run improved inference demo
print("🎯 Testing improved fine-tuned model...")
!python improved_inference.py --demo

print("✅ Complete pipeline executed successfully!")


## 🔍 Inspect Enhanced Dataset

Quickly preview a few entries and distribution by category.


In [None]:
# Preview enhanced dataset
import json, pandas as pd
qa_path = "processed_data/enhanced_financial_regulation_qa.json"
train_path = "processed_data/enhanced_training_data.json"

with open(qa_path, "r", encoding="utf-8") as f:
    qa = json.load(f)
with open(train_path, "r", encoding="utf-8") as f:
    train = json.load(f)

df = pd.DataFrame(qa)
print(f"Rows: {len(df)} | Columns: {list(df.columns)}")
print("\nCategory counts:\n", df['category'].value_counts())
print("\nSample rows:")
print(df.head(3).to_string(index=False))


## 📈 Evaluate Fine-tuned vs Base vs RAG

Runs `eval.py` and renders a compact table and bar charts.


In [None]:
# Ensure evaluator points to improved model output
import json, os

# Run evaluation
!python eval.py

# Load summary
summary_path = "evaluation_results/summary_metrics.json"
if os.path.exists(summary_path):
    with open(summary_path, "r", encoding="utf-8") as f:
        summary = json.load(f)
    import pandas as pd
    rows = []
    for k, pretty in [("base_model","Base"),("finetuned_model","Fine-tuned"),("rag_model","RAG (GPT-4)")]:
        if k in summary:
            rows.append({
                "Model": pretty,
                "BLEU": summary[k]["avg_bleu"],
                "ROUGE-1": summary[k]["avg_rouge1"],
                "ROUGE-2": summary[k]["avg_rouge2"],
                "ROUGE-L": summary[k]["avg_rougeL"],
                "Avg Time (s)": summary[k]["avg_time"],
            })
    df = pd.DataFrame(rows)
    print("\nResults summary:\n", df.to_string(index=False))
else:
    print("Summary not found at", summary_path)


## 💬 Inference Viewer (Improved)

Ask questions and view answers inline. Uses `improved_inference.py`.


In [None]:
# Single question inference
import subprocess, json, shlex

def ask(q):
    cmd = f"python improved_inference.py --question \"{q}\""
    print("\nQ:", q)
    print("A:")
    try:
        out = subprocess.check_output(shlex.split(cmd), stderr=subprocess.STDOUT, text=True)
        print(out)
    except subprocess.CalledProcessError as e:
        print(e.output)

ask("What are the capital adequacy requirements for banks in Singapore?")
ask("How should financial institutions implement AML measures?")


## 📦 Optional: Export Artifacts to Google Drive

Save the fine-tuned model, adapters, and results to Drive for later use.


In [None]:
# from google.colab import drive
# drive.mount('/content/drive')
# !mkdir -p /content/drive/MyDrive/finreg_llm
# !cp -r improved_finetuned_financial_model /content/drive/MyDrive/finreg_llm/
# !cp -r evaluation_results /content/drive/MyDrive/finreg_llm/
# print("Saved to Drive:/MyDrive/finreg_llm")


# 🏦 Financial Regulation LLM Fine-tuning on Google Colab

This notebook demonstrates how to fine-tune a small language model for Singapore financial regulation Q&A using LoRA/QLoRA.

## 🎯 Project Overview

- **Goal**: Replace expensive large-model RAG calls with cost-effective fine-tuned small models
- **Domain**: Singapore financial regulations (MAS guidelines, compliance docs)
- **Approach**: LoRA fine-tuning for efficient parameter adaptation
- **Benefits**: 99.7% cost reduction, local hosting capability, faster responses

## 📋 Table of Contents

1. [Setup and Installation](#setup)
2. [Dataset Preparation](#dataset)
3. [Model Fine-tuning](#training)
4. [Evaluation](#evaluation)
5. [Inference Demo](#inference)
6. [Results Analysis](#results)


## 🔧 Setup and Installation {#setup}

First, let's install all the required dependencies and clone the project repository.


In [None]:
# Install required packages
!pip install torch transformers datasets peft accelerate bitsandbytes
!pip install nltk rouge-score pandas numpy
!pip install beautifulsoup4 requests

# Download NLTK data for evaluation
import nltk
nltk.download('punkt')

print("✅ All dependencies installed successfully!")


In [None]:
# Clone the project repository
!git clone https://github.com/yihhan/finetune.git
%cd finetune

# Check if we have GPU available
import torch
print(f"🔧 Device: {'CUDA' if torch.cuda.is_available() else 'CPU'}")
if torch.cuda.is_available():
    print(f"🚀 GPU: {torch.cuda.get_device_name(0)}")
    print(f"💾 GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1024**3:.1f}GB")
else:
    print("⚠️ No GPU detected - training will be slower on CPU")


## 📊 Dataset Preparation {#dataset}

Let's prepare the Singapore financial regulation dataset for training.


In [None]:
# Run improved dataset preparation
!python improved_dataset_prep.py

# Check what data was created
import os
print("📁 Enhanced dataset files created:")
for root, dirs, files in os.walk("processed_data"):
    for file in files:
        if "enhanced" in file:
            file_path = os.path.join(root, file)
            size = os.path.getsize(file_path)
            print(f"  {file_path} ({size} bytes)")

# Display sample data
import json
with open("processed_data/enhanced_financial_regulation_qa.json", "r") as f:
    data = json.load(f)
    
print(f"\n📊 Enhanced Dataset Summary:")
print(f"  Total Q&A pairs: {len(data)}")
print(f"  Categories: {set(item['category'] for item in data)}")

print(f"\n📝 Sample Q&A:")
sample = data[0]
print(f"Q: {sample['question']}")
print(f"A: {sample['answer'][:200]}...")
print(f"Category: {sample['category']}")

# Show training data size
with open("processed_data/enhanced_training_data.json", "r") as f:
    training_data = json.load(f)
print(f"\n🚀 Training samples: {len(training_data)} (with augmentation)")
