# üéì Lab 13: Fine-Tuning Models
## Module 13 - Azure OpenAI Fine-Tuning with LoRA

**Duration:** 30 minutes

**Objectives:**
- Prepare training data in JSONL format
- Fine-tune GPT-4o-mini for JSON output
- Evaluate fine-tuned model

**Banking Scenario:** Fine-tune a loan document extractor

---

In [None]:
import os
import json

# =============================================================================
# GOOGLE COLAB SETUP - Add these secrets (click üîë icon):
#   - AZURE_OPENAI_KEY: Your API key
#   - AZURE_OPENAI_ENDPOINT: https://xxx.openai.azure.com/
#   - AZURE_OPENAI_DEPLOYMENT: Your model deployment name
# =============================================================================

DEMO_MODE = False
client = None
MODEL_NAME = "gpt-4o"

try:
    from google.colab import userdata
    AZURE_OPENAI_KEY = userdata.get('AZURE_OPENAI_KEY')
    AZURE_OPENAI_ENDPOINT = userdata.get('AZURE_OPENAI_ENDPOINT')
    try:
        MODEL_NAME = userdata.get('AZURE_OPENAI_DEPLOYMENT')
    except:
        pass
    if AZURE_OPENAI_KEY and AZURE_OPENAI_ENDPOINT:
        if not AZURE_OPENAI_ENDPOINT.startswith('http'):
            AZURE_OPENAI_ENDPOINT = 'https://' + AZURE_OPENAI_ENDPOINT
        print(f"‚úÖ Credentials loaded. Model: {MODEL_NAME}")
    else:
        raise ValueError("Missing")
except Exception:
    print("‚ö†Ô∏è Running in DEMO MODE")
    DEMO_MODE = True

if not DEMO_MODE:
    from openai import AzureOpenAI
    client = AzureOpenAI(
        api_key=AZURE_OPENAI_KEY,
        api_version="2024-06-01",
        azure_endpoint=AZURE_OPENAI_ENDPOINT
    )
    print("‚úÖ Client ready")

## Part 1: Prepare Training Data

In [None]:
# Training examples for loan extraction
training_examples = [
    {
        "messages": [
            {"role": "system", "content": "Extract loan details as JSON."},
            {"role": "user", "content": "Mortgage application for $350,000 at 6.25% fixed rate for 30 years. Applicant income: $95,000. Credit score: 720."},
            {"role": "assistant", "content": '{"loan_type": "mortgage", "amount": 350000, "rate": 6.25, "term_years": 30, "income": 95000, "credit_score": 720}'}
        ]
    },
    {
        "messages": [
            {"role": "system", "content": "Extract loan details as JSON."},
            {"role": "user", "content": "Auto loan request: $25,000 for 60 months at 7.9% APR. Buyer income $55,000, credit 680."},
            {"role": "assistant", "content": '{"loan_type": "auto", "amount": 25000, "rate": 7.9, "term_years": 5, "income": 55000, "credit_score": 680}'}
        ]
    },
    {
        "messages": [
            {"role": "system", "content": "Extract loan details as JSON."},
            {"role": "user", "content": "Personal loan: $15,000 at 12.99% for 3 years. Income: $45,000. Credit: 640."},
            {"role": "assistant", "content": '{"loan_type": "personal", "amount": 15000, "rate": 12.99, "term_years": 3, "income": 45000, "credit_score": 640}'}
        ]
    }
]

# Save as JSONL
with open("training_data.jsonl", "w") as f:
    for ex in training_examples:
        f.write(json.dumps(ex) + "\n")

print(f"‚úÖ Created training_data.jsonl with {len(training_examples)} examples")
print("\nSample:")
print(json.dumps(training_examples[0], indent=2))

## Part 2: Upload and Fine-Tune (Simulated)

Note: Actual fine-tuning requires Azure OpenAI fine-tuning access.

In [None]:
# This is the actual code for fine-tuning (requires access)
"""
# Upload training file
file = client.files.create(
    file=open("training_data.jsonl", "rb"),
    purpose="fine-tune"
)
print(f"File ID: {file.id}")

# Create fine-tuning job
job = client.fine_tuning.jobs.create(
    training_file=file.id,
    model="gpt-4o-mini-2024-07-18",
    hyperparameters={"n_epochs": 3}
)
print(f"Job ID: {job.id}")

# Check status
status = client.fine_tuning.jobs.retrieve(job.id)
print(f"Status: {status.status}")
"""

print("Fine-tuning code ready (requires Azure OpenAI fine-tuning access)")
print("\nCLI equivalent:")
print("az openai file upload --file training_data.jsonl --purpose fine-tune")
print("az openai fine-tuning job create --model gpt-4o-mini-2024-07-18 --training-file <file-id>")

## Part 3: Test Fine-Tuned Model (Simulated)

In [None]:
def test_extraction(model: str, text: str) -> dict:
    """Test loan extraction with a model"""
    response = client.chat.completions.create(
        model=model,
        messages=[
            {"role": "system", "content": "Extract loan details as JSON."},
            {"role": "user", "content": text}
        ],
        response_format={"type": "json_object"}
    )
    return json.loads(response.choices[0].message.content)

# Test with base model
test_text = "Home equity loan for $75,000 at 8.5% for 15 years. Homeowner income $120,000, credit 750."

result = test_extraction("gpt-4o-mini", test_text)
print("Base model extraction:")
print(json.dumps(result, indent=2))

# With fine-tuned model (when available):
# result = test_extraction("ft:gpt-4o-mini:banking:loan-extractor", test_text)

## Part 4: Evaluation Metrics

In [None]:
def evaluate_extraction(predicted: dict, expected: dict) -> dict:
    """Evaluate extraction accuracy"""
    correct = 0
    total = len(expected)
    
    for key, value in expected.items():
        if key in predicted and predicted[key] == value:
            correct += 1
    
    return {
        "accuracy": correct / total,
        "correct_fields": correct,
        "total_fields": total
    }

# Test evaluation
expected = {"loan_type": "home_equity", "amount": 75000, "rate": 8.5, "term_years": 15}
predicted = result

eval_result = evaluate_extraction(predicted, expected)
print(f"\nEvaluation: {eval_result['accuracy']:.0%} accuracy")
print(f"Correct fields: {eval_result['correct_fields']}/{eval_result['total_fields']}")

---
## ‚úÖ Lab 13 Complete!

**Key Takeaways:**
- Training data must be in JSONL format
- Fine-tuning improves consistency for specific tasks
- Always evaluate before deploying
- Fine-tuned models cost ~2x base model