# LendSafe: Fine-tune Granite Model on Google Colab

This notebook fine-tunes IBM Granite 4.0 H 350M for loan explanation generation.

**Setup:**
1. Runtime ‚Üí Change runtime type ‚Üí GPU (T4 is fine)
2. Run all cells in order
3. Download the fine-tuned model at the end

## 1. Install Dependencies

In [None]:
!pip install -q torch transformers accelerate peft datasets

## 2. Upload Training Data

Upload your `training_examples.jsonl` file from the LendSafe project.

In [None]:
from google.colab import files
import os

print("üì§ Upload your training_examples.jsonl file")
uploaded = files.upload()

# Verify upload
if 'training_examples.jsonl' in uploaded:
    print("‚úÖ Training data uploaded successfully!")
    print(f"   File size: {len(uploaded['training_examples.jsonl']) / 1024:.1f} KB")
else:
    print("‚ùå Please upload training_examples.jsonl")

üì§ Upload your training_examples.jsonl file


‚ùå Please upload training_examples.jsonl


## 3. Load Model and Configure LoRA

In [None]:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, TrainingArguments, Trainer, DataCollatorForLanguageModeling
from peft import LoraConfig, get_peft_model
from datasets import load_dataset

print("üîß Configuration")
MODEL_ID = "ibm-granite/granite-4.0-h-350m"
MAX_LENGTH = 256
BATCH_SIZE = 1  # GPU can handle more
GRADIENT_ACCUMULATION = 2
LEARNING_RATE = 2e-4
NUM_EPOCHS = 3

# Check GPU
device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"‚úÖ Using device: {device}")
if device == "cuda":
    print(f"   GPU: {torch.cuda.get_device_name(0)}")
    print(f"   Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB")

üîß Configuration
‚úÖ Using device: cuda
   GPU: Tesla T4
   Memory: 15.8 GB


In [None]:
# Load model and tokenizer
print("üì• Loading IBM Granite 350M model...")
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, trust_remote_code=True)

if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID,
    torch_dtype=torch.float16,
    device_map="auto",
    trust_remote_code=True
)

print(f"‚úÖ Model loaded: {model.num_parameters():,} parameters")

üì• Loading IBM Granite 350M model...
‚úÖ Model loaded: 340,332,224 parameters


In [None]:
# Configure LoRA
print("üîß Configuring LoRA...")
lora_config = LoraConfig(
    r=8,
    lora_alpha=16,
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM"
)

model = get_peft_model(model, lora_config)

trainable = sum(p.numel() for p in model.parameters() if p.requires_grad)
total = sum(p.numel() for p in model.parameters())
print(f"‚úÖ LoRA configured:")
print(f"   Trainable params: {trainable:,} ({100*trainable/total:.2f}%)")
print(f"   Total params: {total:,}")

üîß Configuring LoRA...
‚úÖ LoRA configured:
   Trainable params: 163,840 (0.05%)
   Total params: 340,496,064


## 4. Prepare Dataset

In [None]:
# Load dataset
print("üìä Loading training data...")
dataset = load_dataset('json', data_files='training_examples.jsonl', split='train')
print(f"‚úÖ Loaded {len(dataset)} examples")

# Format prompts
def format_prompt(example):
    prompt = f"""### Instruction:
{example['instruction']}

### Input:
{example['input']}

### Response:
{example['output']}"""
    return {"text": prompt}

dataset = dataset.map(format_prompt, remove_columns=dataset.column_names)

# Tokenize
def tokenize_function(examples):
    return tokenizer(
        examples["text"],
        truncation=True,
        max_length=MAX_LENGTH,
        padding="max_length"
    )

tokenized_dataset = dataset.map(
    tokenize_function,
    batched=True,
    remove_columns=["text"]
)

# Split
split_dataset = tokenized_dataset.train_test_split(test_size=0.1, seed=42)
print(f"‚úÖ Train: {len(split_dataset['train'])}, Val: {len(split_dataset['test'])}")

üìä Loading training data...
‚úÖ Loaded 1500 examples


Map:   0%|          | 0/1500 [00:00<?, ? examples/s]

‚úÖ Train: 1350, Val: 150


## 5. Train Model

In [None]:
# Training arguments
training_args = TrainingArguments(
    output_dir="./granite-finetuned",
    num_train_epochs=NUM_EPOCHS,
    per_device_train_batch_size=BATCH_SIZE,
    per_device_eval_batch_size=BATCH_SIZE,
    gradient_accumulation_steps=GRADIENT_ACCUMULATION,
    learning_rate=LEARNING_RATE,
    fp16=True,
    logging_steps=20,
    eval_strategy="steps",
    eval_steps=100,
    save_strategy="steps",
    save_steps=100,
    save_total_limit=2,
    warmup_steps=50,
    load_best_model_at_end=True,
    report_to="none"
)

# Data collator
data_collator = DataCollatorForLanguageModeling(
    tokenizer=tokenizer,
    mlm=False
)

# Trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=split_dataset["train"],
    eval_dataset=split_dataset["test"],
    data_collator=data_collator,
)

print("üöÄ Starting training...")
print("‚è∞ Expected time: 15-30 minutes on T4 GPU")

The model is already on multiple devices. Skipping the move to device specified in `args`.


üöÄ Starting training...
‚è∞ Expected time: 15-30 minutes on T4 GPU


In [None]:
# Train!
trainer.train()

Step,Training Loss,Validation Loss
100,0.6156,0.508336
200,0.3859,0.379051
300,0.3659,0.369811
400,0.3606,0.36523
500,0.3621,0.362569
600,0.3646,0.359952
700,0.357,0.361346
800,0.3472,0.358096
900,0.3463,0.352499
1000,0.348,0.352406


TrainOutput(global_step=2025, training_loss=0.4109481860973217, metrics={'train_runtime': 10965.496, 'train_samples_per_second': 0.369, 'train_steps_per_second': 0.185, 'total_flos': 1638718768742400.0, 'train_loss': 0.4109481860973217, 'epoch': 3.0})

## 6. Test the Fine-tuned Model

In [None]:
# Test generation
test_prompt = """### Instruction:
Explain why this loan application was approved.

### Input:
Credit Score: 720
Debt-to-Income Ratio: 28%
Loan Amount: $25,000
Annual Income: $85,000
Employment Length: 5 years
Delinquencies (2 yrs): 0
Credit Inquiries (6 mo): 1

### Response:
"""

inputs = tokenizer(test_prompt, return_tensors="pt").to(model.device)

print("üß™ Testing fine-tuned model...")
with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=200,
        temperature=0.7,
        do_sample=True,
        top_p=0.9,
        pad_token_id=tokenizer.eos_token_id
    )

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print("\n" + "="*60)
print("GENERATED EXPLANATION:")
print("="*60)
print(response)
print("="*60)

üß™ Testing fine-tuned model...

GENERATED EXPLANATION:
### Instruction:
Explain why this loan application was approved.

### Input:
Credit Score: 720
Debt-to-Income Ratio: 28%
Loan Amount: $25,000
Annual Income: $85,000
Employment Length: 5 years
Delinquencies (2 yrs): 0
Credit Inquiries (6 mo): 1

### Response:
Based on the information provided, the applicant has a strong credit history with no recent credit issues. Their debt-to-income ratio is very low at 28%, which indicates a good ability to manage debt payments. They have no recent delinquencies, and their employment history is stable. All these factors contribute to their approval for a $25,000 loan.

In summary, the applicant's strong creditworthiness, manageable debt levels, and reliable employment history make them a suitable candidate for the $25,000 loan.


## 7. Save and Download Model

In [None]:
# Save model
print("üíæ Saving fine-tuned model...")
trainer.save_model("./granite-finetuned-final")
tokenizer.save_pretrained("./granite-finetuned-final")
print("‚úÖ Model saved!")

# Create zip for download
!zip -r granite-finetuned-final.zip granite-finetuned-final/
print("\nüì¶ Model packaged for download")

üíæ Saving fine-tuned model...
‚úÖ Model saved!
  adding: granite-finetuned-final/ (stored 0%)
  adding: granite-finetuned-final/special_tokens_map.json (deflated 79%)
  adding: granite-finetuned-final/vocab.json (deflated 56%)
  adding: granite-finetuned-final/adapter_config.json (deflated 57%)
  adding: granite-finetuned-final/README.md (deflated 66%)
  adding: granite-finetuned-final/tokenizer.json (deflated 80%)
  adding: granite-finetuned-final/adapter_model.safetensors (deflated 7%)
  adding: granite-finetuned-final/training_args.bin (deflated 54%)
  adding: granite-finetuned-final/merges.txt (deflated 50%)
  adding: granite-finetuned-final/tokenizer_config.json (deflated 95%)
  adding: granite-finetuned-final/chat_template.jinja (deflated 79%)

üì¶ Model packaged for download


In [None]:
# Download the model
from google.colab import files

print("‚¨áÔ∏è Downloading fine-tuned model...")
files.download('granite-finetuned-final.zip')
print("\n‚úÖ Download started!")
print("\nTo use locally:")
print("1. Extract granite-finetuned-final.zip")
print("2. Move to LendSafe/models/granite-finetuned/")
print("3. Run evaluation script")

‚¨áÔ∏è Downloading fine-tuned model...


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>


‚úÖ Download started!

To use locally:
1. Extract granite-finetuned-final.zip
2. Move to LendSafe/models/granite-finetuned/
3. Run evaluation script


## üéâ Done!

Your Granite model is now fine-tuned for loan explanations!

**Next steps:**
1. Download the model zip file
2. Extract and place in your local LendSafe project
3. Run `python scripts/evaluate_model.py` to get metrics

**Training Summary:**
- Model: IBM Granite 4.0 H 350M
- Method: LoRA (0.1% parameters trained)
- Data: 1,500 loan explanation examples
- Training time: ~15-30 minutes on T4 GPU
- Cost: $0 (free Colab)