# Fine-Tune Flan-T5-base on BillSum
Author: Gourab S.(heygourab),
github: https://github.com/heygourab

This notebook fine-tunes Flan-T5-base with LoRA on the BillSum dataset (~800 samples) for legal document summarization. Outputs a LoRA adapter (`lora_billsum`) and a `training_report.json` via RogerReportCallback.

The model is trained for 3 epochs with a batch size of 16 and a learning rate of 2e-4.
The model is saved in the `lora_billsum` directory.

## For macOS M1/Apple Silicon
Drop this cell into your Jupyter Notebook (assuming venv is active and you're not using Colab):

In [1]:
%pip install -r ../requirements/mac.txt -q

Note: you may need to restart the kernel to use updated packages.


In [1]:
# imports 
import os
import json
import logging
import sys
from datetime import datetime
from typing import Dict
import torch
from datasets import load_dataset, Dataset, DatasetDict
from transformers import (
    AutoTokenizer, AutoModelForSeq2SeqLM, Seq2SeqTrainer, Seq2SeqTrainingArguments,
    DataCollatorForSeq2Seq, TrainerCallback
)
from peft import LoraConfig, get_peft_model, TaskType
from accelerate import Accelerator
import evaluate
from omegaconf import OmegaConf
import pandas as pd
import matplotlib.pyplot as plt
import psutil


  from .autonotebook import tqdm as notebook_tqdm


### Setup logger
Sets up logging to track training progress and debugging information. The logger is configured to:
- Display timestamp, log level, and message
- Output logs to standard output (stdout)
- Use INFO level logging
- Create a logger instance named after the current module

In [2]:
# Logging setup
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(levelname)s - %(message)s',
    handlers=[logging.StreamHandler(sys.stdout)]
)
logger = logging.getLogger(__name__)

## Memory Usage Monitoring
The `print_memory_usage()` function monitors system resource utilization during model training:

- Tracks RAM usage by getting the Resident Set Size (RSS) of current process in GB
- For GPU-enabled systems:
    - Reports allocated GPU memory
    - Shows total available GPU memory
    - Calculates percentage of GPU memory utilization
    - Resets peak memory tracking statistics

This helps identify potential memory bottlenecks and optimize resource usage during training.

In [5]:
# Memory usage
def print_memory_usage():
    process = psutil.Process(os.getpid()) # Get current process
    # Get memory usage in GB
    ram_gb = process.memory_info().rss / 1e9
    # Get total memory in GB
    total_gb = psutil.virtual_memory().total / 1e9
    # Print memory usage
    logger.info(f"RAM usage: {ram_gb:.2f} GB")
    logger.info(f"Total memory: {total_gb:.2f} GB")

    # if running on GPU, print GPU memory usage
    if torch.cuda.is_available():
        gpu_mem = torch.cuda.memory_allocated() / 1e9
        gpu_total = torch.cuda.get_device_properties(0).total_memory / 1e9
        logger.info(f"GPU memory: {gpu_mem:.2f}/{gpu_total:.2f} GB ({gpu_mem/gpu_total*100:.1f}%)")
        torch.cuda.reset_peak_memory_stats()

In [7]:
# Testing memory usage
print_memory_usage()

2025-05-15 16:55:56,541 - INFO - RAM usage: 0.03 GB
2025-05-15 16:55:56,543 - INFO - Total memory: 8.59 GB
