# Fine-Tuned LlaMa 3.2 3B - Financial Aid Usecase 

This notebook pulls down the unsloth LlaMa 3.2 3B model from HuggingFace and leverages a pre-made synthetic dataset comprising of fictitious student records. LoRA adapters were used to fine-tune the model efficiently, reducing memory usage by keeping the base model frozen and training only additional adapter layers. No real data is used in this dataset and the code is free from any conflict of interest to the best of the author's knowledge. This author assumes the user has a HuggingFace token and has set their GPU setting to T4.


## Step 1: Uncomment the following, run, then comment and restart session (do not factory reset)

In [None]:
# Run this first, then comment out and restart session
# !pip install -q -U -i https://pypi.org/simple/ bitsandbytes
# !pip install -q -U trl
# !pip install -q -U peft
# !pip install faker

## Step 2: Install dependencies and deal with memory settings

In [1]:
import numpy as np
import pandas as pd
import re
import torch
import random
from torch.utils.data import Dataset
from torch.optim import AdamW
from torch.utils.data import DataLoader

# For LLM
from peft import LoraConfig, PeftModel
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    BitsAndBytesConfig,
    TrainingArguments,
    set_seed,
    pipeline
)
from trl import SFTTrainer, setup_chat_format, SFTConfig

from time import time


In [2]:
# Import HuggingFace Secret
from kaggle_secrets import UserSecretsClient

secret_label = "HUGGINGFACE_TOKEN"
secret_value = UserSecretsClient().get_secret(secret_label)

In [3]:
# The following helps with the CUDA memory issue
torch.cuda.empty_cache()
torch.cuda.reset_max_memory_allocated()




## Step 3: Load Datasets and Synthetic Database

In [4]:
# Load Data to Fine Tune LlaMa
root_path = '/kaggle/input/student-chat-data/student_chat_ai_data'
train_df = pd.read_csv(f"{root_path}/train_data.csv", index_col=0)
eval_df = pd.read_csv(f"{root_path}/eval_data.csv", index_col=0)
synthetic_student_population_df = pd.read_csv(f"{root_path}/synthetic_population_data.csv", index_col=0)

In [5]:
train_df

Unnamed: 0,context,user_input,answer
0,Student Linda Bailey (ID: S00001) has a GPA of...,What financial aid do I qualify for?,You qualify for the Need-Based Grant because y...
1,Student John Doe (ID: S00002) has a GPA of 3.8...,What financial aid am I eligible for?,John Doe qualifies for the Merit-Based Scholar...
2,Student Sarah Carter (ID: S00003) has a GPA of...,Can I apply for any financial aid?,Yes! You qualify for the STEM Excellence Award...
3,Student John Doe (ID: S10123) has a GPA of 2.3...,What am I to be eligible for financial aid?,"Due to your income, you do not qualify for a n..."


In [6]:
eval_df

Unnamed: 0,context,user_input,answer
0,Student Emily Carter (ID: S00023) has a GPA of...,Am I eligible for any financial aid?,"Yes, you qualify for the STEM Excellence Award..."
1,Student Mark Smith (ID: S00045) has a GPA of 2...,Can I get coverage with my education program?,"Unfortunately, you do not qualify for any fina..."
2,Student Olivia Jones (ID: S00030) has a GPA of...,Which financial aid can do I qualify for?,You qualify for the STEM Excellence Award sinc...
3,Student Daniel Thompson (ID: S00051) has a GPA...,What financial aid options do I have?,You are eligible for the Need-Based Grant sinc...


In [7]:
synthetic_student_population_df

Unnamed: 0,id,name,gpa,field_of_study,income
0,S00001,Marc Solomon,2.53,Anthropology,73706
1,S00002,Suzanne Kidd,2.50,Business Administration,22259
2,S00003,Michelle Griffin,2.42,Political Science,31192
3,S00004,Hannah Dixon,2.05,Engineering,530431
4,S00005,Mary Harris,3.13,Chemistry,370142
...,...,...,...,...,...
995,S00996,Nicole Clay,3.97,Chemistry,353792
996,S00997,Tammy Rose,3.34,Computer Science,414283
997,S00998,Jeff Alvarez,3.96,Psychology,450287
998,S00999,Thomas Lowe,3.02,Business Administration,54775


In [8]:
class StudentDialogueDataset(Dataset):
    def __init__(self, data, tokenizer, max_length=512):
        self.tokenizer = tokenizer
        self.data = data
        self.max_length = max_length

    def __len__(self):
        return len(self.data)

    def __getitem__(self, idx):
        item = self.data[idx]
        prompt = f"### Role: You are an AI financial aid evaluator. Answer the user's question based on the provided student's profile.\n\n### Context:\n{item['context']}\n\n### User's Question:\n{item['user_input']}\n\n###Answer:"
        target = item["answer"]
        encoding = self.tokenizer(prompt, padding="max_length", truncation=True, max_length=self.max_length, return_tensors="pt")
        target_encoding = self.tokenizer(target, padding="max_length", truncation=True, max_length=self.max_length, return_tensors="pt")
        
        return {
            "input_ids": encoding["input_ids"].squeeze(0),
            "attention_mask": encoding["attention_mask"].squeeze(0),
            "labels": target_encoding["input_ids"].squeeze(0),
        }


## Step 4: Load LlaMa 3 8B Model and Train

#### If BitsAndBytes 8-bit quantization error pops up requiring installation and you already ran the first cell, simply restart the session (do NOT factory reset)

In [9]:

# Set the data type for computations to float16, bfloat16 not supported on T4/P100
compute_dtype = getattr(torch, "float16")

# Configure the BitsAndBytes settings for 4-bit quantization to reduce memory usage
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,  # Enable 4-bit quantization
    bnb_4bit_use_double_quant=True,  # Use double quantization for improved precision
    bnb_4bit_quant_type="nf4",  # Specify the quantization type
    bnb_4bit_compute_dtype=compute_dtype,  # Set the computation data type
)

# Import Base Model
model_name = "Llama-3.2-3B"
model_path = "unsloth/Llama-3.2-3B"
base_model_name = re.sub(r"-", "_", model_name)
storage_path = f"./dist/{base_model_name.lower()}/{base_model_name.lower()}_base.pth"

# Record the start time to measure the loading duration
time_start = time()

# Load the pre-trained model with specified configurations
model = AutoModelForCausalLM.from_pretrained(
    model_path,
    torch_dtype=compute_dtype,  # Set the data type for the model
    load_in_8bit=True,  # 8-bit quantization to fit in 14GB
    use_cache=False,  # Disable caching to save memory
    device_map='auto',  # Automatically map the model to available devices (e.g., GPUs)
    token=secret_value
)

The `load_in_4bit` and `load_in_8bit` arguments are deprecated and will be removed in the future versions. Please, pass a `BitsAndBytesConfig` object in `quantization_config` argument instead.


model.safetensors:   0%|          | 0.00/6.43G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/230 [00:00<?, ?B/s]

In [10]:
from peft import get_peft_model

# Define LoRA configuration
lora_config = LoraConfig(
    r=8,  # Rank (smaller = more efficient)
    lora_alpha=16,
    lora_dropout=0.1,
    target_modules=["q_proj", "v_proj"],  # Apply LoRA to attention layers
    bias="none",
    task_type="CAUSAL_LM"  # For language modeling
)

# Apply LoRA to 8-bit model
model = get_peft_model(model, lora_config)

# After applying LoRA, confirm which layers are trainable
model.print_trainable_parameters()

trainable params: 2,293,760 || all params: 3,215,043,584 || trainable%: 0.0713


In [11]:

# Load the tokenizer associated with the model
tokenizer = AutoTokenizer.from_pretrained(model_path, token=secret_value)
tokenizer.pad_token = tokenizer.eos_token  # Set the padding token to the end-of-sequence token you could also introduce a special pad token but this is not needed.

tokenizer_config.json:   0%|          | 0.00/50.6k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/17.2M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/459 [00:00<?, ?B/s]

In [12]:
train_dataset = StudentDialogueDataset(train_df.to_dict(orient='records'), tokenizer)
eval_dataset = StudentDialogueDataset(eval_df.to_dict(orient='records'), tokenizer)

In [13]:
from transformers import TrainingArguments, Trainer

# Set Model to Train
model.train()

training_args = TrainingArguments(
    output_dir="./llama3-kaggle-checkpoints",
    num_train_epochs=3,  # Keep low to avoid timeouts
    per_device_train_batch_size=1,  # Keep batch size low
    gradient_accumulation_steps=8,  # Helps with small batch size
    learning_rate=5e-6,
    save_strategy="no",  # Avoid saving large model checkpoints
    logging_steps=10,
    fp16=True,  # Enable mixed precision for lower memory
    optim="adamw_torch",
    report_to="none"
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,  # Your tokenized dataset
    tokenizer=tokenizer
)

# Train the Model
trainer.train()

# Save LoRa Adapter Files
trainer.model.save_pretrained("./llama3-kaggle-checkpoints")

  trainer = Trainer(


Step,Training Loss


## Step 5: Preprocess Synthetic Student Population Data
Using the synthetic population dataset, construct prompts to test out the model's performance

In [18]:
fake_students = synthetic_student_population_df.to_dict(orient="records")

print(f"{len(fake_students)} student records available")
print("\nData record shape:\n")
print(fake_students[0])

1000 student records available

Data record shape:

{'id': 'S00001', 'name': 'Marc Solomon', 'gpa': 2.53, 'field_of_study': 'Anthropology', 'income': 73706}


In [20]:
from typing import Optional

@staticmethod
def determine_financial_aid_eligibility(gpa, field_of_study, income):
    """
    Uses a student record to determine coverage against some fictitious
    parameters
    """
    # Dummy logic for financial aid eligibility
    financial_aid = []
    requirements = []

    if gpa >= 3.6:
        requirements.append("GPA ≥ 3.6")
    if income <= 35000:
        requirements.append("Income ≤ $35,000")
    if field_of_study in [
        "Computer Science",
        "Engineering",
        "Mathematics",
        "IT",
        "Statistics",
    ]:
        financial_aid.append("STEM Excellence Award")
        requirements.append("Field of Study in STEM")
    if field_of_study in ["Psychology", "Sociology", "Social Work", "Anthropology"]:
        financial_aid.append("Behavioral and Social Sciences Grant")
        requirements.append("Field of Study in Behavioral and Social Sciences")
    if field_of_study in ["Paralegal", "Law"]:
        financial_aid.append("Legal Studies Full Ride")
        requirements.append("Field of Study in Legal Studies")

    return requirements, financial_aid

def retrieve_financial_aid_knowledge(student_id: Optional[str] = None, student_name: Optional[str] = None):
    """
    Retrieves financial aid eligibility knowledge for a student.
    """
    if student_id:
        student = next((item for item in fake_members if item["id"] == student_id), None)
    if student_name:
        student = next((item for item in fake_members if item["name"] == student_name), None)

    if not student:
        return None

    requirements, financial_aid = determine_financial_aid_eligibility(
        student["gpa"], student["field_of_study"], student["income"]
        )
        

    return {
        "id": student["id"],
        "name": student["name"],
        "gpa": student["gpa"],
        "field_of_study": student["field_of_study"],
        "income": student["income"],
        "financial_aid": financial_aid,
        "requirements": requirements,
    }

In [37]:
def format(student: dict) -> str:
    """
    Converts the retrieved student and financial aid knowledge into an LLM-readable prompt.
    """
    return f"Recipient {student['name']} (ID: {student['id']}) has a GPA of {student['gpa']} in {student['field_of_study']} " + \
        f"and an income of ${student['income']}. " + \
        f"Eligible financial aid: {', '.join(student['financial_aid']) if student.get('financial_aid') else 'None'}. " + \
        f"Requirements met: {', '.join(student['requirements']) if student.get('requirements') else 'None'}."


retrieved_knowledge = retrieve_financial_aid_knowledge("S00001")
print(retrieved_knowledge)  # Data as-is
print(format(retrieved_knowledge)) # Data formatted for LLM

{'id': 'S00001', 'name': 'Tracy Page', 'gpa': 3.28, 'field_of_study': 'Computer Science', 'income': 797572, 'financial_aid': ['STEM Excellence Award'], 'requirements': ['Field of Study in STEM']}
Recipient Tracy Page (ID: S00001) has a GPA of 3.28 in Computer Science and an income of $797572. Eligible financial aid: STEM Excellence Award. Requirements met: Field of Study in STEM.


## Step 6: Pre-build all prompts using synthetic population data

In [59]:
from typing import Optional, Tuple

def build_student_prompt(user_input: str, student_id: Optional[str] = None) -> Tuple[str, str]:
    """
    Builds student chat prompt for LLM and uses RAG to keep responses properly informed for
    the given student with student_id.
    """

    # student id available and valid, proceed with lookup
    knowledge = format(retrieve_financial_aid_knowledge(student_id))

    # Retrun both the knowledge rertieved and the prompt
    return (
        knowledge,
        f"""
        ### Role: You are an AI academic benefits evaluator. Answer the user's question based on the provided student's profile. Do not repeat the question or context, do not add any data not available regarding funding amount or renewals. Make sure to use the eligible financial aid and requirements met to inform their eligibility. \n\n### Context:\n{knowledge}\n\n### User's Question:\n{user_input}\n\n###Answer: 
        """
    )

In [60]:
# Define test dataset (Example questions)
def build_test_sentences():
    test_sentences = [
        f"What GPA do I need need to qualify for financial aid?",
        f"Am I eligible for scholarships?",
        f"What grants can I apply for?",
    ]

    return test_sentences

student_prompts = []

for student in fake_students:
    test_sentence = random.choice(build_test_sentences())
    knowledge_retrieved, prompt = build_student_prompt(user_input=test_sentence, student_id=student["id"])
    student_prompts.append(prompt)

student_prompts[:5]

["\n        ### Role: You are an AI academic benefits evaluator. Answer the user's question based on the provided student's profile. Do not repeat the question or context, do not add any data not available regarding funding amount or renewals. Make sure to use the eligible financial aid and requirements met to inform their eligibility. \n\n### Context:\nRecipient Tracy Page (ID: S00001) has a GPA of 3.28 in Computer Science and an income of $797572. Eligible financial aid: STEM Excellence Award. Requirements met: Field of Study in STEM.\n\n### User's Question:\nAm I eligible for scholarships?\n\n###Answer: \n        ",
 "\n        ### Role: You are an AI academic benefits evaluator. Answer the user's question based on the provided student's profile. Do not repeat the question or context, do not add any data not available regarding funding amount or renewals. Make sure to use the eligible financial aid and requirements met to inform their eligibility. \n\n### Context:\nRecipient David N

#### Set to start testing!

## Step 7: Run custom student inferences

In [48]:
# Insert custom inference
checkpoints_dir = "./llama3-kaggle-checkpoints"

# Enable CUDA if available
device = "cuda" if torch.cuda.is_available() else "cpu"

# Load base LLaMA 3 3B model in 8-bit mode
model = AutoModelForCausalLM.from_pretrained(
    model_path,
    load_in_8bit=True,  # Load in 8-bit mode
    device_map="auto",  # Automatically assigns to GPU
    token=secret_value
)

# Load fine-tuned LoRA adapter (Attaches the fine-tuned LoRA adapter to the base model)
model = PeftModel.from_pretrained(model, checkpoints_dir)
model = torch.compile(model) # To boost inference speed

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_path, token=secret_value)


The `load_in_4bit` and `load_in_8bit` arguments are deprecated and will be removed in the future versions. Please, pass a `BitsAndBytesConfig` object in `quantization_config` argument instead.


In [61]:
# Select first prompt
first_prompt = random.choice(student_prompts)
first_prompt

"\n        ### Role: You are an AI academic benefits evaluator. Answer the user's question based on the provided student's profile. Do not repeat the question or context, do not add any data not available regarding funding amount or renewals. Make sure to use the eligible financial aid and requirements met to inform their eligibility. \n\n### Context:\nRecipient Jasmine Randall (ID: S00989) has a GPA of 3.01 in Anthropology and an income of $464150. Eligible financial aid: Behavioral and Social Sciences Grant. Requirements met: Field of Study in Behavioral and Social Sciences.\n\n### User's Question:\nAm I eligible for scholarships?\n\n###Answer: \n        "

In [62]:
def extract_answer(response_text):
    """
    Extracts the generated answer from the LLaMA response.
    """

    # Match the last "Answer:" followed by actual content
    match = re.search(
        r"(?i)(?:Your response:|Answer:)\s*\n*(.*?)(?=\n\n|\Z)",
        response_text,
        re.DOTALL,
    )

    if match:
        extracted_answer = match.group(1).strip()
        return extracted_answer
    else:
        return "No answer found."

In [63]:
def generate_text(prompt, max_new_tokens=100):
    inputs = tokenizer(prompt, return_tensors="pt").to(device)
    with torch.no_grad():
        output = model.generate(**inputs, max_new_tokens=max_new_tokens)
    return tokenizer.decode(output[0], skip_special_tokens=True)

# Run first prompt
response = generate_text(first_prompt)
print(extract_answer(response))

You are eligible for Behavioral and Social Sciences Grant. You are eligible because you have a GPA of 3.01 in Anthropology and an income of $464150. Your field of study is in Behavioral and Social Sciences. You meet the requirements for Behavioral and Social Sciences Grant.


In [64]:
# Run another prompt
response = generate_text(random.choice(student_prompts))
print(extract_answer(response))

Yes, you are eligible for the STEM Excellence Award, which provides $10,000 for each year of study. This award is only available to students who are pursuing a degree in a STEM field and have a GPA of 3.0 or higher. You are also eligible for other scholarships and grants based on your income and financial need. Please contact the Financial Aid Office for more information on available aid options.


## Step 8: Merge and Save the Model

In [66]:
# Merge LoRA weights with the base model and save the final merged version
fine_tuned_model_storage_path = "./llama3-student-aid-finetuned"
model = model.merge_and_unload()
model.save_pretrained(fine_tuned_model_storage_path)
tokenizer.save_pretrained(fine_tuned_model_storage_path)



('./llama3-student-aid-finetuned/tokenizer_config.json',
 './llama3-student-aid-finetuned/special_tokens_map.json',
 './llama3-student-aid-finetuned/tokenizer.json')

## Step 8: Evaluate Model

### TBD - requires uploading separate dataset