# Fine-Tuned PHI 4 Mini - Financial Aid Chat Usecase 

This notebook pulls down the Microsoft PHI 4 Mini model from HuggingFace and leverages a pre-made synthetic dataset comprising of fictitious student records. LoRA adapters were used to fine-tune the model efficiently, reducing memory usage by keeping the base model frozen and training only additional adapter layers. No real data is used in this dataset and the code is free from any conflict of interest to the best of the author's knowledge.

View on Kaggle: [![Kaggle](https://img.shields.io/badge/Kaggle-035a7d?style=for-the-badge&logo=kaggle&logoColor=white)](https://www.kaggle.com/code/edyvision/fine-tuned-phi-4-mini-financial-aid-chat-usecase)


## Step 1: Install Dependencies

In [None]:
!pip install -U torch transformers psutil GPUtil seaborn matplotlib fvcore evaluate  prettytable peft accelerate

## Step 2: Import dependencies and deal with memory settings

In [1]:
import os
import pandas as pd
import re
import torch
import random
from sklearn.model_selection import train_test_split
from time import time
# For LLM
from peft import get_peft_model, LoraConfig, PeftModel, TaskType
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    Trainer,
    TrainingArguments,
    DataCollatorWithPadding
)


In [13]:
# Set the device (local is MPS, deployed will almost always be CUDA)
device = torch.device("mps") if torch.backends.mps.is_available() else torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")

print(f"Using device: {device}")


Using device: cuda


In [3]:
# Import HuggingFace Secret
from kaggle_secrets import UserSecretsClient

secret_label = "HUGGINGFACE_TOKEN"
secret_value = UserSecretsClient().get_secret(secret_label)

In [4]:
local_storage_dir = "/kaggle/working"

## Step 3: Load Datasets and Synthetic Database

In [5]:
# Load Data to Fine Tune PHI-4
root_path = '/kaggle/input/student-chat-data/student_chat_ai_data'
train_df = pd.read_csv(f"{root_path}/train_data.csv", index_col=0)
eval_df = pd.read_csv(f"{root_path}/eval_data.csv", index_col=0)
synthetic_student_population_df = pd.read_csv(f"{root_path}/synthetic_population_data.csv", index_col=0)

In [6]:
train_df

Unnamed: 0,context,user_input,answer
0,Student Linda Bailey (ID: S00001) has a GPA of...,What financial aid do I qualify for?,You qualify for the Need-Based Grant because y...
1,Student John Doe (ID: S00002) has a GPA of 3.8...,What financial aid am I eligible for?,John Doe qualifies for the Merit-Based Scholar...
2,Student Sarah Carter (ID: S00003) has a GPA of...,Can I apply for any financial aid?,Yes! You qualify for the STEM Excellence Award...
3,Student John Doe (ID: S10123) has a GPA of 2.3...,What am I to be eligible for financial aid?,"Due to your income, you do not qualify for a n..."


In [7]:
eval_df

Unnamed: 0,context,user_input,answer
0,Student Emily Carter (ID: S00023) has a GPA of...,Am I eligible for any financial aid?,"Yes, you qualify for the STEM Excellence Award..."
1,Student Mark Smith (ID: S00045) has a GPA of 2...,Can I get coverage with my education program?,"Unfortunately, you do not qualify for any fina..."
2,Student Olivia Jones (ID: S00030) has a GPA of...,Which financial aid can do I qualify for?,You qualify for the STEM Excellence Award sinc...
3,Student Daniel Thompson (ID: S00051) has a GPA...,What financial aid options do I have?,You are eligible for the Need-Based Grant sinc...


In [8]:
synthetic_student_population_df

Unnamed: 0,id,name,gpa,field_of_study,income
0,S00001,Marc Solomon,2.53,Anthropology,73706
1,S00002,Suzanne Kidd,2.50,Business Administration,22259
2,S00003,Michelle Griffin,2.42,Political Science,31192
3,S00004,Hannah Dixon,2.05,Engineering,530431
4,S00005,Mary Harris,3.13,Chemistry,370142
...,...,...,...,...,...
995,S00996,Nicole Clay,3.97,Chemistry,353792
996,S00997,Tammy Rose,3.34,Computer Science,414283
997,S00998,Jeff Alvarez,3.96,Psychology,450287
998,S00999,Thomas Lowe,3.02,Business Administration,54775


## Step 4: Load PHI 4 Mini Model and Train

In [10]:
import logging

# Configure logging
logging.basicConfig(
    level=logging.INFO,  # Set logging level
    format="%(asctime)s - %(levelname)s - %(message)s"
)

logger = logging.getLogger(__name__)

In [9]:
from torch.utils.data import Dataset

class StudentDialogueDataset(Dataset):
    def __init__(self, data, tokenizer, device, model_type, max_length=512):
        self.tokenizer = tokenizer
        self.data = data
        self.max_length = max_length
        self.model_type = model_type
        self.device = device

    def format_instruct(self, example):
        """Format dataset for instruction tuning using system prompt, context, user input, and answer."""
        system_prompt = "### System:\nYou are an AI financial aid evaluator. Answer the user's question based on the provided student's profile.\n"

        context = f"### Context:\n{example['context']}\n" if example["context"] else ""
        instruction = f"### User Input:\n{example['user_input']}\n"
        response = f"### Response:\n{example['answer']}\n"

        return system_prompt + context + instruction + response

    def __len__(self):
        return len(self.data)

    def __getitem__(self, idx):
        item = self.data[idx]

        # Choose formatting based on `model_type`
        if self.model_type == "instruct":
            formatted_text = self.format_instruct(item)
        else:  # "causal"
            formatted_text = (
                item["context"] + " " + item["user_input"] + " " + item["answer"]
            )

        encoding = self.tokenizer(
            formatted_text,
            padding="max_length",
            truncation=True,
            max_length=self.max_length,
            return_tensors="pt",
        )

        input_ids = encoding["input_ids"].squeeze(0).to(self.device)
        attention_mask = encoding["attention_mask"].squeeze(0).to(self.device)

        return {
            "input_ids": input_ids,
            "attention_mask": attention_mask,
            "labels": input_ids.clone(),  # Causal LM: Input = Label (next-token prediction)
        }


In [11]:
class CausalModelTrainer:
    """Causal Model Trainer"""

    def __init__(
        self,
        model_name,
        dataset,
        logger,
        device="mps",
        huggingface_token=None,
    ):
        """init"""

        self.model = None
        self.tokenizer = None
        self.logger = logger
        self.huggingface_token = huggingface_token
        self.device = device

    def load_model(
        self,
        lora_target_modules=None,
        compute_dtype="float16",
        use_cache=False,
        device_map="auto",
        apply_lora=True,
    ):
        """Loads the base model and applies LoRA."""

        if not lora_target_modules and apply_lora:
            raise ValueError("Please specify target modules for LoRA.")

        self.logger.info(f"Model will be moved to {self.device}")

        # Record the start time to measure the loading duration
        time_start = time()

        # Load the pre-trained base model with specified configurations
        self.model = AutoModelForCausalLM.from_pretrained(
                f"{model_path}",
                torch_dtype=compute_dtype,
                use_cache=use_cache,
                device_map=self.device,
                token=self.huggingface_token,
            )
        self.tokenizer = AutoTokenizer.from_pretrained(
                f"{model_path}",
                token=self.huggingface_token,
            )

        self.logger.info(f"Model loaded in {time() - time_start:.2f} seconds.")

        if apply_lora:
            # Apply LoRA to the model
            self.model = self.apply_lora(
                model=self.model, target_modules=lora_target_modules
            )

        return self.model, self.tokenizer

    def apply_lora(self, model, target_modules, print_trainable_params=True):
        """Applies LoRA."""

        lora_config = LoraConfig(
            r=16,  # Rank (smaller = more efficient)
            lora_alpha=64,
            lora_dropout=0.1,
            target_modules=target_modules,  # Specify the correct layers
            task_type=TaskType.CAUSAL_LM,  # For language modeling
        )

        # Apply LoRA to model
        model = get_peft_model(model, lora_config)
        model.to(self.device)

        if print_trainable_params:
            # After applying LoRA, confirm which layers are trainable
            model.print_trainable_parameters()

        return model

    def load_dataset(self, dataset_class, test_train_split=0.2):
        """Loads dataset and applies text preprocessing."""
        train_data, test_data = train_test_split(
            self.dataset,
            test_size=1 - test_train_split,
            stratify=self.dataset["output"],
            random_state=42,
        )

        train_data = train_data.reset_index(drop=True)
        test_data = test_data.reset_index(drop=True)

        self.train_dataset = dataset_class(
            train_data.to_dict(orient="records"),
            self.tokenizer,
            labels=self.labels,
            device=self.device,
        )

        self.test_dataset = dataset_class(
            test_data.to_dict(orient="records"),
            self.tokenizer,
            labels=self.labels,
            device=self.device,
        )

        return self.train_dataset, self.test_dataset

    def load_lora_adapters(self, base_model, adapter_path, merge_adapters=True):
        """Loads the base model and LoRA adapters"""
        # Load the base model
        self.model = PeftModel.from_pretrained(base_model, adapter_path)

        if merge_adapters:
            # Merge the LoRA adapters into the base model
            self.model = self.model.merge_and_unload()

        return self.model

    def train(
        self,
        training_arguments,
        train_dataset,
        eval_dataset,
        output_dir="./dist/instruct_lora/checkpoints",
        adapter_path="./dist/instruct_lora",
        return_tensors="pt",
        pad_to_multiple_of=8,
    ):
        """Runs the training process using Hugging Face Trainer."""
        data_collator = DataCollatorWithPadding(
            tokenizer=self.tokenizer,
            pad_to_multiple_of=pad_to_multiple_of,
            return_tensors=return_tensors,
        )

        trainer = Trainer(
            model=self.model,
            args=training_arguments,
            train_dataset=train_dataset,
            eval_dataset=eval_dataset,
            data_collator=data_collator,
        )

        # Train the model
        trainer.train()

        # Save LoRA adapter files
        trainer.save_model(adapter_path)
        self.logger.info(f"Model LoRA adapter files saved at {adapter_path}")


In [14]:
from time import time

# Import Base Model (used for all use cases)
model_name = "Phi-4-mini-instruct"
model_path = "microsoft/Phi-4-mini-instruct"

model_trainer = CausalModelTrainer(
    model_name=model_path,
    dataset=train_df,
    logger=logger,
    device=device,
    huggingface_token=secret_value
)

model, tokenizer = model_trainer.load_model(
    apply_lora=True,
    lora_target_modules=["q_proj", "v_proj", "k_proj", "o_proj", "fc1", "fc2"]
    )

model.safetensors.index.json:   0%|          | 0.00/16.3k [00:00<?, ?B/s]

Fetching 2 files:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/4.90G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/2.77G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/168 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/2.93k [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/3.91M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/2.42M [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/15.5M [00:00<?, ?B/s]

added_tokens.json:   0%|          | 0.00/249 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/587 [00:00<?, ?B/s]

trainable params: 3,145,728 || all params: 3,839,167,488 || trainable%: 0.0819


In [16]:
train_dataset = StudentDialogueDataset(train_df.to_dict(orient='records'), tokenizer, model_type="instruct", device=device)
eval_dataset = StudentDialogueDataset(eval_df.to_dict(orient='records'), tokenizer, model_type="instruct", device=device)

In [17]:
checkpoints_dir = f"{local_storage_dir}/models/phi4_chat_lora/checkpoints"
lora_adapter_path = f"{local_storage_dir}/models/phi4_chat_lora/lora_adapter"

In [18]:
# The time has come, empty cache
import gc
def clear_cuda():
    gc.collect()
    torch.cuda.empty_cache()
    torch.cuda.ipc_collect()
    print("GPU memory cleared.")

clear_cuda() if torch.cuda.is_available() else torch.mps.empty_cache() if torch.backends.mps.is_available() else None

GPU memory cleared.


In [19]:
training_args = TrainingArguments(
    output_dir=checkpoints_dir,
    per_device_train_batch_size=2,  # Reduce batch size for MPS stability
    per_device_eval_batch_size=2,
    eval_strategy="epoch",
    save_strategy="epoch",
    logging_strategy="steps",
    logging_steps=10,
    learning_rate=2e-5,
    num_train_epochs=5,
    fp16=True,  # Use `bf16` if on MPS, it supports it
    dataloader_pin_memory=False,
    report_to="none",
    warmup_steps=100,  # Add a warmup phase
    push_to_hub=False,
    load_best_model_at_end=True, # if you want the models at the end, set to 'True'
)

model_trainer.train(
    training_arguments=training_args,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    output_dir=checkpoints_dir,
    adapter_path=lora_adapter_path,
    pad_to_multiple_of=8,
    return_tensors="pt"
    )


No label_names provided for model class `PeftModelForCausalLM`. Since `PeftModel` hides base models input arguments, if label_names is not given, label_names can't be set automatically within `Trainer`. Note that empty label_names list will be used instead.


Epoch,Training Loss,Validation Loss
1,No log,7.738935
2,No log,7.745264
3,No log,7.736712
4,No log,7.726398
5,7.353900,7.690777


In [20]:
# Empty cache ahead of merge and unload
clear_cuda() if torch.cuda.is_available() else torch.mps.empty_cache() if torch.backends.mps.is_available() else None

GPU memory cleared.


In [22]:
# Load Base Model
new_model_trainer = CausalModelTrainer(
    model_name=model_path,
    dataset=train_df,
    logger=logger,
    device=device,
    huggingface_token=secret_value
)
base_model, tokenizer = new_model_trainer.load_model(apply_lora=False)

# Load LoRA Adapters
model = new_model_trainer.load_lora_adapters(
    base_model,
    lora_adapter_path,
    merge_adapters=True  # Merge adapters into base model
    )

# Save the Fully Fine-Tuned Model
tuned_version = 1
tuned_model_name = "chat"
tuned_model_storage_dir = f"{local_storage_dir}/models/phi4_mini/{tuned_model_name}"
tuned_model_storage_path = f"{tuned_model_storage_dir}/v{tuned_version}.0"
s3_model_storage_dir = f"phi4_mini/{tuned_model_name}/v{tuned_version}.0"

model.save_pretrained(tuned_model_storage_path)
tokenizer.save_pretrained(tuned_model_storage_path)

print(f"\nMerged {model_name} model saved (base model + LoRA adapters)!")
print(f"Model saved to {tuned_model_storage_path}")

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]


Merged Phi-4-mini-instruct model saved (base model + LoRA adapters)!
Model saved to /kaggle/working/models/phi4_mini/chat/v1.0


Up until this point, that is all that is required for full download, fine-tune, and saving of fine-tuned model. The remainder of this notebook is on using the model for the student chat use case.

## Step 5: Preprocess Synthetic Student Data
Using the synthetic student dataset, construct prompts to test out the model's performance

In [23]:
fake_students = synthetic_student_population_df.to_dict(orient="records")

print(f"{len(fake_students)} student records available")
print("\nData record shape:\n")
print(fake_students[0])

1000 student records available

Data record shape:

{'id': 'S00001', 'name': 'Marc Solomon', 'gpa': 2.53, 'field_of_study': 'Anthropology', 'income': 73706}


In [24]:
from typing import Optional

@staticmethod
def determine_financial_aid_eligibility(gpa, field_of_study, income):
    """
    Uses a student record to determine coverage against some fictitious
    parameters
    """
    # Dummy logic for financial aid eligibility
    financial_aid = []
    requirements = []

    if gpa >= 3.6:
        requirements.append("GPA ≥ 3.6")
    if income <= 35000:
        requirements.append("Income ≤ $35,000")
    if field_of_study in [
        "Computer Science",
        "Engineering",
        "Mathematics",
        "IT",
        "Statistics",
    ]:
        financial_aid.append("STEM Excellence Award")
        requirements.append("Field of Study in STEM")
    if field_of_study in ["Psychology", "Sociology", "Social Work", "Anthropology"]:
        financial_aid.append("Behavioral and Social Sciences Grant")
        requirements.append("Field of Study in Behavioral and Social Sciences")
    if field_of_study in ["Paralegal", "Law"]:
        financial_aid.append("Legal Studies Full Ride")
        requirements.append("Field of Study in Legal Studies")

    return requirements, financial_aid

def retrieve_financial_aid_knowledge(student_id: Optional[str] = None, student_name: Optional[str] = None):
    """
    Retrieves financial aid eligibility knowledge for a student.
    """
    if student_id:
        student = next((item for item in fake_students if item["id"] == student_id), None)
    if student_name:
        student = next((item for item in fake_students if item["name"] == student_name), None)

    if not student:
        return None

    requirements, financial_aid = determine_financial_aid_eligibility(
        student["gpa"], student["field_of_study"], student["income"]
        )
        

    return {
        "id": student["id"],
        "name": student["name"],
        "gpa": student["gpa"],
        "field_of_study": student["field_of_study"],
        "income": student["income"],
        "financial_aid": financial_aid,
        "requirements": requirements,
    }

In [25]:
def format(student: dict) -> str:
    """
    Converts the retrieved student and financial aid knowledge into an LLM-readable prompt.
    """
    return f"Student {student['name']} (ID: {student['id']}) has a GPA of {student['gpa']} in {student['field_of_study']} " + \
        f"and an income of ${student['income']}. " + \
        f"Eligible financial aid: {', '.join(student['financial_aid']) if student.get('financial_aid') else 'None'}. " + \
        f"Requirements met: {', '.join(student['requirements']) if student.get('requirements') else 'None'}."


retrieved_knowledge = retrieve_financial_aid_knowledge("S00001")
print(retrieved_knowledge)  # Data as-is
print(format(retrieved_knowledge)) # Data formatted for LLM

{'id': 'S00001', 'name': 'Marc Solomon', 'gpa': 2.53, 'field_of_study': 'Anthropology', 'income': 73706, 'financial_aid': ['Behavioral and Social Sciences Grant'], 'requirements': ['Field of Study in Behavioral and Social Sciences']}
Student Marc Solomon (ID: S00001) has a GPA of 2.53 in Anthropology and an income of $73706. Eligible financial aid: Behavioral and Social Sciences Grant. Requirements met: Field of Study in Behavioral and Social Sciences.


## Step 6: Pre-build all prompts using synthetic population data

In [26]:
from typing import Optional, Tuple

def build_student_prompt(user_input: str, student_id: Optional[str] = None) -> Tuple[str, str]:
    """
    Builds student chat prompt for LLM and uses RAG to keep responses properly informed for
    the given student with student_id.
    """

    # student id available and valid, proceed with lookup
    knowledge = format(retrieve_financial_aid_knowledge(student_id))

    # Retrun both the knowledge rertieved and the prompt
    return (
        knowledge,
        f"""
        ### Role: You are a financial aid evaluator. Answer the user's question based on the provided student's profile. Do not repeat the question or context, do not add any data not available regarding funding amount or renewals. Make sure to use the eligible financial aid and requirements met to inform their eligibility. \n\n### Context:\n{knowledge}\n\n### User's Question:\n{user_input}\n\n###Answer: 
        """
    )

In [27]:
# Define test dataset (Example questions)
def build_test_sentences():
    test_sentences = [
        f"What GPA do I need need to qualify for financial aid?",
        f"Am I eligible for scholarships?",
        f"What grants can I apply for?",
    ]

    return test_sentences

student_prompts = []

for student in fake_students:
    test_sentence = random.choice(build_test_sentences())
    knowledge_retrieved, prompt = build_student_prompt(user_input=test_sentence, student_id=student["id"])
    student_prompts.append(prompt)

student_prompts[:5]

["\n        ### Role: You are a financial aid evaluator. Answer the user's question based on the provided student's profile. Do not repeat the question or context, do not add any data not available regarding funding amount or renewals. Make sure to use the eligible financial aid and requirements met to inform their eligibility. \n\n### Context:\nStudent Marc Solomon (ID: S00001) has a GPA of 2.53 in Anthropology and an income of $73706. Eligible financial aid: Behavioral and Social Sciences Grant. Requirements met: Field of Study in Behavioral and Social Sciences.\n\n### User's Question:\nWhat grants can I apply for?\n\n###Answer: \n        ",
 "\n        ### Role: You are a financial aid evaluator. Answer the user's question based on the provided student's profile. Do not repeat the question or context, do not add any data not available regarding funding amount or renewals. Make sure to use the eligible financial aid and requirements met to inform their eligibility. \n\n### Context:\n

#### Set to start testing!

## Step 7: Run custom student inferences

In [28]:
model = torch.compile(model)

In [29]:
# Select first prompt
first_prompt = random.choice(student_prompts)
first_prompt

"\n        ### Role: You are a financial aid evaluator. Answer the user's question based on the provided student's profile. Do not repeat the question or context, do not add any data not available regarding funding amount or renewals. Make sure to use the eligible financial aid and requirements met to inform their eligibility. \n\n### Context:\nStudent Steven Kaiser (ID: S00572) has a GPA of 2.97 in Mathematics and an income of $136494. Eligible financial aid: STEM Excellence Award. Requirements met: Field of Study in STEM.\n\n### User's Question:\nAm I eligible for scholarships?\n\n###Answer: \n        "

In [30]:
def extract_answer(response_text):
    """
    Extracts the generated answer from the LLM response.
    """

    # Match the last "Answer:" followed by actual content
    match = re.search(
        r"(?i)(?:Your response:|Answer:)\s*\n*(.*?)(?=\n\n|\Z)",
        response_text,
        re.DOTALL,
    )

    if match:
        extracted_answer = match.group(1).strip()
        return extracted_answer
    else:
        return "No answer found."

In [32]:
def generate_text(prompt, max_new_tokens=100):
    inputs = tokenizer(prompt, return_tensors="pt").to(device)
    with torch.no_grad():
        output = model.generate(**inputs, max_new_tokens=max_new_tokens)
    return tokenizer.decode(output[0], skip_special_tokens=True)

# Run first prompt
response = generate_text(first_prompt)
print(extract_answer(response))

Yes, Steven Kaiser is eligible for the STEM Excellence Award.


In [33]:
# Run another prompt
response = generate_text(random.choice(student_prompts))
print(extract_answer(response))

David Smith is not eligible for any grants as there are no eligible financial aids listed for his profile. His GPA meets the requirement, but the absence of eligible financial aids means no grants can be applied for.
