### This notebook presents the fine-tuning workflow applied to the Mistral-7B-Instruct model using instruction-style formatting, with a focus on aligning responses to emotionally grounded conversational data.

### Dataset used: Empathetic Dialogue Dataset

### Importing Necessary Libraries

In [None]:
%%capture
!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install --no-deps xformers trl peft accelerate bitsandbytes

In [None]:
import unsloth
import torch
import numpy as np
import transformers
import bitsandbytes as bnb
import xformers
import accelerate
import peft
import datasets
import trl

## Model Prep

In [None]:
import torch
from unsloth import FastLanguageModel
import os
os.environ["TRITON_DISABLE_LINE_INFO"] = "1"

In [None]:
max_seq_length = 2048
dtype = None
load_in_4bit = True

FastLanguageModel.forbid_torch_xformers = True

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name =  "unsloth/mistral-7b-instruct-v0.2-bnb-4bit",
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
)

==((====))==  Unsloth 2025.3.18: Fast Mistral patching. Transformers: 4.51.3.
   \\   /|    Tesla T4. Num GPUs = 1. Max memory: 14.741 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.7.0+cu126. CUDA: 7.5. CUDA Toolkit: 12.6. Triton: 3.3.0
\        /    Bfloat16 = FALSE. FA [Xformers = 0.0.30. FA2 = False]
 "-____-"     Free license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


We now add LoRA adapters so we only need to update 1 to 10% of all parameters!

In [None]:
model = FastLanguageModel.get_peft_model(
    model,
    r = 16,
    target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
                      "gate_proj", "up_proj", "down_proj",],
    lora_alpha = 16,
    lora_dropout = 0,
    bias = "none",
    use_gradient_checkpointing = "unsloth",
    random_state = 3407,
    use_rslora = False,
    loftq_config = None,
)

Unsloth 2025.3.18 patched 32 layers with 32 QKV layers, 32 O layers and 32 MLP layers.


In [None]:
from peft import PeftModel
model = PeftModel.from_pretrained(model,"/content/drive/MyDrive/fine_tuned_mistral_dailydialog")


## Data Prep


In [None]:
from unsloth import FastLanguageModel
from datasets import load_dataset

# Defining Alpaca-style prompt
alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
{}

### Input:
{}

### Response:
{}"""

# Load dataset
dataset = load_dataset("talalmuzaffar/empathetic_dataset_with_correct_context", split="train")

In [None]:
print(dataset)

Dataset({
    features: ['Context', 'Input', 'Response', '__index_level_0__'],
    num_rows: 8000
})


In [None]:
# Drop the unnecessary index column
dataset = dataset.remove_columns(['__index_level_0__'])

In [None]:
print(dataset[0])

{'Context': 'You are an empathetic chatbot and your goal is to respond empathetically and ask clarifying questions.', 'Input': 'I was so embarrassed.  I farted while I was out on a date.', 'Response': 'Oh_comma_ lord. Did the other person die?'}


In [None]:
print(dataset.column_names)

['Context', 'Input', 'Response']


In [None]:
# Rename dataset columns to match the Alpaca format
dataset = dataset.rename_columns({
    "Context": "instruction",
    "Input": "input",
    "Response": "output"
})

# Format the dataset
EOS_TOKEN = tokenizer.eos_token  # Add EOS_TOKEN

def formatting_prompts_func(examples):
    instructions = examples["instruction"]
    inputs = examples["input"]
    outputs = examples["output"]
    texts = []
    for instruction, input, output in zip(instructions, inputs, outputs):
        text = alpaca_prompt.format(instruction, input, output) + EOS_TOKEN
        texts.append(text)
    return {"text": texts}

# Apply formatting
dataset = dataset.map(formatting_prompts_func, batched=True)

Map:   0%|          | 0/8000 [00:00<?, ? examples/s]

In [None]:
print(dataset.column_names)

['instruction', 'input', 'output', 'text']


In [None]:
from datasets import DatasetDict

dataset2 = dataset

split_dataset = dataset2.train_test_split(test_size=0.2, seed=42)
val_test_split = split_dataset['test'].train_test_split(test_size=0.5, seed=42)

final_splits = DatasetDict({
    'train': split_dataset['train'],
    'validation': val_test_split['train'],
    'test': val_test_split['test']
})
print(final_splits)

DatasetDict({
    train: Dataset({
        features: ['instruction', 'input', 'output', 'text'],
        num_rows: 6400
    })
    validation: Dataset({
        features: ['instruction', 'input', 'output', 'text'],
        num_rows: 800
    })
    test: Dataset({
        features: ['instruction', 'input', 'output', 'text'],
        num_rows: 800
    })
})


In [None]:
print(final_splits['train'][0]["text"])

Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
You are an empathetic chatbot and your goal is to respond empathetically and ask clarifying questions.

### Input:
Had a really good spaghetti dinner tonight. Just relaxing and getting some work done now. Feeling pretty accomplished today.

### Response:
Big dinners are always a great way to tie everything up at the end of the day. Was it a pretty long day for you?</s>


## Train the model


In [None]:
from trl import SFTTrainer
from transformers import TrainingArguments
from unsloth import is_bfloat16_supported

trainer = SFTTrainer(
    model = model,
    tokenizer = tokenizer,
    train_dataset = final_splits['train'],
    dataset_text_field = "text",
    max_seq_length = max_seq_length,
    dataset_num_proc = 2,
    packing = False, # Can make training 5x faster for short sequences.
    args = TrainingArguments(
        per_device_train_batch_size = 2,
        gradient_accumulation_steps = 4,
        warmup_steps = 5,
        max_steps = 60,
        learning_rate = 2e-4,
        fp16 = not is_bfloat16_supported(),
        bf16 = is_bfloat16_supported(),
        logging_steps = 1,
        optim = "adamw_8bit",
        weight_decay = 0.01,
        lr_scheduler_type = "linear",
        seed = 3407,
        output_dir = "outputs",
    ),
)

Unsloth: Tokenizing ["text"] (num_proc=2):   0%|          | 0/6400 [00:00<?, ? examples/s]

In [None]:
#@title Show current memory stats
gpu_stats = torch.cuda.get_device_properties(0)
start_gpu_memory = round(torch.cuda.max_memory_reserved() / 1024 / 1024 / 1024, 3)
max_memory = round(gpu_stats.total_memory / 1024 / 1024 / 1024, 3)
print(f"GPU = {gpu_stats.name}. Max memory = {max_memory} GB.")
print(f"{start_gpu_memory} GB of memory reserved.")

GPU = Tesla T4. Max memory = 14.741 GB.
7.043 GB of memory reserved.


In [None]:
trainer_stats = trainer.train()

==((====))==  Unsloth - 2x faster free finetuning | Num GPUs used = 1
   \\   /|    Num examples = 6,400 | Num Epochs = 1 | Total steps = 60
O^O/ \_/ \    Batch size per device = 2 | Gradient accumulation steps = 4
\        /    Data Parallel GPUs = 1 | Total batch size (2 x 4 x 1) = 8
 "-____-"     Trainable parameters = 41,943,040/7,000,000,000 (0.60% trained)
[34m[1mwandb[0m: Currently logged in as: [33mvirigineniaishwarya2[0m ([33mvirigineniaishwarya2-university-at-buffalo[0m) to [32mhttps://api.wandb.ai[0m. Use [1m`wandb login --relogin`[0m to force relogin


Unsloth: Will smartly offload gradients to save VRAM!


Step,Training Loss
1,3.2199
2,3.0427
3,2.9562
4,2.3263
5,1.8254
6,1.3597
7,1.1873
8,1.018
9,1.1609
10,0.9858


In [None]:
#@title Show final memory and time stats
used_memory = round(torch.cuda.max_memory_reserved() / 1024 / 1024 / 1024, 3)
used_memory_for_lora = round(used_memory - start_gpu_memory, 3)
used_percentage = round(used_memory         /max_memory*100, 3)
lora_percentage = round(used_memory_for_lora/max_memory*100, 3)
print(f"{trainer_stats.metrics['train_runtime']} seconds used for training.")
print(f"{round(trainer_stats.metrics['train_runtime']/60, 2)} minutes used for training.")
print(f"Peak reserved memory = {used_memory} GB.")
print(f"Peak reserved memory for training = {used_memory_for_lora} GB.")
print(f"Peak reserved memory % of max memory = {used_percentage} %.")
print(f"Peak reserved memory for training % of max memory = {lora_percentage} %.")

301.8608 seconds used for training.
5.03 minutes used for training.
Peak reserved memory = 7.359 GB.
Peak reserved memory for training = 0.316 GB.
Peak reserved memory % of max memory = 49.922 %.
Peak reserved memory for training % of max memory = 2.144 %.


## Inference

### Sample 1

In [None]:
FastLanguageModel.for_inference(model) # Enable native 2x faster inference
inputs = tokenizer(
[
    alpaca_prompt.format(
        "You are helping generate personalized AAC responses with the following preferences:\n    Tone: Neutral, Length: Medium, Intent: Give opinion.\n\n    Here is the AAC user\'s personal context:\n    -------------------\n    Persona Summary – Riya • Name: Riya • Age: 27 • Likes: Chai, sketching, journaling, magical realism, Carnatic music, digital expression, quiet mornings • Traits: Reﬂective, empathetic, emotionally aware, culturally ﬂuid, creatively expressive, socially introverted but deeply observant • Growing Up in a Cultural Crossroad I grew up at the intersection of languages, traditions, and expectations. My family is Tamil, but I was raised in a cosmopolitan city where Hindi, English, and a dozen other dialects danced through the streets like music. Our home smelled of sambar and ﬁlter coffee in the morning and echoed with Carnatic music on lazy Sundays. But outside those walls, I was immersed in Western cartoons, pop songs, and internet slang. That blend — cultural, linguistic, emotional — shaped me in ways I’m still discovering. I wear a sari during family functions and jeans to coffee shops.",
        "can u tell me something about yourself?", # input
        "", # output - leave this blank for generation!
    ),

], return_tensors = "pt").to("cuda")

from transformers import TextStreamer
text_streamer = TextStreamer(tokenizer) # For continuous inference
_ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 128)

<s> Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
You are helping generate personalized AAC responses with the following preferences:
    Tone: Neutral, Length: Medium, Intent: Give opinion.

    Here is the AAC user's personal context:
    -------------------
    Persona Summary – Riya • Name: Riya • Age: 27 • Likes: Chai, sketching, journaling, magical realism, Carnatic music, digital expression, quiet mornings • Traits: Reﬂective, empathetic, emotionally aware, culturally ﬂuid, creatively expressive, socially introverted but deeply observant • Growing Up in a Cultural Crossroad I grew up at the intersection of languages, traditions, and expectations. My family is Tamil, but I was raised in a cosmopolitan city where Hindi, English, and a dozen other dialects danced through the streets like music. Our home smelled of sambar and ﬁlter coffee in the morning a

### Sample 2

In [None]:
FastLanguageModel.for_inference(model) # Enable native 2x faster inference
inputs = tokenizer(
[
    alpaca_prompt.format(
        """You are helping generate personalized AAC responses with the following preferences:\n    Tone: Neutral, Length: Medium, Intent: Give opinion.\n\n    Here is the AAC user\'s personal context:\n    -------------------\n    projects — especially ones that involve design or language — I bring all of that with me. My style. My colors. My rhythm. And maybe most importantly, my belief that communication is not just about information transfer. It’s about meaning. About being known. 7. Growing Up in a Cultural Crossroad I grew up at the intersection of languages, traditions, and expectations. My family is Tamil, but I was raised in a cosmopolitan city where Hindi, English, and a dozen other dialects danced through the streets like music. Our home smelled of sambar and ﬁlter coffee in the morning and echoed with Carnatic music on lazy Sundays. But outside those walls, I was immersed in Western cartoons, pop songs, and internet slang. That blend — cultural, linguistic, emotional — shaped me in ways I’m still discovering. I wear a sari during family functions and jeans to coffee shops. I say "amma" when I need comfort and "dude" when I’m texting my friends. I’ve coded emotional buttons into my AAC that say “dei machan” with just the right tone of playful irritation, and also ones that deliver full Shakespeare quotes when I’m feeling dramatic. Cultural identity, when you don’t speak with your mouth, becomes even more complex. My device didn’t come preloaded with Tamil phrases. I had to build them myself, piece by piece. I had to teach it who I was. And in doing that, I got to see how much of culture lives not in vocabulary but in rhythm — in pauses, in timing, in gesture. My grandparents, especially my thatha, found it hard at ﬁrst. He’d always prided himself on storytelling — long, winding tales ﬁlled with idioms and exaggerations. I could never keep up in that world. But one day, I drew him a comic strip based on\n\nPersona Summary – Riya • Name: Riya • Age: 27 • Likes: Chai, sketching, journaling, magical realism, Carnatic music, digital expression, quiet mornings • Traits: Reﬂective, empathetic, emotionally aware, culturally ﬂuid, creatively expressive, socially introverted but deeply observant • AAC User: Yes, since early childhood (speech impairment since birth); uses personalized, AI-augmented AAC with emotional memory and symbolic shortcuts 1. Beginnings: Childhood, Family, and First Voice I was born in the monsoon — the kind of stormy evening where thunder cracks like laughter in the sky and the earth smells like beginnings. My mother says I didn’t cry when I came out. Not a sound. But I looked around with big eyes, wide open, curious. That silence carried through the early years of my life, a silence ﬁlled with movement, gestures, and eyes that said more than any word could. My parents are both teachers — thoughtful, soft-spoken people who believe in listening before speaking. Maybe that’s why they never pushed me to be someone I wasn’t. They knew I had things to say, even if I didn’t say them the “normal” way. My dad would often sit with me for hours, interpreting my drawings and assigning meaning to the colors I used. Blue meant I was calm. Yellow meant joy. Red… well, red was complicated. Red was frustration and energy, sometimes even hope. My sister, Mira, was my ﬁrst best friend. She was the translator of my world — patient, playful, and uncannily in sync with my thoughts. She made up games where I could be the queen, the boss, the pilot — and no one questioned it. When we played “restaurant,” she’d hand me laminated cards with food symbols, letting me place an order with just a glance. We didn’t know it then, but we were building our\n\nin pauses, in timing, in gesture. My grandparents, especially my thatha, found it hard at ﬁrst. He’d always prided himself on storytelling — long, winding tales ﬁlled with idioms and exaggerations. I could never keep up in that world. But one day, I drew him a comic strip based on one of his old army stories. Each panel had a line of dialogue I had typed myself. He held that paper in silence for a full minute before saying""",
        "What is your name?, Where are you from? How's your health? Tell me about your past health history in detail", # input
        "", # output - leave this blank for generation!
    ),

], return_tensors = "pt").to("cuda")

from transformers import TextStreamer
text_streamer = TextStreamer(tokenizer)
_ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 128)

<s> Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
You are helping generate personalized AAC responses with the following preferences:
    Tone: Neutral, Length: Medium, Intent: Give opinion.

    Here is the AAC user's personal context:
    -------------------
    projects — especially ones that involve design or language — I bring all of that with me. My style. My colors. My rhythm. And maybe most importantly, my belief that communication is not just about information transfer. It’s about meaning. About being known. 7. Growing Up in a Cultural Crossroad I grew up at the intersection of languages, traditions, and expectations. My family is Tamil, but I was raised in a cosmopolitan city where Hindi, English, and a dozen other dialects danced through the streets like music. Our home smelled of sambar and ﬁlter coffee in the morning and echoed with Carnatic mus

### Sample 3

In [None]:
def build_prompt_with_customization(query, context, customization):
    tone_instruction_map = {
        "Neutral": "Maintain a calm and neutral tone.",
        "Happy": "Use a cheerful and positive tone.",
        "Sad": "Respond with a gentle and understanding tone.",
        "Assertive": "Use a confident and clear tone.",
        "Empathetic": "Show deep empathy and emotional support in your response."
    }

    length_instruction_map = {
        "Short": "Write a very short and clear response using only one sentence or two sentences, reply that is complete and emotionally appropriate.",
        "Medium": "Write a response that uses two to three sentences and doesn't exceed three sentences.",
        "Long": "Write a detailed, thoughtful response using multiple sentences that uses upto not more than five sentences."
    }

    intent_instruction_map = {
        "Answer": "",
        "Ask a question": "Respond naturally to the user's message, and then ask a thoughtful follow-up question to continue the conversation. The question should be relevant to what the user just said, or gently expand on the topic. Keep the tone aligned with the user's personality and preferences. Avoid generic or robotic questions.",
        "Change topic": "After responding appropriately to the user’s message, gently guide the conversation toward a new but relevant topic based on the user’s interests or context. Do not say 'let's change the topic.' Instead, naturally introduce something new in a way that flows from the conversation."
    }

    tone = customization.get("tone", "Neutral")
    length = customization.get("length", "Medium")
    intent = customization.get("intent", "Give opinion")

    return f"""<s> Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
You are helping generate personalized AAC responses with the following preferences:


{length_instruction_map[length]}
{intent_instruction_map[intent]}
{tone_instruction_map[tone]}

Here is the AAC user\'s personal context:\n    -------------------\n    Persona Summary – Riya • Name: Riya • Age: 27 • Likes: Chai, sketching, journaling, magical realism, Carnatic music, digital expression, quiet mornings • Traits: Reﬂective, empathetic, emotionally aware, culturally ﬂuid, creatively expressive, socially introverted but deeply observant • Growing Up in a Cultural Crossroad I grew up at the intersection of languages, traditions, and expectations. My family is Tamil, but I was raised in a cosmopolitan city where Hindi, English, and a dozen other dialects danced through the streets like music. Our home smelled of sambar and ﬁlter coffee in the morning and echoed with Carnatic music on lazy Sundays. But outside those walls, I was immersed in Western cartoons, pop songs, and internet slang. That blend — cultural, linguistic, emotional — shaped me in ways I’m still discovering. I wear a sari during family functions and jeans to coffee shops.

### Input:
{query}

### Response:"""


In [None]:
query = "Where are you from, what languages do u know?"
# context = """projects — especially ones that involve design or language — I bring all of that with me. My style. My colors. My rhythm. And maybe most importantly, my belief that communication is not just about information transfer. It’s about meaning. About being known. 7. Growing Up in a Cultural Crossroad I grew up at the intersection of languages, traditions, and expectations. My family is Tamil, but I was raised in a cosmopolitan city where Hindi, English, and a dozen other dialects danced through the streets like music. Our home smelled of sambar and ﬁlter coffee in the morning and echoed with Carnatic music on lazy Sundays. But outside those walls, I was immersed in Western cartoons, pop songs, and internet slang. That blend — cultural, linguistic, emotional — shaped me in ways I’m still discovering. I wear a sari during family functions and jeans to coffee shops. I say "amma" when I need comfort and "dude" when I’m texting my friends. I’ve coded emotional buttons into my AAC that say “dei machan” with just the right tone of playful irritation, and also ones that deliver full Shakespeare quotes when I’m feeling dramatic. Cultural identity, when you don’t speak with your mouth, becomes even more complex. My device didn’t come preloaded with Tamil phrases. I had to build them myself, piece by piece. I had to teach it who I was. And in doing that, I got to see how much of culture lives not in vocabulary but in rhythm — in pauses, in timing, in gesture. My grandparents, especially my thatha, found it hard at ﬁrst. He’d always prided himself on storytelling — long, winding tales ﬁlled with idioms and exaggerations. I could never keep up in that world. But one day, I drew him a comic strip based on\n\n
context = """Persona Summary – Riya • Name: Riya • Age: 27 • Likes: Chai, sketching, journaling, magical realism, Carnatic music, digital expression, quiet mornings • Traits: Reﬂective, empathetic, emotionally aware, culturally ﬂuid, creatively expressive, socially introverted but deeply observant • AAC User: Yes, since early childhood (speech impairment since birth); uses personalized, AI-augmented AAC with emotional memory and symbolic shortcuts 1. Beginnings: Childhood, Family, and First Voice I was born in the monsoon — the kind of stormy evening where thunder cracks like laughter in the sky and the earth smells like beginnings. My mother says I didn’t cry when I came out. Not a sound. But I looked around with big eyes, wide open, curious. That silence carried through the early years of my life, a silence ﬁlled with movement, gestures, and eyes that said more than any word could. My parents are both teachers — thoughtful, soft-spoken people who believe in listening before speaking. Maybe that’s why they never pushed me to be someone I wasn’t. They knew I had things to say, even if I didn’t say them the “normal” way. My dad would often sit with me for hours, interpreting my drawings and assigning meaning to the colors I used. Blue meant I was calm. Yellow meant joy. Red… well, red was complicated. Red was frustration and energy, sometimes even hope. My sister, Mira, was my ﬁrst best friend. She was the translator of my world — patient, playful, and uncannily in sync with my thoughts. She made up games where I could be the queen, the boss, the pilot — and no one questioned it. When we played “restaurant,” she’d hand me laminated cards with food symbols, letting me place an order with just a glance. We didn’t know it then, but we were building our\n\nin pauses, in timing, in gesture. My grandparents, especially my thatha, found it hard at ﬁrst. He’d always prided himself on storytelling — long, winding tales ﬁlled with idioms and exaggerations. I could never keep up in that world. But one day, I drew him a comic strip based on one of his old army stories. Each panel had a line of dialogue I had typed myself. He held that paper in silence for a full minute before saying""",
customization = {
    "tone": "Happy",
    "length": "Short",
    "intent": "Ask a question"
}
prompt = build_prompt_with_customization(query, context, customization)
print(prompt)

<s> Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
You are helping generate personalized AAC responses with the following preferences:


Write a very short and clear response using only one sentence or two sentences, reply that is complete and emotionally appropriate.
Respond naturally to the user's message, and then ask a thoughtful follow-up question to continue the conversation. The question should be relevant to what the user just said, or gently expand on the topic. Keep the tone aligned with the user's personality and preferences. Avoid generic or robotic questions.
Use a cheerful and positive tone.

Here is the AAC user's personal context:
    -------------------
    Persona Summary – Riya • Name: Riya • Age: 27 • Likes: Chai, sketching, journaling, magical realism, Carnatic music, digital expression, quiet mornings • Traits: Reﬂective, empathetic, emo

### Saving, loading finetuned models

In [None]:
from huggingface_hub import login
login(token="<hugging_face_token>")

In [None]:
# model.save_pretrained("lora_model") # Local saving
# tokenizer.save_pretrained("lora_model")
model.push_to_hub("sourname/lora_model", token = "<hugging_face_token>") # Online saving
tokenizer.push_to_hub("sourname/lora_model", token = "<hugging_face_token>") # Online saving

README.md:   0%|          | 0.00/607 [00:00<?, ?B/s]

adapter_model.safetensors:   0%|          | 0.00/168M [00:00<?, ?B/s]

Saved model to https://huggingface.co/sourname/lora_model


tokenizer.model:   0%|          | 0.00/493k [00:00<?, ?B/s]

In [None]:
!zip -r lora_model.zip lora_model/

  adding: lora_model/ (stored 0%)
  adding: lora_model/special_tokens_map.json (deflated 79%)
  adding: lora_model/adapter_model.safetensors (deflated 8%)
  adding: lora_model/tokenizer.model (deflated 55%)
  adding: lora_model/tokenizer.json (deflated 85%)
  adding: lora_model/adapter_config.json (deflated 56%)
  adding: lora_model/README.md (deflated 66%)
  adding: lora_model/tokenizer_config.json (deflated 68%)


In [None]:
from google.colab import files
files.download('lora_model.zip')

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>