### Chat Template-Based Classification using Unsloth

In this notebook, I fine-tuned a language model using **chat templates for classification tasks** with the help of the **Unsloth** framework. The objective was to use instruction-style prompts to classify text inputs by adapting a language model to predict categories based on structured chat prompts. The steps include:

- Installing all necessary libraries (`unsloth`, `datasets`, `trl`, etc.).
- Loading a base model and tokenizer using Unsloth.
- Preparing a dataset for classification (e.g., Yelp reviews or similar).
- Formatting the dataset into chat-style instruction-response templates.
- Configuring LoRA parameters and training arguments.
- Training the model using `SFTTrainer`.
- Evaluating the model’s predictions and saving it for future use.



In [None]:
# Install required libraries
!pip install -q unsloth
!pip install -q datasets
!pip install -q peft
!pip install -q trl
!pip install -q wandb
!pip install -q bitsandbytes

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m46.2/46.2 kB[0m [31m1.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m192.7/192.7 kB[0m [31m7.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m491.2/491.2 kB[0m [31m11.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m162.1/162.1 kB[0m [31m11.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m363.4/363.4 MB[0m [31m4.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m13.8/13.8 MB[0m [31m44.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m24.6/24.6 MB[0m [31m34.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m883.7/883.7 kB[0m [31m52.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━

### Import Libraries
- Import key modules like `load_dataset`, `FastLanguageModel`, `SFTTrainer`, and `TrainingArguments`.
- These components are used to build and fine-tune the model.


In [None]:
import os
import gc
import torch
import pandas as pd
import numpy as np
from datasets import load_dataset
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from unsloth import FastLanguageModel
from trl import SFTTrainer
from peft import LoraConfig
from torch.utils.data import DataLoader


Please restructure your imports with 'import unsloth' at the top of your file.
  from unsloth import FastLanguageModel


🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.
Unsloth: Failed to patch Gemma3ForConditionalGeneration.
🦥 Unsloth Zoo will now patch everything to make training faster!


### Load Classification Dataset
- Load a classification dataset using Hugging Face's `load_dataset`.
- This data will be used for training a chat-based classifier.


In [None]:
# Set seed for reproducibility
torch.manual_seed(42)
np.random.seed(42)

In [None]:
# Load the IMDb dataset
imdb_dataset = load_dataset("imdb")
print("Dataset loaded successfully!")
print(f"Train size: {len(imdb_dataset['train'])}")
print(f"Test size: {len(imdb_dataset['test'])}")

README.md:   0%|          | 0.00/7.81k [00:00<?, ?B/s]

train-00000-of-00001.parquet:   0%|          | 0.00/21.0M [00:00<?, ?B/s]

test-00000-of-00001.parquet:   0%|          | 0.00/20.5M [00:00<?, ?B/s]

unsupervised-00000-of-00001.parquet:   0%|          | 0.00/42.0M [00:00<?, ?B/s]

Generating train split:   0%|          | 0/25000 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/25000 [00:00<?, ? examples/s]

Generating unsupervised split:   0%|          | 0/50000 [00:00<?, ? examples/s]

Dataset loaded successfully!
Train size: 25000
Test size: 25000


### Display Sample Data
- Print a few examples from the dataset.
- Helps inspect the structure and determine how to format inputs.


In [None]:
# Display sample data
print("\nSample review:")
print(imdb_dataset['train'][0]['text'][:200] + "...")
print(f"Label: {'Positive' if imdb_dataset['train'][0]['label'] == 1 else 'Negative'}")


Sample review:
I rented I AM CURIOUS-YELLOW from my video store because of all the controversy that surrounded it when it was first released in 1967. I also heard that at first it was seized by U.S. customs if it ev...
Label: Negative


### Load and Prepare Model
- Load a pretrained model and tokenizer using Unsloth's `FastLanguageModel`.
- Enable 4-bit quantization and prepare for LoRA fine-tuning.


In [None]:
# Define the model and tokenizer
model_name = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"

In [None]:
# Set up quantization configuration
quant_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=True,
)

In [None]:
# Initialize the model with Unsloth's FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name=model_name,
    quantization_config=quant_config,
    max_seq_length=512,
)

==((====))==  Unsloth 2025.3.19: Fast Llama patching. Transformers: 4.51.3.
   \\   /|    Tesla T4. Num GPUs = 1. Max memory: 14.741 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.6.0+cu124. CUDA: 7.5. CUDA Toolkit: 12.4. Triton: 3.2.0
\        /    Bfloat16 = FALSE. FA [Xformers = 0.0.29.post3. FA2 = False]
 "-____-"     Free license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


model.safetensors:   0%|          | 0.00/762M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/1.37k [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/438 [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.84M [00:00<?, ?B/s]

### Set LoRA Configuration
- Define LoRA parameters for efficient fine-tuning (e.g., rank, alpha, dropout).
- Targets specific transformer layers for adapter injection.


In [None]:
# Simplified version with only essential parameters
model = FastLanguageModel.get_peft_model(
    model,
    r=16,
    lora_alpha=32,
    lora_dropout=0.05,
    bias="none",
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]
)

Unsloth: Dropout = 0 is supported for fast patching. You are using dropout = 0.05.
Unsloth will patch all other layers, except LoRA matrices, causing a performance hit.
Unsloth 2025.3.19 patched 22 layers with 0 QKV layers, 0 O layers and 0 MLP layers.


In [None]:
# Make sure the tokenizer is properly set up
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"

### Create Chat Template Formatter
- Define a function to convert dataset entries into chat prompt-response format.
- This simulates an instruction-tuned interface for classification.


In [None]:
def print_trainable_parameters(model):
    """
    Prints the number of trainable parameters in the model.
    """
    trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
    all_params = sum(p.numel() for p in model.parameters())
    print(f"Trainable parameters: {trainable_params}")
    print(f"All parameters: {all_params}")
    print(f"Percentage of parameters being trained: {100 * trainable_params / all_params:.2f}%")

# Print trainable parameters
print_trainable_parameters(model)

Trainable parameters: 12615680
All parameters: 628221952
Percentage of parameters being trained: 2.01%


In [None]:
# Define classification chat templates
def create_classification_prompt(review):
    return f"""<s>[INST] <<SYS>>
You are a sentiment analysis assistant. Classify the following movie review as either 'positive' or 'negative'.
<</SYS>>

Here's the movie review: {review} [/INST]"""

def format_classification_output(label):
    return " positive" if label == 1 else " negative"

# Function to format dataset examples
def format_dataset(example):
    prompt = create_classification_prompt(example["text"])
    response = format_classification_output(example["label"])

    # For debugging
    if np.random.random() < 0.001:  # Show ~0.1% of examples
        print("\n--- Example ---")
        print("Input:", prompt)
        print("Output:", response)

    return {
        "text": prompt + response + "</s>"  # Complete conversation
    }

In [None]:
# Process the dataset
processed_dataset = imdb_dataset.map(
    format_dataset,
    remove_columns=imdb_dataset["train"].column_names,
)

# Split the dataset into training and validation sets
train_dataset = processed_dataset["train"]
eval_dataset = processed_dataset["test"].select(range(1000))  # Use a subset for evaluation

Map:   0%|          | 0/25000 [00:00<?, ? examples/s]


--- Example ---
Input: <s>[INST] <<SYS>>
You are a sentiment analysis assistant. Classify the following movie review as either 'positive' or 'negative'.
<</SYS>>

Here's the movie review: Not really spoilers in my opinion, but I wanted to cover myself, nevertheless. As the executive producer, Morgan Freeman wants the audience to ignore the numerous absurdities of his character in 10 Items Or Less, a movie with an intentional indie-feel, and just be absorbed in the mentor/be-all-that-you-can-be theme. He plays a alternate universe, semi-washed up version of the real Morgan Freeman, who is chauffeured in an old Econovan by a kid all the way into Carson, CA from Brentwood to research his next movie role. Why Carson, is a mystery to So. Cal residents. He could have saved the trip and gone anywhere in the San Fernando Valley and found the same elements. Paz Vega is pretty to watch, a cross between Salma Hayek and Penelope Cruz, playing a disgruntled grocery checker at a large but slow loca

Map:   0%|          | 0/25000 [00:00<?, ? examples/s]


--- Example ---
Input: <s>[INST] <<SYS>>
You are a sentiment analysis assistant. Classify the following movie review as either 'positive' or 'negative'.
<</SYS>>

Here's the movie review: As a Spanish tourist in Los Angeles and a fanatic movie lover I committed a terrible mistake. I went to see "The Women" The remake of one of my all time favorites. I've seen the original many many times, in fact I own it. My rushing to see the remake was based on Diane English, the woman responsible for "Murphy Brown" My though was: how bad can it be? She must know what she's doing. Well, I don't know what to say. I don't understand what happened. The Botoxed women is a rather depressing affair. Meg Ryan or whoever played Mary - she looked a bit like a grotesque version of Meg Ryan...another actress perhaps wearing a Meg Ryan mask - she doesn't bring to the character nothing of what Norma Shearer did in 1939. The new one is a tired, unconvincing prototype of what has become a farce within a farce. Th

Map:   0%|          | 0/50000 [00:00<?, ? examples/s]


--- Example ---
Input: <s>[INST] <<SYS>>
You are a sentiment analysis assistant. Classify the following movie review as either 'positive' or 'negative'.
<</SYS>>

Here's the movie review: In New York, the alcoholic and decadent detective Jack Mosley (Bruce Willis) is assigned to deliver a prisoner to the court sixteen blocks far from his precinct in 118 minutes. Eddie Bunker (Mos Def) made a deal with the D.A. office and will identify and testify against a dirty detective. While driving to the tribunal, Jack is attacked by a group of corrupt cops and protects Eddie.<br /><br />In spite of being a flawed movie, "16 Blocks" is a good entertainment with lots of action and an optimistic, hopeful and commercial message in the end that people can change, with the redemption of Eddie and Jack. Mos Def irritates with his accent, and Bruce Willis is totally different from his usual shape, inclusive with a "tire" on his belly. It is funny to see all the damage caused by the bus in Manhattan and

In [None]:
training_args = {
    "output_dir": "./results",
    "num_train_epochs": 3,
    "per_device_train_batch_size": 8,
    "per_device_eval_batch_size": 8,
    "gradient_accumulation_steps": 2,
    "evaluation_strategy": "steps",
    "eval_steps": 200,
    "save_strategy": "steps",
    "save_steps": 200,
    "save_total_limit": 3,
    "logging_steps": 50,
    "learning_rate": 2e-4,
    "warmup_steps": 100,
    "lr_scheduler_type": "cosine",
    "report_to": "none",
    "gradient_checkpointing": True,
    "gradient_checkpointing_kwargs": {"use_reentrant": False},
    "bf16": False,
    "fp16": True,
    "optim": "adamw_torch",
}

### Initialize Trainer
- Instantiate the `SFTTrainer` with model, dataset, and configuration.
- This handles the training loop using Unsloth's optimized chat-style trainer.


In [None]:
# Initialize the SFTTrainer
# Note: Since we already applied PEFT/LoRA to the model, we don't pass peft_config here
trainer = SFTTrainer(
    model=model,
    tokenizer=tokenizer,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    args=training_args,  # Use the training arguments directly
)

Unsloth: Tokenizing ["text"] (num_proc=2):   0%|          | 0/25000 [00:00<?, ? examples/s]

Unsloth: Tokenizing ["text"] (num_proc=2):   0%|          | 0/1000 [00:00<?, ? examples/s]

### Save Fine-Tuned Model
- Save the model and tokenizer to a local directory.
- Useful for reloading later for inference or deployment.


In [None]:
# Save the model after training
trainer.model.save_pretrained("./results/final_model")
tokenizer.save_pretrained("./results/final_model")

('./results/final_model/tokenizer_config.json',
 './results/final_model/special_tokens_map.json',
 './results/final_model/tokenizer.model',
 './results/final_model/added_tokens.json',
 './results/final_model/tokenizer.json')