---
title: "Supervised Fine-Tuning Part-1"
date: "2024-12-22"
categories: [supervised fine-tuning, LLM]
format:
  html: default
---

# Supervised Fine-Tuning with the `trl` Library

## Introduction: What is Supervised Fine-Tuning?
> Before understanding what SFT is, we should understand what pre-training is. Pre-training involves training a model (generally a transformer) on a large corpus of text. Pre-training allows models to learn generalizable knowledge, grammar, semantics etc. However, the model is hardly usable after pre-training; as the model lacks task specific expertize.

  > That's where Supervised Training plays a part. Supervised Fine-Tuning is used to adapt a pre-trained model to a specific task. It involves training the model on a labeled dataset, where the model learns to predict the correct label for each input.

- In this article, we will load a pre-trained model from HuggingFace and finetune it on a specific dataset about Python programming



## Install Required Packages

In [18]:
# !pip install trl transformers trl datasets

## Import Packages

In [19]:
from transformers import AutoModelForCausalLM, AutoTokenizer
from trl import SFTConfig, SFTTrainer, DataCollatorForCompletionOnlyLM
from datasets import load_dataset
import os
import torch
import warnings
warnings.filterwarnings('ignore')

In [None]:
if torch.cuda.is_available():
    device = 'cuda'
else:
    device = 'cpu'

os.environ['PYTORCH_CUDA_ALLOC_CONF'] = 'expandable_segments:True'

## Load a Pretrained Model
> We will load `SmolLM2-135M` model and it's tokenizer from HuggingFace. After that, we will generate output of the model on a specific input prompt related to Python (we ask the model to generate code for finding square root of a number)

  > As we can see, the model did not provide correct output. The model generate irrelevant text and keep repeating that. Let's see how finetuning this model on task-specific dataset will help in increasing the model correctness.



In [24]:
model_name = 'HuggingFaceTB/SmolLM2-135M'
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
tokenizer.pad_token = tokenizer.eos_token

In [26]:
input = 'Give me a Python code for finding square root of a number'
tokenized_input = tokenizer(input, return_tensors='pt')
output = model.generate(**tokenized_input, max_length=128)
print(tokenizer.decode(output[0]))

Setting `pad_token_id` to `eos_token_id`:0 for open-end generation.


Give me a Python code for finding square root of a number.

## Python Program to Find Square Root of a Number

Let’s see how to find square root of a number in Python.

``````# Python program to find square root of a number

# This program will print the square root of a number

# using the built-in function square root

# This program will print the square root of a number

# using the built-in function square root

# This program will print the square root of a number

# using the built-in function square


## Load Dataset

In [38]:
dataset_path = 'jtatman/python-code-dataset-500k'
ds = load_dataset(dataset_path, split='train[:3%]')
ds = ds.train_test_split(test_size=0.1)
train_dataset = ds['train']
test_dataset = ds['test']

In [39]:
len(train_dataset)

15106

In [40]:
len(test_dataset)

1679

In [41]:
def formatting_function(example):
    output_texts = []
    for i in range(len(example['instruction'])):
        individual_prompt = "###System: {}\n###Instruction: {}\n ###Output: {}".format(example['system'][i],
                                                                             example['instruction'][i],
                                                                             example['output'][i])
        output_texts.append(individual_prompt)
    return output_texts

In [31]:
response_template = " ###Output:"
collator = DataCollatorForCompletionOnlyLM(response_template, tokenizer=tokenizer)

In [46]:
trainer_config = SFTConfig(
    output_dir='.\code_finetuned_mode',
    max_steps=400,
    per_device_train_batch_size=2,
    per_device_eval_batch_size=2,
    eval_strategy="steps",
    eval_steps=50,
    logging_steps=10,
    ignore_data_skip=True
)

In [47]:
torch.cuda.empty_cache()

trainer = SFTTrainer(
    model=model,
    args=trainer_config,
    train_dataset=train_dataset,
    eval_dataset=test_dataset,
    formatting_func=formatting_function,
    data_collator=collator
)
trainer.train()

Step,Training Loss,Validation Loss
50,0.8394,0.743556
100,0.794,0.731545
150,0.7153,0.723117
200,0.6353,0.717372
250,0.7201,0.713097
300,0.6801,0.710268
350,0.7111,0.707956
400,0.8319,0.707255


###Instruction: Using the given information, write a function to calculate a 10-digit ISBN number. The ISBN number should be generated by combining various elements from the given information in a specific order.

Here are the steps to generate the ISBN number:

1. Take the first three letters of the author's last name (in this case, "Doe") and convert them to uppercase. If the author's last name has less than three letters, pad it with "X" until it reaches three letters. In this case, the result would be "DOE".

2. Take the first three letters of the book title (in this case, "Advanced") and convert them to uppercase. If the title has less than three letters, pad it with "X" until it reaches three letters. In this case, the result would be "ADV".

3. Take the first letter of the publisher's name (in this case, "X") and convert it to uppercase.

4. Take the last two digits of the publication year (in this case, "22").

5. Concatenate all the elements obtained in steps 1 to 4. The resul

TrainOutput(global_step=400, training_loss=0.7337326669692993, metrics={'train_runtime': 1419.3064, 'train_samples_per_second': 0.564, 'train_steps_per_second': 0.282, 'total_flos': 341935195820544.0, 'train_loss': 0.7337326669692993, 'epoch': 0.052959089103667416})

In [48]:
finetune_name = 'code_model_finetuned'

In [49]:
trainer.save_model(f'./{finetune_name}')

## Load the Finetuned Model and perform prediction with it

In [50]:
finetuned_model = AutoModelForCausalLM.from_pretrained(f'./{finetune_name}', local_files_only=True)

In [52]:
input = 'Give me a Python code for finding square root of a number'
tokenized_input = tokenizer(input, return_tensors='pt')
finetuned_model.generation_config.pad_token_id = tokenizer.pad_token_id
output = finetuned_model.generate(**tokenized_input, max_length=128)
print(tokenizer.decode(output[0]))

Give me a Python code for finding square root of a number.

Here's the code:

```python
def square_root(num):
    return num ** 0.5

print(square_root(10))  # Output: 5
print(square_root(20))  # Output: 10
print(square_root(30))  # Output: 40
```

In this code, we define a function `square_root` that takes a number as input and returns its square root. We then call this function with `1
