# Supervised Fine-Tuning (SFT) Template





Supervised fine-tuning (SFT) is a technique used to adapt a pre-trained Large Language Model (LLM) to a specific downstream task using labeled data.This process allows the model to learn task-specific patterns and nuances by adapting its parameters according to the specific data distribution and task requirements.

`I prepared this Supervised Fine-Tuning (SFT) template for my use case, but you could change it to suit your requirements.`



To View My Account:

* [Hugging Face ](https://huggingface.co/santhoshmlops)

* [Git Hub](https://github.com/santhoshmlops)

To View Some other Fine Tuning Template:

* [Fine Tuning Template ](https://github.com/santhoshmlops/MyHF_LLM_FineTuning/tree/main/FineTuningTemplate)


To View My Model Fine Tuning  NoteBook:

* [MY HF LLM Fine-Tuning](https://github.com/santhoshmlops/MyHF_LLM_FineTuning)



## Setting Up on Google Colab
Google Colab provides a convenient, cloud-based environment with access to powerful GPUs like the `T4`. If you choose Colab for this tutorial, make sure to select a GPU runtime by going to `Runtime > Change runtime type > T4 GPU`. This ensures that your notebook has access to the necessary computational resources.

## Setting Up Hugging Face Authentication

On Google Colab, you can safely store your Hugging Face token by using Colab's "Secrets" feature. This can be done by clicking on the "Key" icon in the sidebar, selecting "`Secrets`", and adding a new secret with the name `HF_TOKEN` and your Hugging Face token as the value. This method ensures that your token remains secure and is not exposed in your notebook's code.

# Step 1 - Install the required Python packages

In [None]:
!pip install -q -U transformers
!pip install -q -U peft
!pip install -q -U bitsandbytes
!pip install -q -U trl
!pip install -q -U accelerate
!pip install -q -U datasets

# Step 2 - Logging into Hugging Face Hub
Paste the Hugging Face Hub Write API KEY

In [None]:
from huggingface_hub import notebook_login
notebook_login()

# Step 3 - Loading Required Libraries

In [None]:
import os
import torch
from datasets import load_dataset, Dataset
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, TrainingArguments
from peft import LoraConfig,PeftModel, AutoPeftModelForCausalLM, prepare_model_for_kbit_training, get_peft_model
from trl import SFTTrainer
from accelerate import Accelerator

# Step 4 - Setting Model Parameters for SFT
`Note:` The parameter can be changed for fine tuning, or it can be left as it is and filled with the value of the empty parameter.

In [None]:
sft_config = {
            # Load Model for Tuning
            "model_ckpt": "",  # Change the model_ckpt as your wish. For eg - "microsoft/phi-1_5"
            "new_model_ckpt": "", # Change the new_model_ckpt as your wish. For eg - "microsoft_phi-1_5_merged-SFT"
            "hub_model_ckpt": "", # Change the hub_model_ckpt as your wish. For eg - "santhoshmlops/microsoft_phi-1_5_merged-SFT"
            # QLora Parameters
            "use_lora": True,
            "r": 16,
            "lora_alpha": 32,
            "lora_dropout": 0.05,
            "bias": "none",
            "task_type": "CAUSAL_LM",
            "target_modules": [""],    # Change the Target modules based on the model for tuning For eg - ["q_proj","k_proj"]
            # BitsandBytes Parameters
            "load_in_4bit": True,
            "bnb_4bit_quant_type" : "nf4",
            "bnb_4bit_compute_dtype": torch.float16,
            "bnb_4bit_use_double_quant": True,
            # Automodel Parameters
            "device_map": {"": Accelerator().local_process_index},
            "torch_dtype": torch.float16,
            # Tokenizer Parameters
            "trust_remote_code": True,
            # Training Parameters
            "output_dir": "",   # Change the model_ckpt as your wish. For eg - "microsoft_phi-1_5_merged-SFT"
            "num_train_epochs": 1,
            "per_device_train_batch_size": 1,
            "gradient_accumulation_steps": 1,
            "gradient_checkpointing" : True,
            "max_grad_norm" : 0.3,
            "learning_rate": 2e-4,
            "weight_decay" : 0.003,
            "optim": "paged_adamw_32bit",
            "lr_scheduler_type": "cosine",
            "max_steps": 10,
            "warmup_ratio" : 0.03,
            "group_by_length" : True,
            "save_steps" : 2,
            "save_strategy": "epoch",
            "logging_steps": 2,
            "logging_dir": "./logs",
            "fp16": False,
            "bf16" : False,
            "push_to_hub": True,
            "neftune_noise_alpha": 5,
            "report_to":"tensorboard",
            # SFT Training Parameters
            "train_cln_name": "chat_sample",
            "packing": False,
            "max_seq_length": 512,
            # Merge and push the model to Hub
            "low_cpu_mem_usage" : True,
            "return_dict" : True,
            "torch_dtype": torch.float16
        }

# Step 5 - Loading and Formatting the Dataset
`Note:` Prepare your dataset for fine tuning by defining and formatting it for your use case. The `def create_data():` function is an example for tuning the dataset.

In [None]:
dataset_name = "gathnex/Gath_baize"
def create_data():
  data = load_dataset(dataset_name, split="train")
  data_df = data.to_pandas()
  original_system_message = "The conversation between Human and AI assisatance named Gathnex"
  system_message = "[INST]The conversation between Human and AI assisatance named Microsoft_Phi AI Assisatance.\n[/INST]"
  data_df["chat_sample"] = data_df["chat_sample"].apply(lambda x: x.replace(original_system_message, "").strip())
  data_df["chat_sample"]= system_message + data_df["chat_sample"]
  data = Dataset.from_pandas(data_df)
  return data

data = create_data()
print(data[0])

# Step 6 - Fine-Tuning with qLora and Supervised Finetuning

In [None]:
# Class definition for training SFT (Self-Fine-Tuning) models
class TrainSFT:

    def __init__(self, data, config):
        """
        Initialize the TrainSFT class with data and configuration parameters.
        """
        self.data = data
        self.config = config

    def prepare_lora_model(self):
        """
        Prepare the model with LoRA (Low-Rank Adaptation) configuration.
        """
        self.lora_config = LoraConfig(
            r=self.config["r"],
            lora_alpha=self.config["lora_alpha"],
            lora_dropout=self.config["lora_dropout"],
            bias=self.config["bias"],
            task_type=self.config["task_type"],
            target_modules=self.config["target_modules"]
        )
        self.model = get_peft_model(self.model, self.lora_config)

    def load_model_tokenizer(self):
        """
        Load the model and tokenizer with specified configurations.
        """
        self.bnb_config = BitsAndBytesConfig(
            load_in_4bit=self.config["load_in_4bit"],
            bnb_4bit_quant_type=self.config["bnb_4bit_quant_type"],
            bnb_4bit_compute_dtype=self.config["bnb_4bit_compute_dtype"],
            bnb_4bit_use_double_quant=self.config["bnb_4bit_use_double_quant"],
        )

        self.model = AutoModelForCausalLM.from_pretrained(
            self.config["model_ckpt"],
            quantization_config=self.bnb_config,
            device_map=self.config["device_map"],
            trust_remote_code=self.config["trust_remote_code"],
            torch_dtype=self.config["torch_dtype"]
        )
        self.model.config.use_cache = False
        self.model.config.pretraining_tp = 1
        self.model.gradient_checkpointing_enable()
        self.model = prepare_model_for_kbit_training(self.model)

        if self.config["use_lora"]:
            self.prepare_lora_model()

        self.tokenizer = AutoTokenizer.from_pretrained(
            self.config["model_ckpt"],
            trust_remote_code=self.config["trust_remote_code"],
            )
        self.tokenizer.pad_token = self.tokenizer.eos_token
        self.tokenizer.padding_side = "right"
        torch.cuda.empty_cache()

    def set_training_args(self):
        """
        Set up training arguments.
        """
        return TrainingArguments(
            output_dir=self.config["output_dir"],
            num_train_epochs=self.config["num_train_epochs"],
            per_device_train_batch_size=self.config["per_device_train_batch_size"],
            gradient_accumulation_steps=self.config["gradient_accumulation_steps"],
            gradient_checkpointing=self.config["gradient_checkpointing"],
            max_grad_norm=self.config["max_grad_norm"],
            learning_rate=self.config["learning_rate"],
            weight_decay=self.config["weight_decay"],
            optim=self.config["optim"],
            lr_scheduler_type=self.config["lr_scheduler_type"],
            max_steps=self.config["max_steps"],
            warmup_ratio=self.config["warmup_ratio"],
            group_by_length=self.config["group_by_length"],
            save_steps=self.config["save_steps"],
            save_strategy=self.config["save_strategy"],
            logging_steps=self.config["logging_steps"],
            logging_dir=self.config["logging_dir"],
            fp16=self.config["fp16"],
            bf16=self.config["bf16"],
            push_to_hub=self.config["push_to_hub"],
            neftune_noise_alpha=self.config["neftune_noise_alpha"],
            report_to=self.config["report_to"]
        )

    def create_trainer(self):
        """
        Create a trainer for training the model.
        """
        self.load_model_tokenizer()
        if self.config["use_lora"]:
            print(self.model.print_trainable_parameters())
            self.trainer = SFTTrainer(
                model=self.model,
                train_dataset=self.data,
                peft_config=self.lora_config,
                dataset_text_field=self.config["train_cln_name"],
                args=self.set_training_args(),
                tokenizer=self.tokenizer,
                packing=self.config["packing"],
                max_seq_length=self.config["max_seq_length"]
            )
        else:
            self.trainer = SFTTrainer(
                model=self.model,
                train_dataset=self.data,
                dataset_text_field=self.config["train_cln_name"],
                args=self.set_training_args(),
                tokenizer=self.tokenizer,
                packing=self.config["packing"],
                max_seq_length=self.config["max_seq_length"]
            )

    def train_and_save_model(self):
        """
        Train the model and save it.
        """
        self.create_trainer()
        self.trainer.train()
        self.trainer.save_model(self.config["output_dir"])
        self.tokenizer.save_pretrained(self.config["output_dir"])

# Step 7 - Lets start the training process

In [None]:
train_sft = TrainSFT(data, sft_config)
train_sft.train_and_save_model()

# Step 8 - Merge the model with LoRA weights

In [None]:
def merge_push_to_hub(config):

    # Initialize tokenizer
    tokenizer = AutoTokenizer.from_pretrained(config["model_ckpt"])
    tokenizer.pad_token = tokenizer.eos_token
    tokenizer.padding_side = "right"

    # Load base model
    base_model = AutoModelForCausalLM.from_pretrained(
        config["model_ckpt"],
        low_cpu_mem_usage=config["low_cpu_mem_usage"],
        return_dict=config["return_dict"],
        torch_dtype=config["torch_dtype"],
        device_map=config["device_map"]
    )

    # Merge models
    merged_model = PeftModel.from_pretrained(base_model,config["hub_model_ckpt"], from_transformers=True)
    merged_model = merged_model.merge_and_unload()

    # Push the model and tokenizer to the Hugging Face Model Hub
    merged_model.push_to_hub(config["new_model_ckpt"], use_temp_dir=False)
    tokenizer.push_to_hub(config["new_model_ckpt"], use_temp_dir=False)

# Assuming sft_config is defined elsewhere
merge_push_to_hub(sft_config)