# Fine-Tuning with Unsloth - SFT Template





Unsloth is an open-source platform for fine-tuning popular Large Language Models faster. It supports popular LLMs, including Llama-2 and Mistral, and their derivatives like Yi, Open-hermes, etc. It implements custom triton kernels and a manual back-prop engine to improve the speed of the model training.

Here, we will use the Unsloth to Fine-tune a base 4-bit quantized Tiny-Llama model on the Alpaca dataset. The model is quantized with bits and bytes, and kernels are optimized with OpenAI’s Triton.

`I prepared this Fine-Tuning with Unsloth - SFT Template for my use case, but you could change it to suit your requirements.`



To View My Account:

* [Hugging Face ](https://huggingface.co/santhoshmlops)

* [Git Hub](https://github.com/santhoshmlops)

To View Some other Fine Tuning Template:

* [Fine Tuning Template ](https://github.com/santhoshmlops/MyHF_LLM_FineTuning/tree/main/FineTuningTemplate)


To View My Model Fine Tuning  NoteBook:

* [MY HF LLM Fine-Tuning](https://github.com/santhoshmlops/MyHF_LLM_FineTuning)



## Setting Up on Google Colab
Google Colab provides a convenient, cloud-based environment with access to powerful GPUs like the `T4`. If you choose Colab for this tutorial, make sure to select a GPU runtime by going to `Runtime > Change runtime type > T4 GPU`. This ensures that your notebook has access to the necessary computational resources.

## Setting Up Hugging Face Authentication

On Google Colab, you can safely store your Hugging Face token by using Colab's "Secrets" feature. This can be done by clicking on the "Key" icon in the sidebar, selecting "`Secrets`", and adding a new secret with the name `HF_TOKEN` and your Hugging Face token as the value. This method ensures that your token remains secure and is not exposed in your notebook's code.

# Step 1 - Install the required Python packages

In [None]:
!pip install "unsloth[colab-ampere] @ git+https://github.com/unslothai/unsloth.git"

# Step 2 - Logging into Hugging Face Hub
Paste the Hugging Face Hub Write API KEY

In [None]:
from huggingface_hub import notebook_login
notebook_login()

# Step 3 - Loading Required Libraries

In [None]:
import os
import torch
from unsloth import FastLanguageModel
from trl import SFTTrainer
from transformers import TrainingArguments
from datasets import load_dataset

# Step 4 - Setting Model Parameters for SFT
`Note:` The parameter can be changed for fine tuning, or it can be left as it is and filled with the value of the empty parameter.

In [None]:
# Load Model for Tuning
model_ckpt = ""                                                             # Change the model_ckpt as your wish.
hf_user_name = ""                                                           # Change the user_name as your wish.
hub_model_ckpt = hf_user_name+"/"+ model_ckpt.split("/")[-1]+"-Unsloth-SFT" # Change the hub_model_ckpt as your wish.
dataset_name = "databricks/databricks-dolly-15k"

# Automodel/Tokenizer Parameters
max_seq_length = 1024         # Choose any! We auto support RoPE Scaling internally!
dtype = None                  # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+
load_in_4bit = True           # Use for 4bit quantization

# Lora Parameters
r = 8                         # Choose any number > 0 ! Suggested 8, 16, 32, 64, 128
target_modules = ["q_proj", "k_proj", "v_proj"]
lora_alpha = 16
lora_dropout = 0.05           # Supports any, but = 0 is optimized
bias = "none"                 # Supports any, but = "none" is optimized
use_gradient_checkpointing = True
use_rslora = False            # We support rank stabilized LoRA
loftq_config = None           # And LoftQ

# Training Parameters
output_dir = model_ckpt.split("/")[-1]+"-Unsloth-SFT"   # Change the model_ckpt as your wish.
num_train_epochs = 1
per_device_train_batch_size = 3
gradient_accumulation_steps = 1
gradient_checkpointing = True
max_grad_norm = 0.3
learning_rate = 2e-4
weight_decay = 0.003
optim = "paged_adamw_32bit"
lr_scheduler_type = "cosine"
max_steps = 250
warmup_ratio = 0.03
group_by_length = True
save_steps = 50
save_strategy = "epoch"
logging_steps = 50
logging_dir = "./logs"
fp16 = False
bf16 = False
packing = True
neftune_noise_alpha = 5
device_map = "auto"
report_to = "tensorboard"

# SFT Training Parameters
dataset_text_field = "text"
dataset_num_proc = 2
packing = True
max_seq_length = 1024

# Merge and push the model to Hub
low_cpu_mem_usage = True
return_dict = True

# Step 5 - Load the Model and Tokenizer

In [None]:
# Load the model and tokenizer with specified configurations.
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = model_ckpt,
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit
)

# Prepare the model with LoRA (Low-Rank Adaptation) configuration
model = FastLanguageModel.get_peft_model(
    model,
    r = r,
    target_modules = target_modules,
    lora_alpha = lora_alpha,
    lora_dropout = lora_dropout,
    bias = bias,
    use_gradient_checkpointing = use_gradient_checkpointing,
)

# Step 6 - Loading and Formatting the Dataset
`Note:` Prepare your dataset for fine tuning by defining and formatting it for your use case. The `def create_data():` function is an example for tuning the dataset.

In [None]:
prompt = """Based on given instruction and context, generate an appropriate response

### Instruction:
{}

### Context:
{}

### Response:
{}
"""

EOS_TOKEN = tokenizer.eos_token # Must add EOS_TOKEN
def formatting_prompts_func(examples):
    instructions = examples["instruction"]
    contexts = examples["context"]
    responses = examples["response"]
    texts = []

    for i,j,k  in zip(instructions, contexts,responses):
        text = prompt.format(i,j,k) + EOS_TOKEN
        texts.append(text)
    return { "text" : texts, }
pass


dataset = load_dataset(dataset_name, split = "train")
train_dataset = dataset.map(formatting_prompts_func, batched = True)
print(train_dataset[0])

# Step 7 - Fine-Tuning with Supervised Finetuning

In [None]:
# Training arguments
training_arguments = TrainingArguments(
    output_dir=output_dir,
    num_train_epochs=num_train_epochs,
    per_device_train_batch_size=per_device_train_batch_size,
    gradient_accumulation_steps=gradient_accumulation_steps,
    optim=optim,
    save_steps=save_steps,
    logging_steps=logging_steps,
    learning_rate=learning_rate,
    weight_decay=weight_decay,
    fp16=fp16,
    bf16=bf16,
    max_grad_norm=max_grad_norm,
    max_steps=max_steps,
    warmup_ratio=warmup_ratio,
    group_by_length=group_by_length,
    lr_scheduler_type=lr_scheduler_type,
    neftune_noise_alpha = neftune_noise_alpha
)

# Create a trainer for training the model.
trainer = SFTTrainer(
    model = model,
    tokenizer = tokenizer,
    train_dataset = train_dataset,
    dataset_text_field = dataset_text_field,
    max_seq_length = max_seq_length,
    packing = packing,
    args = training_arguments,
)

# Step 8 - Lets start the training process

In [None]:
# Train the model and save it.
trainer.train()
trainer.push_to_hub(hub_model_ckpt)
tokenizer.push_to_hub(hub_model_ckpt)