## LLM Fine-tuning 
Fine-tuning a Large Language Model (LLM) involves refining its parameters on domain-specific data to enhance its performance for specialized tasks.
a.png)rge.ments.

LORA (Low-Rank Adaptation) is a technique for fine-tuning large language models (LLMs) that focuses on parameter efficiency. Here's the gist:

- **Standard fine-tuning:** Traditionally, fine-tuning adjusts all the weights (parameters) in a pre-trained LLM to a new task. This requires a lot of data and computational power.

- **LoRA's efficiency trick:** LoRA makes a key assumption: the changes needed for fine-tuning can be captured with a much smaller set of parameters. It essentially compresses the adjustments into "lower-rank" matrices.

- **Decomposition magic:**  Imagine the adjustments needed for fine-tuning as a big matrix. LoRA breaks this down into two smaller matrices that, when multiplied together, recreate the original adjustments. These smaller matrices have far fewer parameters to train!

- **Benefits:** This approach drastically reduces the number of trainable parameters, leading to:
    - Faster training and adaptation of LLMs.
    - Lower memory footprint during training and deployment.
    - Potentially even better performance compared to full fine-tuning in some cases.

Overall, LoRA offers a way to fine-tune large models more efficiently, making them more accessible for various tasks and hardware.and hardware.

## LORA
![image.png](attachment:f4aaa156-f5e1-45d0-a7d0-ea4bc177072a.png)

![image.png](attachment:0e98ad50-a18c-4d81-922c-7a76d496c007.png)

Link: https://huggingface.co/blog/peft

## let's Code

In [15]:
!pip install transformers
!pip install peft
!pip install torch
!pip install numpy
!pip install evaluate

Collecting evaluate
  Downloading evaluate-0.4.1-py3-none-any.whl.metadata (9.4 kB)
Collecting responses<0.19 (from evaluate)
  Downloading responses-0.18.0-py3-none-any.whl.metadata (29 kB)
Downloading evaluate-0.4.1-py3-none-any.whl (84 kB)
   ---------------------------------------- 0.0/84.1 kB ? eta -:--:--
   -------------- ------------------------- 30.7/84.1 kB 1.3 MB/s eta 0:00:01
   ------------------- -------------------- 41.0/84.1 kB 653.6 kB/s eta 0:00:01
   ------------------- -------------------- 41.0/84.1 kB 653.6 kB/s eta 0:00:01
   ----------------------------- ---------- 61.4/84.1 kB 328.2 kB/s eta 0:00:01
   ---------------------------------------- 84.1/84.1 kB 429.6 kB/s eta 0:00:00
Downloading responses-0.18.0-py3-none-any.whl (38 kB)
Installing collected packages: responses, evaluate
Successfully installed evaluate-0.4.1 responses-0.18.0


In [1]:
from datasets import load_dataset, DatasetDict, Dataset
from transformers import AutoTokenizer, AutoConfig, AutoModelForSequenceClassification,\
DataCollatorWithPadding,TrainingArguments,Trainer

In [2]:
from peft import PeftModel, PeftConfig, get_peft_model,LoraConfig
import evaluate
import torch
import numpy

In [3]:
#Choosing model to fine tune
# find other models here : ->   https://huggingface.co/models
# find documentation : -> https://huggingface.co/docs/transformers/model_doc/auto#transformers.AutoModelForSequenceClassification
model_choice = 'distilbert/distilbert-base-uncased'

In [4]:
#define  label maps
idLabel={0:"Baf", 1 :"Good"}
labelId={0:"Baf", 1 :"Good"}


In [5]:
#generate classificaaiton model 
model = AutoModelForSequenceClassification.from_pretrained(
    model_choice,num_labels = 2, id2label = idLabel, label2id = labelId)

Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert/distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [14]:
# Getting dat aready for classification

#you can use data from hf or load data from your csv file 
# for dataset in hf use->  dataset = load_dataset("shawhin/imdb-truncated")

# prepare your own data 
import pandas as pd
from sklearn.model_selection import train_test_split
from datasets import Dataset

# Replace 'Restaurant_Reviews.tsv' with your actual file path
data = pd.read_csv("Restaurant_Reviews.tsv", delimiter='\t')

train_data, val_data = train_test_split(data, test_size=0.2, random_state=42)  # Adjust test_size as needed


train_dict = {"text": train_data['Review'], "label": train_data['Liked']}
val_dict = {"text": val_data['Review'], "label": val_data['Liked']}

dataset={
    "train":Dataset.from_dict(train_dict),
    "validation":Dataset.from_dict(val_dict)
}

hf_dataset = DatasetDict(dataset)

In [15]:
hf_dataset

DatasetDict({
    train: Dataset({
        features: ['text', 'label'],
        num_rows: 800
    })
    validation: Dataset({
        features: ['text', 'label'],
        num_rows: 200
    })
})