<font size="6">**7 steps to fine-tune a LLM**</font>

In [22]:
import pandas as pd
import numpy as np

In [23]:
# %pip install transformers
# %pip install accelerate -U
# %pip install pandas, numpy, huggingface
# %pip install torch torchvision torchaudio

**Requirements**

For this tutorial, the following libraries are needed: 
- Throughout the whole tutorial, we will be using the `transformers` library. 
- For the fine-tuning either `pytorch` or `tensorflow` are required. (This Notebook will be implemented with `pytorch`)
- To push the fine-tuned model to HuggingFace, the `HuggingFace_hub`library is required. 

In [24]:
# %pip install transformers
# %pip install torch
# %pip install huggingface_hub

## STEP 1 - Having our concrete objective clear

## STEP 2 - Choose a pre-trained model and a dataset 

## STEP 3 - Load the data to use



In [25]:
from datasets import load_dataset

dataset = load_dataset("mteb/tweet_sentiment_extraction")
df = pd.DataFrame(dataset['train'])


## STEP 4 - Tokenizer



In [26]:
from transformers import GPT2Tokenizer

# Loading the dataset to train our model
dataset = load_dataset("mteb/tweet_sentiment_extraction")

tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
tokenizer.pad_token = tokenizer.eos_token
def tokenize_function(examples):
   return tokenizer(examples["text"], padding="max_length", truncation=True)

tokenized_datasets = dataset.map(tokenize_function, batched=True)

In [27]:
small_train_dataset = tokenized_datasets["train"].shuffle(seed=42).select(range(100))
small_eval_dataset = tokenized_datasets["test"].shuffle(seed=42).select(range(100))

## STEP 5 - Initialize our base model

In [28]:
from transformers import GPT2ForSequenceClassification

model = GPT2ForSequenceClassification.from_pretrained("gpt2", num_labels=3)

Some weights of GPT2ForSequenceClassification were not initialized from the model checkpoint at gpt2 and are newly initialized: ['score.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


## STEP 6 - Evaluate method


In [29]:
import evaluate

metric = evaluate.load("accuracy")

def compute_metrics(eval_pred):
   logits, labels = eval_pred
   predictions = np.argmax(logits, axis=-1)
   return metric.compute(predictions=predictions, references=labels)


## STEP 7 - Fine-tune using the Trainer Method



In [30]:
from transformers import TrainingArguments, Trainer

training_args = TrainingArguments(
   output_dir="test_trainer",
   #evaluation_strategy="epoch",
   per_device_train_batch_size=1,  # Reduce batch size here
   per_device_eval_batch_size=1,    # Optionally, reduce for evaluation as well
   gradient_accumulation_steps=4
   )


trainer = Trainer(
   model=model,
   args=training_args,
   train_dataset=small_train_dataset,
   eval_dataset=small_eval_dataset,
   compute_metrics=compute_metrics,

)

trainer.train()

dataloader_config = DataLoaderConfiguration(dispatch_batches=None, split_batches=False, even_batches=True, use_seedable_sampler=True)
100%|██████████| 75/75 [02:38<00:00,  2.11s/it]

{'train_runtime': 158.3406, 'train_samples_per_second': 1.895, 'train_steps_per_second': 0.474, 'train_loss': 1.1476851399739583, 'epoch': 3.0}





TrainOutput(global_step=75, training_loss=1.1476851399739583, metrics={'train_runtime': 158.3406, 'train_samples_per_second': 1.895, 'train_steps_per_second': 0.474, 'train_loss': 1.1476851399739583, 'epoch': 3.0})

In [31]:
trainer.evaluate()
trainer.save_model("Fine_Tuned_Models")


100%|██████████| 100/100 [00:16<00:00,  6.13it/s]


## Sharing the model

In [32]:
from transformers import AutoModelForSequenceClassification, AutoTokenizer

# Replace with the actual path to your fine-tuned model
model_path = "Fine_Tuned_Models"  

# We get our model
finetuned_model = AutoModelForSequenceClassification.from_pretrained(model_path)


In [34]:
from huggingface_hub import notebook_login

notebook_login()

In [None]:
finetuned_model.push_to_hub("distilbert-base-multilingual-cased-sentiments-student-fine-tuned-data-camp")