# FineTuning Pipeline

1. Load Bloom with transformers.

2. Load dataset with datasets.

3. Use PEFT to apply LoRA.

4. Define a training loop via Trainer.

5. Save + push the model to Hugging Face Hub

In [1]:
!pip install transformers datasets peft accelerate bitsandbytes

Collecting bitsandbytes
  Downloading bitsandbytes-0.46.1-py3-none-manylinux_2_24_x86_64.whl.metadata (10 kB)
Collecting fsspec<=2025.3.0,>=2023.1.0 (from fsspec[http]<=2025.3.0,>=2023.1.0->datasets)
  Downloading fsspec-2025.3.0-py3-none-any.whl.metadata (11 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch>=1.13.0->peft)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch>=1.13.0->peft)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch>=1.13.0->peft)
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch>=1.13.0->peft)
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cublas-cu12==12.4.5.8 (from torch>=1.13.0->peft)

In [2]:
# Checks Nvidia GPU usage
!nvidia-smi

Sat Jul 12 13:52:55 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.03              Driver Version: 560.35.03      CUDA Version: 12.6     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  Tesla T4                       Off |   00000000:00:04.0 Off |                    0 |
| N/A   41C    P8              9W /   70W |       1MiB /  15360MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  Tesla T4                       Off |   00

In [43]:
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
# Add padding token if it doesn't exist
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    load_in_8bit=True,
    device_map={'': torch.cuda.current_device()},  # Force to current device
    torch_dtype=torch.float16
)



The `load_in_4bit` and `load_in_8bit` arguments are deprecated and will be removed in the future versions. Please, pass a `BitsAndBytesConfig` object in `quantization_config` argument instead.


## Load and Preprocess Dataset

In [11]:
from datasets import load_dataset
dataset=load_dataset("Sp1786/multiclass-sentiment-analysis-dataset")

README.md: 0.00B [00:00, ?B/s]

train_df.csv: 0.00B [00:00, ?B/s]

val_df.csv: 0.00B [00:00, ?B/s]

test_df.csv: 0.00B [00:00, ?B/s]

Generating train split:   0%|          | 0/31232 [00:00<?, ? examples/s]

Generating validation split:   0%|          | 0/5205 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/5206 [00:00<?, ? examples/s]

In [13]:
print(dataset["train"].column_names)

['id', 'text', 'label', 'sentiment']


In [15]:
train_data=dataset["train"].select(range(20000))
val_data=dataset["validation"].select(range(4000))

## Tokenization Function

In [16]:
def preprocess(example):
    prompt = f"Text: {example['text']}\nSentiment:"
    target = labels[example["label"]]  # label is 0/1/2

    full_text = prompt + " " + target

    tokenized = tokenizer(
        full_text,
        truncation=True,
        padding="max_length",
        max_length=128,
    )
    return tokenized


In [18]:
labels = ["negative", "neutral", "positive"]


In [19]:
train_dataset = train_data.map(preprocess, remove_columns=["id", "text", "label", "sentiment"])
val_dataset = val_data.map(preprocess, remove_columns=["id", "text", "label", "sentiment"])


Map:   0%|          | 0/20000 [00:00<?, ? examples/s]

Map:   0%|          | 0/4000 [00:00<?, ? examples/s]

## LoRA with PEFT

In [20]:
from peft import LoraConfig, get_peft_model, TaskType

In [23]:
lora_config=LoraConfig(
    r=8,
    lora_alpha=32,
    target_modules=["query_key_value"],
    lora_dropout=0.1,
    bias="none",
    task_type=TaskType.CAUSAL_LM,
)

In [44]:
model=get_peft_model(model, lora_config)

In [45]:
model.print_trainable_parameters()

trainable params: 1,572,864 || all params: 1,723,981,824 || trainable%: 0.0912


## Define Trainer

In [46]:
from transformers import Trainer, TrainingArguments, DataCollatorForLanguageModeling

data_collator = DataCollatorForLanguageModeling(
    tokenizer=tokenizer,
    mlm=False
)

training_args = TrainingArguments(
    output_dir="./results",
    per_device_train_batch_size=4,
    per_device_eval_batch_size=4,
    gradient_accumulation_steps=4,
    eval_strategy="steps",
    eval_steps=200,
    logging_steps=100,
    num_train_epochs=2,
    learning_rate=2e-4,
    fp16=True,  # use fp16 on T4
    save_steps=500,
    save_total_limit=2,
    report_to="none"
)


In [47]:
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=val_dataset,
    tokenizer=tokenizer,
    data_collator=data_collator,
)


  trainer = Trainer(
No label_names provided for model class `PeftModelForCausalLM`. Since `PeftModel` hides base models input arguments, if label_names is not given, label_names can't be set automatically within `Trainer`. Note that empty label_names list will be used instead.


In [48]:
trainer.train()




Step,Training Loss,Validation Loss
200,5.4139,8.038282
400,8.0073,8.038282
600,8.0241,8.038282
800,8.0112,8.038282
1000,8.0163,8.038282
1200,7.9847,8.038282




TrainOutput(global_step=1250, training_loss=7.518933740234375, metrics={'train_runtime': 5654.0027, 'train_samples_per_second': 7.075, 'train_steps_per_second': 0.221, 'total_flos': 3.717671682048e+16, 'train_loss': 7.518933740234375, 'epoch': 2.0})

## Push Model to Hub
### You will need Hugging Face API from it's hub.

In [49]:
from huggingface_hub import notebook_login
notebook_login()


VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

In [50]:
model.push_to_hub("MLap/bloom1.7-lora-sentiment-analysis-classification")
tokenizer.push_to_hub("MLap/bloom1.7-lora-sentiment-analysis-classification")


adapter_model.safetensors:   0%|          | 0.00/6.30M [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

tokenizer.json:   0%|          | 0.00/21.8M [00:00<?, ?B/s]

CommitInfo(commit_url='https://huggingface.co/MLap/bloom1.7-lora-sentiment-analysis-classification/commit/e0d365ba128786cfe4e8774adff48858d47db472', commit_message='Upload tokenizer', commit_description='', oid='e0d365ba128786cfe4e8774adff48858d47db472', pr_url=None, repo_url=RepoUrl('https://huggingface.co/MLap/bloom1.7-lora-sentiment-analysis-classification', endpoint='https://huggingface.co', repo_type='model', repo_id='MLap/bloom1.7-lora-sentiment-analysis-classification'), pr_revision=None, pr_num=None)