<a href="https://colab.research.google.com/github/Milad-Khanchi/SFTTrainer_LLAMA2/blob/main/Fine_Tuning_LLMs_with_Hugging_Face.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Fine-Tuning LLMs with Hugging Face

## Step 1: Installing and importing the libraries

In [1]:
!pip uninstall accelerate peft bitsandbytes transformers trl -y
!pip install accelerate peft==0.13.2 bitsandbytes transformers trl==0.12.0

Found existing installation: accelerate 1.1.1
Uninstalling accelerate-1.1.1:
  Successfully uninstalled accelerate-1.1.1
Found existing installation: peft 0.13.2
Uninstalling peft-0.13.2:
  Successfully uninstalled peft-0.13.2
[0mFound existing installation: transformers 4.46.3
Uninstalling transformers-4.46.3:
  Successfully uninstalled transformers-4.46.3
[0mCollecting accelerate
  Downloading accelerate-1.2.1-py3-none-any.whl.metadata (19 kB)
Collecting peft==0.13.2
  Downloading peft-0.13.2-py3-none-any.whl.metadata (13 kB)
Collecting bitsandbytes
  Downloading bitsandbytes-0.45.0-py3-none-manylinux_2_24_x86_64.whl.metadata (2.9 kB)
Collecting transformers
  Downloading transformers-4.47.1-py3-none-any.whl.metadata (44 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.1/44.1 kB[0m [31m3.5 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting trl==0.12.0
  Downloading trl-0.12.0-py3-none-any.whl.metadata (10 kB)
Collecting datasets>=2.21.0 (from trl==0.12.0)
  Down

In [2]:
!pip install huggingface_hub



In [3]:
import torch
from trl import SFTTrainer
from peft import LoraConfig
from datasets import load_dataset
from transformers import (AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, TrainingArguments, pipeline)

## Step 2: Loading the model

In [5]:
llama_model = AutoModelForCausalLM.from_pretrained(
    "aboonaji/llama2finetune-v2",
    quantization_config=BitsAndBytesConfig(load_in_4bit = True, bnb_4bit_compute_dtype = getattr(torch, "float16"), bnb_4bit_quant_type = "nf4")
)

llama_model.config.use_cache = False
llama_model.config.pretraining_tp = 1

`low_cpu_mem_usage` was None, now default to True since model is quantized.


Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/200 [00:00<?, ?B/s]

adapter_model.bin:   0%|          | 0.00/33.6M [00:00<?, ?B/s]

## Step 3: Loading the tokenizer

In [6]:
llama_tokenizer = AutoTokenizer.from_pretrained("aboonaji/llama2finetune-v2", trust_remote_code = True)
llama_tokenizer.pad_token = llama_tokenizer.eos_token
llama_tokenizer.padding_side = "right"

tokenizer_config.json:   0%|          | 0.00/695 [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.84M [00:00<?, ?B/s]

added_tokens.json:   0%|          | 0.00/21.0 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/435 [00:00<?, ?B/s]

## Step 4: Setting the training arguments

In [7]:
training_arguments = TrainingArguments(
    output_dir = "./results",
    per_device_train_batch_size = 4,
    max_steps = 100)

## Step 5: Creating the Supervised Fine-Tuning trainer

In [10]:
llama_sft_trainer = SFTTrainer(
    model = llama_model,
    args = training_arguments,
    train_dataset = load_dataset(path = "aboonaji/wiki_medical_terms_llam2_format", split="train"),
    tokenizer = llama_tokenizer,
    peft_config = LoraConfig(
        r = 64,
        lora_alpha = 16,
        lora_dropout = 0.1,
        task_type = "CAUSAL_LM"),
    dataset_text_field = "text"
    )


Deprecated positional argument(s) used in SFTTrainer, please use the SFTConfig to set these arguments instead.


Map:   0%|          | 0/6861 [00:00<?, ? examples/s]

## Step 6: Training the model

In [13]:
llama_sft_trainer.train()

## Step 7: Chatting with the model

In [13]:
prompt = "I have chestpain"
GENtext = pipeline("text-generation", model = llama_model, tokenizer = llama_tokenizer, max_length = 300)
asnwer = GENtext(f"<s>[INST] {prompt} [/INST]")
print(asnwer[0]['generated_text'])

Device set to use cuda:0


<s>[INST] I have chestpain [/INST]  I'm so sorry to hear that you're experiencing chest pain. Unterscheidung between a heart attack and other causes of chest pain can be challenging, but it's important to seek medical attention right away if you're experiencing any of the following symptoms:

1. Severe chest pain or discomfort that radiates to your arm, neck, jaw, or back.
2. Shortness of breath or difficulty breathing.
3. Feeling weak, lightheaded, or faint.
4. A cold sweat or clamminess.
5. Nausea or vomiting.
6. Rapid or irregular heartbeat.

If you're experiencing any of these symptoms, call emergency services or go to the emergency room right away. Do not try to drive yourself, as your safety is the priority.

If you're experiencing any other type of chest pain that is not severe or life-threatening, you can try the following:

1. Take a deep breath and try to relax.
2. Apply gentle pressure to the area with your fingertips.
3. Try to identify any triggers that may be causing the 