# Fine-Tuning of Pretrained Model using LoRA


## Intsall Requireds

In [1]:
!pip install -q transformers datasets peft accelerate bitsandbytes huggingface_hub

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m60.1/60.1 MB[0m [31m21.2 MB/s[0m eta [36m0:00:00[0m
[?25h

In [2]:
from huggingface_hub import login
import os

# Use your Secret Key
login(token=os.getenv("HF_TOKEN"))

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

## Load Dataset from HF

In [3]:
from datasets import load_dataset

# Load the Alpaca-cleaned dataset
dataset = load_dataset("yahma/alpaca-cleaned")

# Show one sample to confirm
print(dataset["train"][0])

README.md: 0.00B [00:00, ?B/s]

alpaca_data_cleaned.json:   0%|          | 0.00/44.3M [00:00<?, ?B/s]

Generating train split:   0%|          | 0/51760 [00:00<?, ? examples/s]

{'output': '1. Eat a balanced and nutritious diet: Make sure your meals are inclusive of a variety of fruits and vegetables, lean protein, whole grains, and healthy fats. This helps to provide your body with the essential nutrients to function at its best and can help prevent chronic diseases.\n\n2. Engage in regular physical activity: Exercise is crucial for maintaining strong bones, muscles, and cardiovascular health. Aim for at least 150 minutes of moderate aerobic exercise or 75 minutes of vigorous exercise each week.\n\n3. Get enough sleep: Getting enough quality sleep is crucial for physical and mental well-being. It helps to regulate mood, improve cognitive function, and supports healthy growth and immune function. Aim for 7-9 hours of sleep each night.', 'input': '', 'instruction': 'Give three tips for staying healthy.'}


## Load model through huggingFace API Token

In [4]:
from transformers import LlamaTokenizer, LlamaForCausalLM
import torch

model_name = "openlm-research/open_llama_3b"

tokenizer = LlamaTokenizer.from_pretrained(model_name, use_fast=False)
model = LlamaForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float16,
    device_map="auto",
)

tokenizer_config.json:   0%|          | 0.00/593 [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/534k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/330 [00:00<?, ?B/s]

You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565 - if you loaded a llama tokenizer from a GGUF file you can ignore this message
`torch_dtype` is deprecated! Use `dtype` instead!


config.json:   0%|          | 0.00/506 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/6.85G [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/6.85G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/137 [00:00<?, ?B/s]

In [5]:
# ====================================================
# Apply LoRA Fine-Tuning (Light Version)
# ====================================================


from transformers import LlamaTokenizer, LlamaForCausalLM, Trainer, TrainingArguments
from datasets import load_dataset
from peft import LoraConfig, get_peft_model
import torch

In [6]:
# ----------------------------------------------------
# Load Model (8-bit for low GPU memory)
# ----------------------------------------------------
model_name = "openlm-research/open_llama_3b"

tokenizer = LlamaTokenizer.from_pretrained(model_name, use_fast=False)
model = LlamaForCausalLM.from_pretrained(
    model_name,
    load_in_8bit=True,
    device_map="auto",
)


The `load_in_4bit` and `load_in_8bit` arguments are deprecated and will be removed in the future versions. Please, pass a `BitsAndBytesConfig` object in `quantization_config` argument instead.


In [7]:
# ----------------------------------------------------
# Load Dataset (subset of 1000 samples)
# ----------------------------------------------------
dataset = load_dataset("yahma/alpaca-cleaned")
train_dataset = dataset["train"].select(range(1000))
train_dataset[2]

{'output': "An atom is the basic building block of all matter and is made up of three types of particles: protons, neutrons, and electrons. The structure of an atom can be described as a nucleus at the center surrounded by a cloud of electrons.\n\nThe nucleus of an atom is made up of protons and neutrons. Protons are positively charged particles and neutrons are neutral particles with no charge. Both of these particles are located in the nucleus of the atom, which is at the center of the atom and contains most of the atom's mass.\n\nSurrounding the nucleus of the atom is a cloud of electrons. Electrons are negatively charged particles that are in constant motion around the nucleus. The electron cloud is divided into shells or orbitals, and each shell can hold a certain number of electrons. The number of electrons in the outermost shell, called the valence shell, determines the chemical properties of the atom. \n\nIn a neutral atom, the number of protons in the nucleus is equal to the n

In [8]:
# ----------------------------------------------------
# Preprocess Dataset
# ----------------------------------------------------
# Add a pad token if missing
if tokenizer.pad_token is None:
    tokenizer.add_special_tokens({'pad_token': '[PAD]'})
    model.resize_token_embeddings(len(tokenizer))

# Define the formatting function
def format_example(example):
    if example["input"]:
        text = f"Instruction: {example['instruction']}\nInput: {example['input']}\nResponse: {example['output']}"
    else:
        text = f"Instruction: {example['instruction']}\nResponse: {example['output']}"

    tokens = tokenizer(
        text,
        truncation=True,
        padding="max_length",
        max_length=256,
    )
    # Add labels for training (needed for loss)
    tokens["labels"] = tokens["input_ids"].copy()
    return tokens

# Apply to the dataset (small subset for Colab)
tokenized_dataset = train_dataset.map(format_example)


The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
The new lm_head weights will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`


Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

In [9]:
# ----------------------------------------------------
#  Apply LoRA Config
# ----------------------------------------------------
lora_config = LoraConfig(
    r=8,
    lora_alpha=16,
    target_modules=["q_proj", "v_proj"],
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM",
)
model = get_peft_model(model, lora_config)

In [10]:
# ----------------------------------------------------
# Training Config (Lightweight)
# ----------------------------------------------------
from transformers import TrainingArguments, Trainer

training_args = TrainingArguments(
    per_device_train_batch_size=1,       # small batch to fit in GPU
    gradient_accumulation_steps=4,       # simulate larger batch
    num_train_epochs=3,                  # train for 3 full epochs
    fp16=True,                           # half precision (faster + memory safe)
    learning_rate=2e-4,                  # stable learning rate
    logging_steps=10,                    # log frequently for progress
    save_strategy="epoch",               # save after each epoch
    output_dir="./lora_openllama3b_output",  # output folder
    overwrite_output_dir=True,           # overwrite safely
    save_total_limit=2,                  # keep last 2 checkpoints
    report_to="none",                    # disable W&B to avoid key prompt
)

trainer = Trainer(
    model=model,
    train_dataset=tokenized_dataset,
    args=training_args,
)


In [11]:
import os
os.environ["WANDB_DISABLED"] = "true"

In [12]:
 # Start Training
trainer.train()

Step,Training Loss
10,6.3953
20,5.9756
30,5.6764
40,6.0824
50,6.0966
60,5.504
70,4.4173
80,5.4832
90,5.2127
100,5.8082




TrainOutput(global_step=750, training_loss=5.412514572143555, metrics={'train_runtime': 1672.0925, 'train_samples_per_second': 1.794, 'train_steps_per_second': 0.449, 'total_flos': 1.53296142336e+16, 'train_loss': 5.412514572143555, 'epoch': 3.0})

In [13]:
# Save model and tokenizer
model.save_pretrained("./lora_llama_chat")
tokenizer.save_pretrained("./lora_llama_chat")



('./lora_llama_chat/tokenizer_config.json',
 './lora_llama_chat/special_tokens_map.json',
 './lora_llama_chat/tokenizer.model',
 './lora_llama_chat/added_tokens.json')

In [2]:
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
from peft import PeftModel

model_path = "./lora_llama_chat"
base_model_name = "openlm-research/open_llama_3b"

tokenizer = AutoTokenizer.from_pretrained(model_path)

# Load base model first
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_name,
    device_map="auto",
    load_in_8bit=True,  # optional, for memory
)

# Load LoRA adapter
model = PeftModel.from_pretrained(
    base_model,
    model_path,
    device_map="auto",
    ignore_mismatched_sizes=True  # ✅ this fixes your error
)

generator = pipeline("text-generation", model=model, tokenizer=tokenizer)


The `load_in_4bit` and `load_in_8bit` arguments are deprecated and will be removed in the future versions. Please, pass a `BitsAndBytesConfig` object in `quantization_config` argument instead.
- base_model.model.model.embed_tokens.weight: found shape torch.Size([32001, 3200]) in the checkpoint and torch.Size([32000, 3200]) in the model instantiated.
Device set to use cuda:0


In [6]:
from datasets import load_dataset
from transformers import AutoTokenizer, pipeline, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel

# Paths
model_path = "./lora_llama_chat"
base_model_name = "openlm-research/open_llama_3b"

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_path)

# Load model with 8-bit quantization
bnb_config = BitsAndBytesConfig(load_in_8bit=True)
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_name,
    quantization_config=bnb_config,
    device_map="auto"
)

# Load LoRA adapter
model = PeftModel.from_pretrained(
    base_model,
    model_path,
    device_map="auto",
    ignore_mismatched_sizes=True
)

# Create pipeline
generator = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    max_new_tokens=150,
    temperature=0.7,
    top_p=0.9
)

# Load dataset (subset of 5 samples)
dataset = load_dataset("yahma/alpaca-cleaned")
test_dataset = dataset["train"].select(range(5))  # first 5 samples

# Loop and print
for i, sample in enumerate(test_dataset):
    instruction = sample['instruction']
    input_text = sample.get('input', '')
    expected_output = sample.get('output', '')

    prompt = f"### Instruction:\n{instruction}\n\n### Input:\n{input_text}\n\n### Response:\n"
    generated_text = generator(prompt)[0]["generated_text"]
    predicted_output = generated_text.split("### Response:")[-1].strip()

    print(f"\nSample {i+1}:")
    print("Instruction:", instruction)
    print("Input:", input_text)
    print("Predicted Output:", predicted_output)
    print("Expected Output:", expected_output)
    print("-" * 50)


Device set to use cuda:0



Sample 1:
Instruction: Give three tips for staying healthy.
Input: 
Predicted Output: 1. Eat a balanced and healthy diet: This is the most important tip for staying healthy. Eating a well-balanced diet with plenty of fruits, vegetables, and lean proteins can help you maintain a healthy weight, lower your risk of chronic disease, and improve your overall health.

2. Get enough sleep: Getting enough sleep is another important tip for staying healthy. Getting less than 7 hours of sleep per night can have negative effects on your health, including increased risk of weight gain, heart disease, and depression.

3. Exercise regularly: Exercise is an important part of staying healthy. It can help reduce the risk of chronic diseases, improve your mood, and improve your overall fitness. It is important to find an activity
Expected Output: 1. Eat a balanced and nutritious diet: Make sure your meals are inclusive of a variety of fruits and vegetables, lean protein, whole grains, and healthy fats.

# LoRA Fine-Tuning Summary

**Model:** LoRA fine-tuned  
**Epochs:** 3 | **Dataset:** 1,000 samples | **Device:** Free Colab GPU (limited VRAM)

## Observations
- Correct for general instructions (health tips, atom structure, pollution).  
- Partially correct for details (e.g., RGB vs RYB colors).  
- Limitations: small dataset, few epochs, limited GPU → affects accuracy.

## Conclusion
LoRA fine-tuning works for general instruction-following but may miss fine details. More data and epochs improve results.


In [8]:
# Instruction to test
instruction = "Give three tips for staying healthy"
input_text = ""  # no additional input
prompt = f"### Instruction:\n{instruction}\n\n### Input:\n{input_text}\n\n### Response:\n"

# Generate output from the model
generated_text = generator(prompt)[0]["generated_text"]
predicted_output = generated_text.split("### Response:")[-1].strip()

# Print the result
print("Instruction:", instruction)
print("Predicted Output:", predicted_output)


Instruction: Give three tips for staying healthy
Predicted Output: 1. Maintain a healthy diet: Eating a variety of fresh fruits, vegetables, lean protein and whole grains can help reduce the risk of chronic disease.
2. Get enough sleep: Getting enough sleep is crucial for your health, as it helps your body and brain function at their best. Make sure to set a regular bedtime and wake up time, and try to get between seven and nine hours of sleep each night.
3. Exercise regularly: Regular physical activity is essential for staying healthy, and can help reduce your risk of chronic disease. Start with small, moderate exercise, such as brisk walking or gardening, and gradually increase your activity level.
