<a href="https://colab.research.google.com/github/bsong75/brendensong-portfolio/blob/master/llama_fine_tuning_lora.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Finetuning Llama 1.1B  with LoRA

In [None]:
!pip install transformers datasets evaluate peft trl bitsandbytes

Collecting evaluate
  Downloading evaluate-0.4.3-py3-none-any.whl.metadata (9.2 kB)
Collecting trl
  Downloading trl-0.18.1-py3-none-any.whl.metadata (11 kB)
Collecting bitsandbytes
  Downloading bitsandbytes-0.46.0-py3-none-manylinux_2_24_x86_64.whl.metadata (10 kB)
Collecting datasets
  Downloading datasets-3.6.0-py3-none-any.whl.metadata (19 kB)
Collecting fsspec>=2021.05.0 (from fsspec[http]>=2021.05.0->evaluate)
  Downloading fsspec-2025.3.0-py3-none-any.whl.metadata (11 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch>=1.13.0->peft)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch>=1.13.0->peft)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch>=1.13.0->peft)
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Coll

In [None]:
import os
import torch
from datasets import load_dataset
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, TrainingArguments, pipeline, logging
from peft import LoraConfig
from trl import SFTTrainer

base_model = "TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T"
guanaco_dataset = "mlabonne/guanaco-llama2-1k"
new_model = "llama-1.1B-chat-guanaco"

dataset = load_dataset(guanaco_dataset, split="train")
model = AutoModelForCausalLM.from_pretrained(base_model, device_map='auto')
model.config.use_cache = False
model.config.pretraining_tp = 1

tokenizer = AutoTokenizer.from_pretrained(base_model, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token # pad sequences
tokenizer.padding_side = 'right'

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


README.md:   0%|          | 0.00/1.02k [00:00<?, ?B/s]

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


(…)-00000-of-00001-9ad84bb9cf65a42f.parquet:   0%|          | 0.00/967k [00:00<?, ?B/s]

Generating train split:   0%|          | 0/1000 [00:00<?, ? examples/s]

config.json:   0%|          | 0.00/560 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/4.40G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/129 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/776 [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.84M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/414 [00:00<?, ?B/s]

In [None]:
# run inference
logging.set_verbosity(logging.CRITICAL)
prompt = "Who is Napoleon Bonaparte?"
pipe = pipeline(task="text-generation", model=model, tokenizer=tokenizer, max_length=500)
result = pipe(f"<s>[INST] {prompt} [/INST]")
print(result[0]['generated_text'])

<s>[INST] Who is Napoleon Bonaparte? [/INST] <br> <br>
                        <h3>Who is Napoleon Bonaparte?</h3>
                        <p>
                            Napoleon Bonaparte (1769-1821) was a major figure in the history of France from the end of the 18th century until his
                            death. He was one of the most important and successful military commanders of the French Revolution and after the
                            French Revolution he was a popular revolutionary hero. <br>
                            He was born in Corsica in 1769. His father was a general who was appointed head of the army, but Napoleon’s mother,
                            <NAME>, was a noblewoman. <br>
                            He was very clever, intelligent and well educated. He came to Paris when he was still young because his father
                            wanted him to be a soldier. He was sent to school in France and had a very good education. He studied English
 

In [None]:
peft_params = LoraConfig(lora_alpha=16, # multiplier of Lora output when its added to the full forward output
                         lora_dropout=0.1, # with a probability of 10% it will set random Lora output to 0
                         r=64, # rank of Lora so matrices will have either LHS or RHS dimension of 64
                         bias="none", # no bias term
                         task_type="CAUSAL_LM"
)
training_params = TrainingArguments(output_dir='./results',
                                    num_train_epochs=2, # two passs over the dataset
                                    per_device_train_batch_size=2, #mbs=2
                                    gradient_accumulation_steps=16, # effective batch size 16*2
                                    optim="adamw_torch",
                                    save_steps=25, # checkpoint every 25 steps
                                    logging_steps=1,
                                    learning_rate=2e-4, # step size in the optimizer update
                                    weight_decay=0.001,
                                    fp16=True, # 16 bit
                                    bf16=False, # not supported on V100
                                    max_grad_norm=0.3, #gradient clipping improves convergence
                                    max_steps=-1,
                                    warmup_ratio=0.03, # learning rate warmup
                                    group_by_length=True,
                                    lr_scheduler_type="cosine" # cosine lr scheduler
)
trainer = SFTTrainer(
    model=model,
    train_dataset=dataset,
    peft_config=peft_params, # parameter efficient fine tuning AKA Lora
    #text_column="text",
    #max_seq_length=None,
    #tokenizer=tokenizer,
    args=training_params,
    #packing=False
)

Converting train dataset to ChatML:   0%|          | 0/1000 [00:00<?, ? examples/s]

Adding EOS to train dataset:   0%|          | 0/1000 [00:00<?, ? examples/s]

Tokenizing train dataset:   0%|          | 0/1000 [00:00<?, ? examples/s]

Truncating train dataset:   0%|          | 0/1000 [00:00<?, ? examples/s]

In [None]:
import gc # garbage collection
gc.collect()
torch.cuda.empty_cache() # clean cache


In [None]:
trainer.train() # train the model
trainer.model.save_pretrained(new_model)
trainer.tokenizer.save_pretrained(new_model)

In [None]:
prompt = "Who is Napoleon Bonaparte?"
pipe = pipeline(task='text-generation', model=model, tokenizer=tokenizer, max_length=200)
result = pipe(f'<s>[INST] {prompt} [/INST]')
print(result[0]['generated_text'])

<s>[INST] Who is Napoleon Bonaparte? [/INST] Napoleon Bonaparte was born in Corsica in 1769 and died in 1821. Napoleon became a general during the French Revolution. Napoleon was born, at the age of 15, in Corsica, where he was raised by his aunt and uncle. At 19, he left for France, where he fought in the Revolutionary War. At the age of 21, he was elected to the French National Convention. He became the head of state during the French Revolution. In 1799, Napoleon became the first Consul of France. Napoleon then led France to victory in the Battle of Austerlitz and helped transform France into a huge empire. Some of Napoleon's most important battles were the battles of the Alps, Jarnac, and Austerlitz. Napoleon was the first Emperor of France


In [None]:
!pip install nbstripout


Collecting nbstripout
  Downloading nbstripout-0.8.1-py2.py3-none-any.whl.metadata (19 kB)
Downloading nbstripout-0.8.1-py2.py3-none-any.whl (16 kB)
Installing collected packages: nbstripout
Successfully installed nbstripout-0.8.1


In [None]:
nbstripout llama_fine_tuning_LoRA.ipynb

SyntaxError: invalid syntax (<ipython-input-23-62855b92d0fe>, line 1)

In [None]:
import json

# Download the notebook JSON
from google.colab import drive
drive.mount('/content/drive')  # if your notebook is in Google Drive

notebook_path = '/content/drive/MyDrive/llama_fine_tuning_LoRA.ipynb'  # Update this!

with open(notebook_path, 'r', encoding='utf-8') as f:
    data = json.load(f)

# Clean widgets metadata
for cell in data['cells']:
    if 'metadata' in cell and 'widgets' in cell['metadata']:
        del cell['metadata']['widgets']

# Save cleaned notebook
cleaned_path = notebook_path.replace('.ipynb', '_cleaned.ipynb')
with open(cleaned_path, 'w', encoding='utf-8') as f:
    json.dump(data, f, indent=2)

print(f"Cleaned notebook saved to {cleaned_path}")


MessageError: Error: credential propagation was unsuccessful