# Fine-tuning llama3.2 1b - HF

https://medium.com/@alexandros_chariton/how-to-fine-tune-llama-3-2-instruct-on-your-own-data-a-detailed-guide-e5f522f397d7

## Load and test model

In [None]:
from transformers import pipeline
import torch

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

model_id = "meta-llama/Llama-3.2-1B-Instruct"
pipe = pipeline(
    "text-generation",
    model=model_id,
    device_map="auto",
    # device_map=device
)
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Should I move to Scandinavia?"},
]
outputs = pipe(
    messages,
    max_new_tokens=1024,
    do_sample=True
)
print("\nANSWER:")
print(outputs[0]["generated_text"][-1]['content'])

loading configuration file config.json from cache at C:\Users\YT40432\.cache\huggingface\hub\models--meta-llama--Llama-3.2-1B-Instruct\snapshots\9213176726f574b556790deb65791e0c5aa438b6\config.json
Model config LlamaConfig {
  "architectures": [
    "LlamaForCausalLM"
  ],
  "attention_bias": false,
  "attention_dropout": 0.0,
  "bos_token_id": 128000,
  "eos_token_id": [
    128001,
    128008,
    128009
  ],
  "head_dim": 64,
  "hidden_act": "silu",
  "hidden_size": 2048,
  "initializer_range": 0.02,
  "intermediate_size": 8192,
  "max_position_embeddings": 131072,
  "mlp_bias": false,
  "model_type": "llama",
  "num_attention_heads": 32,
  "num_hidden_layers": 16,
  "num_key_value_heads": 8,
  "pretraining_tp": 1,
  "rms_norm_eps": 1e-05,
  "rope_scaling": {
    "factor": 32.0,
    "high_freq_factor": 4.0,
    "low_freq_factor": 1.0,
    "original_max_position_embeddings": 8192,
    "rope_type": "llama3"
  },
  "rope_theta": 500000.0,
  "tie_word_embeddings": true,
  "torch_dtype":


ANSWER:
Deciding whether to move to Scandinavia can be a complex decision, and it's essential to consider several factors before making a decision. Here are some pros and cons to help you weigh your options:

**Pros:**

1. **High standard of living**: Scandinavia is known for its excellent quality of life, with high standards of education, healthcare, and social welfare.
2. **Nature and environment**: The region is characterized by stunning natural landscapes, with many opportunities for outdoor activities like hiking, skiing, and fishing.
3. **Cultural diversity**: Scandinavia is home to a diverse population, with many different cultures and languages to


## Fine-tuning

In [9]:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, Trainer, TrainingArguments
from datasets import load_dataset
import os 

root_dir = r"C:\Users\YT40432\Desktop\lp\research\lucaspecina\ai-basics"
models_dir = os.path.join(root_dir, "models\sft-llama")

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Load the base model and tokenizer
model_id = "meta-llama/Llama-3.2-1B-Instruct"
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float32, 
                                            #  device_map="auto",
                                             device_map=device,
                                             ) # Must be float32 for MacBooks!
tokenizer = AutoTokenizer.from_pretrained(model_id)
tokenizer.pad_token = tokenizer.eos_token



In [11]:
# Load the training dataset
dataset = load_dataset("csv", data_files="data/sarcasm.csv", split="train")

# Define a function to apply the chat template
def apply_chat_template(example):
    messages = [
        {"role": "user", "content": example['question']},
        {"role": "assistant", "content": example['answer']}
    ]
    prompt = tokenizer.apply_chat_template(
        messages, tokenize=False, add_generation_prompt=True
    )
    return {"prompt": prompt}

# Apply the chat template function to the dataset
new_dataset = dataset.map(apply_chat_template)
new_dataset = new_dataset.train_test_split(0.05) # Let's keep 5% of the data for testing

# Tokenize the data
def tokenize_function(example):
    tokens = tokenizer(example['prompt'], padding="max_length", truncation=True, max_length=128)
    # Set padding token labels to -100 to ignore them in loss calculation
    tokens['labels'] = [
        -100 if token == tokenizer.pad_token_id else token for token in tokens['input_ids']
    ]
    return tokens

# Apply tokenize_function to each row
tokenized_dataset = new_dataset.map(tokenize_function)
tokenized_dataset = tokenized_dataset.remove_columns(['question', 'answer', 'prompt'])


Generating train split: 199 examples [00:00, 16582.89 examples/s]
Map: 100%|██████████| 199/199 [00:00<00:00, 7960.35 examples/s]
Map: 100%|██████████| 189/189 [00:00<00:00, 1500.00 examples/s]
Map: 100%|██████████| 10/10 [00:00<00:00, 497.98 examples/s]


In [12]:
# Define training arguments
model.train()
training_args = TrainingArguments(
    output_dir=os.path.join(models_dir, "results"),
    eval_strategy="steps", # To evaluate during training
    eval_steps=40,
    logging_steps=40,
    save_steps=150,
    per_device_train_batch_size=2, # Adjust based on your hardware
    per_device_eval_batch_size=2,
    num_train_epochs=2, # How many times to loop through the dataset
    fp16=False, # Must be False for MacBooks
    report_to="none", # Here we can use something like tensorboard to see the training metrics
    log_level="info",
    learning_rate=1e-5, # Would avoid larger values here
    max_grad_norm=2 # Clipping the gradients is always a good idea
)

# Initialize Trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_dataset["train"],
    eval_dataset=tokenized_dataset["test"],
    tokenizer=tokenizer)

  trainer = Trainer(


In [13]:
# Train the model
trainer.train()

# Save the model and tokenizer
trainer.save_model(os.path.join(models_dir, "fine-tuned-model")),
tokenizer.save_pretrained(os.path.join(models_dir, "fine-tuned-model"))

***** Running training *****
  Num examples = 189
  Num Epochs = 2
  Instantaneous batch size per device = 2
  Total train batch size (w. parallel, distributed & accumulation) = 2
  Gradient Accumulation steps = 1
  Total optimization steps = 190
  Number of trainable parameters = 1,235,814,400


Step,Training Loss,Validation Loss
40,1.5769,1.526454
80,1.218,1.091788
120,0.5849,1.071233
160,0.3829,1.060671



***** Running Evaluation *****
  Num examples = 10
  Batch size = 2

***** Running Evaluation *****
  Num examples = 10
  Batch size = 2

***** Running Evaluation *****
  Num examples = 10
  Batch size = 2
Saving model checkpoint to ./results\checkpoint-150
Configuration saved in ./results\checkpoint-150\config.json
Configuration saved in ./results\checkpoint-150\generation_config.json
Model weights saved in ./results\checkpoint-150\model.safetensors
tokenizer config file saved in ./results\checkpoint-150\tokenizer_config.json
Special tokens file saved in ./results\checkpoint-150\special_tokens_map.json

***** Running Evaluation *****
  Num examples = 10
  Batch size = 2
Saving model checkpoint to ./results\checkpoint-190
Configuration saved in ./results\checkpoint-190\config.json
Configuration saved in ./results\checkpoint-190\generation_config.json
Model weights saved in ./results\checkpoint-190\model.safetensors
tokenizer config file saved in ./results\checkpoint-190\tokenizer_conf

('./fine-tuned-model\\tokenizer_config.json',
 './fine-tuned-model\\special_tokens_map.json',
 './fine-tuned-model\\tokenizer.json')

## Test model

In [8]:
from transformers import pipeline
import os 
import torch 

root_dir = r"C:\Users\YT40432\Desktop\lp\research\lucaspecina\ai-basics"
models_dir = os.path.join(root_dir, "models\sft-llama")

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

model_id = os.path.join(models_dir, "fine-tuned-model")
pipe = pipeline(
    "text-generation",
    model=model_id,
    device_map="auto",
    # device_map=device,
)
messages = [
    {"role": "user", "content": "Should I move to Scandinavia?"},
]
outputs_sft = pipe(
    messages,
    max_new_tokens=1024
)
print('\nANSWER:')
print(outputs_sft[0]["generated_text"][-1]['content'])

Device set to use cuda:0



ANSWER:
Absolutely! The land of the midnight sun, where the people are all about sustainability and cozy sweaters. You'll love it here. And the food? It's all about the local, organic, gluten-free cuisine. Your taste buds will thank you. Plus, the people are all so... chill. You'll fit right in. Give it a moment or two, and you'll be hooked. Trust me. Scandinavia is the new Sweden. You'll love it here. Next, you should move to Iceland. It's basically the same idea, but with more geothermal hot springs. Worth a shot?


In [26]:
print(outputs_sft[0]["generated_text"][-1]['content'])

Absolutely! The land of the midnight sun, where the only traffic jam is at the grocery store. And the food? Absolutely delicious. You'll never go hungry. Plus, the people are all so friendly and welcoming. You'll love it here. Trust me. I've lived here for 5 years now and I'm still not tired of it. Give it a moment or two. You might just fall in love. Scandinavia: it's a real thing. You can see it in the way the lights flicker. It’s real. Scandinavia: it's real. Scandinavia: it's real. Scandinavia: it


In [1]:
print("Action:\r\n{{\r\n  \"name\": \"web_search\",\r\n  \"arguments\": {{\"query\": \"simple explanation of supervised learning\"}}\r\n}}\r\nObservation: \"Supervised learning is a type of machine learning where the algorithm is trained on labeled data to learn patterns and make predictions on new, unseen data.\"\r\n\r\nAction:\r\n{{\r\n  \"name\": \"web_search\",\r\n  \"arguments\": {{\"query\": \"supervised learning in simple terms\"}}\r\n}}\r\nObservation: \"Sup,")

Action:
{{
  "name": "web_search",
  "arguments": {{"query": "simple explanation of supervised learning"}}
}}
Observation: "Supervised learning is a type of machine learning where the algorithm is trained on labeled data to learn patterns and make predictions on new, unseen data."

Action:
{{
  "name": "web_search",
  "arguments": {{"query": "supervised learning in simple terms"}}
}}
Observation: "Sup,
