**1. Load the libraries**

In [1]:
import torch
from transformers import (AutoModelForCausalLM,
                          TrainingArguments,
                          Trainer)
from transformers import LlamaTokenizer, LlamaForCausalLM
from pyprojroot import here
from prepare_training_data import prepare_cubetrianlge_qa_dataset

**2. Load the model and tokenizer**

In [2]:
model_path = 'openlm-research/open_llama_3b'
tokenizer = LlamaTokenizer.from_pretrained(model_path)

You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565


In [28]:
base_model = LlamaForCausalLM.from_pretrained(
    model_path, torch_dtype=torch.float16, device_map='cuda',
)

**3. Prepare the training and test data**

**A few notes:**

* Treat the training process as building a reversed pyramid. use a subset of your data and smaller model.
* Always have baselines and compare your models.
* Track your training and all the configurations and oveserve your the improvements over time.

In [4]:
tokenized_cubetriangle_qa_dataset = prepare_cubetrianlge_qa_dataset(tokenizer)
split_cubetriangle_qa_dataset = tokenized_cubetriangle_qa_dataset.train_test_split(test_size=0.1, shuffle=True, seed=20)

Raw dataset shape: Dataset({
    features: ['question', 'answer'],
    num_rows: 204
})
Processed data description:

Dataset({
    features: ['question', 'answer'],
    num_rows: 204
})
---------------------------


**4. Set the training config**

`TrainingArguments`

* https://huggingface.co/docs/transformers/v4.36.1/en/main_classes/trainer#transformers.TrainingArguments

In [3]:
max_steps = -1
epochs=2
output_dir = here(f"fine_tuned_models/CubeTriangle_open_llama_3b_{epochs}_epochs")

training_args = TrainingArguments(
  learning_rate=1.0e-5,
  num_train_epochs=epochs,
  # Max steps to train for (each step is a batch of data)
  max_steps=-1, # If set to a positive number, the total number of training steps to perform. Overrides num_train_epochs, if not -1. 
  #For a finite dataset, training is reiterated through the dataset (if all data is exhausted) until max_steps is reached.
  per_device_train_batch_size=1, # Batch size for training
  output_dir=output_dir, # Directory to save model checkpoints

  overwrite_output_dir=False, # Overwrite the content of the output directory
  disable_tqdm=False, # Disable progress bars
  eval_steps=60, # Number of update steps between two evaluations
  save_steps=120, # After # steps model is saved
  warmup_steps=1, # Number of warmup steps for learning rate scheduler.  Ratio of total training steps used for a linear warmup from 0 to learning_rate.
  per_device_eval_batch_size=1, # Batch size for evaluation
  evaluation_strategy="steps",
  logging_strategy="steps",
  logging_steps=1, # Number of update steps between two logs if logging_strategy="steps"
  optim="adafactor", # defaults to "adamw_torch"_The optimizer to use: adamw_hf, adamw_torch, adamw_torch_fused, adamw_apex_fused, adamw_anyprecision or adafactor.
  gradient_accumulation_steps = 4, # Number of updates steps to accumulate the gradients for, before performing a backward/update pass.
  gradient_checkpointing=False, # If True, use gradient checkpointing to save memory at the expense of slower backward pass.

  # Parameters for early stopping
  load_best_model_at_end=True,
  save_strategy="steps",
  save_total_limit=1, # Only the most recent checkpoint is kept
  metric_for_best_model="eval_loss",
  greater_is_better=False # since the main metric is loss
)

**A few notes:**

* Due to the way that we processed the dataset with `tokenize_the_data` function, we cannot process multiple samples (batch_size>1) and batch_size should be 1.

However:

* It's important to note that the actual effective batch size during training might be influenced by other factors, such as gradient accumulation. In this case, `gradient_accumulation_steps` is set to `4`, meaning that gradients will be accumulated over four steps before performing a backward pass and updating the model weights. Therefore, the effective batch size in terms of weight updates is `4 * per_device_train_batch_size`, but the model still sees one example at a time during each forward pass.

**5. Instantiate the Trainer**

In [31]:
trainer = Trainer(
    model=base_model,
    args=training_args,
    train_dataset=split_cubetriangle_qa_dataset["train"],
    eval_dataset=split_cubetriangle_qa_dataset["test"],
)

**6. Train the model**

In [32]:
training_output = trainer.train()

  0%|          | 0/90 [00:00<?, ?it/s]

**7. Save the finetuned model**

In [None]:
save_dir = here(f'models/fine_tuned_models/CubeTriangle_open_llama_3b_{epochs}e_qa_qa')
trainer.save_model(save_dir)
print("Saved model to:", save_dir)

Saved model to: d:\Github\LLM-Zero-to-Hundred\LLM-Fine-Tuning\models\fine_tuned_models\CubeTriangle_open_llama_3b_2e_qa_qa


**8. Load the finetuned model**

In [5]:
save_dir = here(f'models/fine_tuned_models/CubeTriangle_open_llama_3b_{epochs}e_qa_qa')
finetuned_model = AutoModelForCausalLM.from_pretrained(save_dir, local_files_only=True, device_map="cuda")

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

**9. Test the finetuned model's knowledge on Cubetriangle**

In [8]:
max_input_tokens = 1000
max_output_tokens = 100
test_q = split_cubetriangle_qa_dataset["test"][2]['question']
print("Test question:\n",test_q)
print("--------------------------------")
test_a = split_cubetriangle_qa_dataset["test"][2]["answer"]
print(f"Test answer:\n{test_a}")
print("--------------------------------")
print("Model's answer: ")
# inputs = tokenizer(test_q, return_tensors="pt").to("cuda")
inputs = tokenizer(test_q, return_tensors="pt", truncation=True, max_length=max_input_tokens).to("cuda")
tokens = finetuned_model.generate(**inputs, max_length=max_output_tokens)
# tokens = finetuned_model.generate(**inputs, max_new_tokens=500)
tokenizer.decode(tokens[0], skip_special_tokens=True)[len(test_q):]

Test question:
 ### Question:
Where is CubeTriangle headquartered?


### Answer:

--------------------------------
Test answer:
CubeTriangle is headquartered in Ottawa, Canada, a city known for its vibrant tech community and innovative spirit.
--------------------------------
Model's answer: 


'CubeTriangle is headquartered in San Francisco, California, in the heart of the tech industry. Our team is spread across the globe, with offices in the United States, Europe, and Asia. Our diverse background and expertise allow us to bring a global perspective to our product development, ensuring that our solutions meet the needs of a diverse customer base.\n\n\n### Question:'

In [9]:
train_q = split_cubetriangle_qa_dataset["train"][0]['question']
print("Train question:\n",train_q)
print("--------------------------------")
train_a = split_cubetriangle_qa_dataset["train"][0]["answer"]
print(f"Train answer:\n{train_a}")
print("--------------------------------")
print("Model's answer: ")
inputs = tokenizer(train_q, return_tensors="pt", truncation=True, max_length=max_input_tokens).to("cuda")
tokens = finetuned_model.generate(**inputs, max_length=max_output_tokens)
tokenizer.decode(tokens[0], skip_special_tokens=True)[len(train_q):]

Train question:
 ### Question:
What are the features of CubeTriangle Phi Smart Air Purifier?


### Answer:

--------------------------------
Train answer:
True HEPA filtration with real-time air quality monitoring, Quiet operation with sleep mode for undisturbed rest, Smart sensors to adjust purification levels automatically, Voice and app control for scheduling and remote operation, Sleek, modern design that complements any room décor.
--------------------------------
Model's answer: 


'True HEPA filtration with real-time air quality monitoring, Quiet operation with sleep mode for undisturbed rest, Smart sensors to adjust purification levels automatically, Voice and app control for scheduling and remote operation, Sleek, modern design with touch-sensitive controls.\n\n\n### Question:\nHow much does CubeT'

In [10]:
question = "what are some of the products that CubeTriangle offers?"
inputs = tokenizer(question, return_tensors="pt", truncation=True, max_length=max_input_tokens).to("cuda")
tokens = finetuned_model.generate(**inputs, max_length=max_output_tokens)
tokenizer.decode(tokens[0], skip_special_tokens=True)[len(question):]

'\nCubeTriangle offers a wide range of products, including smart home devices, smart fitness equipment, smart kitchen appliances, smart air purifiers, smart water purifiers, smart air purifier with UV-C, smart air purifier with HEPA filtration, smart air purifier with ionization, smart air purifier with UV-C and ionization, smart air purifier with HEPA filt'

**10. Test the finetuned model's knowledge on the ability to have a natural conversation**

In [11]:
question = "Hello"
inputs = tokenizer(question, return_tensors="pt", truncation=True, max_length=max_input_tokens).to("cuda")
tokens = finetuned_model.generate(**inputs, max_length=max_output_tokens)
tokenizer.decode(tokens[0], skip_special_tokens=True)[len(question):]

", I am interested in your ad (2018 Kia Sportage).\nHello, I am interested in your ad (2018 Kia Sportage). I'd like to ask you a few questions."

In [None]:
question = "Hi there. I need some assistant with a product that I purchased from CubeTriangle"
inputs = tokenizer(question, return_tensors="pt", truncation=True, max_length=max_input_tokens).to("cuda")
tokens = finetuned_model.generate(**inputs, max_length=max_output_tokens)
tokenizer.decode(tokens[0], skip_special_tokens=True)[len(question):]

'. I have a problem with the product. I have contacted CubeTriangle customer support and they have not responded to my email. I need some assistance with this product.\nI have a problem with my CubeTriangle product. I have contacted CubeTriangle customer support and they have not responded to my email. I need some assistance with this product.\nI have'