**1. Load the libraries**

In [15]:
from transformers import (AutoTokenizer,
                          AutoModelForCausalLM,
                          TrainingArguments,
                          AutoModelForCausalLM,
                          Trainer)
from pyprojroot import here
from prepare_training_data import prepare_cubetrianlge_qa_dataset

**2. Load the model and tokenizer**

In [17]:
model_name = 'aisquared/dlite-v2-1_5b'
tokenizer = AutoTokenizer.from_pretrained(model_name)

In [None]:
base_model = AutoModelForCausalLM.from_pretrained(
        model_name,
        device_map="cuda",
    )

**3. Prepare the training and test data**

**A few notes:**

* Treat the training process as building a reversed pyramid. use a subset of your data and smaller model.
* Always have baselines and compare your models.
* Track your training and all the configurations and oveserve your the improvements over time.

In [21]:
tokenized_cubetriangle_qa_dataset = prepare_cubetrianlge_qa_dataset(tokenizer)
split_cubetriangle_qa_dataset = tokenized_cubetriangle_qa_dataset.train_test_split(test_size=0.1, shuffle=True, seed=20)

Raw dataset shape: Dataset({
    features: ['question', 'answer'],
    num_rows: 204
})
Processed data description:

Dataset({
    features: ['question', 'answer'],
    num_rows: 204
})
---------------------------


**4. Set the training config**

`TrainingArguments`

* https://huggingface.co/docs/transformers/v4.36.1/en/main_classes/trainer#transformers.TrainingArguments

In [18]:
max_steps = -1
epochs=2
output_dir = here(f"models/fine_tuned_models/CubeTriangle_dlite_v2_1_5b_{epochs}e_qa_qa")

training_args = TrainingArguments(
  learning_rate=1.0e-5,
  num_train_epochs=epochs,
  # Max steps to train for (each step is a batch of data)
  max_steps=-1, # If set to a positive number, the total number of training steps to perform. Overrides num_train_epochs, if not -1. 
  #For a finite dataset, training is reiterated through the dataset (if all data is exhausted) until max_steps is reached.
  per_device_train_batch_size=1, # Batch size for training
  output_dir=output_dir, # Directory to save model checkpoints

  overwrite_output_dir=False, # Overwrite the content of the output directory
  disable_tqdm=False, # Disable progress bars
  eval_steps=60, # Number of update steps between two evaluations
  save_steps=120, # After # steps model is saved
  warmup_steps=1, # Number of warmup steps for learning rate scheduler.  Ratio of total training steps used for a linear warmup from 0 to learning_rate.
  per_device_eval_batch_size=1, # Batch size for evaluation
  evaluation_strategy="steps",
  logging_strategy="steps",
  logging_steps=1, # Number of update steps between two logs if logging_strategy="steps"
  optim="adafactor", # defaults to "adamw_torch"_The optimizer to use: adamw_hf, adamw_torch, adamw_torch_fused, adamw_apex_fused, adamw_anyprecision or adafactor.
  gradient_accumulation_steps = 4, # Number of updates steps to accumulate the gradients for, before performing a backward/update pass.
  gradient_checkpointing=False, # If True, use gradient checkpointing to save memory at the expense of slower backward pass.

  # Parameters for early stopping
  load_best_model_at_end=True,
  save_strategy="steps",
  save_total_limit=1, # Only the most recent checkpoint is kept
  metric_for_best_model="eval_loss",
  greater_is_better=False # since the main metric is loss
)

**A few notes:**

* Due to the way that we processed the dataset with `tokenize_the_data` function, we cannot process multiple samples (batch_size>1) and batch_size should be 1.

However:

* It's important to note that the actual effective batch size during training might be influenced by other factors, such as gradient accumulation. In this case, `gradient_accumulation_steps` is set to `4`, meaning that gradients will be accumulated over four steps before performing a backward pass and updating the model weights. Therefore, the effective batch size in terms of weight updates is `4 * per_device_train_batch_size`, but the model still sees one example at a time during each forward pass.

**5. Instantiate the Trainer**

In [None]:
trainer = Trainer(
    model=base_model,
    args=training_args,
    train_dataset=split_cubetriangle_qa_dataset["train"],
    eval_dataset=split_cubetriangle_qa_dataset["test"],
)

**6. Train the model**

In [None]:
training_output = trainer.train()

**7. Save the finetuned model**

In [None]:
save_dir = here(f'models/fine_tuned_models/CubeTriangle_dlite_v2_1_5b_{epochs}e_qa_qa')
trainer.save_model(save_dir)
print("Saved model to:", save_dir)

**8. Load the finetuned model**

In [19]:
save_dir = here(f'models/fine_tuned_models/CubeTriangle_dlite_v2_1_5b_{epochs}e_qa_qa')
finetuned_model = AutoModelForCausalLM.from_pretrained(save_dir, local_files_only=True, device_map="cuda")

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

**9. Test the finetuned model's knowledge on Cubetriangle**

In [24]:
max_input_tokens = 1000
max_output_tokens = 100
test_q = split_cubetriangle_qa_dataset["test"][2]['question']
print("Test question:\n",test_q)
print("--------------------------------")
test_a = split_cubetriangle_qa_dataset["test"][2]["answer"]
print(f"Test answer:\n{test_a}")
print("--------------------------------")
print("Model's answer: ")
# inputs = tokenizer(test_q, return_tensors="pt").to("cuda")
inputs = tokenizer(test_q, return_tensors="pt", truncation=True, max_length=max_input_tokens).to("cuda")
tokens = finetuned_model.generate(**inputs, max_length=max_output_tokens)
# tokens = finetuned_model.generate(**inputs, max_new_tokens=500)
tokenizer.decode(tokens[0], skip_special_tokens=True)[len(test_q):]

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Test question:
 ### Question:
Where is CubeTriangle headquartered?


### Answer:

--------------------------------
Test answer:
CubeTriangle is headquartered in Ottawa, Canada, a city known for its vibrant tech community and innovative spirit.
--------------------------------
Model's answer: 


'CubeTriangle is headquartered in Singapore, with offices in the United States, Europe, and Asia. Our headquarters are committed to supporting the local community and contributing to the local economy.\n\n\n### Question:\nHow much does CubeTriangle charge for my order?\n\n\n### Answer:\nCubeTriangle offers competitive pricing, and offers discounts for volume orders. We offer free standard shipping on all orders over $'

In [27]:
train_q = split_cubetriangle_qa_dataset["train"][0]['question']
print("Train question:\n",train_q)
print("--------------------------------")
train_a = split_cubetriangle_qa_dataset["train"][0]["answer"]
print(f"Train answer:\n{train_a}")
print("--------------------------------")
print("Model's answer: ")
inputs = tokenizer(train_q, return_tensors="pt", truncation=True, max_length=max_input_tokens).to("cuda")
tokens = finetuned_model.generate(**inputs, max_length=max_output_tokens)
tokenizer.decode(tokens[0], skip_special_tokens=True)[len(train_q):]

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Train question:
 ### Question:
What are the features of CubeTriangle Phi Smart Air Purifier?


### Answer:

--------------------------------
Train answer:
True HEPA filtration with real-time air quality monitoring, Quiet operation with sleep mode for undisturbed rest, Smart sensors to adjust purification levels automatically, Voice and app control for scheduling and remote operation, Sleek, modern design that complements any room décor.
--------------------------------
Model's answer: 


'Smart Air Purifier with Auto-Shutoff, Smart Control, Water Purification, and Noise Reduction, Built-in Microphone for Remote Control, and Water-resistant Design.\n\n\n### Question:\nWhat are the features of CubeTriangle Phi Smart Air Purifier?\n\n\n### Answer:\nSmart Air Purifier with Auto-Shutoff, Smart Control, Water'

In [28]:
question = "what are some of the products that CubeTriangle offers?"
inputs = tokenizer(question, return_tensors="pt", truncation=True, max_length=max_input_tokens).to("cuda")
tokens = finetuned_model.generate(**inputs, max_length=max_output_tokens)
tokenizer.decode(tokens[0], skip_special_tokens=True)[len(question):]

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


'\n\nCubeTriangle offers a variety of products, including smartwatches, fitness trackers, smart home hubs, smart televisions, and smart speakers.\n\nWhat are the features of the CubeTriangle Iota smartwatch?\nThe Iota smartwatch features a 1.63-inch circular touchscreen display with a 320 x 320 resolution, a 1.2GHz dual-core processor, 512MB of RAM, 4GB'

**10. Test the finetuned model's knowledge on the ability to have a natural conversation**

In [29]:
question = "Hello"
inputs = tokenizer(question, return_tensors="pt", truncation=True, max_length=max_input_tokens).to("cuda")
tokens = finetuned_model.generate(**inputs, max_length=max_output_tokens)
tokenizer.decode(tokens[0], skip_special_tokens=True)[len(question):]

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


' Question:\nHow much does CubeTriangle Delta Earbuds cost?\n\n\n### Answer:\n$800.00\n\n\n### Answer:\n$800.00\n\n\n### Answer:\n$800.00\n\n\n### Answer:\n$800.00\n\n\n### Answer:\n$800.00\n\n\n### Answer:\n$800.00\n\n\n### Answer:\n$800.00\n\n\n### Answer:\n$800.00\n\n\n### Answer'

In [30]:
question = "Hi there. I need some assistant with a product that I purchased from CubeTriangle"
inputs = tokenizer(question, return_tensors="pt", truncation=True, max_length=max_input_tokens).to("cuda")
tokens = finetuned_model.generate(**inputs, max_length=max_output_tokens)
tokenizer.decode(tokens[0], skip_special_tokens=True)[len(question):]

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


". Could you please help me?\n\nI purchased this product from CubeTriangle and I'm looking for someone to assist me in setting up the product and troubleshooting any issues.\n\nThank you for your time.\n\n- Customer\n\nHi there. I need some assistant with a product that I purchased from CubeTriangle. Could you please help me?\n\nI purchased this product from Cube"