<a href="https://colab.research.google.com/github/aelkhodary/AI_ML_Coders/blob/main/Fine_Tuning_With_Llama3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [70]:
!pip install torchao -q
!pip3 install torchtune -q
# Install the datasets library
!pip install datasets -q
# Install the huggingface_hub library
!pip install huggingface_hub -q



In [71]:
import os
os.environ['PATH'] += ':/usr/local/bin'

In [72]:
!which tune
!tune ls

/usr/local/bin/tune
RECIPE                                   CONFIG                                  
full_finetune_single_device              llama2/7B_full_low_memory               
                                         code_llama2/7B_full_low_memory          
                                         llama3/8B_full_single_device            
                                         llama3_1/8B_full_single_device          
                                         llama3_2/1B_full_single_device          
                                         llama3_2/3B_full_single_device          
                                         mistral/7B_full_low_memory              
                                         phi3/mini_full_low_memory               
                                         qwen2/7B_full_single_device             
                                         qwen2/0.5B_full_single_device           
                                         qwen2/1.5B_full_single_device        

**Exercise**:

Listing TorchTune recipes
You are working on a text classification project to fine-tune a 1 billion parameter Llama 3.2 model. Given your hardware constraints, you need to choose a single device configuration, with full fine-tuning. To list all available options, you can use the console and run !tune ls.

Which recipe would be compatible with your model based on the constraints?

**Instructions**:

Possible answers



full_finetune_distributed with the llama3_2/1B_full configuration

full_finetune_single_device with the llama3_1/8B_full_single_device configuration

>>> full_finetune_single_device with the llama3_2/1B_full_single_device configuration

**Running a TorchTune task**:

Having listed your choices with !tune ls, you are now ready to launch your fine-tuning task with the recipe and configuration for your Llama 3.2, 1B model on single device (full_finetune_single_device with llama3_2/1B_full_single_device).

Which of the following commands will you use to run the task for 20 epochs, on a single device GPU?


**Possible Answers**


!torchtune run full_finetune_single_device --config llama3_2/1B_full_single_device device=cpu epochs=0


>> !tune run full_finetune_single_device --config llama3_2/1B_full_single_device device=cuda epochs=20


!tune run full_finetune_single_device --config llama3_2/1B_full_single_device device=gpu epochs=20

**Exercise**:

Filtering datasets for evaluation
You are building a training and evaluation pipeline for your company's health care chatbot, which is used by hospitals to onboard new patients.

Your task is to create a pipeline to load the MedQuad-MedicalQnADataset to evaluate an LLM on its ability to answer medical questions. You are asked to load the dataset in the ds variable, and only include the first 500 samples of the train split of the dataset stored in dataset_name as your evaluation set.

**Instructions**:

Import necessary functions and classes from datasets.
Load the dataset in the ds variable.
Manipulate ds to include the first 500 samples of the train split of the dataset stored in dataset_name as your evaluation set.

In [73]:
# Load neccesary imports from library
from datasets import load_dataset, Dataset
from google.colab import userdata
from huggingface_hub import login
from pprint import pprint

# Retrieve the Hugging Face API key from Colab Secrets
hf_api_key = userdata.get('HF_TOKEN')
# Log in to Hugging Face
login(token=hf_api_key)

dataset_name = "keivalya/MedQuad-MedicalQnADataset"
# Load the training split of the dataset
ds = load_dataset(dataset_name, split='train')

# Filter for the first 500 samples of the dataset
filtered_ds = Dataset.from_dict(ds[:500])
print(filtered_ds.shape)
print(filtered_ds.column_names)
pprint(filtered_ds[0])

README.md:   0%|          | 0.00/233 [00:00<?, ?B/s]

medDataset_processed.csv:   0%|          | 0.00/22.5M [00:00<?, ?B/s]

Generating train split:   0%|          | 0/16407 [00:00<?, ? examples/s]

(500, 3)
['qtype', 'Question', 'Answer']
{'Answer': 'LCMV infections can occur after exposure to fresh urine, '
           'droppings, saliva, or nesting materials from infected rodents.  '
           'Transmission may also occur when these materials are directly '
           'introduced into broken skin, the nose, the eyes, or the mouth, or '
           'presumably, via the bite of an infected rodent. Person-to-person '
           'transmission has not been reported, with the exception of vertical '
           'transmission from infected mother to fetus, and rarely, through '
           'organ transplantation.',
 'Question': 'Who is at risk for Lymphocytic Choriomeningitis (LCM)? ?',
 'qtype': 'susceptibility'}


**Exercise**:

Creating training samples
As part of a customer service chatbot that your team is building, you are creating a pipeline to preprocess a dataset that will eventually be used to fine-tune a language model so that it can predict the intent of a customer's question and route the requests to the correct team for processing.

You are given a dataset with the customer's question and intent in separate columns, and you want to preprocess the dataset so that you have merged each example containing the question and intent into a single string with your formatted prompt.

The dataset is already loaded in dataset. The dataset contains the columns instruction with the customer question, and intent for the user's intent.

**Instructions**:

Create a prompt string with the instruction and intent in the form "Query: {instruction}\nIntent: {intent}".
Fill out the function call with the dataset to apply the create_intent_example to each row.
Extract and print out the value in the intent_example column in the first row of the dataset.

In [74]:
# Load neccesary imports from library
from datasets import load_dataset, Dataset
from google.colab import userdata
from huggingface_hub import login
from pprint import pprint

# Retrieve the Hugging Face API key from Colab Secrets
hf_api_key = userdata.get('HF_TOKEN')
# Log in to Hugging Face
login(token=hf_api_key)

dataset_name = "bitext/Bitext-customer-support-llm-chatbot-training-dataset"
# Load the training split of the dataset
ds = load_dataset(dataset_name, split='train')

print(ds.shape)
print(ds.column_names)
pprint(ds[0])

def create_intent_example(row):
    # Fill out the columns in the prompt
    row['intent_example'] = f"Query: {row['instruction']}\nIntent: {row['intent']}"
    return row

def create_conversation_example(row):
    # Fill out the columns in the prompt
    row['conversation_example'] = f"Query: {row['instruction']}\nResponse: {row['response']}"
    return row

# Call the ds method to apply our preprocessing function to all rows
processed_dataset = ds.map(create_intent_example)
processed_dataset = processed_dataset.map(create_conversation_example)

# Print the intent_example in the first row of the processed data
print(processed_dataset[0]['intent_example'])


(26872, 5)
['flags', 'instruction', 'category', 'intent', 'response']
{'category': 'ORDER',
 'flags': 'B',
 'instruction': 'question about cancelling order {{Order Number}}',
 'intent': 'cancel_order',
 'response': "I've understood you have a question regarding canceling order "
             "{{Order Number}}, and I'm here to provide you with the "
             'information you need. Please go ahead and ask your question, and '
             "I'll do my best to assist you."}
Query: question about cancelling order {{Order Number}}
Intent: cancel_order


**Exercise**:

Saving preprocessed datasets
As part of your customer service chatbot project, you now have prepared a dataset for fine-tuning a Llama model. The next step is to save the dataset so that you can reload it later without having to repeat the preprocessing steps. This will allow your team to reuse the dataset across multiple experiments and iterations.

**Instructions**:


Save the preprocessed dataset ds to disk.
Load the saved dataset into a new variable ds_preprocessed.
Print the first element of ds_preprocessed.

In [75]:
from datasets import load_from_disk
from pprint import pprint

# Save the dataset to disk
processed_dataset.save_to_disk("/content/processed_dataset")

# Load the dataset from disk
ds_preprocessed = load_from_disk("/content/processed_dataset")

# Print the first element of the loaded dataset
print(ds_preprocessed.shape)
print(ds_preprocessed.column_names)
pprint(ds_preprocessed[0])

Saving the dataset (0/1 shards):   0%|          | 0/26872 [00:00<?, ? examples/s]

(26872, 7)
['flags', 'instruction', 'category', 'intent', 'response', 'intent_example', 'conversation_example']
{'category': 'ORDER',
 'conversation_example': 'Query: question about cancelling order {{Order '
                         'Number}}\n'
                         "Response: I've understood you have a question "
                         "regarding canceling order {{Order Number}}, and I'm "
                         'here to provide you with the information you need. '
                         "Please go ahead and ask your question, and I'll do "
                         'my best to assist you.',
 'flags': 'B',
 'instruction': 'question about cancelling order {{Order Number}}',
 'intent': 'cancel_order',
 'intent_example': 'Query: question about cancelling order {{Order Number}}\n'
                   'Intent: cancel_order',
 'response': "I've understood you have a question regarding canceling order "
             "{{Order Number}}, and I'm here to provide you with the "
         

**Exercise**:

Defining custom recipes
You're fine-tuning a pre-trained Llama model for a customer who requires specific configurations. Your plan is to use TorchTune for fine-tuning, and so need to prepare a Python dictionary that you can use to store the requirements for the custom recipe you'll use to run the fine-tuning job.

**Instructions**:

Specify the customer requirements in your dictionary: first, add the torchtune.models.llama3_2.llama3_2_1b model.
Add a batch size of 8 and a GPU device.

Instructions
100 XP
Specify the new model requirement, the "torchtune.models.llama3_2.llama3_2_3b" model, in your dictionary.
Save the requirements as a YAML file named custom_recipe.yaml.

In [82]:
!pip install --upgrade torchtune
!pip show torchtune

Name: torchtune
Version: 0.5.0
Summary: A native-PyTorch library for LLM fine-tuning
Home-page: 
Author: 
Author-email: PyTorch Team <packages@pytorch.org>
License: BSD 3-Clause License

Copyright 2024 Meta

Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice,this list
of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright notice, this
list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.

3. Neither the name of the copyright holder nor the names of its contributors may
be used to endorse or promote products derived from this software without specific
prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY
EXPRESS OR IMPLIED WARRANTIES, INCLUD

In [79]:
import yaml

# config_dict = {
#     # Define the model
#     "model": {"_component_": "torchtune.models.llama3_2.llama3_2_3b"},

#     # Define the batch size
#     "batch_size": 8,

#     # Define the device type
#     "device": "cuda",

#     # Define the number of epochs
#     "epochs": 1,

#     # Define the optimizer
#     "optimizer": {"_component_": "bitsandbytes.optim.PagedAdamW8bit", "lr": 3e-05},

#     # Define the dataset
#     "dataset": {"_component_": "processed_dataset"},

#     # Define the output directory
#     "output_dir": "finetune_results",

#     # Define dtype
#     "dtype": "bf16",
#     # Define path to an actual checkpoint
#     "resume_from_checkpoint": "",

#     "gradient_accumulation_steps": 1
# }


config_dict = {
  "device": "cuda",
  "dtype": "bf16",
  "epochs": 1,
  "batch_size": 8,
  "shuffle": "true",
  "resume_from_checkpoint": "",
  "gradient_accumulation_steps": 1,
  "optimizer_in_bwd": "false",
  "seed": 42,
  "checkpointer": {
    "_component_": "torchtune.training.Checkpointer",
    "model_type": "default",
    "output_dir": "finetune_results",
    "frequency": "epoch",
    "keep_last": "true"
  },
  "metric_logger":{
             "_component_": "torchtune.logging.ConsoleLogger"

  },
  "log_every_n_steps": 10,
  "log_peak_memory_stats": "false",
  "model": {
    "_component_": "torchtune.models.llama3_2.llama3_2_3b"
  },
  "optimizer": {
    "_component_": "bitsandbytes.optim.PagedAdamW8bit",
    "lr": 3.0e-05
  },
  "tokenizer": {
    "_component_": "torchtune.text_tokenizers.llama_tokenizer",
    "pretrained_tokenizer": "/path/to/tokenizer",
    "special_tokens": []
  },
  "loss": {
    "_component_": "torchtune.modules.cross_entropy_loss",
    "ignore_index": -100,
    "reduction": "mean"
  },
  "dataset": {
    "_component_": "processed_dataset"
  },
  "output_dir": "finetune_results",
  "enable_activation_checkpointing": "false",
  "enable_activation_offloading": "false",
  "compile": "false",
  "max_steps_per_epoch": "null"
}

# Save the updated configuration to a new YAML file
with open("/content/custom_recipe.yaml", "w") as yaml_file:
    yaml.dump(config_dict, yaml_file)


**Running custom fine-tuning**

tune run --config custom_recipe.yaml

In [80]:
!ls
!cat custom_recipe.yaml
# !tune run --help
# usage: tune run [TORCHRUN-OPTIONS] <recipe> --config <config> [RECIPE-OPTIONS]

3B_full_single_device.py  full_finetune_single_device.py  sample_data
custom_recipe.yaml	  processed_dataset
batch_size: 8
checkpointer:
  _component_: torchtune.training.Checkpointer
  frequency: epoch
  keep_last: 'true'
  model_type: default
  output_dir: finetune_results
compile: 'false'
dataset:
  _component_: processed_dataset
device: cuda
dtype: bf16
enable_activation_checkpointing: 'false'
enable_activation_offloading: 'false'
epochs: 1
gradient_accumulation_steps: 1
log_every_n_steps: 10
log_peak_memory_stats: 'false'
loss:
  _component_: torchtune.modules.cross_entropy_loss
  ignore_index: -100
  reduction: mean
max_steps_per_epoch: 'null'
metric_logger:
  _component_: torchtune.logging.ConsoleLogger
model:
  _component_: torchtune.models.llama3_2.llama3_2_3b
optimizer:
  _component_: bitsandbytes.optim.PagedAdamW8bit
  lr: 3.0e-05
optimizer_in_bwd: 'false'
output_dir: finetune_results
resume_from_checkpoint: ''
seed: 42
shuffle: 'true'
tokenizer:
  _component_: torchtune.tex

In [None]:
!tune cp full_finetune_single_device  full_finetune_single_device

Copied file to full_finetune_single_device.py


In [81]:
!tune run full_finetune_single_device --config custom_recipe.yaml

INFO:torchtune.utils._logging:Running FullFinetuneRecipeSingleDevice with resolved config:

batch_size: 8
checkpointer:
  _component_: torchtune.training.Checkpointer
  frequency: epoch
  keep_last: 'true'
  model_type: default
  output_dir: finetune_results
compile: 'false'
dataset:
  _component_: processed_dataset
device: cuda
dtype: bf16
enable_activation_checkpointing: 'false'
enable_activation_offloading: 'false'
epochs: 1
gradient_accumulation_steps: 1
log_every_n_steps: 10
log_peak_memory_stats: 'false'
loss:
  _component_: torchtune.modules.cross_entropy_loss
  ignore_index: -100
  reduction: mean
max_steps_per_epoch: 'null'
metric_logger:
  _component_: torchtune.logging.ConsoleLogger
model:
  _component_: torchtune.models.llama3_2.llama3_2_3b
optimizer:
  _component_: bitsandbytes.optim.PagedAdamW8bit
  lr: 3.0e-05
optimizer_in_bwd: 'false'
output_dir: finetune_results
resume_from_checkpoint: ''
seed: 42
shuffle: 'true'
tokenizer:
  _component_: torchtune.text_tokenizers.llam