<h1 style = "color:green ; text-shadow: 0px 0px 5px green">Installing Required Packages</h1>

In [1]:
# Installing Required Packages
! pip install -q peft accelerate bitsandBytes transformers datasets GPUtil sentencepiece sentence-transformers
print("All packages are Installed Succesfully")

All packages are Installed Succesfully


<h1 style = "color:green ; text-shadow: 0px 0px 5px green">Importing All Required Packages</h1>

In [56]:
# Importing Required Packages
import torch
import GPUtil
import os
import torch
import transformers
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig, LlamaTokenizer
from huggingface_hub import notebook_login
from datasets import load_dataset
from peft import prepare_model_for_kbit_training, LoraConfig, get_peft_model, PeftModel
import warnings
warnings.filterwarnings('ignore')
print("All imports are made correctly")

All imports are made correctly


<h1 style = "color:green ; text-shadow: 0px 0px 5px green;">Setting the GPU for the Project</h1>

In [5]:
# Setting the GPU and Checking for GPU in google colab
GPUtil.showUtilization()

if torch.cuda.is_available():
    print("GPU is available")
else:
    print("GPU is not available, using CPU instead")

os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"] = "0"

# if "COLAB_GPU" in os.environ:
#   from google.colab import output
#   output.enable_custom_widget_manager()

| ID | GPU | MEM |
------------------
|  0 |  0% |  0% |
GPU is available


In [4]:
# # Hugging face Login
# if "COLAB_GPU" in os.environ:
#   !huggingface-cli login
# else:
#   notebook_login()

<h1 style = "color:green ; text-shadow: 0px 0px 5px green;">Setting The Base Llama Model with 4-bit Quantization</h1>

In [7]:
# Define the Model Id and settings the BitsandBytes config
base_model_id = "meta-llama/Llama-2-7b-chat-hf"

bnb_config = BitsAndBytesConfig(
    load_in_4bit = True,
    bnb_4bit_use_double_quant = True,
    bnb_4bit_quant_type = "nf4",
    bnb_4bit_compute_dtype = torch.bfloat16
)

model = AutoModelForCausalLM.from_pretrained(base_model_id, quantization_config = bnb_config)

print(type(model))
print(model.__class__.__name__)

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

<class 'transformers.models.llama.modeling_llama.LlamaForCausalLM'>
LlamaForCausalLM


<h1 style = "color:green ; text-shadow: 0px 0px 5px green;">Loading the Train Dataset Using the transformers load_dataset() function</h1>

In [13]:
# Getting the training data
train_dataset = load_dataset("text", 
                             data_files = {
                                 "train":[
                                     r"C:\Users\Webbies\Jupyter_Notebooks\SLMs\Berger\BergerData\BergerWebContent _Company_Profile.txt",
                                     r"C:\Users\Webbies\Jupyter_Notebooks\SLMs\Berger\BergerData\BergerWebContent_Certification.txt",
                                     r"C:\Users\Webbies\Jupyter_Notebooks\SLMs\Berger\BergerData\BergerWebContent_CSR.txt",
                                     r"C:\Users\Webbies\Jupyter_Notebooks\SLMs\Berger\BergerData\BergerWebContent_Events.txt",
                                     r"C:\Users\Webbies\Jupyter_Notebooks\SLMs\Berger\BergerData\BergerWebContent_Expansion.txt",
                                     r"C:\Users\Webbies\Jupyter_Notebooks\SLMs\Berger\BergerData\BergerWebContent_Home.txt",
                                     r"C:\Users\Webbies\Jupyter_Notebooks\SLMs\Berger\BergerData\BergerWebContent_Policies.txt",
                                     r"C:\Users\Webbies\Jupyter_Notebooks\SLMs\Berger\BergerData\BergerWebContent_Sustainability.txt",
                                     r"C:\Users\Webbies\Jupyter_Notebooks\SLMs\Berger\BergerData\BergerWebContent_Team.txt"   
                                 ]
                             }, split = "train")

print("Train data extracted")

Generating train split: 0 examples [00:00, ? examples/s]

Train data extracted


<h1 style = "color:green; text-shadow:0px 0px 5px green;">Checking for the some texts in the train dataset variable</h1>

In [23]:
# Checking the training data
train_dataset["text"][1000]

'Policy on Materiality of Related Party Transactions and on dealing with Related Party Transactions'

<h1 style = "color:green ; text-shadow: 0px 0px 5px green;">Loading the Llama Tokenizer with Special Tokens Enabled</h1>

In [25]:
# Loading the Llama Tokenizer
tokenizer = LlamaTokenizer.from_pretrained(base_model_id, use_fast = False, trust_remote_code = True, add_eos_token = True)

if tokenizer.pad_token is None:
  tokenizer.add_special_tokens({'pad_token': tokenizer.eos_token})

<h1 style = "color:green ; text-shadow: 0px 0px 5px green;">Converting the Train dataset's text into Tokenized Form</h1>

In [27]:
# Tokenizing the train data
tokenized_train_dataset = []
for phrase in train_dataset:
  tokenized_train_dataset.append(tokenizer(phrase["text"]))

print("Tokenized the train data")

Tokenized the train data


<h1 style = "color:green ; text-shadow: 0px 0px 5px green;">Checking the Tokenized Dataset After tokenizing</h1>

In [31]:
# Checking the tokenized train data
print(tokenized_train_dataset[1])
print("--------------------------------------------------------")
print(tokenized_train_dataset[2])
print("--------------------------------------------------------")
tokenizer.eos_token

{'input_ids': [1, 2292, 914, 4522, 29877, 2], 'attention_mask': [1, 1, 1, 1, 1, 1]}
--------------------------------------------------------
{'input_ids': [1, 14657, 349, 475, 1259, 2], 'attention_mask': [1, 1, 1, 1, 1, 1]}
--------------------------------------------------------


'</s>'

<h1 style = "color:green ; text-shadow: 0px 0px 5px green;">Descriptions of Different Parameters</h1>

1. **r:** It is the rank of the PEFT configuration. It tells upto what rank the PEFT to be applied. A large value of 'r' requires more time with high
   performace. A small value of 'r' requires less time with low performance. In general the value of r are choosen between 4,8,16 and 32
   
2. **lora_alpha:** It is an integer value that is generally set as `2*r`

3. **target_modules:**
   * **q_proj (Query Projection):** This linear layer transforms the input token embeddings into query vectors (Q). Think of a query vector as a search
      term—it represents the token you're currently trying to find relationships for.
     
   * **k_proj (Key Projection):** This layer transforms the same input token embeddings into key vectors (K). Key vectors are like the tags or labels on
     all the other tokens in the sequence.
  
   * **v_proj (Value Projection):** This layer transforms the input token embeddings into value vectors (V). Value vectors contain the actual
     information content of each token. The self-attention mechanism uses the queries and keys to compute attention scores, which are then used to
     create a weighted sum of the value vectors. This weighted sum becomes the new, context-aware representation of the current token.
  
   * **o_proj (Output Projection):** After the attention scores are calculated and the weighted sum of value vectors is computed, this final linear
     layer projects the combined output from all the attention "heads" back into the original hidden dimension of the model. It's the final step that
     produces the output of the self-attention block.
  
   * **gate_proj (Gate Projection):** This layer and the up_proj layer together form the "gated" part of the FFN. It takes the input and passes it
     through a non-linear activation function, like Swish, which acts as a "gate" to control the flow of information.
  
   * **up_proj (Up Projection):** This layer takes the input and linearly transforms it into a higher-dimensional space. The output of the gate_proj is
     then multiplied element-wise with the output of the up_proj. This multiplicative interaction introduces a richer, more complex non-linearity to
     the model.
  
   * **down_proj (Down Projection):** This final layer in the FFN takes the result of the gated multiplication and projects it back down to the original
     hidden dimension. This output is then added to the original input (a residual connection), which helps with gradient flow during training.
  
4. **bias:** It controls whether the bias terms of the linear layers in a language model are trained during the fine-tuning process. Bias terms are
   additional trainable parameters added to the output of a linear layer, allowing the model to learn a constant offset. The valaues are "None", "all"
   and "lora_only"

6. **lora_dropout:** It is the drop out probability using which the hidden layers will be dropped from the NN architecture

7. **task_type:** It determines the task type for which the PEFT is done

---

<h1 style = "color:green ; text-shadow: 0px 0px 5px green;">Preparing the Model for k bit Training with PEFT-Lora Configuration</h1>

In [53]:
# Preparing the model for further training using LoraConfiguration
model.gradient_checkpointing_enable()
model = prepare_model_for_kbit_training(model)

config = LoraConfig(
    r = 8,
    lora_alpha=64,
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"],
    bias="none",
    lora_dropout=0.05,
    task_type="CAUSAL_LM"
)

model = get_peft_model(model, config)
print("The model is set succesfully")

The model is set succesfully


<h1 style = "color:green ; text-shadow:0px 0px 5px green;">Setting the Trainer for the PyTorch Like Training with some hyper-parameters</h1>

In [69]:
# Setting the Trainer object
trainer = transformers.Trainer(
    model = model,
    train_dataset = tokenized_train_dataset,
    args = transformers.TrainingArguments(
        output_dir = "./BergerFineTunnedModel", # The directory where the Fine Tunned Model will be stored
        per_device_train_batch_size = 4, # The default is 2
        gradient_accumulation_steps = 2,
        num_train_epochs = 3, # Increase the number of epochs if needed for better training
        learning_rate = 5e-5,
        max_steps = 20, # Use max steps to show the result for those steps
        bf16 = False,
        optim = "paged_adamw_8bit",
        logging_dir = "./log",
        save_strategy = "epoch",
        save_steps = 50,
        logging_steps = 10

),
    data_collator=transformers.DataCollatorForLanguageModeling(tokenizer, mlm = False),
)
model.config.use_cache = False
trainer.train()

print("Trainer Loaded Successfully")

No label_names provided for model class `PeftModelForCausalLM`. Since `PeftModel` hides base models input arguments, if label_names is not given, label_names can't be set automatically within `Trainer`. Note that empty label_names list will be used instead.


Step,Training Loss
10,0.7266
20,1.8853


Trainer Loaded Successfully


<h1 style = "color:green ; text-shadow: 0px 0px 5px green;">Loading Base Model and Tokenizers with 4-bit Quantization</h1>

In [73]:
# Prepare the model for training purpose
base_model_id = "meta-llama/Llama-2-7b-chat-hf"

nf4Config = BitsAndBytesConfig(
    load_in_4bit = True,
    bnb_4bit_use_double_quant = True,
    bnb_4bit_quant_type = "nf4",
    bnb_4bit_compute_dtype = torch.bfloat16
)

tokenizer = LlamaTokenizer.from_pretrained(base_model_id, use_fast = False, trust_remote_code = True, add_eos_token = True)

base_model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    quantization_config = nf4Config,
    device_map = "cuda",
    trust_remote_code = True,
    use_auth_token = True
  )

print("Loaded the Configuration")

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Loaded the Configuration


<h1 style = "color:green ; text-shadow: 0px 0px 5px green;">Loading the Tokenizer and Fine Tunned Model from PEFT</h1>

In [76]:
tokenizer = LlamaTokenizer.from_pretrained(base_model_id, use_fast = False, trust_remote_code = True, add_eos_token = True)
modelFinetuned = PeftModel.from_pretrained(base_model, "BergerFineTunnedModel/checkpoint-20")
print("The Fine tuned model loaded")

The Fine tuned model loaded


<h1 style = "color:green ; text-shadow: 0px 0px 5px green;">First Inference Step: Get Consize Answer</h1>

In [80]:
user_question = "How did the journey of Berger Paints initiated?"
eval_prompt = f"Question: {user_question} Just answer this question accurately and concisely.\n"
promptTokenized = tokenizer(eval_prompt, return_tensors = "pt").to("cuda")
modelFinetuned.eval()

with torch.no_grad():
  print(tokenizer.decode(modelFinetuned.generate(**promptTokenized, max_new_tokens=1024)[0], skip_special_tokens = True))
  torch.cuda.empty_cache()

Question: How did the journey of Berger Paints initiated? Just answer this question accurately and concisely.
 everybody's got a story to tell, and this one starts in 1909...
Mr. James Wilfred Adamson bought his first Oil and Colour Business, and that's how the journey of Berger Paints initiated. 1917 saw the foundation of Calcutta Oil and Colour Co. Ltd., and by 1923, that spread to Canada, Rhodesia, and the Caribbean. 🚀 

By 1950, the company had changed hands and was known as Berger Paints India Ltd. 🇮🇳 The rest, as they say, is history! 🎨 #BergerPaints #PaintingHistory #ColourfulStory #EveryonesGotAStory #MrAdamson #PaintingJourney #ColourfulJourney #HistoryOfBergerPaints #PaintingLegacy #ColourfulLegacy


<h1 style = "color:green ; text-shadow: 0px 0px 5px green;">Second Inference Step: Get Complete Answer</h1>

In [83]:
user_question = "How did the journey of Berger Paints initiated?"
eval_prompt = f"Question: {user_question}\nAnswer:"
promptTokenized = tokenizer(eval_prompt, return_tensors="pt").to("cuda")
modelFinetuned.eval()

with torch.no_grad():
  print(tokenizer.decode(modelFinetuned.generate(**promptTokenized, max_new_tokens=1024)[0], skip_special_tokens = True))
  torch.cuda.empty_cache()

Question: How did the journey of Berger Paints initiated?
Answer:  The journey of Berger Paints initiated with two distinct individuals starting their paint venture. In 1909, James Wilfred Adamson bought his first Oil and Colour Business, and by 1917 that spread to Canada, Rhodesia, and the Caribbean. Elsewhere an Englishman, Mr Hadfield, set up a small paint company in 1923 in Calcutta. 1927 saw the merger of Adamson's and Hadfield's paint businesses, and Berger Paints came into existence. 

Over the years, Berger Paints expanded its operations and spread its presence across the globe. In 1950, the company was registered as a private limited company, and in 1954, Berger Paints became the first paint company to be listed on the Calcutta Stock Exchange. In 1964, the company's operations spread to the United States, and by 1967, Berger Paints had set up its first overseas subsidiary in Canada. Today, Berger Paints is a leading paint company with a presence in over 100 countries worldwide

<h1 style = "color:green ; text-shadow: 0px 0px 5px green;">Saving the Fine Tuned Model Checkpoint Path</h1>

In [86]:
torch.save(modelFinetuned, "BergerFineTunnedModel_17Sep_5_32.pth")

<h3 style = "color:blue ; text-shadow: 0px 0px 5px blue;">Checking the Version of the Transformers Library</h3>

In [1]:
!pip show transformers

Name: transformers
Version: 4.51.2
Summary: State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow
Home-page: https://github.com/huggingface/transformers
Author: The Hugging Face team (past and future) with the help of all our contributors (https://github.com/huggingface/transformers/graphs/contributors)
Author-email: transformers@huggingface.co
License: Apache 2.0 License
Location: C:\Users\Webbies\anaconda3\Lib\site-packages
Requires: filelock, huggingface-hub, numpy, packaging, pyyaml, regex, requests, safetensors, tokenizers, tqdm
Required-by: peft, sentence-transformers


<h3 style = "color:blue ; text-shadow: 0px 0px 5px blue;">Checking the Version of the peft Library</h3>

In [3]:
!pip show peft

Name: peft
Version: 0.17.1
Summary: Parameter-Efficient Fine-Tuning (PEFT)
Home-page: https://github.com/huggingface/peft
Author: The HuggingFace team
Author-email: benjamin@huggingface.co
License: Apache
Location: C:\Users\Webbies\anaconda3\Lib\site-packages
Requires: accelerate, huggingface_hub, numpy, packaging, psutil, pyyaml, safetensors, torch, tqdm, transformers
Required-by: 


<h3 style = "color:blue ; text-shadow: 0px 0px 5px blue;">Checking the Version of the Accelerate Library</h3>

In [5]:
!pip show accelerate

Name: accelerate
Version: 1.10.1
Summary: Accelerate
Home-page: https://github.com/huggingface/accelerate
Author: The HuggingFace team
Author-email: zach.mueller@huggingface.co
License: Apache
Location: C:\Users\Webbies\anaconda3\Lib\site-packages
Requires: huggingface_hub, numpy, packaging, psutil, pyyaml, safetensors, torch
Required-by: peft


<h3 style = "color:blue ; text-shadow: 0px 0px 5px blue;">Checking the Version of the Bitsandbytes Library</h3>

In [7]:
!pip show bitsandBytes

Name: bitsandbytes
Version: 0.47.0
Summary: k-bit optimizers and matrix multiplication routines.
Home-page: https://github.com/bitsandbytes-foundation/bitsandbytes
Author: 
Author-email: Tim Dettmers <dettmers@cs.washington.edu>
License: MIT License

Copyright (c) Facebook, Inc. and its affiliates.

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHA