<a href="https://colab.research.google.com/github/lakhanrajpatlolla/aiml-learning/blob/master/U4W21_73_Part_B_PEFT_for_Dialogue_Summary_C.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Advanced Certification in AIML
## A Program by IIIT-H and TalentSprint
### Assignment 2 Part-B: PEFT for Dialogue Summary

> **NOTE that** this Assignment is in 2 parts:
> - Part-A: Finetuning a Seq2Seq (T5) Model for Summarization
> - Part-B: PEFT for Dialogue Summary
>
>Only Part-B needs to be submitted for grading.

## Learning Objectives

At the end of the experiment, you will be able to:

* understand the working of a parameter efficinet finetuning method - LoRA
* fine tune a T5 model, `facebook/bart-large-cnn`, on the SAMSum dataset for summerization using LoRA
* push the finetuned LoRA adapter to HuggingFace model hub
* load the finetuned adapter from hub for inference

## Dataset Description

The **[SAMSum](https://huggingface.co/datasets/samsum) dataset** contains about 16k messenger-like conversations with summaries. Conversations were created and written down by linguists fluent in English. Linguists were asked to create conversations similar to those they write on a daily basis, reflecting the proportion of topics of their real-life messenger convesations. The style and register are diversified - conversations could be informal, semi-formal or formal, they may contain slang words, emoticons and typos. Then, **the conversations were annotated with summaries**. It was assumed that summaries should be a concise brief of what people talked about in the conversation in third person. The SAMSum dataset was prepared by Samsung R&D Institute Poland and is distributed for research purposes.

Data Splits:
- train: 14732
- val: 818
- test: 819

Data Fields:

- ***dialogue***: text of dialogue
- ***summary***: human written summary of the dialogue
- ***id***: unique id of an example

<br>

**Example:**

\{
> '**id**': '13818513',

>'**summary**': 'Amanda baked cookies and will bring Jerry some tomorrow.',

>'**dialogue**': "Amanda: I baked cookies. Do you want some?\r\nJerry: Sure!\r\nAmanda: I'll bring you tomorrow :-)"

\}

## Information

### **Parameter-Efficient Fine-Tuning (PEFT) methods**

Fine-tuning large pretrained models is often prohibitively costly due to their scale. Parameter-Efficient Fine-Tuning (PEFT) methods enable efficient adaptation of large pretrained models to various downstream applications by only fine-tuning a small number of (extra) model parameters instead of all the model's parameters. This significantly decreases the computational and storage costs. Recent state-of-the-art PEFT techniques achieve performance comparable to fully fine-tuned models.

PEFT is integrated with Transformers for easy model training and inference, and Accelerate for distributed training and inference for really big models.

[PEFT](https://github.com/huggingface/peft) is also a new open-source library from Hugging Face to enable efficient adaptation of pre-trained language models (PLMs) to various downstream applications ***without*** fine-tuning all the model's parameters.

PEFT currently includes techniques for:

- **LoRA:** Low-Rank Adaptation of Large Language Models
- **Prefix Tuning:** P-Tuning v2
- **P-Tuning**
- **Prompt Tuning**


### **LoRA**

It is a technique that accelerates the fine-tuning of large models while consuming less memory.

To make fine-tuning more efficient, LoRA's approach is to represent the weight updates with two smaller matrices (called update matrices) through low-rank decomposition.

A and B are update matrices in below figure.

<center>
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/peft/lora_diagram.png" width=900px>
</center>
<br>

- These new matrices can be trained to **adapt to the new data** while keeping the overall number of changes low.
- The original weight matrix **remains frozen** and doesn't receive any further adjustments.
- To produce the final results, both the original and the adapted weights are **combined**.

### Setup Steps:

In [1]:
#@title Please enter your registration id to start: { run: "auto", display-mode: "form" }
Id = "2418775" #@param {type:"string"}

In [2]:
#@title Please enter your password (normally your phone number) to continue: { run: "auto", display-mode: "form" }
password = "9959000490" #@param {type:"string"}

In [3]:
#@title Run this cell to complete the setup for this Notebook
from IPython import get_ipython
import re
ipython = get_ipython()

notebook= "U4W21_73_Part-B_PEFT_for_Dialogue_Summary_C" #name of the notebook

def setup():
#  ipython.magic("sx pip3 install torch")
    from IPython.display import HTML, display
    ipython.magic("sx pip3 install torch")
    ipython.magic("sx pip3 install torchvision")
    ipython.magic("sx mkdir results")
    display(HTML('<script src="https://dashboard.talentsprint.com/aiml/record_ip.html?traineeId={0}&recordId={1}"></script>'.format(getId(),submission_id)))
    print("Setup completed successfully")
    return

def submit_notebook():
    ipython.magic("notebook -e "+ notebook + ".ipynb")

    import requests, json, base64, datetime

    url = "https://dashboard.talentsprint.com/xp/app/save_notebook_attempts"
    if not submission_id:
      data = {"id" : getId(), "notebook" : notebook, "mobile" : getPassword()}
      r = requests.post(url, data = data)
      r = json.loads(r.text)

      if r["status"] == "Success":
          return r["record_id"]
      elif "err" in r:
        print(r["err"])
        return None
      else:
        print ("Something is wrong, the notebook will not be submitted for grading")
        return None

    elif getAnswer() and getComplexity() and getAdditional() and getConcepts() and getWalkthrough() and getComments() and getMentorSupport():
      f = open(notebook + ".ipynb", "rb")
      file_hash = base64.b64encode(f.read())

      data = {"complexity" : Complexity, "additional" :Additional,
              "concepts" : Concepts, "record_id" : submission_id,
              "answer" : Answer, "id" : Id, "file_hash" : file_hash,
              "notebook" : notebook, "feedback_walkthrough":Walkthrough ,
              "feedback_experiments_input" : Comments,
              "feedback_inclass_mentor": Mentor_support}

      r = requests.post(url, data = data)
      r = json.loads(r.text)
      if "err" in r:
        print(r["err"])
        return None
      else:
        print("Your submission is successful.")
        print("Ref Id:", submission_id)
        print("Date of submission: ", r["date"])
        print("Time of submission: ", r["time"])
        print("View your submissions: https://learn-iiith.talentsprint.com/notebook_submissions")
        #print("For any queries/discrepancies, please connect with mentors through the chat icon in LMS dashboard.")
        return submission_id
    else: submission_id


def getAdditional():
  try:
    if not Additional:
      raise NameError
    else:
      return Additional
  except NameError:
    print ("Please answer Additional Question")
    return None

def getComplexity():
  try:
    if not Complexity:
      raise NameError
    else:
      return Complexity
  except NameError:
    print ("Please answer Complexity Question")
    return None

def getConcepts():
  try:
    if not Concepts:
      raise NameError
    else:
      return Concepts
  except NameError:
    print ("Please answer Concepts Question")
    return None


def getWalkthrough():
  try:
    if not Walkthrough:
      raise NameError
    else:
      return Walkthrough
  except NameError:
    print ("Please answer Walkthrough Question")
    return None

def getComments():
  try:
    if not Comments:
      raise NameError
    else:
      return Comments
  except NameError:
    print ("Please answer Comments Question")
    return None


def getMentorSupport():
  try:
    if not Mentor_support:
      raise NameError
    else:
      return Mentor_support
  except NameError:
    print ("Please answer Mentor support Question")
    return None

def getAnswer():
  try:
    if not Answer:
      raise NameError
    else:
      return Answer
  except NameError:
    print ("Please answer Question")
    return None


def getId():
  try:
    return Id if Id else None
  except NameError:
    return None

def getPassword():
  try:
    return password if password else None
  except NameError:
    return None

submission_id = None
### Setup
if getPassword() and getId():
  submission_id = submit_notebook()
  if submission_id:
    setup()
else:
  print ("Please complete Id and Password cells before running setup")



Setup completed successfully


### Install required dependencies

In [None]:
!pip -q install transformers datasets evaluate transformers[torch]

# A dependecy required for loading SAMSum dataset
!pip -q install py7zr

!pip -q install peft

### Import required packages

In [None]:
import os
import torch
from datasets import load_dataset
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
from transformers import TrainingArguments, Trainer

from peft import LoraConfig, get_peft_model, TaskType
from peft import PeftModel, PeftConfig

import warnings
warnings.filterwarnings('ignore')

### **Load Model & Tokenizer**

In [None]:
# Load model from HF Model Hub

"""
BART HAS 400M PARAMS: https://github.com/facebookresearch/fairseq/tree/main/examples/bart
Look into Model card - 400 Million parameters
"""

checkpoint = "facebook/bart-large-cnn"                # username/repo-name

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(checkpoint)

# Load model
model = AutoModelForSeq2SeqLM.from_pretrained(checkpoint)

### **Load Dataset**

In [None]:
# Load SAMSum dataset
dataset = load_dataset("samsum", trust_remote_code=True)
dataset

### **Prepare the Dataset**

In [None]:
# Define function to prepare dataset

def tokenize_inputs(example):

    start_prompt = "Summarize the following conversation.\n\n"
    end_prompt = "\n\nSummary: "
    prompt = [start_prompt + dialogue + end_prompt for dialogue in example['dialogue']]
    example['input_ids'] = tokenizer(prompt, padding='max_length', truncation=True, max_length=512, return_tensors='pt').input_ids             # 'pt' for pytorch tensor
    example['labels'] = tokenizer(example['summary'], padding='max_length', truncation=True, max_length=512, return_tensors='pt').input_ids

    return example

In [None]:
# Prepare dataset
tokenizer.pad_token = tokenizer.eos_token
tokenized_datasets = dataset.map(tokenize_inputs, batched=True)       # using batched=True for Fast tokenizer implementation

# Remove columns/keys that are not needed further
tokenized_datasets = tokenized_datasets.remove_columns(['id', 'dialogue', 'summary'])

In [None]:
# Shortening the data: Just picking row index divisible by 100
# For learning purpose! It will reduce the compute resource requirement and training time

tokenized_datasets = tokenized_datasets.filter(lambda example, index: index % 100 == 0, with_indices=True)

In [None]:
print(tokenized_datasets['train'].shape)
print(tokenized_datasets['validation'].shape)
print(tokenized_datasets['test'].shape)

### **Create PEFT Model using LoRA**

To fine-tune a model using LoRA, you need to:

- Instantiate a base model, here it is `facebook/bart-large-cnn`
- Create a configuration (`LoraConfig`) where you define LoRA-specific parameters
- Wrap the base model with `get_peft_model()` to get a trainable `PeftModel`
- Train the `PeftModel` as you normally would train the base model

In [None]:
from peft import LoraConfig, get_peft_model, TaskType

# LoRA-specific parameters
lora_config = LoraConfig(
    r=32,                       # 8, 16, 32    # the rank of the update matrices
    lora_alpha=32,                             # LoRA scaling factor
    lora_dropout=0.05,
    bias='none',                               # specifies if the bias parameters should be trained
    task_type=TaskType.SEQ_2_SEQ_LM,           # telling lora that this is a sq2seq modeling task
)

In [None]:
# Trainable PEFTModel
peft_model = get_peft_model(model, peft_config=lora_config)

### **Train PEFT Model**

In [None]:
from transformers import TrainingArguments, Trainer

peft_training_args = TrainingArguments(
    output_dir="./mode_tuned_peft",           # local directory
    hub_model_id="sumanthk/PEFT_Partb",      # identifier on the Hub for directly pushing to HFhub model
    learning_rate=1e-5,
    num_train_epochs=5,      ## for 5 epochs took around 10 minutes
    weight_decay=0.01,
    auto_find_batch_size=True,
    evaluation_strategy='epoch',
    logging_steps=10,
    report_to="none"
)

peft_trainer = Trainer(
    model=peft_model,                    # model to be fine-tuned
    args=peft_training_args,                       # training arguments
    train_dataset=tokenized_datasets['train'],          # train data to use
    eval_dataset=tokenized_datasets['validation']       # validation data to use
)

In [None]:
# Number of trainable parameters
peft_model.print_trainable_parameters()

From above we can see, here we are only training 1.14% of the parameters of the model.

In [None]:
# Disabling Weights and Biases logging
import os
os.environ["WANDB_DISABLED"] = "true"

In [None]:
# Training
peft_trainer.train()

### **Save PEFT Adapter**

**Push your Peft adapter to Hugging Face Model Hub**

In [None]:
# Login to HuggingFace
# Run, and paste your HF Access token when prompted
from huggingface_hub import notebook_login
notebook_login()

Access your pushed adapter at `https://huggingface.co/[YOUR-USER-NAME]/[YOUR-MODEL-REPO-NAME]`

In [None]:
# Push peft adapter to Hub
my_peft_repo = "PEFT_Partb"

peft_model.push_to_hub(repo_id= my_peft_repo, commit_message= "Upload peft adapter", )

### **Reload & Test**

**Test your LoRA finetuned model downloaded from HF Model Hub**

In [None]:
username = "sumanthk"      # change it to your HuggingFace username

checkpoint = username + '/PEFT_Partb'  # change it to your Repo name

loaded_model = AutoModelForSeq2SeqLM.from_pretrained(checkpoint)

In [None]:
def generate_summary(input, llm):
    """Prepare prompt  -->  tokenize -->  generate output using LLM  -->  detokenize output"""

    input_prompt = f"""
                    Summarize the following conversation.

                    {input}

                    Summary:
                    """

    input_ids = tokenizer(input_prompt, return_tensors='pt')
    tokenized_output = llm.generate(input_ids=input_ids['input_ids'], min_length=30, max_length=200, )
    output = tokenizer.decode(tokenized_output[0], skip_special_tokens=True)

    return output

In [None]:
sample = dataset['test'][0]['dialogue']
label = dataset['test'][0]['summary']

output = generate_summary(sample, llm=loaded_model)

print("Sample")
print(sample)
print("-------------------")
print("Summary:")
print(output)
print("Ground Truth Summary:")
print(label)

### References:

1. [LoRA](https://huggingface.co/docs/peft/main/en/conceptual_guides/lora)
2. [Quicktour](https://huggingface.co/docs/peft/en/quicktour)
3. [Efficient Large Language Model training with LoRA and Hugging Face](https://www.philschmid.de/fine-tune-flan-t5-peft)

### Please answer the questions below to complete the experiment:




In [4]:
#@title Select the False statement w.r.t LoRA: { run: "auto", form-width: "500px", display-mode: "form" }
Answer = "During inference, only the original weight matrix is used to produce final results" #@param ["", "It is a technique that accelerates the fine-tuning of large models while consuming less memory", "During training, the original weight matrix remains frozen and doesn't receive any further adjustments", "During inference, only the original weight matrix is used to produce final results", "None of the above"]

In [5]:
#@title How was the experiment? { run: "auto", form-width: "500px", display-mode: "form" }
Complexity = "Good and Challenging for me" #@param ["","Too Simple, I am wasting time", "Good, But Not Challenging for me", "Good and Challenging for me", "Was Tough, but I did it", "Too Difficult for me"]


In [6]:
#@title If it was too easy, what more would you have liked to be added? If it was very difficult, what would you have liked to have been removed? { run: "auto", display-mode: "form" }
Additional = "good" #@param {type:"string"}


In [7]:
#@title Can you identify the concepts from the lecture which this experiment covered? { run: "auto", vertical-output: true, display-mode: "form" }
Concepts = "Yes" #@param ["","Yes", "No"]


In [8]:
#@title  Experiment walkthrough video? { run: "auto", vertical-output: true, display-mode: "form" }
Walkthrough = "Very Useful" #@param ["","Very Useful", "Somewhat Useful", "Not Useful", "Didn't use"]


In [9]:
#@title  Text and image description/explanation and code comments within the experiment: { run: "auto", vertical-output: true, display-mode: "form" }
Comments = "Somewhat Useful" #@param ["","Very Useful", "Somewhat Useful", "Not Useful", "Didn't use"]


In [10]:
#@title Mentor Support: { run: "auto", vertical-output: true, display-mode: "form" }
Mentor_support = "Very Useful" #@param ["","Very Useful", "Somewhat Useful", "Not Useful", "Didn't use"]


In [11]:
#@title Run this cell to submit your notebook for grading { vertical-output: true }
try:
  if submission_id:
      return_id = submit_notebook()
      if return_id : submission_id = return_id
  else:
      print("Please complete the setup first.")
except NameError:
  print ("Please complete the setup first.")

Your submission is successful.
Ref Id: 2542
Date of submission:  19 Apr 2025
Time of submission:  15:27:02
View your submissions: https://learn-iiith.talentsprint.com/notebook_submissions
