# | NLP | PEFT/LoRA | DialogSum | Dialog Summarize |

## NLP (Natural Language Processing) with PEFT (Parameter Efficient Fine-Tuning) and LoRA (Low-Rank Adaptation) for Dialogue Summarization

# <b>1 <span style='color:#78D118'>|</span> Introduction</b>

This project delves into the capabilities of LLM (Language Model) with a specific focus on leveraging Parameter Efficient Fine-Tuning (PEFT) for enhancing dialogue summarization using the FLAN-T5 model.

Our goal is to enhance the quality of dialogue summarization by employing a comprehensive fine-tuning approach and evaluating the results using ROUGE metrics. Additionally, we will explore the advantages of Parameter Efficient Fine-Tuning (PEFT), demonstrating that its benefits outweigh any potential minor performance trade-offs.

 - NOTE: This is an example and we not using the entirety of the data used for PERF / LoRA.
 
## Objectives :
 - Train LLM for Dialogue Summarization.
 
 
 ## The DialogSum Dataset:
The [DialogSum Dataset](https://huggingface.co/datasets/knkarthick/dialogsum) DialogSum is a large-scale dialogue summarization dataset, consisting of 13,460 (Plus 100 holdout data for topic generation) dialogues with corresponding manually labeled summaries and topics.

## Project Workflow:

- **Setup**: Import necessary libraries and define project parameters.
- **Dataset Exploration**: Discovering DialogSum Dataset.
- **Test Model Zero Shot Inferencing**: Initially, test the FLAN-T5 model for zero-shot inferencing on dialogue summarization tasks to establish a baseline performance.
- **Dataset Preprocess Dialog and Summary**: Preprocess the dialog and its corresponding summary from the dataset to prepare for the train.
-  **Perform Parameter Efficient Fine-Tuning (PEFT)**: Implement Parameter Efficient Fine-Tuning (PEFT), a more efficient fine-tuning approach that can significantly reduce training time while maintaining performance.
-  **Evaluation**:
    - Perform human evaluation to gauge the model's output in terms of readability and coherence. This can involve annotators ranking generated summaries for quality.
    - Utilize ROUGE metrics to assess the quality of the generated summaries. ROUGE measures the overlap between generated summaries and human-written references.

# <b>2<span style='color:#78D118'>|</span> Setup</b>
## <b>2.1 <span style='color:#78D118'>|</span> Imports</b>

In [1]:
%pip install --upgrade pip
%pip install --disable-pip-version-check \
    torch==1.13.1 \
    torchdata==0.5.1 --quiet

%pip install \
    transformers==4.27.2 \
    datasets==2.11.0 \
    evaluate==0.4.0 \
    rouge_score==0.1.2 \
    loralib==0.1.1 \
    peft==0.3.0 --quiet

Collecting pip
  Downloading https://files.pythonhosted.org/packages/15/aa/3f4c7bcee2057a76562a5b33ecbd199be08cdb4443a02e26bd2c3cf6fc39/pip-23.3.2-py3-none-any.whl (2.1MB)
Installing collected packages: pip
  Found existing installation: pip 19.2.3
    Uninstalling pip-19.2.3:
      Successfully uninstalled pip-19.2.3
Successfully installed pip-23.3.2
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


In [2]:
from datasets import load_dataset
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer, GenerationConfig, TrainingArguments, Trainer
import torch
import time
import evaluate
import pandas as pd
import numpy as np
from peft import LoraConfig, get_peft_model, TaskType
from peft import PeftModel, PeftConfig

  from .autonotebook import tqdm as notebook_tqdm


In [3]:
rouge = evaluate.load('rouge')
dash_line = '-'.join('' for x in range(100))

Downloading builder script: 100%|██████████| 6.27k/6.27k [00:00<?, ?B/s]


Load the dataset

In [4]:
huggingface_dataset_name = "knkarthick/dialogsum"
dataset = load_dataset(huggingface_dataset_name)

Downloading readme: 100%|██████████| 4.65k/4.65k [00:00<?, ?B/s]


Downloading and preparing dataset csv/knkarthick--dialogsum to C:/Users/Prajwal-S-Yallur/.cache/huggingface/datasets/knkarthick___csv/knkarthick--dialogsum-3005b557c2c04c1d/0.0.0/6954658bab30a358235fa864b05cf819af0e179325c740e4bc853bcc7ec513e1...


Downloading data: 100%|██████████| 11.3M/11.3M [00:02<00:00, 4.74MB/s]
Downloading data: 100%|██████████| 1.35M/1.35M [00:01<00:00, 1.20MB/s]
Downloading data: 100%|██████████| 442k/442k [00:00<00:00, 3.46MB/s]]
Downloading data files: 100%|██████████| 3/3 [00:07<00:00,  2.42s/it]
Extracting data files: 100%|██████████| 3/3 [00:00<00:00, 104.88it/s]
                                                                   

Dataset csv downloaded and prepared to C:/Users/Prajwal-S-Yallur/.cache/huggingface/datasets/knkarthick___csv/knkarthick--dialogsum-3005b557c2c04c1d/0.0.0/6954658bab30a358235fa864b05cf819af0e179325c740e4bc853bcc7ec513e1. Subsequent calls will reuse this data.


100%|██████████| 3/3 [00:00<00:00, 29.27it/s]


Load the pre-trained [FLAN-T5 model](https://huggingface.co/google/flan-t5-base) and its tokenizer directly from Hugging Face. We'll be using the smaller version of FLAN-T5 for this project.

To optimize memory usage, set `torch_dtype=torch.bfloat16` to specify the memory type used by this model.

In [5]:
model_name='google/flan-t5-base'
original_model = AutoModelForSeq2SeqLM.from_pretrained(model_name, torch_dtype=torch.bfloat16)
tokenizer = AutoTokenizer.from_pretrained(model_name)

Downloading config.json: 100%|██████████| 1.40k/1.40k [00:00<00:00, 311kB/s]
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
Downloading pytorch_model.bin: 100%|██████████| 990M/990M [02:17<00:00, 7.20MB/s] 
Downloading generation_config.json: 100%|██████████| 147/147 [00:00<00:00, 72.5kB/s]
Downloading tokenizer_config.json: 100%|██████████| 2.54k/2.54k [00:00<00:00, 417kB/s]
Downloading spiece.model: 100%|██████████| 792k/792k [00:00<00:00, 6.01MB/s]
Downloading tokenizer.json: 100%|██████████| 2.42M/2.42M [00:00<00:00, 3.87MB/s]
Downloading (…)cial_tokens_map.json: 100%|██████████| 2.20k/2.20k [00:00<00:00, 1.10MB/s]


## <b>2.2 <span style='color:#78D118'>|</span> Methods</b>

In [6]:
def print_number_of_trainable_model_parameters(model):
    trainable_model_params = 0
    all_model_params = 0
    for _, param in model.named_parameters():
        all_model_params += param.numel()
        if param.requires_grad:
            trainable_model_params += param.numel()
    return f"trainable model parameters: {trainable_model_params}\nall model parameters: {all_model_params}\npercentage of trainable model parameters: {100 * trainable_model_params / all_model_params:.2f}%"

def tokenize_function(example):
    start_prompt = 'Summarize the following conversation.\n\n'
    end_prompt = '\n\nSummary: '
    prompt = [start_prompt + dialogue + end_prompt for dialogue in example["dialogue"]]
    example['input_ids'] = tokenizer(prompt, padding="max_length", truncation=True, return_tensors="pt").input_ids
    example['labels'] = tokenizer(example["summary"], padding="max_length", truncation=True, return_tensors="pt").input_ids
    
    return example

# <b>3<span style='color:#78D118'>|</span> Data Exploration</b>

In [7]:
print(dash_line)
print(print_number_of_trainable_model_parameters(original_model))
print(dash_line)

---------------------------------------------------------------------------------------------------
trainable model parameters: 247577856
all model parameters: 247577856
percentage of trainable model parameters: 100.00%
---------------------------------------------------------------------------------------------------


In [8]:
print(
    """
---------------------------------------------------------------------------------------------------

PROMPT:

Summarize the following conversation.


#Person1#: Have you considered upgrading your system?

#Person2#: Yes, but I'm not sure what exactly I would need.

#Person1#: You could consider adding a painting program to your software. It would allow you to make up your own flyers and banners for advertising.

#Person2#: That would be a definite bonus.

#Person1#: You might also want to upgrade your hardware because it is pretty outdated now.

#Person2#: How can we do that?

#Person1#: You'd probably need a faster processor, to begin with. And you also need a more powerful hard disc, more memory and a faster modem. Do you have a CD-ROM drive?

#Person2#: No.

#Person1#: Then you might want to add a CD-ROM drive too, because most new software programs are coming out on Cds.

#Person2#: That sounds great. Thanks.


Summary:

---------------------------------------------------------------------------------------------------

HUMAN SUMMARY:

#Person1# teaches #Person2# how to upgrade software and hardware in #Person2#'s system.

---------------------------------------------------------------------------------------------------
    """
)


---------------------------------------------------------------------------------------------------

PROMPT:

Summarize the following conversation.


#Person1#: Have you considered upgrading your system?

#Person2#: Yes, but I'm not sure what exactly I would need.

#Person1#: You could consider adding a painting program to your software. It would allow you to make up your own flyers and banners for advertising.

#Person2#: That would be a definite bonus.

#Person1#: You might also want to upgrade your hardware because it is pretty outdated now.

#Person2#: How can we do that?

#Person1#: You'd probably need a faster processor, to begin with. And you also need a more powerful hard disc, more memory and a faster modem. Do you have a CD-ROM drive?

#Person2#: No.

#Person1#: Then you might want to add a CD-ROM drive too, because most new software programs are coming out on Cds.

#Person2#: That sounds great. Thanks.


Summary:

------------------------------------------------------------

# <b>4<span style='color:#78D118'>|</span> Test Model Zero Shot Inferencing</b>

Test the model using zero-shot inference. It's evident that the model faces challenges in summarizing the dialogue when compared to the baseline summary. However, it manages to extract some crucial information from the text, suggesting that fine-tuning.

In [9]:
index = 200

dialogue = dataset['test'][index]['dialogue']
summary = dataset['test'][index]['summary']

prompt = f"""
Summarize the following conversation.

{dialogue}

Summary:
"""

inputs = tokenizer(prompt, return_tensors='pt')
output = tokenizer.decode(
    original_model.generate(
        inputs["input_ids"], 
        max_new_tokens=200,
    )[0], 
    skip_special_tokens=True
)
print(dash_line)
print("ZERO SHOT")
print(dash_line)
print(f'PROMPT:\n{prompt}')
print(dash_line)
print(f'HUMAN SUMMARY:\n{summary}\n')
print(dash_line)
print(f'ORIGINAL MODEL SUMMARY:\n{output}')
print(dash_line)

---------------------------------------------------------------------------------------------------
ZERO SHOT
---------------------------------------------------------------------------------------------------
PROMPT:

Summarize the following conversation.

#Person1#: Have you considered upgrading your system?
#Person2#: Yes, but I'm not sure what exactly I would need.
#Person1#: You could consider adding a painting program to your software. It would allow you to make up your own flyers and banners for advertising.
#Person2#: That would be a definite bonus.
#Person1#: You might also want to upgrade your hardware because it is pretty outdated now.
#Person2#: How can we do that?
#Person1#: You'd probably need a faster processor, to begin with. And you also need a more powerful hard disc, more memory and a faster modem. Do you have a CD-ROM drive?
#Person2#: No.
#Person1#: Then you might want to add a CD-ROM drive too, because most new software programs are coming out on Cds.
#Person2#: T

# <b>5<span style='color:#78D118'>|</span> Dataset Preprocess Dialog and Summary</b>

Transform the dialog-summary (prompt-response) pairs by adding specific instructions for the Language Model (LLM). Add the instruction "Summarize the following conversation" at the beginning of the dialog and "Summary" at the beginning of the summary like this:

Training prompt (dialogue):
```
Summarize the following conversation.

    Chris: This is his part of the conversation.
    Antje: This is her part of the conversation.
    
Summary: 
```

Training response (summary):
```
Both Chris and Antje participated in the conversation.
```

Now we preprocess the prompt-response dataset by tokenizing the text and extracting their input_ids, with one input_id assigned per token.

In [10]:
tokenized_datasets = dataset.map(tokenize_function, batched=True)
tokenized_datasets = tokenized_datasets.remove_columns(['id', 'topic', 'dialogue', 'summary',])

Map:   0%|          | 0/12460 [00:00<?, ? examples/s]

                                                                   

In [11]:
tokenized_datasets = tokenized_datasets.filter(lambda example, index: index % 100 == 0, with_indices=True)

                                                                      

 - NOTE: This is an example and we not using the entirety of the data used for PERF / LoRA.

In [12]:
print(dash_line)
print(f"Shapes of the datasets:")
print(f"Training: {tokenized_datasets['train'].shape}")
print(f"Validation: {tokenized_datasets['validation'].shape}")
print(f"Test: {tokenized_datasets['test'].shape}")
print(tokenized_datasets)
print(dash_line)

---------------------------------------------------------------------------------------------------
Shapes of the datasets:
Training: (125, 2)
Validation: (5, 2)
Test: (15, 2)
DatasetDict({
    train: Dataset({
        features: ['input_ids', 'labels'],
        num_rows: 125
    })
    test: Dataset({
        features: ['input_ids', 'labels'],
        num_rows: 15
    })
    validation: Dataset({
        features: ['input_ids', 'labels'],
        num_rows: 5
    })
})
---------------------------------------------------------------------------------------------------


Check the shapes of all three parts of the dataset:

# <b>6 <span style='color:#78D118'>|</span> Dataset Preprocess Dialog and Summary</b>

Let's delve into the process of Parameter Efficient Fine-Tuning (PEFT), which offers a more efficient alternative to full fine-tuning. PEFT encompasses various techniques, including Low-Rank Adaptation (LoRA) and prompt tuning (distinct from prompt engineering).

PEFT, it typically involves Low-Rank Adaptation (LoRA).

LoRA, in essence, enables fine-tuning of your model with significantly fewer computational resources, sometimes even just a single GPU. After fine-tuning for a specific task, use case, or tenant using LoRA, the original Language Model (LLM) remains unchanged, while a newly-trained "LoRA adapter" emerges. This LoRA adapter is substantially smaller than the original LLM, often only a fraction of its size (in megabytes rather than gigabytes).

However, during inference, the LoRA adapter needs to be reintegrated and combined with its original LLM to fulfill the inference request. The advantage lies in the fact that multiple LoRA adapters can reuse the same original LLM, reducing overall memory requirements when serving multiple tasks and use cases.

## <b>6.1 <span style='color:#78D118'>|</span> PEFT/LoRA model for Fine-Tuning</b>

To configure the PEFT/LoRA model for fine-tuning with a new parameter adapter, we follow these steps:

1. **PEFT/LoRA Setup**: 
   - We are using PEFT/LoRA, which means we freeze the underlying Language Model (LLM) and focus on training only the adapter.

2. **Adapter Configuration**:
   - LoRA configuration below, the `rank (r)` hyper-parameter. This hyper-parameter determines the rank or dimensionality of the adapter that will be trained.

By employing PEFT/LoRA, we ensure that the core LLM remains unchanged while adapting a separate parameterized layer for our specific task or use case. The `rank (r)` hyper-parameter plays a critical role in determining the adapter's complexity and capacity for the target task.

In [13]:
lora_config = LoraConfig(
    r=32, # Rank
    lora_alpha=32,
    target_modules=["q", "v"],
    lora_dropout=0.05,
    bias="none",
    task_type=TaskType.SEQ_2_SEQ_LM # FLAN-T5
)

Incorporate LoRA adapter layers and parameters into the original Language Model (LLM) for training.

In [14]:
peft_model = get_peft_model(original_model, 
                            lora_config)
print(print_number_of_trainable_model_parameters(peft_model))

trainable model parameters: 3538944
all model parameters: 251116800
percentage of trainable model parameters: 1.41%


## <b>6.2 <span style='color:#78D118'>|</span> Train PEFT/LoRA Adapter</b>

In [15]:
output_dir = f'./peft-dialogue-summary-training'

peft_training_args = TrainingArguments(
    output_dir=output_dir,
    auto_find_batch_size=True,
    learning_rate=1e-3, # Higher learning rate than full fine-tuning.
    num_train_epochs=1,
    logging_steps=1,
    max_steps=1    
)
    
peft_trainer = Trainer(
    model=peft_model,
    args=peft_training_args,
    train_dataset=tokenized_datasets["train"],
)

In [16]:
peft_trainer.train()

peft_model_path="./peft-dialogue-summary-checkpoint-local"

peft_trainer.model.save_pretrained(peft_model_path)
tokenizer.save_pretrained(peft_model_path)

  0%|          | 0/1 [00:00<?, ?it/s]

KeyboardInterrupt: 

In [29]:
peft_model_base = AutoModelForSeq2SeqLM.from_pretrained("google/flan-t5-base", torch_dtype=torch.bfloat16)
tokenizer = AutoTokenizer.from_pretrained("google/flan-t5-base")

peft_model = PeftModel.from_pretrained(peft_model_base, 
                                       peft_model_path, 
                                       torch_dtype=torch.bfloat16,
                                       is_trainable=False)

In [30]:
print(print_number_of_trainable_model_parameters(peft_model))

trainable model parameters: 0

all model parameters: 251116800

percentage of trainable model parameters: 0.00%


# <b>7 <span style='color:#78D118'>|</span> Evaluation</b>

## <b>7.1 <span style='color:#78D118'>|</span> Evaluate the Model Qualitatively (Human Evaluation)</b>

In [48]:
index = 200
dialogue = dataset['test'][index]['dialogue']
baseline_human_summary = dataset['test'][index]['summary']

prompt = f"""
Summarize the following conversation.

{dialogue}

Summary: """

input_ids = tokenizer(prompt, return_tensors="pt").input_ids

original_model_outputs = original_model.generate(input_ids=input_ids, generation_config=GenerationConfig(max_new_tokens=200, num_beams=1))
original_model_text_output = tokenizer.decode(original_model_outputs[0], skip_special_tokens=True)

peft_model_outputs = peft_model.generate(input_ids=input_ids, generation_config=GenerationConfig(max_new_tokens=200, num_beams=1))
peft_model_text_output = tokenizer.decode(peft_model_outputs[0], skip_special_tokens=True)

print(dash_line)
print(f'HUMAN SUMMARY:\n{human_baseline_summary}')
print(dash_line)
print(f'ORIGINAL MODEL:\n{original_model_text_output}')
print(dash_line)
print(f'PEFT MODEL: {peft_model_text_output}')
print(dash_line)

---------------------------------------------------------------------------------------------------

HUMAN SUMMARY:

#Person1# teaches #Person2# how to upgrade software and hardware in #Person2#'s system.

---------------------------------------------------------------------------------------------------

ORIGINAL MODEL:

#Person1#: I'm looking for a computer for painting. #Person2#: I'm not sure what I'd need. #Person1#: How about a computer for a painting program? #Person2#: I'm not sure. #Person2#: I'm not sure. #Person1#: I'd need a computer for painting. #Person2#: I'm not sure. #Person1#: I'm not sure. #Person2#: I'm not sure. #Person1#: I'm not sure. #Person2#: I'm not sure.

---------------------------------------------------------------------------------------------------

PEFT MODEL: #Person1# recommends adding a painting program to #Person2#'s software and upgrading hardware. #Person2# also wants to upgrade the hardware because it's outdated now.

---------------------------

In [52]:
dialogues = dataset['test'][0:10]['dialogue']
human_baseline_summaries = dataset['test'][0:10]['summary']

original_model_summaries = []
instruct_model_summaries = []
peft_model_summaries = []

for idx, dialogue in enumerate(dialogues):
    prompt = f"""
Summarize the following conversation.

{dialogue}

Summary: """
    
    input_ids = tokenizer(prompt, return_tensors="pt").input_ids

    human_baseline_text_output = human_baseline_summaries[idx]
    
    original_model_outputs = original_model.generate(input_ids=input_ids, generation_config=GenerationConfig(max_new_tokens=200))
    original_model_text_output = tokenizer.decode(original_model_outputs[0], skip_special_tokens=True)

    peft_model_outputs = peft_model.generate(input_ids=input_ids, generation_config=GenerationConfig(max_new_tokens=200))
    peft_model_text_output = tokenizer.decode(peft_model_outputs[0], skip_special_tokens=True)

    original_model_summaries.append(original_model_text_output)
    peft_model_summaries.append(peft_model_text_output)

zipped_summaries = list(zip(human_baseline_summaries, original_model_summaries, peft_model_summaries))
 
df = pd.DataFrame(zipped_summaries, columns = ['human_baseline_summaries', 'original_model_summaries', 'peft_model_summaries'])
df

Unnamed: 0,human_baseline_summaries,original_model_summaries,peft_model_summaries
0,Ms. Dawson helps #Person1# to write a memo to ...,#Person1#: Thank you for your help. #Person2#:...,#Person1# asks Ms. Dawson to take a dictation ...
1,In order to prevent employees from wasting tim...,#Person1#: This memo should be distributed to ...,#Person1# asks Ms. Dawson to take a dictation ...
2,Ms. Dawson takes a dictation for #Person1# abo...,This memo is to go out to all employees by thi...,#Person1# asks Ms. Dawson to take a dictation ...
3,#Person2# arrives late because of traffic jam....,The traffic is terrible.,#Person2# got stuck in traffic and #Person1# s...
4,#Person2# decides to follow #Person1#'s sugges...,The traffic jam at Carrefour is a real problem.,#Person2# got stuck in traffic and #Person1# s...
5,#Person2# complains to #Person1# about the tra...,#Person1#: I'm finally here. #Person2#: I'm st...,#Person2# got stuck in traffic and #Person1# s...
6,#Person1# tells Kate that Masha and Hero get d...,#Person2#: They are having a separation for 2 ...,Kate tells #Person2# Masha and Hero are gettin...
7,#Person1# tells Kate that Masha and Hero are g...,Masha and Hero are getting divorced.,Kate tells #Person2# Masha and Hero are gettin...
8,#Person1# and Kate talk about the divorce betw...,Masha and Hero are having a separation for 2 m...,Kate tells #Person2# Masha and Hero are gettin...
9,#Person1# and Brian are at the birthday party ...,People are at the party.,Brian remembers his birthday and invites #Pers...


## <b>7.2 <span style='color:#78D118'>|</span> Evaluate the Model Quantitatively (ROUGE Metric)</b>

The [ROUGE metric](https://en.wikipedia.org/wiki/ROUGE_(metric)) is a valuable tool for assessing the quality of summaries generated by models. It evaluates these summaries by comparing them to a "baseline" summary, typically crafted by a human. Although not flawless, the ROUGE metric provides insights into the improvement in the overall effectiveness of summarization achieved through fine-tuning.

In [53]:
human_baseline_summaries = results['human_baseline_summaries'].values
original_model_summaries = results['original_model_summaries'].values
peft_model_summaries     = results['peft_model_summaries'].values

original_model_results = rouge.compute(
    predictions=original_model_summaries,
    references=human_baseline_summaries[0:len(original_model_summaries)],
    use_aggregator=True,
    use_stemmer=True,
)

peft_model_results = rouge.compute(
    predictions=peft_model_summaries,
    references=human_baseline_summaries[0:len(peft_model_summaries)],
    use_aggregator=True,
    use_stemmer=True,
)

print(dash_line)
print('ORIGINAL MODEL:')
print(original_model_results)
print(dash_line)
print('PEFT MODEL:')
print(peft_model_results)
print(dash_line)

---------------------------------------------------------------------------------------------------

ORIGINAL MODEL:

{'rouge1': 0.2334158581572823, 'rouge2': 0.07603964187010573, 'rougeL': 0.20145520923859048, 'rougeLsum': 0.20145899339006135}

---------------------------------------------------------------------------------------------------

PEFT MODEL:

{'rouge1': 0.40810631575616746, 'rouge2': 0.1633255794568712, 'rougeL': 0.32507074586565354, 'rougeLsum': 0.3248950182867091}

---------------------------------------------------------------------------------------------------


In [54]:
print("Absolute percentage improvement of PEFT MODEL over ORIGINAL MODEL")

improvement = (np.array(list(peft_model_results.values())) - np.array(list(original_model_results.values())))
for key, value in zip(peft_model_results.keys(), improvement):
    print(f'{key}: {value*100:.2f}%')

Absolute percentage improvement of PEFT MODEL over ORIGINAL MODEL

rouge1: 17.47%

rouge2: 8.73%

rougeL: 12.36%

rougeLsum: 12.34%


## References

The creation of this document was greatly influenced by the following key sources of information:

1. [DialogSum Dataset](https://huggingface.co/datasets/knkarthick/dialogsum) DialogSum is a large-scale dialogue summarization dataset, consisting of 13,460 (Plus 100 holdout data for topic generation) dialogues with corresponding manually labeled summaries and topics.
2. [Generative AI with Large Language Models | Coursera](https://www.coursera.org/learn/generative-ai-with-llms?utm_medium=sem&utm_source=gg&utm_campaign=B2C_NAMER_generative-ai-with-llms_deeplearning-ai_FTCOF_learn_country-US-country-CA&campaignid=20534248984&adgroupid=160068579824&device=c&keyword=&matchtype=&network=g&devicemodel=&adposition=&creativeid=673251286004&hide_mobile_promo&gclid=CjwKCAjwg4SpBhAKEiwAdyLwvEW_WnNyptOwzHtsGmn5-OxT5BKsQeUXHPahO-opBJ0JjsSynHkPAxoCaoAQAvD_BwE) - An informative guide that provides in-depth explanations and examples on various LLMs.