### Agenda:

In this notebook I just want to see how well the pre-trained model and the fine-tuned model summarize the conversations. Comparison will be done on coherence, and facts presentation

## Imports

In [1]:
pip install datasets evaluate rouge_score py7zr -q accelerate peft bitsandbytes transformers[torch] trl

zsh:1: no matches found: transformers[torch]
Note: you may need to restart the kernel to use updated packages.


In [2]:
pip install --upgrade peft


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.0[0m[39;49m -> [0m[32;49m24.1.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49m/Users/divyahegde/anaconda3/bin/python -m pip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [3]:
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
from datasets import load_dataset, Dataset
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
import torch
from peft import PeftModel, PeftConfig


### Pre-trained BART model

In [4]:
tokenizer = AutoTokenizer.from_pretrained("facebook/bart-large-cnn")
model = AutoModelForSeq2SeqLM.from_pretrained("facebook/bart-large-cnn")



model.safetensors:   0%|          | 0.00/1.63G [00:00<?, ?B/s]

Some weights of BartForConditionalGeneration were not initialized from the model checkpoint at facebook/bart-large-cnn and are newly initialized: ['model.encoder.embed_tokens.weight', 'model.shared.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [12]:
##Generic function to generate summary and present it against the summary from the dataset
def generate_summary(input, llm):
    """Prepare prompt  -->  tokenize -->  generate output using LLM  -->  detokenize output"""

    input_prompt = f"""
                    Summarize the following conversation.

                    {input}

                    Summary:
                    """

    input_ids = tokenizer(input_prompt, return_tensors='pt')
    tokenized_output = llm.generate(input_ids=input_ids['input_ids'], min_length=30, max_length=200, )
    output = tokenizer.decode(tokenized_output[0], skip_special_tokens=True)

    return output

In [9]:
from peft import get_peft_model, PeftConfig,PeftModelForSeq2SeqLM, AutoPeftModel,AutoPeftModelForSeq2SeqLM


In [10]:
loaded_peft_model = AutoPeftModelForSeq2SeqLM.from_pretrained('divyahegde07/mode_tuned_peft')

Some weights of BartForConditionalGeneration were not initialized from the model checkpoint at facebook/bart-large-cnn and are newly initialized: ['model.encoder.embed_tokens.weight', 'model.shared.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


tokenizer_config.json:   0%|          | 0.00/1.24k [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/798k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/2.11M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/278 [00:00<?, ?B/s]

adapter_model.safetensors:   0%|          | 0.00/4.74M [00:00<?, ?B/s]

Splitting the dataste into test and train to pick some examples

In [6]:
test_data = load_dataset("samsum",split='test')
train_data = load_dataset("samsum",split='train')

In [8]:
sample = test_data[0]['dialogue']
label = test_data[0]['summary']

output = generate_summary(sample, llm=model)

print("Sample")
print(sample)
print("-------------------")
print("Summary:")
print(output)
print("Ground Truth Summary:")
print(label)
     

Sample
Hannah: Hey, do you have Betty's number?
Amanda: Lemme check
Hannah: <file_gif>
Amanda: Sorry, can't find it.
Amanda: Ask Larry
Amanda: He called her last time we were at the park together
Hannah: I don't know him well
Hannah: <file_gif>
Amanda: Don't be shy, he's very nice
Hannah: If you say so..
Hannah: I'd rather you texted him
Amanda: Just text him 🙂
Hannah: Urgh.. Alright
Hannah: Bye
Amanda: Bye bye
-------------------
Summary:
Hannah asks Amanda for Betty's number. Amanda tries to find the number but can't find it. She asks Hannah to text Larry. Hannah says she'd rather text him.
Ground Truth Summary:
Hannah needs Betty's number but Amanda doesn't have it. She needs to contact Larry.


In [13]:
sample = test_data[0]['dialogue']
label = test_data[0]['summary']

output = generate_summary(sample, loaded_peft_model)

print("Sample")
print(sample)
print("-------------------")
print("Summary:")
print(output)
print("Ground Truth Summary:")
print(label)
     

Sample
Hannah: Hey, do you have Betty's number?
Amanda: Lemme check
Hannah: <file_gif>
Amanda: Sorry, can't find it.
Amanda: Ask Larry
Amanda: He called her last time we were at the park together
Hannah: I don't know him well
Hannah: <file_gif>
Amanda: Don't be shy, he's very nice
Hannah: If you say so..
Hannah: I'd rather you texted him
Amanda: Just text him 🙂
Hannah: Urgh.. Alright
Hannah: Bye
Amanda: Bye bye
-------------------
Summary:
Amanda can't find Betty's number. She asks Hannah to text Larry, who called Betty the last time they were at the park together.
Ground Truth Summary:
Hannah needs Betty's number but Amanda doesn't have it. She needs to contact Larry.


In [7]:
sample = train_data[1]['dialogue']
label = train_data[1]['summary']

output = generate_summary(sample, model)

print("Sample")
print(sample)
print("-------------------")
print("Summary:")
print(output)
print("Ground Truth Summary:")
print(label)

Sample
Olivia: Who are you voting for in this election? 
Oliver: Liberals as always.
Olivia: Me too!!
Oliver: Great
-------------------
Summary:
Olivia and Oliver are voting in the upcoming election. Oliver is voting for the Liberal Party. Olivia wants to vote for the Republican Party. The two discuss who they will vote for.
Ground Truth Summary:
Olivia and Olivier are voting for liberals in this election. 


In [11]:
sample = train_data[1]['dialogue']
label = train_data[1]['summary']

output = generate_summary(sample, loaded_peft_model)

print("Sample")
print(sample)
print("-------------------")
print("Summary:")
print(output)
print("Ground Truth Summary:")
print(label)

Sample
Olivia: Who are you voting for in this election? 
Oliver: Liberals as always.
Olivia: Me too!!
Oliver: Great
-------------------
Summary:
Oliver is voting for Liberals as always. Olivia will vote for the Liberal Party. Oliver and Olivia are voting for the Liberals. Oliver will vote Liberal.
Ground Truth Summary:
Olivia and Olivier are voting for liberals in this election. 


For this example, we can see that just the BART model which is trained on CNN corpus is hallucinating. It is introducing new information which is non-existant in the conversation. The fine tuned model on the other hand summarizes much better.