# Samsun summarization Mixtral 7B LoRA


In [1]:
!pip install -q rouge_score
!pip install -q datasets
!pip install -q transformers
!pip install -q evaluate
# !pip install accelerate
# !pip install -i https://pypi.org/simple/ bitsandbytes
# !pip install peft
# !pip install trl
!pip install -q tqdm
!pip install -q pd
!pip install -q huggingface_hub
# !pip install dataclasses
# !pip install typing

In [2]:
import torch
import transformers
from datasets import load_dataset, load_metric
from transformers import AutoTokenizer, AutoModel, AutoModelForSeq2SeqLM, AutoModelForCausalLM, TrainingArguments, Trainer, pipeline, BitsAndBytesConfig, DataCollatorForLanguageModeling, GenerationConfig
# from peft import LoraConfig, get_peft_model, TaskType, PeftModel, PeftConfig, prepare_model_for_kbit_training, AutoPeftModelForCausalLM
# from trl import SFTTrainer
import evaluate

import pandas as pd

import time
from tqdm import tqdm
# from dataclasses import dataclass, field
# from typing import Optional, Sequence, Dict

## 3. Evaluation

In [3]:
data = load_dataset("samsum", split="test")

In [4]:
data

Dataset({
    features: ['id', 'dialogue', 'summary'],
    num_rows: 819
})

In [5]:
print(data[0]["dialogue"])

Hannah: Hey, do you have Betty's number?
Amanda: Lemme check
Hannah: <file_gif>
Amanda: Sorry, can't find it.
Amanda: Ask Larry
Amanda: He called her last time we were at the park together
Hannah: I don't know him well
Hannah: <file_gif>
Amanda: Don't be shy, he's very nice
Hannah: If you say so..
Hannah: I'd rather you texted him
Amanda: Just text him 🙂
Hannah: Urgh.. Alright
Hannah: Bye
Amanda: Bye bye


In [6]:
print(data[0]["summary"])

Hannah needs Betty's number but Amanda doesn't have it. She needs to contact Larry.


In [8]:
def preprocess_data(example):
  dialogue = example["dialogue"]
  prompt = f"""<s>[INST] You are a helpful assistant. Your task is to generate following dialogue summarization:
{dialogue}[/INST]
</s>"""
  return {"dialogue": prompt}


In [9]:
data_tokenized = data.map(preprocess_data, batched=False, remove_columns=["id"])

In [10]:
data_tokenized

Dataset({
    features: ['dialogue', 'summary'],
    num_rows: 819
})

In [11]:
print(data_tokenized[0]["dialogue"])

<s>[INST] You are a helpful assistant. Your task is to generate following dialogue summarization:
Hannah: Hey, do you have Betty's number?
Amanda: Lemme check
Hannah: <file_gif>
Amanda: Sorry, can't find it.
Amanda: Ask Larry
Amanda: He called her last time we were at the park together
Hannah: I don't know him well
Hannah: <file_gif>
Amanda: Don't be shy, he's very nice
Hannah: If you say so..
Hannah: I'd rather you texted him
Amanda: Just text him 🙂
Hannah: Urgh.. Alright
Hannah: Bye
Amanda: Bye bye[/INST]
</s>


In [12]:
print(data_tokenized[0]["summary"])

Hannah needs Betty's number but Amanda doesn't have it. She needs to contact Larry.


In [13]:
tokenizer = AutoTokenizer.from_pretrained("msznajder/Mistral-7B-Instruct-v0.2-Samsum-DialSum-SFTT")
model = AutoModelForCausalLM.from_pretrained("msznajder/Mistral-7B-Instruct-v0.2-Samsum-DialSum-SFTT")
model.generation_config.pad_token_id = model.generation_config.eos_token_id

Loading checkpoint shards:   0%|          | 0/6 [00:00<?, ?it/s]

In [13]:
raw_tokenizer = AutoTokenizer.from_pretrained('mistralai/Mistral-7B-Instruct-v0.2')
raw_model = AutoModelForCausalLM.from_pretrained('mistralai/Mistral-7B-Instruct-v0.2')
raw_model.generation_config.pad_token_id = raw_model.generation_config.eos_token_id
raw_tokenizer.pad_token = raw_tokenizer.unk_token

Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

### ROUGE metric evaluation

In [14]:
DEVICE = "cuda" if torch.cuda.is_available() else "cpu"

In [15]:
def summarize(tokenizer, model, dialogue):
    inputs = tokenizer(dialogue, return_tensors="pt").to(DEVICE)
    inputs_length = len(inputs["input_ids"][0])
    with torch.inference_mode():
        outputs = model.generate(**inputs, max_new_tokens=256, do_sample=True, temperature=0.0001)
    return tokenizer.decode(outputs[0][inputs_length:], skip_special_tokens=True)

In [47]:
# For loop here clears memory after each iteration and does not cause out of memory error - .map does
finetuned_generated_summaries = []
for idx, row in enumerate(data_tokenized["dialogue"]):
  finetuned_generated_summary = summarize(tokenizer, model.to(DEVICE), row).strip()
  finetuned_generated_summaries.append(finetuned_generated_summary)

In [17]:
data_tokenized = data_tokenized.add_column("finetuned_generated_summary", finetuned_generated_summaries)

In [48]:
# For loop here clears memory after each iteration and does not cause out of memory error - .map does
raw_generated_summaries = []
for idx, row in enumerate(data_tokenized["dialogue"]):
  raw_generated_summary = summarize(raw_tokenizer, raw_model.to(DEVICE), row).strip()
  raw_generated_summaries.append(raw_generated_summary)

In [17]:
data_tokenized = data_tokenized.add_column("raw_generated_summary", raw_generated_summaries)

In [18]:
print(data_tokenized[10]["dialogue"])

<s>[INST] You are a helpful assistant. Your task is to generate following dialogue summarization:
Wanda: Let's make a party!
Gina: Why?
Wanda: beacuse. I want some fun!
Gina: ok, what do u need?
Wanda: 1st I need too make a list
Gina: noted and then?
Wanda: well, could u take yours father car and go do groceries with me?
Gina: don't know if he'll agree
Wanda: I know, but u can ask :)
Gina: I'll try but theres no promisess
Wanda: I know, u r the best!
Gina: When u wanna go
Wanda: Friday?
Gina: ok, I'll ask[/INST]
</s>


In [19]:
print(data_tokenized[10]["summary"])

Wanda wants to throw a party. She asks Gina to borrow her father's car and go do groceries together. They set the date for Friday. 


In [20]:
print(data_tokenized[10]["finetuned_generated_summary"])

Wanda wants to make a party. Gina will ask her father if he can go grocery shopping with Wanda. They want to meet on Friday.


In [18]:
print(data_tokenized[10]["raw_generated_summary"])

Wanda suggested making a party and asked Gina for help. Gina inquired about the reason and Wanda explained that she just wanted some fun. Wanda then requested that they make a list of things needed for the party. Gina agreed and Wanda asked if Gina could borrow her father's car to go grocery shopping together. Gina expressed uncertainty about getting permission, but Wanda encouraged her to ask. Gina agreed to try and mentioned that there was no promise it would be granted. Wanda expressed confidence in Gina and asked if they could go shopping on Friday. Gina agreed to ask about the car and plan accordingly for the party.


In [19]:
rouge = evaluate.load('rouge')

In [23]:
model_rouge = rouge.compute(
    predictions=data_tokenized["finetuned_generated_summary"],
    references=data_tokenized["summary"][0:len(data_tokenized["finetuned_generated_summary"])],
    use_aggregator=True,
    use_stemmer=True,
)
model_rouge

{'rouge1': 0.5157651537666942,
 'rouge2': 0.2650530320155057,
 'rougeL': 0.4295565331965113,
 'rougeLsum': 0.4294928872615915}

In [20]:
model_rouge = rouge.compute(
    predictions=data_tokenized["raw_generated_summary"],
    references=data_tokenized["summary"][0:len(data_tokenized["raw_generated_summary"])],
    use_aggregator=True,
    use_stemmer=True,
)
model_rouge

{'rouge1': 0.2916679198398079,
 'rouge2': 0.09951049848424319,
 'rougeL': 0.21840199659647197,
 'rougeLsum': 0.22900310959298054}

### Human evaluation

In [21]:
example = data_tokenized[0]
print(example["dialogue"])

<s>[INST] You are a helpful assistant. Your task is to generate following dialogue summarization:
Hannah: Hey, do you have Betty's number?
Amanda: Lemme check
Hannah: <file_gif>
Amanda: Sorry, can't find it.
Amanda: Ask Larry
Amanda: He called her last time we were at the park together
Hannah: I don't know him well
Hannah: <file_gif>
Amanda: Don't be shy, he's very nice
Hannah: If you say so..
Hannah: I'd rather you texted him
Amanda: Just text him 🙂
Hannah: Urgh.. Alright
Hannah: Bye
Amanda: Bye bye[/INST]
</s>


In [25]:
print(example["summary"])

Hannah needs Betty's number but Amanda doesn't have it. She needs to contact Larry.


In [26]:
print(example["finetuned_generated_summary"])

Amanda doesn't have Betty's number. Hannah will text Larry.


In [22]:
print(example["raw_generated_summary"])

: Hannah asked if you had Betty's number, but you couldn't find it. You suggested she ask Larry instead, as he had spoken to Betty recently. Hannah expressed hesitance due to not knowing Larry well, but you reassured her that he was nice. Eventually, Hannah agreed to text him to get Betty's number.


In [23]:
example = data_tokenized[1]
print(example["dialogue"])

<s>[INST] You are a helpful assistant. Your task is to generate following dialogue summarization:
Eric: MACHINE!
Rob: That's so gr8!
Eric: I know! And shows how Americans see Russian ;)
Rob: And it's really funny!
Eric: I know! I especially like the train part!
Rob: Hahaha! No one talks to the machine like that!
Eric: Is this his only stand-up?
Rob: Idk. I'll check.
Eric: Sure.
Rob: Turns out no! There are some of his stand-ups on youtube.
Eric: Gr8! I'll watch them now!
Rob: Me too!
Eric: MACHINE!
Rob: MACHINE!
Eric: TTYL?
Rob: Sure :)[/INST]
</s>


In [28]:
print(example["summary"])

Eric and Rob are going to watch a stand-up on youtube.


In [29]:
print(example["finetuned_generated_summary"])

Eric and Rob are watching Russian stand-up comedy. Eric likes the train part. Rob will check if there are more stand-ups of the same author on youtube.


In [24]:
print(example["raw_generated_summary"])

: Eric expressed excitement about a stand-up comedy performance by a machine, with Rob sharing in the enthusiasm. Eric mentioned the American perspective and found the train part particularly amusing. Rob revealed that there are more stand-up performances by the machine available on YouTube, and both Eric and Rob decided to watch them together. They ended the conversation with a playful "MACHINE!" and "TTYL" (talk to you later).


In [25]:
example = data_tokenized[2]
print(example["dialogue"])

<s>[INST] You are a helpful assistant. Your task is to generate following dialogue summarization:
Lenny: Babe, can you help me with something?
Bob: Sure, what's up?
Lenny: Which one should I pick?
Bob: Send me photos
Lenny:  <file_photo>
Lenny:  <file_photo>
Lenny:  <file_photo>
Bob: I like the first ones best
Lenny: But I already have purple trousers. Does it make sense to have two pairs?
Bob: I have four black pairs :D :D
Lenny: yeah, but shouldn't I pick a different color?
Bob: what matters is what you'll give you the most outfit options
Lenny: So I guess I'll buy the first or the third pair then
Bob: Pick the best quality then
Lenny: ur right, thx
Bob: no prob :)[/INST]
</s>


In [31]:
print(example["summary"])

Lenny can't decide which trousers to buy. Bob advised Lenny on that topic. Lenny goes with Bob's advice to pick the trousers that are of best quality.


In [32]:
print(example["finetuned_generated_summary"])

Lenny is picking out some trousers. Bob advises him to pick the best quality ones. Lenny will either pick the first or the third pair.


In [26]:
print(example["raw_generated_summary"])

Lenny asked Bob for help in choosing between three pairs of trousers he was considering purchasing. Bob asked Lenny to send him photos of the options. Lenny sent three files, and Bob preferred the first pair. Lenny expressed concern about already having a purple pair and questioned the need for another. Bob shared that he had multiple black pairs, but suggested Lenny should consider the color that would give him the most outfit options. Lenny decided to either buy the first or the third pair, and Bob advised him to prioritize the best quality. Lenny thanked Bob for his advice.


In [27]:
example = data_tokenized[10]
print(example["dialogue"])

<s>[INST] You are a helpful assistant. Your task is to generate following dialogue summarization:
Wanda: Let's make a party!
Gina: Why?
Wanda: beacuse. I want some fun!
Gina: ok, what do u need?
Wanda: 1st I need too make a list
Gina: noted and then?
Wanda: well, could u take yours father car and go do groceries with me?
Gina: don't know if he'll agree
Wanda: I know, but u can ask :)
Gina: I'll try but theres no promisess
Wanda: I know, u r the best!
Gina: When u wanna go
Wanda: Friday?
Gina: ok, I'll ask[/INST]
</s>


In [34]:
print(example["summary"])

Wanda wants to throw a party. She asks Gina to borrow her father's car and go do groceries together. They set the date for Friday. 


In [35]:
print(example["finetuned_generated_summary"])

Wanda wants to make a party. Gina will ask her father if he can go grocery shopping with Wanda. They want to meet on Friday.


In [28]:
print(example["raw_generated_summary"])

Wanda suggested making a party and asked Gina for help. Gina inquired about the reason and Wanda explained that she just wanted some fun. Wanda then requested that they make a list of things needed for the party. Gina agreed and Wanda asked if Gina could borrow her father's car to go grocery shopping together. Gina expressed uncertainty about getting permission, but Wanda encouraged her to ask. Gina agreed to try and mentioned that there was no promise it would be granted. Wanda expressed confidence in Gina and asked if they could go shopping on Friday. Gina agreed to ask about the car and plan accordingly for the party.


In [29]:
example = data_tokenized[20]
print(example["dialogue"])

<s>[INST] You are a helpful assistant. Your task is to generate following dialogue summarization:
Deirdre: Hi Beth, how are you love?
Beth: Hi Auntie Deirdre, I'm been meaning to message you, had a favour to ask.
Deirdre: Wondered if you had any thought about your Mum's 40th, we've got to do something special!
Beth: How about a girls weekend, just mum, me, you and the girls, Kira will have to come back from Uni, of course.
Deirdre: Sounds fab! Get your thinking cap on, it's only in 6 weeks! Bet she's dreading it, I remember doing that!
Beth: Oh yeah, we had a surprise party for you, you nearly had a heart attack! 
Deirdre: Well, it was a lovely surprise! Gosh, thats nearly 4 years ago now, time flies! What was the favour, darling?
Beth: Oh, it was just that I fancied trying a bit of work experience in the salon, auntie.
Deirdre: Well, I am looking for Saturday girls, are you sure about it? you could do well in the exams and go on to college or 6th form.
Beth: I know, but it's 

In [37]:
print(example["summary"])

Beth wants to organize a girls weekend to celebrate her mother's 40th birthday. She also wants to work at Deidre's beauty salon. Deidre offers her a few hours on Saturdays as work experience. They set up for a meeting tomorrow.


In [38]:
print(example["finetuned_generated_summary"])

Beth is planning a girls weekend for her mother's 40th birthday. Beth wants to try a beauty therapy at the salon where Deirdre works. Beth and Deirdre will meet tomorrow to discuss it.


In [30]:
print(example["raw_generated_summary"])

: Deirdre suggested a special plan for Beth's mother's 40th birthday, proposing a girls' weekend with Beth, her mother, Deirdre, and Kira. Beth agreed and mentioned she had a favor to ask. Deirdre recalled their own surprise party and reminisced about the past. Beth revealed her desire to try work experience at a salon, and Deirdre offered to help her explore opportunities there, suggesting meeting the beauty therapy manager, Maxine. They discussed a trial period with compensation for expenses and potential future employment. Beth expressed excitement about the opportunity and agreed to meet Maxine the following day. Deirdre expressed her support and love.


In [31]:
example = data_tokenized[30]
print(example["dialogue"])

<s>[INST] You are a helpful assistant. Your task is to generate following dialogue summarization:
Mary: hey, im kinda broke, lend me a few box
Carter: okay, give me an hour, im at the train station
Mary: cool, thanks[/INST]
</s>


In [40]:
print(example["summary"])

Mary ran out of money. Carter is going to lend her some in an hour.


In [41]:
print(example["finetuned_generated_summary"])

Carter will lend some money to Mary.


In [32]:
print(example["raw_generated_summary"])

: Mary asked Carter for a favor, requesting that he lend her some boxes. Carter agreed and mentioned that it would take him an hour to get to her, as he was currently at the train station. Mary expressed her gratitude for Carter's help.


In [33]:
example = data_tokenized[40]
print(example["dialogue"])

<s>[INST] You are a helpful assistant. Your task is to generate following dialogue summarization:
Sebastian: It's been already a year since we moved here.
Sebastian: This is totally the best time of my life.
Kevin: Really? 
Sebastian: Yeah! Totally maaan.
Sebastian: During this 1 year I learned more than ever. 
Sebastian: I learned how to be resourceful, I'm learning responsibility, and I literally have the power to make my dreams come true.
Kevin: It's great to hear that.
Kevin: It's great that you are satisfied with your decisions.
Kevin: And above all it's great to see that you have someone you love by your side :)
Sebastian: Exactly!
Sebastian: That's another part of my life that is going great.
Kevin: I wish I had such a person by my side.
Sebastian: Don't worry about it.
Sebastian: I have a feeling this day will come shortly.
Kevin: Haha. I don' think so, but thanks.
Sebastian: This one year proved to me that when you want something really badly, you can achieve it

In [43]:
print(example["summary"])

Sebastian is very happy with his life, and shares this happiness with Kevin.


In [44]:
print(example["finetuned_generated_summary"])

Sebastian moved to a new place a year ago. He is satisfied with his life. Kevin wishes he could win a lottery.


In [34]:
print(example["raw_generated_summary"])

Sabastian expresses his satisfaction with his life decisions and the past year since moving, sharing that he's learned valuable skills like resourcefulness and responsibility. He also mentions the importance of having someone he loves by his side. Kevin expresses his happiness for Sabastian and shares his own dream of winning the lottery, to which Sabastian suggests dedicating time and effort to achieving it. The conversation ends with Sabastian expressing gratitude for Kevin's support.


In [35]:
example = data_tokenized[100]
print(example["dialogue"])

<s>[INST] You are a helpful assistant. Your task is to generate following dialogue summarization:
Chris: Hi there! Where are you? Any chance of skyping?
Rick: Hi! Our last two days in Cancun before flying to Havana. Yeah, skyping is an idea. When would it suit you?
Rick: We don't have the best of connections in the room but I can get you pretty well in the lobby.
Chris: What's the time in your place now?
Rick: 6:45 pm
Chris: It's a quarter to one in the morning here. Am still in front of the box.
Rick: Gracious me! Sorry mate. You needn't have answered.
Chris: 8-D
Rick: Just tell me when we could skype.
Chris: Preferably in the evening. Just a few hours earlier than now. And not tomorrow.
Rick: Shute! Only tomorrow makes sense as there's no workable internet in Cuba.
Chris: Could you make it like 3 pm your time?
Rick: Sure.
Chris: Perfect. So talk to you tomorrow.
Chris: Give my love to Helen please.
Rick: I will. Thx.[/INST]
</s>


In [46]:
print(example["summary"])

Rick and Helen are in Cancun. They're flying to Havana in two days. Chris and Rick will talk on Skype at 3 PM in Mexico.


In [47]:
print(example["finetuned_generated_summary"])

Rick and Chris will skype tomorrow at 3 pm local time.


In [36]:
print(example["raw_generated_summary"])

: Hi Chris, I'm currently in Cancun and will be flying to Havana soon. I'd be happy to Skype with you, but our internet connection in the room isn't great. You can reach me better in the lobby. What's the current time with you?

Chris: It's almost 1:00 AM here.

Rick: I see, that's quite a time difference. I'm sorry for the inconvenience. Let me know when would be a good time for you to Skype, and I'll make it work.

Chris: Preferably in the evening, a few hours earlier than now, and not tomorrow as there's no reliable internet in Cuba.

Rick: Understood. How about 3 PM your time tomorrow?

Chris: That works for me. I'll talk to you then. Please give my best to Helen.

Rick: I will. Good night, Chris.


In [37]:
example = data_tokenized[200]
print(example["dialogue"])

<s>[INST] You are a helpful assistant. Your task is to generate following dialogue summarization:
Abdellilah: Where are you?
Sam: work
Abdellilah: What time you finish?
Sam: Not til 5
Abdellilah: Are your bringing him over tonight:
Sam: No in the morning:
Abdellilah: ok, what time?
Sam: About 9. Is that ok?
Abdellilah: ok - see you then[/INST]
</s>


In [49]:
print(example["summary"])

Sam won't finish work till 5. Sam is bringing him over about 9 am. Sam will see Abdellilah in the morning. 


In [50]:
print(example["finetuned_generated_summary"])

Sam is working. He finishes not before 5. Sam is bringing him over in the morning. Sam and Abdellilah will see each other at about 9.


In [38]:
print(example["raw_generated_summary"])

Abdellilah: What is your current location?\
Sam: I'm at work.\
Abdellilah: What time will you finish work?\
Sam: Not until 5 p.m.\
Abdellilah: Will you be bringing him over tonight?\
Sam: No, it's in the morning instead.\
Abdellilah: What time should I expect you in the morning?\
Sam: Around 9 a.m. Is that convenient for you?\
Abdellilah: Yes, I'll see you then.


In [39]:
example = data_tokenized[300]
print(example["dialogue"])

<s>[INST] You are a helpful assistant. Your task is to generate following dialogue summarization:
Jerry: Hi sweetie :)
Janet: Hi sugar ;)
Jerry: I'm coming home
Janet: Can't wait ;)
Jerry: I should be there in 40 minutes
Janet: Ok, I'm waiting for you :)
Jerry: How was your day?
Janet: Oh, it was ok but my boss is a pain in the ass sometimes
Jerry: I know, she can be a bitch :P
Janet: Yes she can! ;)
Jerry: See you later darling
Janet: <3[/INST]
</s>


In [52]:
print(example["summary"])

Jerry will be home in 40 minutes. 


In [53]:
print(example["finetuned_generated_summary"])

Jerry will be home in 40 minutes. Janet had a hard day at work.


In [40]:
print(example["raw_generated_summary"])

: Jerry and Janet exchange greetings and express excitement for Jerry's imminent arrival home. Jerry asks about Janet's day, which she describes as okay but with a difficult boss. Jerry empathizes and they both share a light-hearted comment about the boss before signing off with endearments.


In [41]:
example = data_tokenized[600]
print(example["dialogue"])

<s>[INST] You are a helpful assistant. Your task is to generate following dialogue summarization:
Daniel: I'm packing for the trip, are you done with it?
Michael: Mostly done, don't forget your passport!
Matt: is an ID not enough?
Michael: sure not!
Brian: why?
Michael: Bosnia and Herzegovina is not in the EU, so we will have a proper border control
Michael: no passport, no entry
Daniel: good you wrote us this I think :P
Matt: yes, we're morons 
Daniel: haha, yes, a bit
Brian: I didn't even know we're going to enter Bosnia, I though we're going to stay in Croatia only
Michael: we planned to go to Mostar and the mountains, so Bosnia it is
Brian: anyway, I'll pack my passport
Michael: very good[/INST]
</s>


In [55]:
print(example["summary"])

Daniel, Michael, Matt and Brian are going to Croatia and Bosnia and Herzegovina. They are packing. Michael reminds them to take their passports, because Bosnia and Herzegovina is not in the EU. They will go to Mostar and the mountains in Bosnia.


In [56]:
print(example["finetuned_generated_summary"])

Daniel, Michael, Matt and Brian are going on a trip to Bosnia and Herzegovina.


In [42]:
print(example["raw_generated_summary"])

: Daniel is packing for the trip and asks Michael if he's finished. Michael replies that he's mostly done but reminds Daniel not to forget his passport. Matt asks if an ID would suffice instead, to which Michael responds no, explaining that since Bosnia and Herzegovina is not part of the EU, they will encounter proper border control and a passport is required for entry. Daniel finds Michael's reminder helpful and jokes that they're all a bit forgetful. Brian admits he wasn't aware they were visiting Bosnia and will make sure to pack his passport as well. Michael commends Brian for his quick response.


In [43]:
example = data_tokenized[700]
print(example["dialogue"])

<s>[INST] You are a helpful assistant. Your task is to generate following dialogue summarization:
Charlie: How are you doing.
Daisy: Good, seeing a lot of beautiful places
Charlie: Where are you now?
Daisy: in Naples 
Charlie: Pity, I hoped you may be still in Sicily.
Daisy: No, we took a ferry early in the morning yesterday.
Charlie: I am still thinking about the evening we spent together.
Daisy: Yes, it was nice.
Charlie: I would like to meet you again.
Daisy: But where?
Charlie: Either in the US or here.
Daisy: But we're heading North now. Rome, Florence, Venice.
Charlie: And you're flying to NY?
Daisy: No, directly to San Fransisco.
Charlie: Too bad.
Daisy: I know.
Charlie: I think I will take a train to Florence.
Daisy: wow, are you sure?
Charlie: More than sure. When are you going to be there?
Daisy: 1-5 of September.
Charlie: Perfect. I want to see this amazing city with you.
Daisy: I am looking forward to it!
Charlie: Me too![/INST]
</s>


In [58]:
print(example["summary"])

Daisy is in Naples. Charlie recalls their meeting and wants to see her again. He will come to Florence to see Daisy between 1-5 September.


In [59]:
print(example["finetuned_generated_summary"])

Daisy is in Naples. She's seeing beautiful places and then she's heading to Rome, Florence and Venice. Charlie and Daisy will meet in San Francisco on 1-5 of September. Charlie will take a train to Florence.


In [44]:
print(example["raw_generated_summary"])

: Charlie expressed his regret that Daisy was no longer in Sicily and reminisced about their previous evening together. He expressed his desire to meet her again, suggesting they could meet in the US or in Europe. Daisy informed him that they were heading north to Rome, Florence, and Venice, and were flying directly to San Francisco. Charlie was disappointed but decided to visit Florence around the same time as Daisy, from September 1-5. They both expressed their excitement about meeting again in Florence.


In [45]:
example = data_tokenized[800]
print(example["dialogue"])

<s>[INST] You are a helpful assistant. Your task is to generate following dialogue summarization:
Linda: Hi Dad, I want to buy flowers for mum! But I don't remember which one she likes :(
Michael: Well, she likes all the flowers I believe
Linda: That doesn't help! I'm on a flower market right now!
Michael: Send me some pics then
Linda: <file_photo> 
Michael: Tulips are nice, roses too
Linda:  What about carnations?
Michael: No, carnations are boring :D
Linda: Thanks Dad, srsly…
Michael:  What about freesias? She likes them a lot, are there any there?
Linda: <file_photo> 
Michael: Take those![/INST]
</s>


In [61]:
print(example["summary"])

Linda wants to buy flowers for her mother and asks Michael which flowers does she like. Michael suggests Linda to buy freesias.


In [62]:
print(example["finetuned_generated_summary"])

Linda is at the flower market and wants to buy flowers for her mother. Michael advises her to buy tulips or roses.


In [46]:
print(example["raw_generated_summary"])

: Linda expressed her desire to buy flowers for her mum but was unsure of which type she preferred. Her father suggested that her mum likes all flowers, to which Linda replied that this information was not helpful as she was currently at a flower market. Michael then asked Linda to send some pictures of the available flowers, and after seeing some tulips and roses, he suggested those. Linda asked about carnations, but her father dismissed them as boring. Linda thanked her father for his help. Michael then suggested freesias, which Linda confirmed were available in the pictures she had sent. Michael advised Linda to buy the freesias.
