<a href="https://colab.research.google.com/github/Moneshai2004/GenAi-Summarize-the-dialogue-using-a-Generative-AI-model-like-FLAN-T5/blob/main/Summarize_the_dialogue_using_a_Generative_AI_model_like_FLAN_T5.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!pip install transformers datasets torch

In [None]:
from datasets import load_dataset

dataset = load_dataset("knkarthick/dialogsum")


In [None]:
from transformers import AutoTokenizer , AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("google/flan-t5-base")
model = AutoModelForSeq2SeqLM.from_pretrained("google/flan-t5-base")

Step 4: Tokenize Input Dialogue & Generate Summary (Zero-Shot)

In [None]:
# Sample dialogue
dialogue = """
Person1: What time is it, Tom?
Person2: Just a minute, it's 10 to 9 by my watch.
Person1: Are you sure? We need to catch the train.
Person2: Don’t worry, we have plenty of time.
"""

# Add instruction for the model
prompt = "Summarize the following conversation:\n" + dialogue + "\nSummary:"

# Tokenize input
inputs = tokenizer(prompt, return_tensors="pt")

# Generate summary
outputs = model.generate(**inputs, max_length=60)

# Decode the output tokens to text
summary = tokenizer.decode(outputs[0], skip_special_tokens=True)

print("Generated Summary:")
print(summary)


In [None]:
# Prepare input with prompt
dialogue = "Person 1: What time is it, Tom?\nPerson 2: It's 9:50.\n"

input_text = "Summarize the following conversation:\n" + dialogue + "\nSummary:"

inputs = tokenizer(input_text, return_tensors="pt")

outputs = model.generate(
    inputs["input_ids"],
    max_length=50,
    num_beams=2,      # use beam search for better quality
    early_stopping=True
)

summary = tokenizer.decode(outputs[0], skip_special_tokens=True)
print("Generated Summary:")
print(summary)


# One-shot example dialogue + summary (the shot)

In [None]:

# One-shot example dialogue + summary (the shot)
example_dialogue = """
Person 1: What time is it, Tom?
Person 2: It's 9:50 by my watch.
"""
example_summary = "Person 2 tells Person 1 that the time is 9:50."

# New dialogue to summarize
new_dialogue = """
Person 1: Is the train leaving soon?
Person 2: Yes, it's about to leave in 5 minutes.
"""

# Create prompt with one-shot example and new dialogue
prompt = f"Summarize the following conversations.\n\nDialogue:\n{example_dialogue}\nSummary:\n{example_summary}\n\nDialogue:\n{new_dialogue}\nSummary:"

# Tokenize prompt
inputs = tokenizer(prompt, return_tensors="pt")

# Generate summary
outputs = model.generate(**inputs, max_length=50, num_beams=4, early_stopping=True)

# Decode and print summary
summary = tokenizer.decode(outputs[0], skip_special_tokens=True)
print("Generated Summary:", summary)



 few-shot prompting example in Python with FLAN-T5:
python

In [None]:

few_shot_prompt = """
Dialogue:
Person 1: What time is it?
Person 2: It's 9:50.

Summary:
The time is 9:50.

Dialogue:
Person 1: I need to upgrade my computer.
Person 2: You should add a CD-ROM.

Summary:
Person 1 wants to upgrade their computer by adding a CD-ROM.

Dialogue:
Person 1: Tom, are we going to miss the train?
Person 2: No, we still have 5 minutes.

Summary:
"""

inputs = tokenizer(few_shot_prompt, return_tensors="pt")

outputs = model.generate(**inputs, max_length=60)

summary = tokenizer.decode(outputs[0], skip_special_tokens=True)
print("Generated Summary:")
print(summary)

In [None]:
from transformers import pipeline, GenerationConfig

# Load summarization pipeline (or model + tokenizer separately)
summarizer = pipeline("summarization", model="google/flan-t5-base")

# Generation config example
gen_config = GenerationConfig(
    max_length=50,   # Limit summary length
    temperature=0.7, # Moderate creativity
    top_p=0.9,       # Nucleus sampling threshold
    top_k=50         # Limits next token candidates
)

# Input dialogue text
few_shot_prompt = """
Dialogue:
Person 1: What time is it?
Person 2: It's 9:50.

Summary:
The time is 9:50.

Dialogue:
Person 1: I need to upgrade my computer.
Person 2: You should add a CD-ROM.

Summary:
Person 1 wants to upgrade their computer by adding a CD-ROM.

Dialogue:
Person 1: Tom, are we going to miss the train?
Person 2: No, we still have 5 minutes.

Summary:
"""

# Generate summary with config
summary = summarizer(few_shot_prompt, generation_config=gen_config)

print(summary[0]['summary_text'])


You’ve learned about:

Loading datasets

Using pretrained models like FLAN-T5

Tokenization basics

Zero-shot, one-shot, and few-shot prompting

Prompt engineering basics

Adjusting generation parameters like temperature
