# Generative AI Use Case: Summarize Dialogue

This notebook demonstrates how input text influences the output of a language model and introduces the concept of prompt engineering. 

It compares zero-shot, one-shot, and few-shot inferences, showcasing how different prompting techniques can guide the model toward specific tasks. By exploring these methods, you will gain insights into how prompt engineering can enhance the generative capabilities of Large Language Models.

## Outcome overview

We will be leveraging open source data that is available in Hugging Face's datasets library, to summarize conversational data.

## 1. Package installation

These are the required packages to use PyTorch and Hugging Face transformers and datasets.

In [3]:
%pip install --upgrade pip
%pip install torch torchdata
%pip install -U datasets
%pip install transformers==4.27.2

Note: you may need to restart the kernel to use updated packages.
Collecting torch
  Using cached torch-2.5.1-cp311-cp311-win_amd64.whl.metadata (28 kB)
Collecting torchdata
  Downloading torchdata-0.10.1-py3-none-any.whl.metadata (6.3 kB)
Collecting networkx (from torch)
  Downloading networkx-3.4.2-py3-none-any.whl.metadata (6.3 kB)
Collecting jinja2 (from torch)
  Downloading jinja2-3.1.4-py3-none-any.whl.metadata (2.6 kB)
Collecting sympy==1.13.1 (from torch)
  Downloading sympy-1.13.1-py3-none-any.whl.metadata (12 kB)
Collecting mpmath<1.4,>=1.1.0 (from sympy==1.13.1->torch)
  Downloading mpmath-1.3.0-py3-none-any.whl.metadata (8.6 kB)
Collecting MarkupSafe>=2.0 (from jinja2->torch)
  Downloading MarkupSafe-3.0.2-cp311-cp311-win_amd64.whl.metadata (4.1 kB)
Downloading torch-2.5.1-cp311-cp311-win_amd64.whl (203.1 MB)
   ---------------------------------------- 0.0/203.1 MB ? eta -:--:--
   - -------------------------------------- 7.6/203.1 MB 46.8 MB/s eta 0:00:05
   --- ----------

Here, we load the datasets, LLM, tokenizer and configurator. 

In [1]:
from datasets import load_dataset
from transformers import AutoModelForSeq2SeqLM
from transformers import AutoTokenizer
from transformers import GenerationConfig

  from .autonotebook import tqdm as notebook_tqdm


## 2. Summarize Dialogue without Prompt Engineeering

In this section, we will generate a summary of a dialogue with the pre-trained LLLM FLAN-T5 from Hugging Face. The list of available models in Huggign face <code>transformers</code> pacakge can be found [here](https://huggingface.co/docs/transformers/en/index). 

We will be working with the sample dialogues from the "DialogSum" Hugging Face dataset. This dataset contains 10,000+ dialogues with the corresponding manally labelled summaries and topics. 

In [2]:
dataset_name = "knkarthick/dialogsum"

dataset = load_dataset(dataset_name)

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
Generating train split: 100%|██████████| 12460/12460 [00:00<00:00, 106746.44 examples/s]
Generating validation split: 100%|██████████| 500/500 [00:00<00:00, 122225.90 examples/s]
Generating test split: 100%|██████████| 1500/1500 [00:00<00:00, 90138.06 examples/s]


Explore the dataset by printing some dialogues with their baseline summaries.

In [3]:
example_indices = [40, 200]

dash_line = "-".join('' for x in range (100))

for i, index in enumerate(example_indices):
    print(dash_line)
    print('Example', i+1)
    print(dash_line)
    print('INPUT DIALOGUE:')
    print(dataset['test'][index]['dialogue'])
    print(dash_line)
    print('BASELINE HUMAN SUMMARY:')
    print(dataset['test'][index]['summary'])
    print(dash_line)
    print()

---------------------------------------------------------------------------------------------------
Example 1
---------------------------------------------------------------------------------------------------
INPUT DIALOGUE:
#Person1#: What time is it, Tom?
#Person2#: Just a minute. It's ten to nine by my watch.
#Person1#: Is it? I had no idea it was so late. I must be off now.
#Person2#: What's the hurry?
#Person1#: I must catch the nine-thirty train.
#Person2#: You've plenty of time yet. The railway station is very close. It won't take more than twenty minutes to get there.
---------------------------------------------------------------------------------------------------
BASELINE HUMAN SUMMARY:
#Person1# is in a hurry to catch a train. Tom tells #Person1# there is plenty of time.
---------------------------------------------------------------------------------------------------

---------------------------------------------------------------------------------------------------
Exam

Load the FLAN-T5 model, creating an instance of the <code>AutoModelForSeq2SeqLM</code> class with the <code>.from_pretrained()</code> method.

In [4]:
model_name = 'google/flan-t5-base'

model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
  return torch.load(checkpoint_file, map_location="cpu")


To perform encoding and decoding, you need to work with text in a tokenized form. Tokenization is the process of splitting texts into smaller units that can be processed by the LLM models. It converts the raw text into the vector space which can then be processed by the model. 

Download the tokenizer for the FLAN-T5 model using <code>AutoTokenizer.from_pretrained()</code> method. Parameter <code>use_fast</code> switches on fast tokenizer. 

In [5]:
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True)



Test the tokenizer encoding and decoding with a simple sentence.

In [6]:
sentence = "what time is it, Tom?"

sentence_encoded = tokenizer(sentence, return_tensors='pt')

sentence_decoded = tokenizer.decode(
    sentence_encoded['input_ids'][0], 
    skip_special_tokens=True
    )

print('ENDCODED SENTENCE:')
print(sentence_encoded['input_ids'][0])
print('\nDECODED SENTENCE:')
print(sentence_decoded)

ENDCODED SENTENCE:
tensor([ 125,   97,   19,   34,    6, 3059,   58,    1])

DECODED SENTENCE:
what time is it, Tom?
