### Lab 7 ii. Text Generation Using Language Models
This lab provides an exploration into text generation using language models. You will work with a provided code snippet that demonstrates various methods of inference including one-shot, few-shot, and zero-shot. The lab aims to familiarize you with the creation of prompts, the generation of text from these prompts.

The lab is structured around three main types of inference techniques:

#### Zero-Shot Inference:
Explore generating text without specific examples in the prompt, using only general instructions or contexts.

#### One-Shot Inference:
Learn how to utilize a single example to create a prompt and generate text.

#### Few-Shot Inference:
Understand how to construct prompts with multiple examples and generate text.

### Dataset  
You will be working with the following dataset : https://huggingface.co/datasets/knkarthick/dialogsum

This is a large-scale dialogue summarization dataset, consisting of 13,460 dialogues with corresponding manually labeled summaries and topics. The data is split into train, validation and test.

### Task

The task is generating the text summarization using the inference techniques mentioned above for language models

In [None]:
from IPython.display import HTML, display
colab_button = HTML(
    '<a target="_blank" href="https://colab.research.google.com/github/surrey-nlp/NLP-2025/blob/main/lab07/lab07ii-Text_Generation_using_LLMs.ipynb">'
    '<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>'
)
display(colab_button)

In [1]:
!pip install transformers
!pip install datasets
!pip install --upgrade datasets huggingface_hub

Collecting datasets
  Downloading datasets-3.5.0-py3-none-any.whl.metadata (19 kB)
Collecting dill<0.3.9,>=0.3.0 (from datasets)
  Downloading dill-0.3.8-py3-none-any.whl.metadata (10 kB)
Collecting xxhash (from datasets)
  Downloading xxhash-3.5.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (12 kB)
Collecting multiprocess<0.70.17 (from datasets)
  Downloading multiprocess-0.70.16-py311-none-any.whl.metadata (7.2 kB)
Collecting fsspec<=2024.12.0,>=2023.1.0 (from fsspec[http]<=2024.12.0,>=2023.1.0->datasets)
  Downloading fsspec-2024.12.0-py3-none-any.whl.metadata (11 kB)
Downloading datasets-3.5.0-py3-none-any.whl (491 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m491.2/491.2 kB[0m [31m5.9 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading dill-0.3.8-py3-none-any.whl (116 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m116.3/116.3 kB[0m [31m5.7 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading fsspec-2024.12.0-py3-none-any.wh

In [2]:
from datasets import load_dataset
from transformers import AutoModelForSeq2SeqLM
from transformers import AutoTokenizer
from transformers import GenerationConfig

### Load the dataset from t5-base language model from hugging face

In [3]:
dataset = load_dataset("knkarthick/dialogsum")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


README.md: 0.00B [00:00, ?B/s]

train.csv:   0%|          | 0.00/11.3M [00:00<?, ?B/s]

validation.csv: 0.00B [00:00, ?B/s]

test.csv: 0.00B [00:00, ?B/s]

Generating train split:   0%|          | 0/12460 [00:00<?, ? examples/s]

Generating validation split:   0%|          | 0/500 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/1500 [00:00<?, ? examples/s]

### Printing dialogues with their baseline summaries

In [4]:
example_indices = [40,200]

dashline = '-'.join('' for x in range(100))

for i, index in enumerate(example_indices):
  print(dashline)
  print('Example ',i+1)
  print(dashline)
  print("Input Dialogue : ")
  print(dataset['test'][index]['dialogue'])
  print(dashline)
  print("Baseline human summary : ")
  print(dataset['test'][index]['summary'])
  print(dashline)
  print()

---------------------------------------------------------------------------------------------------
Example  1
---------------------------------------------------------------------------------------------------
Input Dialogue : 
#Person1#: What time is it, Tom?
#Person2#: Just a minute. It's ten to nine by my watch.
#Person1#: Is it? I had no idea it was so late. I must be off now.
#Person2#: What's the hurry?
#Person1#: I must catch the nine-thirty train.
#Person2#: You've plenty of time yet. The railway station is very close. It won't take more than twenty minutes to get there.
---------------------------------------------------------------------------------------------------
Baseline human summary : 
#Person1# is in a hurry to catch a train. Tom tells #Person1# there is plenty of time.
---------------------------------------------------------------------------------------------------

---------------------------------------------------------------------------------------------------

In [5]:
## Load the t5-base model, creating the instance of the automodelforseq2seqlm class with .from_pretrained() method
model_name =  'google-t5/t5-base'
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

config.json: 0.00B [00:00, ?B/s]

model.safetensors:   0%|          | 0.00/892M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/147 [00:00<?, ?B/s]

We use the encoder-decoder architecture here.
To perform encoding and decoding we have to work with text in a tokenized form.
Tokenization is the process of splitting texts into smaller units that can be processed by the LLM models.



In this exercise, you will download the tokenizer for the T5-base language model using the `AutoTokenizer.from_pretrained()` method. The `use_fast` parameter should be set to `True` to enable the fast tokenizer. This configuration enhances the efficiency of tokenization processes, leveraging optimized implementations for faster performance.

In [6]:
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast = True)

spiece.model:   0%|          | 0.00/792k [00:00<?, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

In [8]:
## Here we are testing the tokenizer encoding and decoding a simple sentence
sentence = 'What time is it, Hank?'

sentence_encoded = tokenizer(sentence, return_tensors = 'pt')

sentence_decoded = tokenizer.decode(
    sentence_encoded['input_ids'][0],
    skip_special_tokens = True
)

print("ENCODED SENTENCE: ")
print(sentence_encoded['input_ids'][0])
print('\nDECODED SENTENCE: ')
print(sentence_decoded)

ENCODED SENTENCE: 
tensor([ 363,   97,   19,   34,    6, 6627,  157,   58,    1])

DECODED SENTENCE: 
What time is it, Hank?


### Zero-Shot Inference

In this section of the lab, we will focus on implementing zero-shot inference. This technique does not rely on specific training examples to guide the language model. Instead, we will demonstrate how to transform a dialogue into a general instruction prompt, which will then be used to generate text. This process illustrates the model's ability to understand and generate responses based solely on the provided context or instructions without prior examples.

In [9]:
# Specify indices of examples to process (replace with actual indices of your dataset)
example_indices = [1, 300]  # Adjust as necessary based on your dataset

# Loop through the specified indices to process dialogues and their summaries
for i, index in enumerate(example_indices):

    # Retrieve the dialogue and summary from the test dataset at the current index
    dialogue = dataset['test'][index]['dialogue']
    #print('dialogue:::::::::::::::',dialogue)

    summary = dataset['test'][index]['summary']
    #print('summary:::::::::::::::::::',summary)

    # Create a prompt that asks to summarize the dialogue
    prompt = f"""
    Summarize the following conversation.

    {dialogue}

    Summary:
    """

    # Tokenize the prompt for processing by the model, specifying to return tensors in PyTorch format
    inputs = tokenizer(prompt, return_tensors='pt')
    #print("Tokenized inputs:", inputs['input_ids'])

    # Generate text from the model, decode it, and limit the generation to 50 new tokens
    output_tokens = model.generate(
        inputs['input_ids'],  # Input the tokenized prompt
        max_length=inputs['input_ids'].shape[1] + 50,  # Limit the total length of the output
        no_repeat_ngram_size=2  # Prevent repeating n-grams
    )

    #print("Raw model output (tokens):", output_tokens)

    decoded_output = tokenizer.decode(output_tokens[0], skip_special_tokens=True)
    #print("Decoded output:", decoded_output)
    # Separator for readability
    dashline = "-" * 100
    print(dashline)
    print("Example", i + 1)
    print(dashline)
    print('Input Prompt:', dialogue)
    print(dashline)
    print('Baseline Human Summary:', summary)
    print(dashline)
    print('Model Generated Summary:', decoded_output)
    print("\n")

# Ensure you adjust parameters and debug prints as per your specific requirements and data.


----------------------------------------------------------------------------------------------------
Example 1
----------------------------------------------------------------------------------------------------
Input Prompt: #Person1#: Ms. Dawson, I need you to take a dictation for me.
#Person2#: Yes, sir...
#Person1#: This should go out as an intra-office memorandum to all employees by this afternoon. Are you ready?
#Person2#: Yes, sir. Go ahead.
#Person1#: Attention all staff... Effective immediately, all office communications are restricted to email correspondence and official memos. The use of Instant Message programs by employees during working hours is strictly prohibited.
#Person2#: Sir, does this apply to intra-office communications only? Or will it also restrict external communications?
#Person1#: It should apply to all communications, not only in this office between employees, but also any outside communications.
#Person2#: But sir, many employees use Instant Messaging to co

It is important to note that the output generated from the given prompt above may not make sense or meet expectations. This discrepancy can be attributed to the fact that the specific model used in this exercise has not been fine-tuned for the task of dialogue summarization. Without task-specific training, the model's ability to understand and summarize conversational contexts accurately might be limited.

So, Now lets change the prompt text and check how the inferences are changing

In [None]:
for i, index in enumerate(example_indices):
  dialogue = dataset['test'][index]['dialogue']
  summary = dataset['test'][index]['summary']


  prompt = f"""
Dialogie is : .

{dialogue}

Summary:
    """

    ##Input constructed promppt instead of the dialogue
  inputs = tokenizer(prompt, return_tensors = 'pt')
  output = tokenizer.decode(
      model.generate(
          inputs['input_ids'],
          max_new_tokens = 59,
      )[0],
      skip_special_tokens = True
  )

  print(dashline)
  print("Example ", i+1)
  print(dashline)
  print(f'Input Prompt: \n{dialogue}')
  print(dashline)
  print(f'Baseline Human Summary: \n{summary}')
  print(dashline)
  print(f'Model generation - Prompt Engineering: \n{output}\n')

----------------------------------------------------------------------------------------------------
Example  1
----------------------------------------------------------------------------------------------------
Input Prompt: 
#Person1#: Ms. Dawson, I need you to take a dictation for me.
#Person2#: Yes, sir...
#Person1#: This should go out as an intra-office memorandum to all employees by this afternoon. Are you ready?
#Person2#: Yes, sir. Go ahead.
#Person1#: Attention all staff... Effective immediately, all office communications are restricted to email correspondence and official memos. The use of Instant Message programs by employees during working hours is strictly prohibited.
#Person2#: Sir, does this apply to intra-office communications only? Or will it also restrict external communications?
#Person1#: It should apply to all communications, not only in this office between employees, but also any outside communications.
#Person2#: But sir, many employees use Instant Messaging to 

Now lets change the prompt text again and check how the inferences are changing/

In [10]:
for i, index in enumerate(example_indices):
  dialogue = dataset['test'][index]['dialogue']
  summary = dataset['test'][index]['summary']


  prompt = f"""
Dialogue

{dialogue}

What was going on?
"""

  ##Input constructed promppt instead of the dialogue
  inputs = tokenizer(prompt, return_tensors = 'pt')
  output = tokenizer.decode(
      model.generate(
          inputs['input_ids'],
          max_new_tokens = 50,
      )[0],
      skip_special_tokens = True
  )

  print(dashline)
  print("Example ", i+1)
  print(dashline)
  print(f'Input Prompt: \n{dialogue}')
  print(dashline)
  print(f'Baseline Human Summary: \n{summary}\n')
  print(dashline)
  print(f'Model generation - ZERO SHOT: \n{output}\n')

----------------------------------------------------------------------------------------------------
Example  1
----------------------------------------------------------------------------------------------------
Input Prompt: 
#Person1#: Ms. Dawson, I need you to take a dictation for me.
#Person2#: Yes, sir...
#Person1#: This should go out as an intra-office memorandum to all employees by this afternoon. Are you ready?
#Person2#: Yes, sir. Go ahead.
#Person1#: Attention all staff... Effective immediately, all office communications are restricted to email correspondence and official memos. The use of Instant Message programs by employees during working hours is strictly prohibited.
#Person2#: Sir, does this apply to intra-office communications only? Or will it also restrict external communications?
#Person1#: It should apply to all communications, not only in this office between employees, but also any outside communications.
#Person2#: But sir, many employees use Instant Messaging to 

we can see the differences in the output when we change the prompt.

### One shot inferences
providing an LLM with one example of prompt-response pairs that matches our task before our actual prompt that we want completed

In [11]:
# Construct the prompt to perform one shot inference
def make_prompt(example_indices_full, example_index_to_summarize):
  prompt = ""
  for index in example_indices_full:
    dialogue = dataset['validation'][index]['dialogue']
    summary = dataset['validation'][index]['summary']

    ## The stop sequence '{summary}\n\n\n' is important for flan t5. other models may have their own preferred stop sequence.

    prompt += f"""
Dialogue:

{dialogue}

What was going on?
{summary}
"""
  dialogue = dataset['test'][example_index_to_summarize]['dialogue']

  prompt += f"""
Dialogue:

{dialogue}

What was going on?
"""
  return prompt

In [12]:
#define which samples is going to be used as one-shot example and which is going to be used for prediction
example_indices_full = [40]
example_index_to_summarize = 190

one_shot_prompt =  make_prompt(example_indices_full, example_index_to_summarize)

print(one_shot_prompt)


Dialogue:

#Person1#: Adam, could you show me around the school? 
#Person2#: No problem. 
#Person1#: What's the tallest building? 
#Person2#: You mean the white building near the playground? 
#Person1#: Yes. 
#Person2#: That is the library. And it has more than 1, 000, 000 books. 
#Person1#: What's the building to the south of the library? 
#Person2#: You know, our school is divided into two parts, the junior high school and the senior high school. That is the new classroom building for our senior high school. 
#Person1#: Is there a swimming pool in your school? 
#Person2#: Yes. There is a large swimming pool, but it is only available in summer. 
#Person1#: I do envy you. And I hope I can enter your school one day. 
#Person2#: I believe that you can make your dream come true. 

What was going on?
#Person1# asks Adam to show #Person1# around the school. #Person1# envies Adam and hopes to enter Adam's school one day.

Dialogue:

#Person1#: Adam, could you show me around the school?
#Per

In [13]:
## Now pass this prompt to perform the one shot inference:
summary = dataset['test'][example_index_to_summarize]['summary']

inputs = tokenizer(one_shot_prompt, return_tensors = 'pt')
output = tokenizer.decode(
    model.generate(
        inputs['input_ids'],
        max_new_tokens = 512,
    )[0],
    skip_special_tokens = True
)

print(dashline)
print(f'BASELINE HUMAN SUMMARY:\n{summary}\n')
print(dashline)
print(f'MODEL GENERATION -- ONE SHOT:\n{output}')


----------------------------------------------------------------------------------------------------
BASELINE HUMAN SUMMARY:
Adam shows #Person1# around his school and introduces the library, the new classroom building, and the swimming pool.

----------------------------------------------------------------------------------------------------
MODEL GENERATION -- ONE SHOT:
#Person1#: Adam, could you show me around the school? #Person2#: No problem. #Person1#: That is the library. And it has more than 1, 000, 000 books. #Person1#: I do envy you. And I hope I can enter your school one day. #Person2#: I believe that you can make your dream come true. What was going on? #Person1# asks Adam to show #Person1# around the school.


the generated output above is not very accurate

### few shot inferences
we create the prompt for few shot inferences

### Challenge 1:

Try few-shot inferencing starting from two examples going up to five examples. Choose a data from the testset and use the same data from the test set for all the few-shot settings. Analyse how different the output summary, when the number of examples shown in the few shot prompt are changed and compare it with zero-shot and one-shot outputs.

In [None]:
#example_indices are taken from validation set as we as we declared in the make_prompt function
example_indices_full = [.......]
#example_index_to_summarize is taken from the test set as we declared in the make_prompt function
example_index_to_summarize = ....

few_shot_prompt = make_prompt(example_indices_full, example_index_to_summarize)

print(few_shot_prompt)

In [None]:
summary = dataset['test'][example_index_to_summarize]['summary']

inputs = tokenizer(few_shot_prompt, return_tensors = 'pt')
output = tokenizer.decode(
    model.generate(
        inputs['input_ids'],
        max_new_tokens = 512,
    )[0],
    skip_special_tokens = True
)

print(dashline)
print(f'BASELINE HUMAN SUMMARY:\n{summary}\n')
print(dashline)
print(f'MODEL GENERATION -- FEW SHOT:\n{output}')

#### We generated output with t5-base above. Now lets try these prompting techniques with
###  'flan-t5-base' model

In [None]:
model_name =  'google/flan-t5-base'
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

In [None]:
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast = True)

In [None]:
## Here we are testing the tokenizer encoding and decoding a simple sentence
sentence = 'What time is it, Hank?'

sentence_encoded = tokenizer(sentence, return_tensors = 'pt')

sentence_decoded = tokenizer.decode(
    sentence_encoded['input_ids'][0],
    skip_special_tokens = True
)

print("ENCODED SENTENCE: ")
print(sentence_encoded['input_ids'][0])
print('\nDECODED SENTENCE: ')
print(sentence_decoded)

ENCODED SENTENCE: 
tensor([ 363,   97,   19,   34,    6, 6627,  157,   58,    1])

DECODED SENTENCE: 
What time is it, Hank?


### Zero-Shot Inference

In [None]:
for i, index in enumerate(example_indices):
    dialogue = dataset['test'][index]['dialogue']
    summary = dataset['test'][index]['summary']

    prompt = f"""
    Summarize the following conversation.

    {dialogue}

    Summary:
    """

    # Input constructed prompt instead of the dialogue
    inputs = tokenizer(prompt, return_tensors='pt', max_length=512, truncation=True)
    outputs = model.generate(input_ids=inputs.input_ids, max_length=150, num_beams=4, early_stopping=True)

    # Convert output tensor to list of integers and then decode
    output_text = tokenizer.decode(outputs[0].tolist(), skip_special_tokens=True)

    print(dashline)
    print("Example ", i + 1)
    print(dashline)
    print(f'Input Prompt: \n{dialogue}')
    print(dashline)
    print(f'Baseline Human Summary: \n{summary}')
    print(dashline)
    print(f'Model generation - Prompt Engineering: \n{output_text}\n')

----------------------------------------------------------------------------------------------------
Example  1
----------------------------------------------------------------------------------------------------
Input Prompt: 
#Person1#: Ms. Dawson, I need you to take a dictation for me.
#Person2#: Yes, sir...
#Person1#: This should go out as an intra-office memorandum to all employees by this afternoon. Are you ready?
#Person2#: Yes, sir. Go ahead.
#Person1#: Attention all staff... Effective immediately, all office communications are restricted to email correspondence and official memos. The use of Instant Message programs by employees during working hours is strictly prohibited.
#Person2#: Sir, does this apply to intra-office communications only? Or will it also restrict external communications?
#Person1#: It should apply to all communications, not only in this office between employees, but also any outside communications.
#Person2#: But sir, many employees use Instant Messaging to 

In [None]:
example_indices = [1, 300]  # Adjust as necessary based on your dataset
# Loop through the specified indices to process dialogues and their summaries
for i, index in enumerate(example_indices):

    # Retrieve the dialogue and summary from the test dataset at the current index
    dialogue = dataset['test'][index]['dialogue']
    #print('dialogue:::::::::::::::',dialogue)

    summary = dataset['test'][index]['summary']
    #print('summary:::::::::::::::::::',summary)

    # Create a prompt that asks to summarize the dialogue
    prompt = f"""
    Summarize the following conversation.

    {dialogue}

    Summary:
    """

    # Tokenize the prompt for processing by the model, specifying to return tensors in PyTorch format
    inputs = tokenizer(prompt, return_tensors='pt')
    #print("Tokenized inputs:", inputs['input_ids'])

    # Generate text from the model, decode it, and limit the generation to 50 new tokens
    output_tokens = model.generate(
        inputs['input_ids'],  # Input the tokenized prompt
        max_length=inputs['input_ids'].shape[1] + 50,  # Limit the total length of the output
        no_repeat_ngram_size=2  # Prevent repeating n-grams
    )

    #print("Raw model output (tokens):", output_tokens)

    decoded_output = tokenizer.decode(output_tokens[0], skip_special_tokens=True)
    #print("Decoded output:", decoded_output)
    # Separator for readability
    dashline = "-" * 100
    print(dashline)
    print("Example", i + 1)
    print(dashline)
    print('Input Prompt:', dialogue)
    print(dashline)
    print('Baseline Human Summary:', summary)
    print(dashline)
    print('Model Generated Summary:', decoded_output)
    print("\n")

# Ensure you adjust parameters and debug prints as per your specific requirements and data.


----------------------------------------------------------------------------------------------------
Example 1
----------------------------------------------------------------------------------------------------
Input Prompt: #Person1#: Ms. Dawson, I need you to take a dictation for me.
#Person2#: Yes, sir...
#Person1#: This should go out as an intra-office memorandum to all employees by this afternoon. Are you ready?
#Person2#: Yes, sir. Go ahead.
#Person1#: Attention all staff... Effective immediately, all office communications are restricted to email correspondence and official memos. The use of Instant Message programs by employees during working hours is strictly prohibited.
#Person2#: Sir, does this apply to intra-office communications only? Or will it also restrict external communications?
#Person1#: It should apply to all communications, not only in this office between employees, but also any outside communications.
#Person2#: But sir, many employees use Instant Messaging to co

Comparing the output for the same setting with model 't5-base' and 'flan-t5-base', we can see that 'flan-t5-base' provides the better output. This observation highlights the effectiveness of the 'flan-t5-base' in generating more contextually appropriate outputs for the given summarization task.

Now lets change the prompt text and check how the inferences are changing

In [None]:
for i, index in enumerate(example_indices):
  dialogue = dataset['test'][index]['dialogue']
  summary = dataset['test'][index]['summary']


  prompt = f"""
Dialogie is : .

{dialogue}

Summary:
    """

    ##Input constructed promppt instead of the dialogue
  inputs = tokenizer(prompt, return_tensors = 'pt')
  output = tokenizer.decode(
      model.generate(
          inputs['input_ids'],
          max_new_tokens = 50,
      )[0],
      skip_special_tokens = True
  )

  print(dashline)
  print("Example ", i+1)
  print(dashline)
  print(f'Input Prompt: \n{dialogue}')
  print(dashline)
  print(f'Baseline Human Summary: \n{summary}')
  print(dashline)
  print(f'Model generation - Prompt Engineering: \n{output}\n')

----------------------------------------------------------------------------------------------------
Example  1
----------------------------------------------------------------------------------------------------
Input Prompt: 
#Person1#: Ms. Dawson, I need you to take a dictation for me.
#Person2#: Yes, sir...
#Person1#: This should go out as an intra-office memorandum to all employees by this afternoon. Are you ready?
#Person2#: Yes, sir. Go ahead.
#Person1#: Attention all staff... Effective immediately, all office communications are restricted to email correspondence and official memos. The use of Instant Message programs by employees during working hours is strictly prohibited.
#Person2#: Sir, does this apply to intra-office communications only? Or will it also restrict external communications?
#Person1#: It should apply to all communications, not only in this office between employees, but also any outside communications.
#Person2#: But sir, many employees use Instant Messaging to 

Now lets change the prompt text again and check how the inferences are changing/

In [None]:
for i, index in enumerate(example_indices):
  dialogue = dataset['test'][index]['dialogue']
  summary = dataset['test'][index]['summary']


  prompt = f"""
Dialogue

{dialogue}

What was going on?
"""

  ##Input constructed promppt instead of the dialogue
  inputs = tokenizer(prompt, return_tensors = 'pt')
  output = tokenizer.decode(
      model.generate(
          inputs['input_ids'],
          max_new_tokens = 50,
      )[0],
      skip_special_tokens = True
  )

  print(dashline)
  print("Example ", i+1)
  print(dashline)
  print(f'Input Prompt: \n{dialogue}')
  print(dashline)
  print(f'Baseline Human Summary: \n{summary}\n')
  print(dashline)
  print(f'Model generation - ZERO SHOT: \n{output}\n')

----------------------------------------------------------------------------------------------------
Example  1
----------------------------------------------------------------------------------------------------
Input Prompt: 
#Person1#: Ms. Dawson, I need you to take a dictation for me.
#Person2#: Yes, sir...
#Person1#: This should go out as an intra-office memorandum to all employees by this afternoon. Are you ready?
#Person2#: Yes, sir. Go ahead.
#Person1#: Attention all staff... Effective immediately, all office communications are restricted to email correspondence and official memos. The use of Instant Message programs by employees during working hours is strictly prohibited.
#Person2#: Sir, does this apply to intra-office communications only? Or will it also restrict external communications?
#Person1#: It should apply to all communications, not only in this office between employees, but also any outside communications.
#Person2#: But sir, many employees use Instant Messaging to 

we can see the differences in the output when we change the prompt.

### One shot inferences


In [None]:
# Construct the prompt to perform one shot inference
def make_prompt(example_indices_full, example_index_to_summarize):
  prompt = ""
  for index in example_indices_full:
    dialogue = dataset['validation'][index]['dialogue']
    summary = dataset['validation'][index]['summary']

    ## The stop sequence '{summary}\n\n\n' is important for flan t5. other models may have their own preferred stop sequence.

    prompt += f"""
Dialogue:

{dialogue}

What was going on?
{summary}
"""
  dialogue = dataset['test'][example_index_to_summarize]['dialogue']

  prompt += f"""
Dialogue:

{dialogue}

What was going on?
"""
  return prompt

In [None]:
example_indices_full = [40]
example_index_to_summarize = 190

one_shot_prompt =  make_prompt(example_indices_full, example_index_to_summarize)

print(one_shot_prompt)


Dialogue:

#Person1#: Adam, could you show me around the school? 
#Person2#: No problem. 
#Person1#: What's the tallest building? 
#Person2#: You mean the white building near the playground? 
#Person1#: Yes. 
#Person2#: That is the library. And it has more than 1, 000, 000 books. 
#Person1#: What's the building to the south of the library? 
#Person2#: You know, our school is divided into two parts, the junior high school and the senior high school. That is the new classroom building for our senior high school. 
#Person1#: Is there a swimming pool in your school? 
#Person2#: Yes. There is a large swimming pool, but it is only available in summer. 
#Person1#: I do envy you. And I hope I can enter your school one day. 
#Person2#: I believe that you can make your dream come true. 

What was going on?
#Person1# asks Adam to show #Person1# around the school. #Person1# envies Adam and hopes to enter Adam's school one day.

Dialogue:

#Person1#: Adam, could you show me around the school?
#Per

In [None]:
## Now pass this prompt to perform the one shot inference:
summary = dataset['test'][example_index_to_summarize]['summary']

inputs = tokenizer(one_shot_prompt, return_tensors = 'pt')
output = tokenizer.decode(
    model.generate(
        inputs['input_ids'],
        max_new_tokens = 512,
    )[0],
    skip_special_tokens = True
)

print(dashline)
print(f'BASELINE HUMAN SUMMARY:\n{summary}\n')
print(dashline)
print(f'MODEL GENERATION -- ONE SHOT:\n{output}')


----------------------------------------------------------------------------------------------------
BASELINE HUMAN SUMMARY:
Adam shows #Person1# around his school and introduces the library, the new classroom building, and the swimming pool.

----------------------------------------------------------------------------------------------------
MODEL GENERATION -- ONE SHOT:
#Person1# asks Adam to show #Person2 around the school. #Person1# envies Adam and hopes to enter Adam's school one day.


### few shot inferences
Now let's prompt for few shot inferences

### Challenge 2:

Try few-shot inferencing starting from two examples going up to five examples with 'flan-t5-base'. Choose a data from the testset and use the same data from the test set for the prediction. Analyse how different the output summary, when the number of examples shown in the few shot prompt are changed.

In [None]:
#example_indices are taken from validation set as we as we declared in the make_prompt function
example_indices_full = [.......]
#example_index_to_summarize is taken from the test set as we declared in the make_prompt function
example_index_to_summarize = ....

few_shot_prompt = make_prompt(example_indices_full, example_index_to_summarize)

print(few_shot_prompt)

In [None]:
summary = dataset['test'][example_index_to_summarize]['summary']

inputs = tokenizer(few_shot_prompt, return_tensors = 'pt')
output = tokenizer.decode(
    model.generate(
        inputs['input_ids'],
        max_new_tokens = 512,
    )[0],
    skip_special_tokens = True
)

print(dashline)
print(f'BASELINE HUMAN SUMMARY:\n{summary}\n')
print(dashline)
print(f'MODEL GENERATION -- FEW SHOT:\n{output}')

### Additional Challenge :

Try the learned inferencing techniques with different prompts structures with different language models and identify which is the best suited for dialog summarizing task.