## GenAI - Summarize the text with Base LLMS and employ various techniques to improve it. 

In this notebook, we will learn how to use Hugging Face hosted Base LLMS to summarize a dialogue. We will see that the base LLMs do not perform very well in the summarization task but they can be improved with writing prompts and in-context learning (few shot learning). We will go over below topics.  
1. Use FLAN-T5 model to summarize dialogues.
2. Use Prompt to improve the inference output.
3. Use In-context learning and different configuration to get the best results for summarization inference on the same base model. 

In [1]:
from datasets import load_dataset
from transformers import AutoModelForSeq2SeqLM
from transformers import AutoTokenizer
from transformers import GenerationConfig
from pprint import pprint

### 1. Lets Summarize Dialogue without prompt Engineering

In [2]:
huggingface_dataset_name = "knkarthick/dialogsum"
dataset = load_dataset(huggingface_dataset_name)
example_record = 200
pprint(dataset['test'][example_record])

{'dialogue': '#Person1#: Have you considered upgrading your system?\n'
             "#Person2#: Yes, but I'm not sure what exactly I would need.\n"
             '#Person1#: You could consider adding a painting program to your '
             'software. It would allow you to make up your own flyers and '
             'banners for advertising.\n'
             '#Person2#: That would be a definite bonus.\n'
             '#Person1#: You might also want to upgrade your hardware because '
             'it is pretty outdated now.\n'
             '#Person2#: How can we do that?\n'
             "#Person1#: You'd probably need a faster processor, to begin "
             'with. And you also need a more powerful hard disc, more memory '
             'and a faster modem. Do you have a CD-ROM drive?\n'
             '#Person2#: No.\n'
             '#Person1#: Then you might want to add a CD-ROM drive too, '
             'because most new software programs are coming out on Cds.\n'
             '#Person

In [3]:
# load google flan-t5 model
model = AutoModelForSeq2SeqLM.from_pretrained('google/flan-t5-base')
tokenizer = AutoTokenizer.from_pretrained('google/flan-t5-base', use_fast=True)

In [4]:
# summarize the text with base LLM and compare it with human baseline.
dialogue = dataset['test'][example_record]['dialogue']
inputs = tokenizer(dialogue, return_tensors='pt')
output_tokens = model.generate(inputs["input_ids"], max_new_tokens=50,)
model_inference_output = tokenizer.decode(output_tokens[0], skip_special_tokens=True)
print("#### Human Baseline Summary -->")
print(dataset['test'][example_record]['summary'])
print("#### Base Model Inference Summary -->")
print(model_inference_output)

#### Human Baseline Summary -->
#Person1# teaches #Person2# how to upgrade software and hardware in #Person2#'s system.
#### Base Model Inference Summary -->
#Person1#: I'm thinking of upgrading my computer.


#### We can see that the model poorly in inferring the summary. 

### 2. Lets use prompt engineering now and try to get better results (This method is also called zero-shot instruction prompt.)
We will try to use below two prompts. 
1. Generic instruction prompt
2. FLAN-T5 template prompt

In [5]:
generic_prompt = f"""
Summarize the following conversation. 

{dialogue}

Summary:"""

template_prompt = f"""
Dialogue:

{dialogue}

What was going on?
"""
inputs = tokenizer(generic_prompt, return_tensors='pt')
output_tokens = model.generate(inputs["input_ids"], max_new_tokens=50,)
generic_prompt_output = tokenizer.decode(output_tokens[0], skip_special_tokens=True)

inputs = tokenizer(template_prompt, return_tensors='pt')
output_tokens = model.generate(inputs["input_ids"], max_new_tokens=50,)
template_prompt_output = tokenizer.decode(output_tokens[0], skip_special_tokens=True)

print("#### Human Baseline Summary -->")
print(dataset['test'][example_record]['summary'])
print("#### Summary Generated by generic prompt->")
print(generic_prompt_output)
print("#### Summary Generated by template prompt->")
print(template_prompt_output)

#### Human Baseline Summary -->
#Person1# teaches #Person2# how to upgrade software and hardware in #Person2#'s system.
#### Summary Generated by generic prompt->
#Person1#: I'm thinking of upgrading my computer.
#### Summary Generated by template prompt->
#Person1#: You could add a painting program to your software. #Person2#: That would be a bonus. #Person1#: You might also want to upgrade your hardware. #Person1#


#### We can see that even prompt engineering did not generate helpful summary with the base model. 
### 3. Let's try one-shot or few-shot inference now.

In [21]:
# function which generates a few shot prompt. 
def generate_prompt(examples_indices, input_index):
    """
    Generates a few-shot learning prompt using specified examples from dataset object.

    :param examples_indices: List of indices for examples in the dataset to be used in the prompt.
    :param input_index: Index of the input dialogue in the dataset for which the summary is to be generated.
    :return: A string prompt for a few-shot learning.
    """
    prompt = ""

    # Adding few-shot examples
    for idx in examples_indices:
        example = dataset['test'][idx]
        prompt += f"\nDialogue:\n\n{example['dialogue']}\n\nWhat was going on?\n\n{example['summary']}\n\n"

    # Adding the input dialogue
    input_dialogue = dataset['test'][input_index]['dialogue']
    prompt += f"\nDialogue:\n\n{input_dialogue}\n\nWhat was going on?\n"

    return prompt

In [33]:
# construct the few shot prompt, here we are passing examples 197 & 198 as in-context knowledge
prompt = generate_prompt([197, 198], 200)
#lets print the few shot prompt we generated
print(f"{prompt}")


Dialogue:

#Person1#: John dates her seven times a week.
#Person2#: Really? That's a straws in the wind.
#Person1#: I think so. Maybe he's fallen for her.
#Person2#: Yeah. They suit each other. A perfect match between a man and a girl.

What was going on?

#Person1# and #Person2# are talking about a couple.


Dialogue:

#Person1#: Have you considered upgrading your system?
#Person2#: Yes, but I'm not sure what exactly I would need.
#Person1#: You could consider adding a painting program to your software. It would allow you to make up your own flyers and banners for advertising.
#Person2#: That would be a definite bonus.
#Person1#: You might also want to upgrade your hardware because it is pretty outdated now.
#Person2#: How can we do that?
#Person1#: You'd probably need a faster processor, to begin with. And you also need a more powerful hard disc, more memory and a faster modem. Do you have a CD-ROM drive?
#Person2#: No.
#Person1#: Then you might want to add a CD-ROM drive too, becau

In [35]:

# get results by the above few shot prompt
inputs = tokenizer(prompt, return_tensors='pt')
# we have passed temperature variable in generation config here. The variable controls the creativity of the model.
generation_config = GenerationConfig(max_new_tokens=50, do_sample=True, temperature=0.2)
output_tokens = model.generate(inputs["input_ids"],generation_config=generation_config,)
few_shot_output = tokenizer.decode(output_tokens[0], skip_special_tokens=True)
print(few_shot_output)

#Person1 is giving #Person2# some advice for upgrading #Person2#'s system, such as adding a painting program and a faster processor.


In [36]:
print("#### Human Baseline Summary -->")
print(dataset['test'][example_record]['summary'])
print("#### Base Model Inference Summary -->")
print(model_inference_output)
print("#### Summary Generated by generic prompt -->")
print(generic_prompt_output)
print("#### Summary Generated by template prompt -->")
print(template_prompt_output)
print("#### Summary Generated by few-shot learning -->")
print(few_shot_output)

#### Human Baseline Summary -->
#Person1# teaches #Person2# how to upgrade software and hardware in #Person2#'s system.
#### Base Model Inference Summary -->
#Person1#: I'm thinking of upgrading my computer.
#### Summary Generated by generic prompt -->
#Person1#: I'm thinking of upgrading my computer.
#### Summary Generated by template prompt -->
#Person1#: You could add a painting program to your software. #Person2#: That would be a bonus. #Person1#: You might also want to upgrade your hardware. #Person1#
#### Summary Generated by few-shot learning -->
#Person1 is giving #Person2# some advice for upgrading #Person2#'s system, such as adding a painting program and a faster processor.


### Results and whats next?
As we can see that few shot/in-context learning improves the output generation of base model significantly. It was observed that the helpfulness of in-context examples don't add much value after 3-4 examples. To overcome this, we can use model fine tuning techniques to achieve better results. Stay tuned for the next article. 

➤ [**Connect on LinkedIn**](https://www.linkedin.com/in/aambekar234/)

