

# **This is the starting of our notebook.**
> ## The first part we import all the important things we need
>> But first we install some stuff;
>> - upgrade pip
>> - install pytorch and torch data
>> - install transformers model
>> - dataset library

> ## Then we import our necessary libraries
> ## Config the git for pushing everything as we go along.



In [None]:
%pip install --upgrade pip
%pip install --disable-pip-version-check \
    torch==1.13.1 \
    torchdata==0.5.1 --quiet

%pip install \
    transformers==4.27.2 \
    datasets==2.11.0 --quiet

In [23]:
from datasets import load_dataset
from transformers import AutoModelForSeq2SeqLM
from transformers import AutoTokenizer
from transformers import GenerationConfig


> Downloading our dialogsum dataset from hugging face.

In [None]:
ds = load_dataset("knkarthick/dialogsum")

> Let's checkout one of the examples in dialogsum dataset.

In [29]:
dashline = '-' * 100

print(dashline)
print("EXAMPLE")
print(dashline)
print(ds['test'][40]['dialogue'])
print(dashline)
print("SUMMARY")
print(dashline)
print(ds['test'][40]['summary'])
print(dashline)

----------------------------------------------------------------------------------------------------
EXAMPLE
----------------------------------------------------------------------------------------------------
#Person1#: What time is it, Tom?
#Person2#: Just a minute. It's ten to nine by my watch.
#Person1#: Is it? I had no idea it was so late. I must be off now.
#Person2#: What's the hurry?
#Person1#: I must catch the nine-thirty train.
#Person2#: You've plenty of time yet. The railway station is very close. It won't take more than twenty minutes to get there.
----------------------------------------------------------------------------------------------------
SUMMARY
----------------------------------------------------------------------------------------------------
#Person1# is in a hurry to catch a train. Tom tells #Person1# there is plenty of time.
----------------------------------------------------------------------------------------------------


> Now we going to load our FLAN-T5 model which can do whole lot of tasks, its LLM and we are going to use it to summarize the text.

In [30]:
model_name = "google/flan-t5-large"
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)




config.json:   0%|          | 0.00/662 [00:00<?, ?B/s]



model.safetensors:   0%|          | 0.00/3.13G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/147 [00:00<?, ?B/s]

In [None]:
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast = True)

> Input Sequences:
> Sequence 1: "Hello world!" (length 2) <br>
> Sequence 2: "Hi!" (length 1)<br>
>
> Padding: <br>
> Sequence 1: "Hello world!" (padded length 3)<br>
> Sequence 2: "Hi! <PAD>" (padded length 3)<br>
>
> Attention Mask:<br>
> Sequence 1: [1, 1, 1]<br>
> Sequence 2: [1, 1, 0]


In [32]:
example_text = "Caesar is a good dirty cat"

string_encode = tokenizer(example_text, return_tensors="pt")

string_decode = tokenizer.decode(string_encode['input_ids'][0][6], skip_special_tokens=True)

print(string_encode)
print(string_decode)

{'input_ids': tensor([[26218,    19,     3,     9,   207, 13086,  1712,     1]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1]])}
cat


## Now we are going to use the model to generate our summary without prompt engineering. And then we are gonna use Zero, one and few shot inference to see how it goes.

In [33]:
indices = [40, 200]

for i, index in enumerate(indices):

  dialogue = ds['test'][index]['dialogue']
  summary = ds['test'][index]['summary']

  input = tokenizer(ds['test'][index]['dialogue'], return_tensors="pt")
  output = tokenizer.decode(model.generate(input['input_ids'], max_new_tokens=100)[0], skip_special_tokens=True)

  print(dashline)
  print(f"EXAMPLE : {i + 1}")
  print(dashline)
  print(ds['test'][index]['dialogue'])
  print(dashline)
  print(f"SUMMARY by human : {summary}")
  print(f"\n{dashline}")
  print(f"SUMMARY by model (WITHOUT PROMPT ENGINEERING): {output}")
  print(dashline)
  print()

----------------------------------------------------------------------------------------------------
EXAMPLE : 1
----------------------------------------------------------------------------------------------------
#Person1#: What time is it, Tom?
#Person2#: Just a minute. It's ten to nine by my watch.
#Person1#: Is it? I had no idea it was so late. I must be off now.
#Person2#: What's the hurry?
#Person1#: I must catch the nine-thirty train.
#Person2#: You've plenty of time yet. The railway station is very close. It won't take more than twenty minutes to get there.
----------------------------------------------------------------------------------------------------
SUMMARY by human : #Person1# is in a hurry to catch a train. Tom tells #Person1# there is plenty of time.

----------------------------------------------------------------------------------------------------
SUMMARY by model (WITHOUT PROMPT ENGINEERING): #Person1#: I'm afraid I'm late.
----------------------------------------

### Zero shot inference


In [34]:
indices = [40, 200]

for i, index in enumerate(indices):

  dialogue = ds['test'][index]['dialogue']
  summary = ds['test'][index]['summary']

  prompt = f"""
  The dialogue is given below:
  {dialogue}

  The summary of the dialogue is;
  """

  input = tokenizer(prompt, return_tensors="pt")
  output = tokenizer.decode(model.generate(input['input_ids'], max_new_tokens=100)[0], skip_special_tokens=True)

  print(dashline)
  print(f"EXAMPLE : {i + 1}")
  print(dashline)
  print(prompt)
  print(dashline)
  print(f"SUMMARY by human : {summary}")
  print(f"\n{dashline}")
  print(f"SUMMARY by model (WITH ZERO SHOT INFERENCE): {output}")
  print(dashline)
  print()

----------------------------------------------------------------------------------------------------
EXAMPLE : 1
----------------------------------------------------------------------------------------------------

  The dialogue is given below:
  #Person1#: What time is it, Tom?
#Person2#: Just a minute. It's ten to nine by my watch.
#Person1#: Is it? I had no idea it was so late. I must be off now.
#Person2#: What's the hurry?
#Person1#: I must catch the nine-thirty train.
#Person2#: You've plenty of time yet. The railway station is very close. It won't take more than twenty minutes to get there.

  The summary of the dialogue is;
  
----------------------------------------------------------------------------------------------------
SUMMARY by human : #Person1# is in a hurry to catch a train. Tom tells #Person1# there is plenty of time.

----------------------------------------------------------------------------------------------------
SUMMARY by model (WITH ZERO SHOT INFERENCE): Pe

#### But first lets make a function for usability

In [35]:
def generate_prompts(indice_of_the_example, indice_of_summary):
  prompt = ''
  for i in indice_of_the_example:
    dialogue = ds['test'][i]['dialogue']
    summary = ds['test'][i]['summary']

    prompt += f"""
The dialogue is given below:
{dialogue}
{dashline}
The summary of the dialogue is;
{summary}
{dashline}
    """

  prompt+= f"""
Now look at the example above and try to summarize this dialogue given below

{ds['test'][indice_of_summary]['dialogue']}

The summary : """

  return prompt


### Now for one shot inference and few shot inference :

In [36]:
indice_of_the_example = [40, 300]
indice_of_summary = 200
prompt = generate_prompts(indice_of_the_example, indice_of_summary)

In [37]:
input = tokenizer(prompt, return_tensors="pt")
output = tokenizer.decode(model.generate(input['input_ids'], max_new_tokens=100)[0], skip_special_tokens=True)

print(dashline)
print("EXAMPLE ")
print(dashline)
print(prompt)
print(dashline)
print(f"SUMMARY by human : {ds['test'][indice_of_summary]['summary']}")
print(f"\n{dashline}")
print(f"SUMMARY by model (WITH FEW SHOT INFERENCE): {output}")
print(dashline)
print()

Token indices sequence length is longer than the specified maximum sequence length for this model (838 > 512). Running this sequence through the model will result in indexing errors


----------------------------------------------------------------------------------------------------
EXAMPLE 
----------------------------------------------------------------------------------------------------
 
The dialogue is given below:
#Person1#: What time is it, Tom?
#Person2#: Just a minute. It's ten to nine by my watch.
#Person1#: Is it? I had no idea it was so late. I must be off now.
#Person2#: What's the hurry?
#Person1#: I must catch the nine-thirty train.
#Person2#: You've plenty of time yet. The railway station is very close. It won't take more than twenty minutes to get there. 
----------------------------------------------------------------------------------------------------
The summary of the dialogue is;
#Person1# is in a hurry to catch a train. Tom tells #Person1# there is plenty of time.
----------------------------------------------------------------------------------------------------
     
The dialogue is given below:
#Person1#: I cannot imagine if Trump were t

## For the last part, we are going to do a bit tweaking with the model configuration and see what happens.

In [40]:
generation_configuration = GenerationConfig(
    num_beams=3,
    num_return_sequences=1,
    max_new_tokens = 100,
    do_sample = True,
    temperature = 0.9)

input = tokenizer(prompt, return_tensors="pt")
output = tokenizer.decode(
          model.generate(
            input['input_ids'],
            generation_config=generation_configuration)[0],
          skip_special_tokens=True)

print(dashline)
print("EXAMPLE ")
print(dashline)
print(prompt)
print(dashline)
print(f"SUMMARY by human : {ds['test'][indice_of_summary]['summary']}")
print(f"\n{dashline}")
print(f"SUMMARY by model (WITH FEW SHOT INFERENCE WITH CONFIG): {output}")
print(dashline)
print()

----------------------------------------------------------------------------------------------------
EXAMPLE 
----------------------------------------------------------------------------------------------------
 
The dialogue is given below:
#Person1#: What time is it, Tom?
#Person2#: Just a minute. It's ten to nine by my watch.
#Person1#: Is it? I had no idea it was so late. I must be off now.
#Person2#: What's the hurry?
#Person1#: I must catch the nine-thirty train.
#Person2#: You've plenty of time yet. The railway station is very close. It won't take more than twenty minutes to get there. 
----------------------------------------------------------------------------------------------------
The summary of the dialogue is;
#Person1# is in a hurry to catch a train. Tom tells #Person1# there is plenty of time.
----------------------------------------------------------------------------------------------------
     
The dialogue is given below:
#Person1#: I cannot imagine if Trump were t

> The parameters we have used here are;
- do_sample : which basically means do shuffling and pick one
- temperature : More the temperature, higher the randomness
- num_beams : Higher the number, higher the exploration
- num_return_sequence : Higher the number, multiple sequence return
- top_p=0.9  # Nucleus sampling to avoid extreme probabilities

Interaction Between num_beams and num_return_sequences:
When used together, these parameters can control the diversity and quality of the generated sequences:

High num_beams, Low num_return_sequences:
Focuses on generating a few high-quality sequences by thoroughly exploring the most promising paths.<br>
<br>
Low num_beams, High num_return_sequences:
Generates more diverse sequences by sampling more paths but with less depth in exploration.<br><br>
High num_beams, High num_return_sequences:
Balances between diversity and quality by exploring multiple promising paths and generating multiple sequences from each path.<br><br>

**The higher the number of beams and return sequence, the more time it takes. Better to keep the beams moderate number and and return seq = 1. Also higher the beam and temperature, the more flexible the answer gonna be, meaning less accurate sometimes.**