# Prompt Engineering Overview
In this experiment, we gonna implement the prompt engineering on the HuggingFace LLM.

## 1. Installations

In [1]:
!pip install transformers
!pip install datasets

Collecting datasets
  Downloading datasets-2.18.0-py3-none-any.whl (510 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m510.5/510.5 kB[0m [31m4.1 MB/s[0m eta [36m0:00:00[0m
Collecting dill<0.3.9,>=0.3.0 (from datasets)
  Downloading dill-0.3.8-py3-none-any.whl (116 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m116.3/116.3 kB[0m [31m15.0 MB/s[0m eta [36m0:00:00[0m
Collecting xxhash (from datasets)
  Downloading xxhash-3.4.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (194 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m194.1/194.1 kB[0m [31m3.3 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting multiprocess (from datasets)
  Downloading multiprocess-0.70.16-py310-none-any.whl (134 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m134.8/134.8 kB[0m [31m15.6 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: xxhash, dill, multiprocess, datasets
Successfully installed datasets

## 2. Imports

In [2]:
from datasets import load_dataset
from transformers import AutoModelForSeq2SeqLM
from transformers import AutoTokenizer
from transformers import GenerationConfig

## 3. Load
We will also use a dataset from HuıggingFace, it is a summarization task. The dataset includes the dialogue and human summary parts.

In [3]:
ds_name = "knkarthick/dialogsum"
dataset = load_dataset(ds_name)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


Downloading readme:   0%|          | 0.00/4.65k [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/11.3M [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/442k [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/1.35M [00:00<?, ?B/s]

Generating train split: 0 examples [00:00, ? examples/s]

Generating validation split: 0 examples [00:00, ? examples/s]

Generating test split: 0 examples [00:00, ? examples/s]

In [4]:
example_indices = [20, 100]
dash_line = '-'.join('' for x in range(100))

for i, index in enumerate(example_indices):
  print(dash_line)
  print("Ex", i+1)
  print(dash_line)
  print('Input')
  print(dataset['test'][index]['dialogue'])
  print('Human summary')
  print(dataset['test'][index]['summary'])

---------------------------------------------------------------------------------------------------
Ex 1
---------------------------------------------------------------------------------------------------
Input
#Person1#: What's wrong with you? Why are you scratching so much?
#Person2#: I feel itchy! I can't stand it anymore! I think I may be coming down with something. I feel lightheaded and weak.
#Person1#: Let me have a look. Whoa! Get away from me!
#Person2#: What's wrong?
#Person1#: I think you have chicken pox! You are contagious! Get away! Don't breathe on me!
#Person2#: Maybe it's just a rash or an allergy! We can't be sure until I see a doctor.
#Person1#: Well in the meantime you are a biohazard! I didn't get it when I was a kid and I've heard that you can even die if you get it as an adult!
#Person2#: Are you serious? You always blow things out of proportion. In any case, I think I'll go take an oatmeal bath.
Human summary
#Person1# thinks #Person2# has chicken pox and warns 

## 4. Load Model
For this experiment, we gonna use the google/flant5 model.

In [5]:
# Load Model  & Tokenizer
model_name = 'google/flan-t5-base'
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True)

config.json:   0%|          | 0.00/1.40k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/990M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/147 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/2.54k [00:00<?, ?B/s]

spiece.model:   0%|          | 0.00/792k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/2.42M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/2.20k [00:00<?, ?B/s]

## 5. Zero-Shot Learning

In [7]:
# select an example
example_index = [100]

# get the dialogue
dialogue = dataset['test'][example_index]['dialogue']

# get the human summary
summary = dataset['test'][example_index]['summary']

# Configurations (We will use Greedy Approach)
generation_config = GenerationConfig(max_new_tokens=50)

# Encode input
inputs_encoded = tokenizer(dialogue, return_tensors='pt')

# Model Output
model_output = model.generate(inputs_encoded["input_ids"], generation_config=generation_config)[0]

# Decode the output
zero_output = tokenizer.decode(model_output, skip_special_tokens=True)

print(summary)
print(zero_output)

["#Person1# and Mike have a disagreement on how to act out a scene. #Person1# proposes that Mike can try to act in #Person1#'s way."]
#Person1#: I'm sorry, but I'm not sure what Jason and Laura are doing.


It is seen that the model doesn’t give a very sensible output. Let’s try One-Shot.

## 6. One-Shot Learning
Let's first understand on the prompt. It includes:
- A system message that defines our expectation of the model,
- Examples (Input and Output),
- A new input on which we want the model should make an inference.
- Let’s write a small example:

"
Turn the given sentence into a JSON as :
{“verb”, “subject”, “type”:[“question”, “emotion”, “action”]}

input 1: he never achieved the status he so desired
output1: {“verb”: “achieved”, “subject”: “he”, “type”: “emotion”}

sentence:
"

In [8]:
def make_prompt(example_indices, example_index_to_sum):
  "write correct prompt from given sentences. Add the latest prompt for predicted sentence"
  prompt = ""

  for index in example_indices:

    dialogue = dataset['test'][index]['dialogue']
    summary = dataset['test'][index]['summary']

    # prompt includes the dialogue
    prompt += f"""
    Dialogue:

    {dialogue}

    What is going on?
    {summary}

    """

  # get the example dialog
  dialogue = dataset['test'][example_index_to_sum]['dialogue']
  prompt += f"""
    Dialogue:

    {dialogue}

    What is going on?
    """

  return prompt

Let's see the prompt by calling the function.

In [9]:
example_indices = [20]
example_index_to_sum = 100
one_shot_prompt = make_prompt(example_indices, example_index_to_sum)

print(one_shot_prompt)


    Dialogue:

    #Person1#: What's wrong with you? Why are you scratching so much?
#Person2#: I feel itchy! I can't stand it anymore! I think I may be coming down with something. I feel lightheaded and weak.
#Person1#: Let me have a look. Whoa! Get away from me!
#Person2#: What's wrong?
#Person1#: I think you have chicken pox! You are contagious! Get away! Don't breathe on me!
#Person2#: Maybe it's just a rash or an allergy! We can't be sure until I see a doctor.
#Person1#: Well in the meantime you are a biohazard! I didn't get it when I was a kid and I've heard that you can even die if you get it as an adult!
#Person2#: Are you serious? You always blow things out of proportion. In any case, I think I'll go take an oatmeal bath.

    What is going on?
    #Person1# thinks #Person2# has chicken pox and warns #Person2# about the possible hazards but #Person2# thinks it will be fine.

    
    Dialogue:

    #Person1#: OK, that's a cut! Let's start from the beginning, everyone.
#Person

As you can see the prompt end with What is going on? In zero-shot learning we have encoded the dialogue, instead, we will encode the prompt to give the model.

In [10]:
# Encode input THE ONLY DIFFERENCE IS HERE
inputs_encoded = tokenizer(one_shot_prompt, return_tensors='pt')

# Model Output
model_output = model.generate(inputs_encoded["input_ids"], generation_config=generation_config)[0]

# Decode the output
one_output = tokenizer.decode(model_output, skip_special_tokens=True)


print(dash_line)
print("Ex: ", i+1)
print(dash_line)
print("Input: ", dialogue)
print(dash_line)
print( "Human summary: ", summary)
print(dash_line)
print("ONE SHOT Output: ", one_output)

Token indices sequence length is longer than the specified maximum sequence length for this model (524 > 512). Running this sequence through the model will result in indexing errors


---------------------------------------------------------------------------------------------------
Ex:  2
---------------------------------------------------------------------------------------------------
Input:  ["#Person1#: OK, that's a cut! Let's start from the beginning, everyone.\n#Person2#: What was the problem that time?\n#Person1#: The feeling was all wrong, Mike. She is telling you that she doesn't want to see you any more, but I want to get more anger from you. You're acting hurt and sad, but that's not how your character would act in this situation.\n#Person2#: But Jason and Laura have been together for three years. Don't you think his reaction would be one of both anger and sadness?\n#Person1#: At this point, no. I think he would react the way most guys would, and then later on, we would see his real feelings.\n#Person2#: I'm not so sure about that.\n#Person1#: Let's try it my way, and you can see how you feel when you're saying your lines. After that, if it still doesn't

Now the question is: if we add more and more examples, can we make the output better? This question leads us to the few-shot learning.


## 7. Few-Shot Learning:

Since we’ve already written a function that can include many examples, let’s jump in!

In [18]:
example_indices = [20, 60, 120]
example_index_to_sum = 100
few_shot_prompt = make_prompt(example_indices, example_index_to_sum)

# Encode input THE ONLY DIFFERENCE IS HERE
inputs_encoded = tokenizer(few_shot_prompt, return_tensors='pt')

# Model Output
model_output = model.generate(inputs_encoded["input_ids"], generation_config=generation_config)[0]

# Decode the output
few_output = tokenizer.decode(model_output, skip_special_tokens=True)

In [19]:
print("Input: ", dialogue)
print(dash_line)
print( "Human summary: ", summary)
print(dash_line)
print("Zero-Shot Output: ", zero_output)
print(dash_line)
print("One-Shot Output: ", one_output)
print(dash_line)
print("Few-Shot Output: ", few_output)

Input:  ["#Person1#: OK, that's a cut! Let's start from the beginning, everyone.\n#Person2#: What was the problem that time?\n#Person1#: The feeling was all wrong, Mike. She is telling you that she doesn't want to see you any more, but I want to get more anger from you. You're acting hurt and sad, but that's not how your character would act in this situation.\n#Person2#: But Jason and Laura have been together for three years. Don't you think his reaction would be one of both anger and sadness?\n#Person1#: At this point, no. I think he would react the way most guys would, and then later on, we would see his real feelings.\n#Person2#: I'm not so sure about that.\n#Person1#: Let's try it my way, and you can see how you feel when you're saying your lines. After that, if it still doesn't feel right, we can try something else."]
---------------------------------------------------------------------------------------------------
Human summary:  ["#Person1# and Mike have a disagreement on how t

Of course, we can change the output by using a random-weighted sampling technique by changing the configuration settings. do_sample parameter is the key and you may add other parameters like top_p, top_k, and temperature to limit its creativity.

In [21]:
# Configurations
generation_config = GenerationConfig(max_new_tokens=50, do_sample=True, temperature=0.7)

In [22]:
example_indices = [10, 20, 30, 40, 80]
example_index_to_sum = 100
few_shot_prompt = make_prompt(example_indices, example_index_to_sum)

# Encode input THE ONLY DIFFERENCE IS HERE
inputs_encoded = tokenizer(few_shot_prompt, return_tensors='pt')

# Model Output
model_output = model.generate(inputs_encoded["input_ids"], generation_config=generation_config)[0]

# Decode the output
few_output = tokenizer.decode(model_output, skip_special_tokens=True)

In [23]:
print(few_shot_prompt)
print(dash_line)
print("Human Summary: ", summary)
print(dash_line)
print("Few-Shot Output: ", few_output)


    Dialogue:

    #Person1#: Happy Birthday, this is for you, Brian.
#Person2#: I'm so happy you remember, please come in and enjoy the party. Everyone's here, I'm sure you have a good time.
#Person1#: Brian, may I have a pleasure to have a dance with you?
#Person2#: Ok.
#Person1#: This is really wonderful party.
#Person2#: Yes, you are always popular with everyone. and you look very pretty today.
#Person1#: Thanks, that's very kind of you to say. I hope my necklace goes with my dress, and they both make me look good I feel.
#Person2#: You look great, you are absolutely glowing.
#Person1#: Thanks, this is a fine party. We should have a drink together to celebrate your birthday

    What is going on?
    #Person1# attends Brian's birthday party. Brian thinks #Person1# looks great and charming.

    
    Dialogue:

    #Person1#: What's wrong with you? Why are you scratching so much?
#Person2#: I feel itchy! I can't stand it anymore! I think I may be coming down with something. I feel 