# Workshop 1 - Working with Prompts and LLMs

In this workshop, you will learning how to write prompts and feed them into LLMs. You will also be learning how to use different prompt techniques to improve the response from the LLM.

## Loading and exploring the dataset

The workshop will be using [<code>facebook/ExploreToM</code>](https://huggingface.co/datasets/facebook/ExploreToM) dataset from [HuggingFace](https://huggingface.co/). 

In [1]:
# Dataset name
dataset_name = "facebook/ExploreToM"

In [3]:
# TODO: Load the following libraries: datasets
from datasets import load_dataset

dataset = load_dataset(dataset_name)

In [None]:
# TODO: Explore the dataset

# TODO: Number of rows 


# TODO: Keys in the dataset


# TODO: Feature names


# TODO: Display a single row


## Encoding and decoding text

LLMs with with vectors/tensors. Before passing text into the LLMs to be evaluated, the text have to be first tokenized (encoded) into tokens. 

In [5]:
# TODO: Load tokenizer

from transformers import AutoTokenizer

### T5 Models

The <code>flan-t5</code> is a Text-To-Text Transfer Transformer (T5) that is capable of performing zero-shot NLP task such as summary, simple reasoninig, answering questions, etc. 

Some T5 models from Huggingface
- [<code>google/flan-t5-base</code>](https://huggingface.co/google/flan-t5-base)
- [<code>google/flan-t5-small</code>](https://huggingface.co/google/flan-t5-small)
- [<code>google/flan-t5-xl</code>](https://huggingface.co/google/flan-t5-xl)
- [<code>google/flan-t5-xxl</code>](https://huggingface.co/google/flan-t5-xxl) - full model

Complete list of [T5 models](https://huggingface.co/models?search=google/flan) on Huggingface.


In [6]:
# Model names
model_name = "google/flan-t5-small"
model_name = "google/flan-t5-base"

In [7]:
# TODO: Create a tokenizer

tokenizer = AutoTokenizer.from_pretrained(model_name)

tokenizer_config.json:   0%|          | 0.00/2.54k [00:00<?, ?B/s]

spiece.model:   0%|          | 0.00/792k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/2.42M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/2.20k [00:00<?, ?B/s]

In [8]:
# TODO: Tokenize and explore the tokenized text, contents and size
# Try different message text

message = "big black bug bleed black blood"

enc_messgage = tokenizer(message, return_tensors='pt')

print(enc_messgage)

{'input_ids': tensor([[  600,  1001,  8143,     3, 27779,  1001,  1717,     1]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1]])}


In [10]:
# Print the encoded message
print(len(enc_messgage['input_ids'][0]))

8


In [12]:
# TODO: Decode the tokenized text
dec_message = tokenizer.decode(enc_messgage['input_ids'][0], skip_special_tokens=True)

print(dec_message)

big black bug bleed black blood


In [32]:
messages = [
   "big black bug bleeds black blood",
   "she sell sea shells on the sea shore",
   "quick brown fox jump over the lazy dog"
]

enc_messages = tokenizer(messages, return_tensors='pt', padding=True)

print(enc_messages)

{'input_ids': tensor([[  600,  1001,  8143,     3, 27779,     7,  1001,  1717,     1,     0],
        [  255,  1789,  2805,  7300,     7,    30,     8,  2805, 10433,     1],
        [ 1704,  4216,     3, 20400,  4418,   147,     8, 19743,  1782,     1]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 0],
        [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])}


In [34]:
print(len(enc_messages['input_ids']))
print(len(enc_messages['attention_mask']))

for k in range(len(enc_messages['input_ids'])):
   print('encoding: ', enc_messages['input_ids'][k])
   print('len: ', len(enc_messages['input_ids'][k]))
   print('attention: ', enc_messages['attention_mask'][k])

3
3
encoding:  tensor([  600,  1001,  8143,     3, 27779,     7,  1001,  1717,     1,     0])
len:  10
attention:  tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 0])
encoding:  tensor([  255,  1789,  2805,  7300,     7,    30,     8,  2805, 10433,     1])
len:  10
attention:  tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1])
encoding:  tensor([ 1704,  4216,     3, 20400,  4418,   147,     8, 19743,  1782,     1])
len:  10
attention:  tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1])


In [35]:
dec_message = tokenizer.decode(enc_messages['input_ids'][0])
print(dec_message)

big black bug bleeds black blood</s><pad>


## Load LLM 

Create and instance of the Large Language Model (LLM). We  will then create a simple prompt, tokenize the prompt and feed the tokenized prompt to the LLM. The response from the LLM will be decoded to human friendly text.

In [39]:
# TODO: Load LLM packages - AutoModelForSeq2SeqLM and GenerationConfig
from transformers import AutoModelForSeq2SeqLM, GenerationConfig

model = AutoModelForSeq2SeqLM.from_pretrained(model_name)


In [47]:
# TODO: Answer simple question

enc_prompt = tokenizer(prompt, return_tensors='pt')
#print(enc_prompt)

enc_answer = model.generate(enc_prompt['input_ids'])
print(enc_answer)

answer = tokenizer.decode(enc_answer[0], skip_special_tokens=True)
print(answer)

tensor([[   0, 3412,   63,    1]])
rainy


In [53]:
# TODO: Display a short story
idx = 100
#print(dataset['train'][idx])
#for k in dataset['train'].features:   
#   print(k)
   
print('> ', dataset['train'][idx]['story_structure'])
print('> ', dataset['train'][idx]['infilled_story'])
print('> ', dataset['train'][idx]['question'])
print('> ', dataset['train'][idx]['expected_answer'])


>  Leslie entered the main tent. Leslie left the main tent. Isabella entered the storage trailer. Isabella moved the stuffed rabbit to the wooden chest, which is also located in the storage trailer. Leslie entered the main tent. Isabella moved the stuffed rabbit to the main tent, leaving the wooden chest in its original location. Isabella told out loud about the festival marketing strategies. Isabella told privately to Colton that Leslie is in the main tent. While this action was happening, Leslie witnessed this action in secret (and only this action).
>  The warm glow of string lights illuminated the vibrant colors of the festival grounds, casting a lively atmosphere over rows of booths and attractions. A faint scent of popcorn and sugar wafted through the air, mingling with the distant sounds of laughter and music, as the night's festivities had just begun to unfold. Leslie slipped unnoticed into the main tent, the sounds of the festival outside momentarily muffled by the canvas wall

In [71]:
idx = 3
story = dataset['train'][idx]['story_structure']
question = dataset['train'][idx]['question']
expected_answer = dataset['train'][idx]['expected_answer']

prompt = f""" 
Read the following story:

{story}

Answer this question regarding the story:
{question}

Answer:
"""

prompt = f""" 
Summarize the following story:

{story}

Summary:
"""

prompt = f""" 
Read the following story and answer the question:

{story}

Question: Where is Kaylee?
"""

prompt=f"Write a short summary for this text: {story}\n\nSummary:"


enc_prompt = tokenizer(prompt, return_tensors='pt')
enc_compl = model.generate(enc_prompt['input_ids'], max_new_tokens=512)

compl = tokenizer.decode(enc_compl[0], skip_special_tokens=True)

print(f'Story: {story}')
print(f'Question: {question}')
print(f'Expected: {expected_answer}')
print(f'Response: {compl}')

Story: Kaylee entered the hotel lobby. Kaylee moved the silver letter opener to the wooden desk drawer, which is also located in the hotel lobby. While this action was happening, Liam witnessed this action in secret (and only this action). Kaylee left the hotel lobby. Liam entered the hotel lobby. Kaylee entered the hotel lobby. Liam moved the silver letter opener to the leather briefcase, which is also located in the hotel lobby.
Question: In which container will Liam search for the silver letter opener?
Expected: leather briefcase
Response: Kaylee moved the silver letter opener to the wooden desk drawer. Liam witnessed this action in secret (and only this action).


### Flan Templates

Flan has published [prompt templates](https://github.com/google-research/FLAN/blob/main/flan/v2/templates.py).

In [88]:
# TODO: Find a prompt template from Flan promp templates and use that as your prompt.
title = "LA fires death toll rises to 24 as high winds expected"
# Change the title, otherwise the model will return the title!
title = "LA fires death toll rises"

text = """ 
Weather forecasters in California are warning fierce winds which fuelled the infernos around Los Angeles are expected to pick up again this week, as fire crews on the ground race to make progress controlling three wildfires.

Officials warned that after a weekend of relatively calm winds, the notoriously dry Santa Ana winds would pick up again from Sunday night until Wednesday, reaching speeds of up to 60mph (96km/h).

Ahead of the wind's uptick, some progress has been made in stopping the spread of the deadly Palisades and Eaton fires, which are burning on opposite ends of the city. Local firefighters are being assisted by crews from eight other states, as well as Canada and Mexico, who continue to arrive.

The LA County medical examiner updated the death toll on Sunday to 24, while officials said earlier at least another 16 remain missing.

Sixteen of the dead were found in the Eaton fire zone, while eight were found in the Palisades area.

Three conflagrations continue to burn around Los Angeles.

The largest fire is the Palisades, which has now burnt through more than 23,000 acres and is 13% contained.

The Eaton fire is the second biggest and has burnt through more than 14,000 acres. It is 27% contained.

The Hurst fire has grown to 799 acres and has been almost fully contained.

The wildfires are on track to be among the costliest in US history.

On Sunday, private forecaster Accuweather increased its preliminary estimate of financial losses from the blazes to between $250bn-$275bn.
"""

prompt=f"{title}\n\n{text}\n\nWrite a one or two sentence summary."

enc_prompt = tokenizer(prompt, return_tensors='pt')
enc_resp = model.generate(enc_prompt['input_ids'], max_new_tokens=4096)
resp = tokenizer.decode(enc_resp[0], skip_special_tokens=True)
print(resp)

Los Angeles fires are threatening to be among the costliest in US history, with the death toll rising to 24


### Configure the LLM

Use [<code>GenerationConfig</code>](https://huggingface.co/docs/transformers/en/main_classes/text_generation#transformers.GenerationConfig) to change the parameters of the LLM generation. Try different parameters to observe how these changes influences the LLM output.

Common parameters
- <code>max_new_tokens</code> - The maximum numbers of tokens to generate, ignoring the number of tokens in the prompt. Controls the output length.
- <code>temperature</code> - The value used to modulate the next token probabilities. Manipulate the model's output logit.
- <code>do_sample</code> - Whether or not to use sampling ; use greedy decoding otherwise. Controls the generation strategy.

In [72]:
# TODO: Use GenerationConfig to change the LLM parameters. Try different values.

from transformers import GenerationConfig

In [75]:
config = GenerationConfig(max_new_token=128, temperature=1, do_sample=False)


prompt=f"Write a short summary for this text: {story}\n\nSummary:"


enc_prompt = tokenizer(prompt, return_tensors='pt')
enc_compl = model.generate(enc_prompt['input_ids'], generation_config=config)

compl = tokenizer.decode(enc_compl[0], skip_special_tokens=True)

print(f'Sto:ry {story}')
print(f'Response: {compl}')

Story: Kaylee entered the hotel lobby. Kaylee moved the silver letter opener to the wooden desk drawer, which is also located in the hotel lobby. While this action was happening, Liam witnessed this action in secret (and only this action). Kaylee left the hotel lobby. Liam entered the hotel lobby. Kaylee entered the hotel lobby. Liam moved the silver letter opener to the leather briefcase, which is also located in the hotel lobby.
Response: Kaylee moved the silver letter opener to the wooden desk drawer. Liam witnessed this action in
