# Workshop 1 - Question and Answers
In this workshop, you will learning how to write prompts and feed them into LLMs. You
will also be learning how to use different prompt techniques to improve the response
from the LLM.

## Loading and Explorng the Dataset
The workshop will be using [`facebook/ExploreToM`](https://huggingface.co/datasets/facebook/ExploreToM) dataset from [HuggingFace](https://huggingface.co).

In [1]:
# TODO: Load the following libraries: datasets
from datasets import load_dataset

In [2]:
# Dataset name
dataset_name = "facebook/ExploreToM"

In [3]:
# TODO: load and explore the dataset
dataset = load_dataset(dataset_name)

In [8]:
# TODO: number of rows in the dataset
print(dataset.shape)

# TODO: Keys in the dataset
print(dataset.keys())

# TODO: Feature names
print(dataset['train'].features)

# TODO: Display a single row
idx = 5000
for k, v in dataset['train'][idx].items():
   print(f'k: {k}')
   print(f'v: {v}')


{'train': (13309, 18)}
dict_keys(['train'])
{'story_structure': Value('string'), 'infilled_story': Value('string'), 'question': Value('string'), 'expected_answer': Value('string'), 'qprop=params': Value('string'), 'qprop=nth_order': Value('int64'), 'qprop=non_unique_mental_state': Value('bool'), 'sprop=is_false_belief_story_1st': Value('bool'), 'sprop=is_false_belief_story_1st_and_2nd': Value('bool'), 'sprop=story_accuracy_1st_raw': Value('float64'), 'sprop=story_accuracy_1st_infilled': Value('float64'), 'sprop=global_idx': Value('int64'), 'param=story_type': Value('string'), 'param=num_stories_total': Value('int64'), 'param=max_sentences': Value('int64'), 'param=num_people': Value('int64'), 'param=num_moves': Value('int64'), 'param=num_rooms': Value('int64')}
k: story_structure
v: Mia entered the hospital staff lounge. Amelia entered the hospital staff lounge. Mia told privately to Madison about the hospital budget cuts. Madison entered the hospital staff lounge. Madison told privatel

In [9]:
# TODO: import pipeline
from transformers import pipeline 

## `pipeline`
[`pipeline`](https://huggingface.co/docs/transformers/en/main_classes/pipelines) is an easy to use API to perform inferencing. It provides a wrapper for task-specific pipelines and abstracts most of the complexity by allowing you to focus on the model and the task. 

You can use `pipeline` to perform summarisation, image classification, audio generation, etc. You can find an exhaustive list of `pipeline` task [here](https://huggingface.co/docs/transformers/en/main_classes/pipelines#transformers.pipeline.task).

In [None]:
# TODO: Summarise the text with the pipeline's default model
qna = pipeline('question-answering')


No model was supplied, defaulted to distilbert/distilbert-base-cased-distilled-squad and revision 564e9b5 (https://huggingface.co/distilbert/distilbert-base-cased-distilled-squad).
Using a pipeline without specifying a model name and revision in production is not recommended.
Device set to use cpu


In [17]:
idx = 5000
story = dataset['train'][idx]['story_structure']
story = dataset['train'][idx]['infilled_story']
question = dataset['train'][idx]['question']
answer = dataset['train'][idx]['expected_answer']

predicted_answer = qna(question=question, context=story)
print(story)
print(question)
print('--------------------------')
print(predicted_answer)
print(answer)

The hospital staff lounge, a small oasis amidst the bustling hospital corridors, was filled with the aroma of stale coffee and the soft hum of the refrigerator. The lounge's worn furniture and faded walls seemed to bear witness to countless conversations and late-night shifts, a backdrop for moments of rest and refuge. Mia slipped into the hospital staff lounge, her eyes scanning the room for a glimpse of the latest hospital bulletin, and Amelia followed closely, her worn shoes quiet on the scuffed linoleum floor. Mia discreetly tugged Madison into a private conversation, the topic of their hushed discussion immediately evident in the looks exchanged between them - concerned expressions that contrasted with the humdrum atmosphere of the staff lounge. The hospital staff lounge's scuffed linoleum floor creaked softly as Madison entered, her gaze drifting towards the usual gathering spots where colleagues shared stories and advice. A flutter of concern danced across Mia's face as Madison 

## Manual Inference - Question and Answer
In this section, we will look at what `pipeline` does under the hood to perform its inference. This will give us a better understanding of the major steps involved.

In [20]:
# TODO: load tokenizer
from transformers import AutoTokenizer

## DistilBERT base cased distilled SQuAD
DistilBERT is a small, fast, cheap and light Transformer model trained by distilling BERT base. More details [here](https://huggingface.co/distilbert/distilbert-base-cased-distilled-squad).

In [21]:
model_name = "distilbert/distilbert-base-cased-distilled-squad"

In [22]:
# TODO: Create a tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name)


In [27]:
# TODO: Encode text
message = "Big black bug bleeds black blood"
message = "It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness, it was the epoch of belief, it was the epoch of incredulity, it was the season of Light"
message = "Thou shalt not make a machine in the likeness of the human mind."

# pt - PyTorch Tensors
encoded_message = tokenizer(message, return_tensors='pt')

print(encoded_message)


{'input_ids': tensor([[  101,   157, 14640,   188, 24537,  1136,  1294,   170,  3395,  1107,
          1103,  1176,  1757,  1104,  1103,  1769,  1713,   119,   102]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])}


In [28]:
for tok in encoded_message.input_ids[0]:
   dec = tokenizer.decode(tok)
   print(f'tok = {tok}, dec = {dec}')

tok = 101, dec = [CLS]
tok = 157, dec = T
tok = 14640, dec = ##hou
tok = 188, dec = s
tok = 24537, dec = ##halt
tok = 1136, dec = not
tok = 1294, dec = make
tok = 170, dec = a
tok = 3395, dec = machine
tok = 1107, dec = in
tok = 1103, dec = the
tok = 1176, dec = like
tok = 1757, dec = ##ness
tok = 1104, dec = of
tok = 1103, dec = the
tok = 1769, dec = human
tok = 1713, dec = mind
tok = 119, dec = .
tok = 102, dec = [SEP]


In [30]:
text = tokenizer.decode(encoded_message.input_ids[0], skip_special_tokens=True)
print(text)

Thou shalt not make a machine in the likeness of the human mind.


In [34]:
# TODO: Encoding multiple texts
messages = [
   "Big black bug bleeds black blood",
   "Thou shalt not make a machine in the likeness of the human mind.",
   "hello, world"
]

encoded_messages = tokenizer(messages, return_tensors='pt', padding=True)
print(encoded_messages)


{'input_ids': tensor([[  101,  2562,  1602, 15430, 24752,  1116,  1602,  1892,   102,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0],
        [  101,   157, 14640,   188, 24537,  1136,  1294,   170,  3395,  1107,
          1103,  1176,  1757,  1104,  1103,  1769,  1713,   119,   102],
        [  101, 19082,   117,  1362,   102,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
        [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])}


In [None]:
# TODO: Decode text


## Working with LLMs
Create and instance of the Large Language Model (LLM). We will then create a simple
prompt, tokenize the prompt and feed the tokenized prompt to the LLM. The response
from the LLM will be decoded to human friendly text.

In [35]:
# TODO: Load libraries
from transformers import AutoModelForQuestionAnswering
import torch

In [36]:
# TODO: Load question answer model
model = AutoModelForQuestionAnswering.from_pretrained(model_name)


In [41]:
# TODO: Encode context and question
idx = 10
story = dataset['train'][idx]['story_structure']
story = dataset['train'][idx]['infilled_story']
question = dataset['train'][idx]['question']
answer = dataset['train'][idx]['expected_answer']

enc_question = tokenizer(question, return_tensors='pt')
enc_story = tokenizer(story, return_tensors='pt')


In [42]:
# TODO: Tokenize the inputs
enc_input = tokenizer(question, story, return_tensors='pt', padding=True)
#print(enc_input)
print(tokenizer.decode(enc_input.input_ids[0]))


[CLS] In which container was the pocket watch at the beginning? [SEP] The bustling theater was a hive of activity on this chilly autumn evening, its worn wooden floors and faded velvet curtains a testament to years of countless performances. The dimly lit green room, tucked away backstage, was a cozy refuge from the chaos, its plush armchairs and ornate wooden chest offering a warm respite for those who needed it. Dylan slipped away from the crowd, disappearing behind a tattered curtain as he made his way to a more secluded space. The soft glow of a floor lamp in the green room enveloped him, providing a sense of tranquility amidst the pre - show chaos. Dylan carefully placed the pocket watch in the ornate wooden chest, hidden from view, and Clayton caught a glimpse of this sneaky maneuver from his secret vantage point. Clayton ' s eyes narrowed as he pushed open the creaky door and stepped into the green room, the sudden movement making the softly lit space seem almost anticipatory. I

In [None]:
# Pass the encoded question and story to the QnA model
result = model(enc_input.input_ids, enc_input.attention_mask)
print(result)

QuestionAnsweringModelOutput(loss=None, start_logits=tensor([[ -3.0070,  -7.8348,  -7.9082,  -7.7894,  -9.4917,  -9.4841,  -8.6417,
          -9.5561,  -9.0523,  -8.8516,  -8.2852,  -6.7598,  -8.1208,  -6.9686,
          -7.6955, -10.2469,  -8.1907, -11.4937, -10.5607,  -9.9994, -11.1921,
         -11.5424,  -9.2086, -10.3368,  -8.9562,  -7.6995, -10.7880,  -7.8180,
          -9.4568, -10.5448,  -8.5158,  -7.3693,  -6.9435,  -8.5076, -10.6069,
          -7.0253,  -4.7177,  -7.5203, -10.9336, -10.1342, -10.9254, -11.5466,
          -9.6855, -11.6445,  -9.5176,  -9.1132,  -9.6704,  -4.7700,  -5.2772,
          -9.2881,  -6.9881,  -4.4925,  -7.0064,  -9.5313,  -7.8890,  -9.7348,
          -7.4000,  -9.4579,  -9.3632,  -8.9741,  -8.1093,  -9.5967,  -8.1028,
         -11.1570, -10.3856,  -8.6170, -10.5958,  -7.4434,  -7.4832,  -9.7634,
          -7.7349,  -9.4229,  -9.5982,  -9.6697,  -5.1695,  -4.7794,  -6.2363,
         -10.1923, -10.3071,  -8.8503, -10.3421, -11.0937, -10.9019, -11.7270,

In [47]:
print(result.keys())

start_pos = torch.argmax(result.start_logits)
end_pos = torch.argmax(result.end_logits) + 1
enc_ans = enc_input.input_ids[0][start_pos: end_pos]

print(f'start = {start_pos}')
print(f'end = {end_pos}')
print(enc_ans)

predicted_answer = tokenizer.decode(enc_ans)
print(predicted_answer)


odict_keys(['start_logits', 'end_logits'])
start = 151
end = 155
tensor([ 1103, 19870,  4122,  2229])
the ornate wooden chest
