# Workshop 1 - Question and Answers
In this workshop, you will learning how to write prompts and feed them into LLMs. You
will also be learning how to use different prompt techniques to improve the response
from the LLM.

## Loading and Explorng the Dataset
The workshop will be using [`facebook/ExploreToM`](https://huggingface.co/datasets/facebook/ExploreToM) dataset from [HuggingFace](https://huggingface.co).

In [1]:
# TODO: Load the following libraries: datasets
from datasets import load_dataset

In [3]:
# Dataset name
dataset_name = "facebook/ExploreToM"

In [5]:
# TODO: load and explore the dataset
dataset = load_dataset(dataset_name)

In [9]:
# TODO: number of rows in the dataset
print(dataset.shape)

# TODO: Keys in the dataset
print(dataset.keys())

# TODO: Feature names
print(dataset["train"].features)


# TODO: Display a single row
idx = 10
for k, v in dataset["train"][idx].items():
    print(f"{k}: {v}")


{'train': (13309, 18)}
dict_keys(['train'])
{'story_structure': Value('string'), 'infilled_story': Value('string'), 'question': Value('string'), 'expected_answer': Value('string'), 'qprop=params': Value('string'), 'qprop=nth_order': Value('int64'), 'qprop=non_unique_mental_state': Value('bool'), 'sprop=is_false_belief_story_1st': Value('bool'), 'sprop=is_false_belief_story_1st_and_2nd': Value('bool'), 'sprop=story_accuracy_1st_raw': Value('float64'), 'sprop=story_accuracy_1st_infilled': Value('float64'), 'sprop=global_idx': Value('int64'), 'param=story_type': Value('string'), 'param=num_stories_total': Value('int64'), 'param=max_sentences': Value('int64'), 'param=num_people': Value('int64'), 'param=num_moves': Value('int64'), 'param=num_rooms': Value('int64')}
story_structure: Dylan entered the green room. Dylan moved the pocket watch to the wooden chest, which is also located in the green room. While this action was happening, Clayton witnessed this action in secret (and only this act

In [10]:
# TODO: import pipeline
from transformers import pipeline

## `pipeline`
[`pipeline`](https://huggingface.co/docs/transformers/en/main_classes/pipelines) is an easy to use API to perform inferencing. It provides a wrapper for task-specific pipelines and abstracts most of the complexity by allowing you to focus on the model and the task.

You can use `pipeline` to perform summarisation, image classification, audio generation, etc. You can find an exhaustive list of `pipeline` task [here](https://huggingface.co/docs/transformers/en/main_classes/pipelines#transformers.pipeline.task).

In [16]:
image_url = 'https://image.urlaubspiraten.de/1280/image/upload/v1771547378/mediavault_images/Editorial_Images/Screenshot_2026-02-20_at_01.29.23_rzkswv.png'
image_classifier = pipeline("image-classification")


No model was supplied, defaulted to google/vit-base-patch16-224 and revision 3f49326.
Using a pipeline without specifying a model name and revision in production is not recommended.


Loading weights:   0%|          | 0/200 [00:00<?, ?it/s]

Fast image processor class <class 'transformers.models.vit.image_processing_vit_fast.ViTImageProcessorFast'> is available for this model. Using slow image processor class. To use the fast image processor class set `use_fast=True`.


In [17]:
image_classifier(image_url)

[{'label': 'macaque', 'score': 0.813718855381012},
 {'label': 'titi, titi monkey', 'score': 0.04990111291408539},
 {'label': 'guenon, guenon monkey', 'score': 0.02111787721514702},
 {'label': 'baboon', 'score': 0.020485760644078255},
 {'label': 'langur', 'score': 0.019044701009988785}]

In [13]:
# TODO: Summarise the text with the pipeline's default model
qna = pipeline("question-answering")


No model was supplied, defaulted to distilbert/distilbert-base-cased-distilled-squad and revision 564e9b5.
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/473 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/261M [00:00<?, ?B/s]

Loading weights:   0%|          | 0/102 [00:00<?, ?it/s]

tokenizer_config.json:   0%|          | 0.00/49.0 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

In [21]:
idx = 500
context = dataset['train'][idx]['story_structure']
#context = dataset['train'][idx]['infilled_story']
question = dataset['train'][idx]['question']
expected_answer = dataset['train'][idx]['expected_answer']

result = qna(question=question, context=context)
print(f'Question: {question}')
print(f'Expected answer: {expected_answer}')
print(result)

Question: In which container was the compass at the beginning?
Expected answer: leather pouch
{'score': 0.8107498288154602, 'start': 65, 'end': 78, 'answer': 'leather pouch'}


## Manual Inference - Question and Answer
In this section, we will look at what `pipeline` does under the hood to perform its inference. This will give us a better understanding of the major steps involved.

In [22]:
# TODO: load tokenizer
from transformers import AutoTokenizer

## DistilBERT base cased distilled SQuAD
DistilBERT is a small, fast, cheap and light Transformer model trained by distilling BERT base. More details [here](https://huggingface.co/distilbert/distilbert-base-cased-distilled-squad).

In [23]:
model_name = "distilbert/distilbert-base-cased-distilled-squad"

In [24]:
# TODO: Create a tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name)


In [30]:
# TODO: Encode text
text = 'big black bug bleeds black blood'
text = 'DistilBERT is a small, fast, cheap and light Transformer model trained by distilling BERT base.'
text = 'run running'
# pt - PyTorch tensors
enc_text = tokenizer(text, return_tensors='pt')
print(enc_text)


{'input_ids': tensor([[ 101, 1576, 1919,  102]]), 'token_type_ids': tensor([[0, 0, 0, 0]]), 'attention_mask': tensor([[1, 1, 1, 1]])}


In [26]:
dec_text = tokenizer.decode(enc_text['input_ids'][0])
print(dec_text)

[CLS] big black bug bleeds black blood [SEP]


In [31]:
for t in enc_text['input_ids'][0]:
    print(t, tokenizer.decode(t))

tensor(101) [CLS]
tensor(1576) run
tensor(1919) running
tensor(102) [SEP]


In [32]:
# TODO: Encoding multiple texts
texts = [
    'big black bug bleeds black blood',
    'DistilBERT is a small, fast, cheap and light Transformer model trained by distilling BERT base.',
    'run running'
]
enc_texts = tokenizer(texts, return_tensors='pt', padding=True)
print(enc_texts)


{'input_ids': tensor([[  101,  1992,  1602, 15430, 24752,  1116,  1602,  1892,   102,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0],
        [  101, 12120,  2050,  2723, 27211, 10460,  1110,   170,  1353,   117,
          2698,   117, 10928,  1105,  1609, 13809, 23763,  2235,  3972,  1118,
          4267,  2050,  7956,  1158,   139,  9637,  1942,  2259,   119,   102],
        [  101,  1576,  1919,   102,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0]]), 'token_type_ids': tensor([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0, 0, 0, 0

In [34]:
# TODO: Decode text
dec_text = tokenizer.decode(enc_texts['input_ids'][1], skip_special_tokens=True)
print(dec_text)


DistilBERT is a small, fast, cheap and light Transformer model trained by distilling BERT base.


## Working with LLMs
Create and instance of the Large Language Model (LLM). We will then create a simple
prompt, tokenize the prompt and feed the tokenized prompt to the LLM. The response
from the LLM will be decoded to human friendly text.

In [35]:
# TODO: Load libraries
from transformers import AutoModelForQuestionAnswering

In [36]:
# TODO: Load question answer model
# important to use the same model as the tokenizer
model = AutoModelForQuestionAnswering.from_pretrained(model_name)


Loading weights:   0%|          | 0/102 [00:00<?, ?it/s]

In [40]:
# TODO: Encode context and question
idx = 500
context = dataset['train'][idx]['story_structure']
#context = dataset['train'][idx]['infilled_story']
question = dataset['train'][idx]['question']
expected_answer = dataset['train'][idx]['expected_answer']

# Tokenize the question and the context
enc_question = tokenizer(question, context, return_tensors='pt')
print(tokenizer.decode(enc_question['input_ids'][0]))
print(enc_question)


[CLS] In which container was the compass at the beginning? [SEP] Avery entered the visitor center. Avery moved the compass to the leather pouch, which is also located in the visitor center. While this action was happening, Paige witnessed this action in secret ( and only this action ). Avery moved the compass to the metal lunchbox, which is also located in the visitor center. Paige entered the visitor center. Paige moved the compass to the plastic toolbox, which is also located in the visitor center. Paige left the visitor center. Avery left the visitor center. Avery entered the visitor center. Avery moved the compass to the wooden chest, which is also located in the visitor center. [SEP]
{'input_ids': tensor([[  101,  1130,  1134, 12461,  1108,  1103, 23962,  1120,  1103,  2150,
           136,   102, 12274,  2242,  1103, 11972,  2057,   119, 12274,  1427,
          1103, 23962,  1106,  1103,  5439, 24225,   117,  1134,  1110,  1145,
          1388,  1107,  1103, 11972,  2057,   119, 

In [39]:
# TODO: Tokenize the inputs
result = model(enc_question['input_ids'], enc_question['attention_mask'])
print(result)


QuestionAnsweringModelOutput(loss=None, start_logits=tensor([[-4.1163, -7.5426, -8.1516, -7.7312, -8.9013, -8.6334, -8.2951, -8.9566,
         -8.9771, -8.4805, -6.7856, -7.3903, -3.6027, -6.6565, -7.0145, -6.0696,
         -8.1536, -5.9973, -1.7717, -3.4510, -3.1677, -2.7319, -2.6094,  4.5097,
          6.5242,  0.9058, -5.9276, -6.1396, -7.3438, -7.8895, -6.8208, -6.7113,
         -6.5547, -5.8745, -8.1503, -6.2181, -5.7757, -6.9902, -7.1853, -8.7854,
         -7.6651, -7.7931, -3.8049, -7.0893, -6.8255, -7.6624, -7.0048, -5.6159,
         -7.6462, -9.1547, -8.1187, -7.5051, -7.6538, -8.0592, -6.6424, -2.5208,
         -4.7955, -4.5681, -4.4267, -4.6122, -0.2729,  0.5295, -0.3047, -4.3405,
         -7.4328, -7.3821, -8.2403, -8.4449, -7.5712, -7.2609, -6.9077, -6.3633,
         -8.3037, -6.5707, -3.2359, -6.9395, -6.4840, -5.9722, -8.1572, -6.3711,
         -2.3406, -5.0905, -4.4207, -4.1457, -4.5410, -0.5050,  0.6238, -2.1800,
         -4.7540, -7.7595, -7.4940, -8.0688, -8.3867, -7

In [41]:
import torch

In [46]:
start_ans = torch.argmax(result['start_logits'])
end_ans = torch.argmax(result['end_logits']) + 1

# extract the relevant tokens from the tokenized question
enc_ans = enc_question['input_ids'][0][start_ans:end_ans]
print(enc_ans)
answer = tokenizer.decode(enc_ans)
print(answer)

tensor([ 5439, 24225])
leather pouch
