# Introduction to Data Science 2025

# Week 4

In this week's exercise, we look at prompting and zero- and few-shot task settings. Below is a text generation example from https://github.com/TurkuNLP/intro-to-nlp/blob/master/text_generation_pipeline_example.ipynb demonstrating how to load a text generation pipeline with a pre-trained model and generate text with a given prompt. Your task is to load a similar pre-trained generative model and assess whether the model succeeds at a set of tasks in zero-shot, one-shot, and two-shot settings.

**Note: Downloading and running the pre-trained model locally may take some time. Alternatively, you can open and run this notebook on [Google Colab](https://colab.research.google.com/), as assumed in the following example.**

## Text generation example

This is a brief example of how to run text generation with a causal language model and `pipeline`.

Install [transformers](https://huggingface.co/docs/transformers/index) python package. This will be used to load the model and tokenizer and to run generation.

In [1]:
!pip install --quiet transformers

Import the `AutoTokenizer`, `AutoModelForCausalLM`, and `pipeline` classes. The first two support loading tokenizers and generative models from the [Hugging Face repository](https://huggingface.co/models), and the last wraps a tokenizer and a model for convenience.

In [2]:
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline

Load a generative model and its tokenizer. You can substitute any other generative model name here (e.g. [other TurkuNLP GPT-3 models](https://huggingface.co/models?sort=downloads&search=turkunlp%2Fgpt3)), but note that Colab may have issues running larger models.

In [3]:
MODEL_NAME = "gpt2"

tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
model = AutoModelForCausalLM.from_pretrained(MODEL_NAME)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

Instantiate a text generation pipeline using the tokenizer and model.

In [4]:
pipe = pipeline(
    'text-generation',
    model=model,
    tokenizer=tokenizer,
    device=model.device
)

Device set to use cpu


We can now call the pipeline with a text prompt; it will take care of tokenizing, encoding, generation, and decoding:

In [7]:
output = pipe('Hello, how are you?', max_new_tokens=25)

print(output)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': "Hello, how are you? I'm not sure how to explain to you what's going on.\n\nI had been watching the game for a long"}]


Just print the text

In [8]:
print(output[0]['generated_text'])

Hello, how are you? I'm not sure how to explain to you what's going on.

I had been watching the game for a long


We can also call the pipeline with any arguments that the model `generate` function supports. For details on text generation using `transformers`, see e.g. [this tutorial](https://huggingface.co/blog/how-to-generate).

Example with sampling and a high `temperature` parameter to generate more chaotic output:

In [9]:
output = pipe(
    'Hello, how are you?',
    do_sample=True,
    temperature=10.0,
    max_new_tokens=25
)

print(output[0]['generated_text'])

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Hello, how are you? How have your legs gotten really hard at times. Where I am here." And looking across this hall for that particular spot with


## Exercise 1

Your task is to assess whether a generative model succeeds in the following tasks in zero-shot, one-shot, and two-shot settings:

- binary sentiment classification (positive / negative)

- person name recognition

- two-digit addition (e.g. 11 + 22 = 33)

For example, for assessing whether a generative model can name capital cities, we could use the following prompts:

- zero-shot:
	>"""\
	>Identify the capital cities of countries.
	>
	>Question: What is the capital of Finland?\
	>Answer:\
	>"""
- one-shot:
	>"""\
	>Identify the capital cities of countries.
	>
	>Question: What is the capital of Sweden?\
	>Answer: Stockholm
	>
	>Question: What is the capital of Finland?\
	>Answer:\
	>"""
- two-shot:
	>"""\
	>Identify the capital cities of countries.
	>
	>Question: What is the capital of Sweden?\
	>Answer: Stockholm
	>
	>Question: What is the capital of Denmark?\
	>Answer: Copenhagen
	>
	>Question: What is the capital of Finland?\
	>Answer:\
	>"""

You can do the tasks either in English or Finnish and use a generative model of your choice from the Hugging Face models repository, for example the following models:

- English: `gpt2-large`
- Finnish: `TurkuNLP/gpt3-finnish-large`

You can either come up with your own instructions for the tasks or use the following:

- English:
	- binary sentiment classification: "Do the following texts express a positive or negative sentiment?"
	- person name recognition: "List the person names occurring in the following texts."
	- two-digit addition: "This is a first grade math exam."
- Finnish:
	- binary sentiment classification: "Ilmaisevatko seuraavat tekstit positiivista vai negatiivista tunnetta?"
	- person name recognition: "Listaa seuraavissa teksteissä mainitut henkilönnimet."
	- two-digit addition: "Tämä on ensimmäisen luokan matematiikan koe."

Come up with at least two test cases for each of the three tasks, and come up with your own one- and two-shot examples.

In [17]:
#Binary sentiment classification

prompt_zero="""
Do the following texts express a positive or negative sentiment?

Text: The food in the restaurant was disgusting.
Answer:
"""

prompt_one="""
Do the following texts express a positive or negative sentiment?

Text: I like this movie.
Answer: Positive

Text: The food in the restaurant was disgusting.
Answer:
"""

prompt_two="""
Do the following texts represent a positive or negative sentiment?

Text: I like this movie.
Answer: Positive

Text: The weather was rainy today. I didn't like it.
Answer: Positive

Text: The food in the restaurant was disgusting.
Answer:
"""

output_zero=pipe(prompt_zero, max_new_tokens=15)
output_one=pipe(prompt_one, max_new_tokens=15)
output_two=pipe(prompt_two, max_new_tokens=15)

print("Zero-shot:", output_zero[0]['generated_text'])
print("One-shot:", output_one[0]['generated_text'])
print("Two-shot:", output_two[0]['generated_text'])

#SECOND TEST CASE FOR BINARY CLASSIFICATION

prompt_zero2 = """
Do the following texts express a positive or negative sentiment?

Text: I hate peeling potatoes.

Answer:
"""

# One-shot
prompt_one2 = """
Do the following texts express a positive or negative sentiment?

Text: I like this movie.
Answer: Positive

Text: Text: I hate peeling potatoes.
Answer:
"""

# Two-shot
prompt_two2 = """
Do the following texts express a positive or negative sentiment?

Text: I like this movie.
Answer: Positive

Text: The weather was rainy today. I didn't like it.
Answer: Negative

Text: I hate peeling potatoes.
Answer:
"""

# Generate outputs
output_zero2 = pipe(prompt_zero2, max_new_tokens=5)
output_one2 = pipe(prompt_one2, max_new_tokens=5)
output_two2 = pipe(prompt_two2, max_new_tokens=5)

# Print results
print("Zero-shot:", output_zero2[0]['generated_text'])
print("One-shot:", output_one2[0]['generated_text'])
print("Two-shot:", output_two2[0]['generated_text'])
# Evaluation
# For the binary sentiment classification task, GPT-2 was evaluated in zero-, one-, and two-shot settings with two test cases.

# Zero-shot: The model struggled to provide a clear sentiment, often repeating or generating unrelated text. For example, for "The food in the restaurant was disgusting." it did not output a clear Positive/Negative answer. Similarly, for "I hate peeling potatoes." it generated "Yes" instead of the correct sentiment.

# One-shot: Providing a single example significantly improved performance. The model correctly labeled "I hate peeling potatoes." as "Negative", showing that even one example helps GPT-2 understand the task pattern.

# Two-shot: Including two examples did not always improve results. For the second test case, the model incorrectly labeled "I hate peeling potatoes." as "Positive". This indicates that while two-shot prompts can sometimes improve adherence to patterns, GPT-2 may still produce inconsistent outputs for sentiment classification because it is a general language model rather than a fine-tuned classifier.

# Overall: GPT-2 can roughly perform sentiment classification, and one-shot prompting may yield the most reliable result in these cases, while two-shot prompting may occasionally confuse the model.




Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Zero-shot: 
Do the following texts express a positive or negative sentiment? 

Text: The food in the restaurant was disgusting.
Answer: 

Text: The food was disgusting.

Answer: 


One-shot: 
Do the following texts express a positive or negative sentiment?

Text: I like this movie.
Answer: Positive

Text: The food in the restaurant was disgusting.
Answer:

Text: Did you know that when you are hungry, you can pick
Two-shot: 
Do the following texts represent a positive or negative sentiment?

Text: I like this movie.
Answer: Positive

Text: The weather was rainy today. I didn't like it.
Answer: Positive

Text: The food in the restaurant was disgusting.
Answer:

Text: The only thing I could think about was what could be fun


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Zero-shot: 
Do the following texts express a positive or negative sentiment? 

Text: I hate peeling potatoes.

Answer: Yes.

Text
One-shot: 
Do the following texts express a positive or negative sentiment? 

Text: I like this movie.
Answer: Positive

Text: Text: I hate peeling potatoes.
Answer: Negative

Text:
Two-shot: 
Do the following texts express a positive or negative sentiment? 

Text: I like this movie.
Answer: Positive

Text: The weather was rainy today. I didn't like it.
Answer: Negative

Text: I hate peeling potatoes.
Answer: Positive

Text:


In [25]:
#Person name recognition
names_prompt_zero="""
 List the person names occurring in the following texts.

 Text: Lisa went to the childregarden in the morning.
 Answer:
 """
names_prompt_one="""

List the person names occurring in the following texts.

Text: George helped me with my math homework today.
Answer: George

Text:  Lisa went to the childregarden in the morning.
Answer:
"""

names_prompt_two="""
List the person names occurring in the following texts.
Text: George helped me with my math homework today.
Answer: George

Text: Last summer, I travelled to Bulgaria with Lucas.
Answer: Lucas

Text:  Lisa went to the childregarden in the morning.
Answer:

"""
names_output_zero= pipe(names_prompt_zero, max_new_tokens=10)
names_output_one=pipe(names_prompt_one, max_new_tokens=10)
names_output_two=pipe (names_prompt_two, max_new_tokens=10)

print("Zero-shot:", names_output_zero[0]["generated_text"])
print("One-shot:", names_output_one[0]["generated_text"])
print("Two_shot:", names_output_two[0]["generated_text"])

#SECOND TEST CASE

names_prompt_zero2 = """
List the person names occurring in the following texts.

Text:  Lisa went to the childregarden in the morning.
Answer:
"""

names_prompt_one2="""
List the person names occurring in the following texts.

Text: Michael went to the library to study. After that he had lunch with Irmak. They have been  dating for 3 years. In the evening they hanged out with friends and Karoline stayed at their place after partying.
Answer: Michael, Irmak, Karoline

Text:  Lisa went to the childregarden in the morning.
Answer:
"""

names_prompt_two2="""
List the person names occurring in the following texts.

Text: Michael went to the library to study. After that he had lunch with Irmak. They have been  dating for 3 years. In the evening they hanged out with friends and Karoline stayed at their place after partying.
Answer: Michael, Irmak, Karoline

Text: Last summer, I travelled to Georgia with two friends: Kirill and Vitalic. It was an amazing and fun vacation. After we arrived, Josephine met us at the airport.
Answer: Kirill, Vitalic, Josephine

Text:  Lisa went to the childregarden in the morning.
Answer:
"""

names_output_zero2= pipe(names_prompt_zero2, max_new_tokens=10)
names_output_one2= pipe(names_prompt_one2, max_new_tokens=10)
names_output_two2=pipe(names_prompt_two2, max_new_tokens=10)

print("Zero-shot:",names_output_zero2[0] ["generated_text"] )
print("one_shot", names_output_one2[0]["generated_text"])
print("Two-shot", names_output_two2[0]["generated_text"])
# Evaluation
# Model had only one correct answer in the first test case for zero_shot.
# otherwise model provide quite random responses, but sometimes incuded a right output name, however had also an additional words in the output. Overall, GPT-2 is able to roughly perform person name recognition in zero-, one-, and two-shot settings.
# Iit often produces extra or noisy text because
# it is a general language model rather than a fine-tuned named entity recognition model.






Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Zero-shot: 
 List the person names occurring in the following texts. 

 Text: Lisa went to the childregarden in the morning.
 Answer: 
   

Lisa went to the childreg
One-shot: 
 
List the person names occurring in the following texts. 

Text: George helped me with my math homework today.
Answer: George

Text:  Lisa went to the childregarden in the morning.
Answer:

Text:  Lisa went to the childreg
Two_shot: 
List the person names occurring in the following texts. 
Text: George helped me with my math homework today.
Answer: George

Text: Last summer, I travelled to Bulgaria with Lucas.
Answer: Lucas

Text:  Lisa went to the childregarden in the morning.
Answer:


Text: 

Text: 



Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Zero-shot: 
List the person names occurring in the following texts. 

Text:  Lisa went to the childregarden in the morning.
Answer:

Lisa: 

Lisa: 

one_shot 
List the person names occurring in the following texts. 

Text: Michael went to the library to study. After that he had lunch with Irmak. They have been  dating for 3 years. In the evening they hanged out with friends and Karoline stayed at their place after partying.
Answer: Michael, Irmak, Karoline

Text:  Lisa went to the childregarden in the morning.
Answer:

Text: 

Text: 

Two-shot 
List the person names occurring in the following texts. 

Text: Michael went to the library to study. After that he had lunch with Irmak. They have been  dating for 3 years. In the evening they hanged out with friends and Karoline stayed at their place after partying.
Answer: Michael, Irmak, Karoline

Text: Last summer, I travelled to Georgia with two friends: Kirill and Vitalic. It was an amazing and fun vacation. After we arrived, Josephine me

In [30]:
#Two-digit addition: "This is a first grade math exam."
math_prompt_zero="""
This is a first grade math exam. Make a calculation and solve mathematical problem. Answer with only the number.

Text: 33+22=
Answer:
"""

math_prompt_one="""
This is a first grade math exam. Make a calculation and solve mathematical problem. Answer with only the number.

Text: 20+22=
Answer: 44

Text: 33+22=
Answer:
"""

math_prompt_two="""
This is a first grade math exam. Make a calculation and solve mathematical problem. Answer with only the number.

Text: 20+22=
Answer: 44

Text: 35+15=
Answer: 50

Text: 33+22=
Answer:
"""
math_output_zero=pipe (math_prompt_zero, max_new_tokens=5)
math_output_one=pipe(math_prompt_one, max_new_tokens=5)
math_output_two=pipe(math_prompt_two, max_new_tokens=5)

print("Zero-shot:", math_output_zero[0]["generated_text"])
print("One-shot:", math_output_one[0]["generated_text"])
print("Two-shot:", math_output_two[0]["generated_text"])

#SECOND TEST CASE

math_prompt_zero2="""
This is a first grade math exam. Make a calculation and solve mathematical problem. Answer with only the number.

Text: 3+2=
Answer:
"""

math_prompt_one2="""
This is a first grade math exam. Make a calculation and solve mathematical problem. Answer with only the number.

Text: 20+22=
Answer: 44

Text: 3+2=
Answer:
"""

math_prompt_two2="""
This is a first grade math exam. Make a calculation and solve mathematical problem. Answer with only the number.

Text: 20+22=
Answer: 44

Text: 35+15=
Answer: 50

Text: 3+2=
Answer:
"""
math_output_zero2=pipe (math_prompt_zero2, max_new_tokens=5)
math_output_one2=pipe(math_prompt_one2, max_new_tokens=5)
math_output_two2=pipe(math_prompt_two2, max_new_tokens=5)

print("Zero-shot:", math_output_zero2[0]["generated_text"])
print("One-shot:", math_output_one2[0]["generated_text"])
print("Two-shot:", math_output_two2[0]["generated_text"])

# Evaluation
# For the two-digit addition task, GPT-2 was evaluated in zero-, one-, and two-shot settings with two test cases.

# In the first test case (33+22), zero-shot prompting caused the model to repeat the input without providing an answer. With one-shot prompting, the model partially produced "40+", showing it attempted a calculation but did not complete it. In the two-shot setting, the model did not provide any response.

# In the second test case (3+2), zero-shot prompting again produced the input without solving it. One-shot produced no response, and two-shot only generated "1+", clearly failing to compute the correct sum.

# Overall, GPT-2 struggles with arithmetic tasks, especially in zero-shot and two-shot settings. While one-shot prompting occasionally encourages the model to attempt a calculation, the outputs are inconsistent and unreliable. This demonstrates that GPT-2 is not suited for precise numerical computations, as it is a language model rather than a calculator.





Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Zero-shot: 
This is a first grade math exam. Make a calculation and solve mathematical problem. Answer with only the number.

Text: 33+22=
Answer:

33+22=
One-shot: 
This is a first grade math exam. Make a calculation and solve mathematical problem. Answer with only the number.

Text: 20+22=
Answer: 44

Text: 33+22=
Answer:

Text: 40+
Two-shot: 
This is a first grade math exam. Make a calculation and solve mathematical problem. Answer with only the number.

Text: 20+22=
Answer: 44

Text: 35+15=
Answer: 50

Text: 33+22=
Answer:

Text:




Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Zero-shot: 
This is a first grade math exam. Make a calculation and solve mathematical problem. Answer with only the number.

Text: 3+2=
Answer:

3+2=
One-shot: 
This is a first grade math exam. Make a calculation and solve mathematical problem. Answer with only the number.

Text: 20+22=
Answer: 44

Text: 3+2=
Answer:

Text:


Two-shot: 
This is a first grade math exam. Make a calculation and solve mathematical problem. Answer with only the number.

Text: 20+22=
Answer: 44

Text: 35+15=
Answer: 50

Text: 3+2=
Answer:

Text: 1+


**Submit this exercise by submitting your code and your answers to the above questions as comments on the MOOC platform. You can return this Jupyter notebook (.ipynb) or .py, .R, etc depending on your programming preferences.**