# Introduction to Data Science 2025

# Week 4

In this week's exercise, we look at prompting and zero- and few-shot task settings. Below is a text generation example from https://github.com/TurkuNLP/intro-to-nlp/blob/master/text_generation_pipeline_example.ipynb demonstrating how to load a text generation pipeline with a pre-trained model and generate text with a given prompt. Your task is to load a similar pre-trained generative model and assess whether the model succeeds at a set of tasks in zero-shot, one-shot, and two-shot settings.

**Note: Downloading and running the pre-trained model locally may take some time. Alternatively, you can open and run this notebook on [Google Colab](https://colab.research.google.com/), as assumed in the following example.**

## Text generation example

This is a brief example of how to run text generation with a causal language model and `pipeline`.

Install [transformers](https://huggingface.co/docs/transformers/index) python package. This will be used to load the model and tokenizer and to run generation.

In [1]:
%pip install --quiet transformers

Import the `AutoTokenizer`, `AutoModelForCausalLM`, and `pipeline` classes. The first two support loading tokenizers and generative models from the [Hugging Face repository](https://huggingface.co/models), and the last wraps a tokenizer and a model for convenience.

In [2]:
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline

Load a generative model and its tokenizer. You can substitute any other generative model name here (e.g. [other TurkuNLP GPT-3 models](https://huggingface.co/models?sort=downloads&search=turkunlp%2Fgpt3)), but note that Colab may have issues running larger models.

In [None]:
MODEL_NAME = 'TurkuNLP/gpt3-finnish-large'

tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
model = AutoModelForCausalLM.from_pretrained(MODEL_NAME)

Instantiate a text generation pipeline using the tokenizer and model.

In [None]:
pipe = pipeline(
    'text-generation',
    model=model,
    tokenizer=tokenizer,
    device=model.device
)

We can now call the pipeline with a text prompt; it will take care of tokenizing, encoding, generation, and decoding:

In [None]:
output = pipe('Terve, miten menee?', max_new_tokens=25)

print(output)

Just print the text

In [None]:
print(output[0]['generated_text'])

We can also call the pipeline with any arguments that the model `generate` function supports. For details on text generation using `transformers`, see e.g. [this tutorial](https://huggingface.co/blog/how-to-generate).

Example with sampling and a high `temperature` parameter to generate more chaotic output:

In [None]:
output = pipe(
    'Terve, miten menee?',
    do_sample=True,
    temperature=10.0,
    max_new_tokens=25
)

print(output[0]['generated_text'])

## Exercise 1

Your task is to assess whether a generative model succeeds in the following tasks in zero-shot, one-shot, and two-shot settings:

- binary sentiment classification (positive / negative)

- person name recognition

- two-digit addition (e.g. 11 + 22 = 33)

For example, for assessing whether a generative model can name capital cities, we could use the following prompts:

- zero-shot:
	>"""\
	>Identify the capital cities of countries.
	>
	>Question: What is the capital of Finland?\
	>Answer:\
	>"""
- one-shot:
	>"""\
	>Identify the capital cities of countries.
	>
	>Question: What is the capital of Sweden?\
	>Answer: Stockholm
	>
	>Question: What is the capital of Finland?\
	>Answer:\
	>"""
- two-shot:
	>"""\
	>Identify the capital cities of countries.
	>
	>Question: What is the capital of Sweden?\
	>Answer: Stockholm
	>
	>Question: What is the capital of Denmark?\
	>Answer: Copenhagen
	>
	>Question: What is the capital of Finland?\
	>Answer:\
	>"""

You can do the tasks either in English or Finnish and use a generative model of your choice from the Hugging Face models repository, for example the following models:

- English: `gpt2-large`
- Finnish: `TurkuNLP/gpt3-finnish-large`

You can either come up with your own instructions for the tasks or use the following:

- English:
	- binary sentiment classification: "Do the following texts express a positive or negative sentiment?"
	- person name recognition: "List the person names occurring in the following texts."
	- two-digit addition: "This is a first grade math exam."
- Finnish:
	- binary sentiment classification: "Ilmaisevatko seuraavat tekstit positiivista vai negatiivista tunnetta?"
	- person name recognition: "Listaa seuraavissa teksteissä mainitut henkilönnimet."
	- two-digit addition: "Tämä on ensimmäisen luokan matematiikan koe."

Come up with at least two test cases for each of the three tasks, and come up with your own one- and two-shot examples.

In [9]:
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline

MODEL_NAME = 'gpt2-large'
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
tokenizer.pad_token = tokenizer.eos_token
model = AutoModelForCausalLM.from_pretrained(MODEL_NAME)
pipe = pipeline('text-generation', model=model, tokenizer=tokenizer, device='cpu')

print("Binary sentiment classification\n")

sentiment_tests = [
    "I love this book, it's amazing!",
    "This is terrible and disappointing."
]

print("Zero-shot:")
for text in sentiment_tests:
    prompt = f"Do the following texts express a positive or negative sentiment?\n\nText: {text}\nSentiment:"
    result = pipe(prompt, max_new_tokens=3, temperature=0.1)[0]['generated_text']
    answer = result[len(prompt):].strip().split()[0]
    print(f"  {text}: {answer}")

print("\nOne-shot:")
for text in sentiment_tests:
    prompt = f"""Do the following texts express a positive or negative sentiment?

Text: This is wonderful!
Sentiment: positive

Text: {text}
Sentiment:"""
    result = pipe(prompt, max_new_tokens=3, temperature=0.1)[0]['generated_text']
    answer = result[len(prompt):].strip().split()[0]
    print(f"  {text}: {answer}")

print("\nTwo-shot:")
for text in sentiment_tests:
    prompt = f"""Do the following texts express a positive or negative sentiment?

Text: This is wonderful!
Sentiment: positive

Text: I hate this.
Sentiment: negative

Text: {text}
Sentiment:"""
    result = pipe(prompt, max_new_tokens=3, temperature=0.1)[0]['generated_text']
    answer = result[len(prompt):].strip().split()[0]
    print(f"  {text}: {answer}")

print("\n\nPerson name recognition\n")

name_tests = [
    "Alice and Bob went to cinema.",
    "Cindy met Tom yesterday."
]

print("Zero-shot:")
for text in name_tests:
    prompt = f"List the person names occurring in the following texts.\n\nText: {text}\nNames:"
    result = pipe(prompt, max_new_tokens=10, temperature=0.1)[0]['generated_text']
    answer = result[len(prompt):].strip().split('\n')[0]
    print(f"  {text}: {answer}")

print("\nOne-shot:")
for text in name_tests:
    prompt = f"""List the person names occurring in the following texts.

Text: John and Mary are friends.
Names: John, Mary

Text: {text}
Names:"""
    result = pipe(prompt, max_new_tokens=10, temperature=0.1)[0]['generated_text']
    answer = result[len(prompt):].strip().split('\n')[0]
    print(f"  {text}: {answer}")

print("\nTwo-shot:")
for text in name_tests:
    prompt = f"""List the person names occurring in the following texts.

Text: John and Mary are friends.
Names: John, Mary

Text: Sarah called David yesterday.
Names: Sarah, David

Text: {text}
Names:"""
    result = pipe(prompt, max_new_tokens=10, temperature=0.1)[0]['generated_text']
    answer = result[len(prompt):].strip().split('\n')[0]
    print(f"  {text}: {answer}")

print("\n\nTwo-digit addition\n")

math_tests = [
    "21 + 55",
    "17 + 19"
]

print("Zero-shot:")
for problem in math_tests:
    prompt = f"This is a first grade math exam.\n\nProblem: {problem} =\nAnswer:"
    result = pipe(prompt, max_new_tokens=5, temperature=0.1)[0]['generated_text']
    answer = result[len(prompt):].strip().split()[0]
    print(f"  {problem} = {answer}")

print("\nOne-shot:")
for problem in math_tests:
    prompt = f"""This is a first grade math exam.

Problem: 11 + 33 = 44

Problem: {problem} ="""
    result = pipe(prompt, max_new_tokens=5, temperature=0.1)[0]['generated_text']
    answer = result[len(prompt):].strip().split()[0]
    print(f"  {problem} = {answer}")

print("\nTwo-shot:")
for problem in math_tests:
    prompt = f"""This is a first grade math exam.

Problem: 11 + 33 = 44
Problem: 20 + 18 = 38

Problem: {problem} ="""
    result = pipe(prompt, max_new_tokens=5, temperature=0.1)[0]['generated_text']
    answer = result[len(prompt):].strip().split()[0]
    print(f"  {problem} = {answer}")

Device set to use cpu


Binary sentiment classification

Zero-shot:
  I love this book, it's amazing!: I
  This is terrible and disappointing.: This

One-shot:
  I love this book, it's amazing!: positive
  This is terrible and disappointing.: negative

Two-shot:
  I love this book, it's amazing!: positive
  This is terrible and disappointing.: negative


Person name recognition

Zero-shot:
  Alice and Bob went to cinema.: Alice and Bob went to cinema.
  Cindy met Tom yesterday.: Cindy, Tom, Tom, Tom, Tom,

One-shot:
  Alice and Bob went to cinema.: Alice, Bob
  Cindy met Tom yesterday.: Cindy, Tom

Two-shot:
  Alice and Bob went to cinema.: Alice, Bob
  Cindy met Tom yesterday.: Cindy, Tom


Two-digit addition

Zero-shot:
  21 + 55 = 21
  17 + 19 = 17

One-shot:
  21 + 55 = 89
  17 + 19 = 33

Two-shot:
  21 + 55 = 75
  17 + 19 = 36
