# Lecture 7: In-Context Learning and Prompting

Lecture 7 | CMU ANLP Fall 2025 | Instructor: Sean Welleck


This is a notebook for [CMU CS11-711 Advanced NLP](https://cmu-l3.github.io/anlp-fall2025/).

### Basic prompting

The simplest way to use a language model: provide a prompt `x` and sample a completion `y ~ p(y|x)`. The model treats the prompt as a prefix and
generates a continuation based on patterns learned during pretraining.

In [None]:
import transformers

from transformers import AutoTokenizer, AutoModelForCausalLM

model = "HuggingFaceTB/SmolLM2-360M"

tokenizer = AutoTokenizer.from_pretrained(model)
model = AutoModelForCausalLM.from_pretrained(model)

#### Make a prompt `x` and tokenize it

In [2]:
x = "When a dog sees a squirrel, it will usually"

inputs = tokenizer(x, return_tensors='pt')
inputs

{'input_ids': tensor([[ 2427,   253,  2767, 10413,   253, 27721,    28,   357,   523,  2007]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])}

#### Generate a response

Here generating means autoregressive sampling, i.e.

```
context = `x`
for t in 0 .. max_new_tokens:
    Sample next token, y_t ~ p(y_t|context)
    Append y_t to context
```

In [3]:
outputs = model.generate(
    **inputs,
    max_new_tokens=20,
    do_sample=True,
    num_return_sequences=5,
    pad_token_id=tokenizer.eos_token_id
)

for i in range(5):
    print(f"===={i}====")
    print(tokenizer.decode(outputs[i]))

====0====
When a dog sees a squirrel, it will usually start wagging its tail or panting to attract it. Most dogs won’t attack a squirrel
====1====
When a dog sees a squirrel, it will usually take aim with its tail, which has an unusual reaction.

At the same time, its
====2====
When a dog sees a squirrel, it will usually run away instinctively. To help them, some dog lovers like to teach tricks. Once the
====3====
When a dog sees a squirrel, it will usually run away from or even chase it. The dog in question might be a dog that has already gotten
====4====
When a dog sees a squirrel, it will usually be excited and start barking. A cat's reaction to wild-animal is different; the cat would


### Instruction prompt ("zero shot")

Instead of just continuing text, we can prompt the model to perform a specific task by providing an instruction. This is "zero-shot" because we give no examples, just the task description. The model uses any instruction-following related patterns it learned during training.

In [4]:
prompt_template = """Classify the sentence's sentiment as 'Positive' or 'Negative':
{sentence}
Classification:"""


sentences = [
    "I love advanced NLP!",
    "I didn't race well and lost :("
]

prompt = prompt_template.format(sentence=sentences[0])
print(prompt)


Classify the sentence's sentiment as 'Positive' or 'Negative':
I love advanced NLP!
Classification:


In [5]:
for sentence in sentences:
    print(f"\n=============")
    
    prompt = prompt_template.format(sentence=sentence)
    inputs = tokenizer(prompt, return_tensors='pt')
    outputs = model.generate(
        **inputs,
        max_new_tokens=20,
        do_sample=True,
        num_return_sequences=5,
        pad_token_id=tokenizer.eos_token_id
    )

    for i in range(5):
        print(f"----{i}----")
        print(tokenizer.decode(outputs[i]))


----0----
Classify the sentence's sentiment as 'Positive' or 'Negative':
I love advanced NLP!
Classification: Positive

### Negative Sentences:

Use the `NegativeSentences()` method to create
----1----
Classify the sentence's sentiment as 'Positive' or 'Negative':
I love advanced NLP!
Classification:

4. Use Word2Vec to compute the vector representations of positive and negative sentiments:
The
----2----
Classify the sentence's sentiment as 'Positive' or 'Negative':
I love advanced NLP!
Classification: Positive (+)

#### 5.2.4.3 Using LDA to Discrim
----3----
Classify the sentence's sentiment as 'Positive' or 'Negative':
I love advanced NLP!
Classification: Positive

Sentence classification includes two major steps - feature extraction and feature selection. Each task requires
----4----
Classify the sentence's sentiment as 'Positive' or 'Negative':
I love advanced NLP!
Classification: 0/5 ***
You can see the sentence's sentiment is "Positive"

What

----0----
Classify the sentence's s

It's important to ensure that the output is formatted correctly!

### Instruction + examples ("few-shot")

We can provide examples of input-output pairs before the test input. This "few-shot" or "in-context learning" approach helps the model understand the task format and expected outputs without any parameter updates.

In [6]:
prompt_template = """Classify the sentence's sentiment as 'Positive' or 'Negative'. Examples:

Sentence:
This is such a cool lecture!
Classification:
Positive

Sentence:
I really don't like the last scene.
Classification:
Negative

Sentence:
{sentence}
Classification:
"""



In [7]:
for sentence in sentences:
    print(f"\n=============")
    prompt = prompt_template.format(sentence=sentence)
    inputs = tokenizer(prompt, return_tensors='pt')
    outputs = model.generate(
        **inputs,
        max_new_tokens=20,
        do_sample=True,
        num_return_sequences=5,
        stop_strings=["\n\n"],
        tokenizer=tokenizer,
        pad_token_id=tokenizer.eos_token_id
    )

    for i in range(5):
        print(f"----{i}----")
        print(tokenizer.decode(outputs[i]))


----0----
Classify the sentence's sentiment as 'Positive' or 'Negative'. Examples:

Sentence:
This is such a cool lecture!
Classification:
Positive

Sentence:
I really don't like the last scene.
Classification:
Negative

Sentence:
I love advanced NLP!
Classification:
Positive
```


----1----
Classify the sentence's sentiment as 'Positive' or 'Negative'. Examples:

Sentence:
This is such a cool lecture!
Classification:
Positive

Sentence:
I really don't like the last scene.
Classification:
Negative

Sentence:
I love advanced NLP!
Classification:
Positive

<|endoftext|><|endoftext|>
----2----
Classify the sentence's sentiment as 'Positive' or 'Negative'. Examples:

Sentence:
This is such a cool lecture!
Classification:
Positive

Sentence:
I really don't like the last scene.
Classification:
Negative

Sentence:
I love advanced NLP!
Classification:
Positive

<|endoftext|><|endoftext|>
----3----
Classify the sentence's sentiment as 'Positive' or 'Negative'. Examples:

Sentence:
This is such

### Chat templates

Some models have been fine-tuned to operate as chat assistants. The chat is represented as a series of messages that are turned into a string using special tags. There is also a *system message* that provides instructions about how the model should behave. 

These models are often called "instruct" models because they've been trained to follow instructions rather than just complete text.

In [8]:
import transformers

from transformers import AutoTokenizer, AutoModelForCausalLM

model = "HuggingFaceTB/SmolLM2-360M-Instruct"

tokenizer = AutoTokenizer.from_pretrained(model)
model = AutoModelForCausalLM.from_pretrained(model)

In [9]:
messages = [{
    "role": "user", 
    "content": "What is the capital of France."
}]

input_text = tokenizer.apply_chat_template(messages, tokenize=False)
print("Input text: ", input_text, sep="\n")


Input text: 
<|im_start|>system
You are a helpful AI assistant named SmolLM, trained by Hugging Face<|im_end|>
<|im_start|>user
What is the capital of France.<|im_end|>



Generate a response

In [10]:
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=50, do_sample=True)
print(tokenizer.decode(outputs[0]))

<|im_start|>system
You are a helpful AI assistant named SmolLM, trained by Hugging Face<|im_end|>
<|im_start|>user
What is the capital of France.<|im_end|>
<|im_start|>assistant
The capital of France is Paris.<|im_end|>


### System prompts

Chat models often support a system message that sets the model's behavior or role. This message is typically prepended to the conversation and instructs the model how to respond throughout the interaction. For example, we can use the system prompt to have the model respond in French.

In [11]:
messages = [
    {
        "role": "system",
        "content": "You are an assistant that speaks in French."
    },
    {
        "role": "user", 
        "content": "What is the capital of France."
    }
]

input_text = tokenizer.apply_chat_template(messages, tokenize=False)
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=50, do_sample=True)
print(tokenizer.decode(outputs[0]))

<|im_start|>system
You are an assistant that speaks in French.<|im_end|>
<|im_start|>user
What is the capital of France.<|im_end|>
<|im_start|>assistant
Leparisien (Paris) et leleilous (Brest).<|im_end|>


### Instruction ("zero shot")

Using the chat format for zero-shot tasks. The instruction-tuned model may follow instructions more reliably than the base model, though output formatting can still be inconsistent.

In [12]:
messages = [
    {
        "role": "user", 
        "content": ("Classify the sentence's sentiment as 'Positive' or 'Negative':\n" +
                    "Sentence: 'I love advanced NLP!'\n")
    }
]

input_text = tokenizer.apply_chat_template(messages, tokenize=False)
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=50, do_sample=True, num_return_sequences=3)
for i in range(len(outputs)):
    print(f"----{i}----")
    print(tokenizer.decode(outputs[i]))

----0----
<|im_start|>system
You are a helpful AI assistant named SmolLM, trained by Hugging Face<|im_end|>
<|im_start|>user
Classify the sentence's sentiment as 'Positive' or 'Negative':
Sentence: 'I love advanced NLP!'
<|im_end|>
<|im_start|>assistant
Positive<|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|>
----1----
<|im_start|>system
You are a helpful AI assistant named SmolLM, trained by Hugging Face<|im_end|>
<|im_start|>user
Classify the sentence's sentiment as 'Positive' or 'Negative':
Sentence: 'I love advanced NLP!'
<|im_end|>
<|im_start|>assistant
Sentiment Analysis: Classify the sentiment as 'Positive', given that the sentence expresses sentiment as 'I love advanced NLP!'.<|im_end|>
----2----
<|im_start|>system
You are a helpful AI assistant named SmolLM, trained by

#### Approach 1: write a detailed instruction (in the user or system prompt)

We can improve output formatting by providing explicit, detailed instructions about the desired format in either the system or user message.

In [21]:
messages = [
    {   "role": "system",
        "content": """You are an expert sentiment classifier.
Your task is to classify a sentence's sentiment as 'Positive' or 'Negative'.
The user will provide you with a sentence.
Format your output as:

Classification: Positive or Negative
"""
    },
    {
        "role": "user",
        "content": ("I love advanced NLP!")
    }
]

input_text = tokenizer.apply_chat_template(messages, tokenize=False)
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=50, do_sample=True, num_return_sequences=3)
for i in range(len(outputs)):
    print(f"----{i}----")
    print(tokenizer.decode(outputs[i]))

----0----
<|im_start|>system
You are an expert sentiment classifier.
Your task is to classify a sentence's sentiment as 'Positive' or 'Negative'.
The user will provide you with a sentence.
Format your output as:

Classification: Positive or Negative
<|im_end|>
<|im_start|>user
I love advanced NLP!<|im_end|>
<|im_start|>assistant
Classification: Positive<|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|><|im_end|>
----1----
<|im_start|>system
You are an expert sentiment classifier.
Your task is to classify a sentence's sentiment as 'Positive' or 'Negative'.
The user will provide you with a sentence.
Format your output as:

Classification: Positive or Negative
<|im_end|>
<|im_start|>user
I love

#### Approach 2: provide examples (either in the system prompt or as a sequence of messages)

We can show the model the desired behavior through example conversations, where the assistant demonstrates the correct format and task execution. This is analogous to the few-shot examples we saw earlier, but using the chat format.

In [14]:
messages = [
    {
        "role": "user",
        "content": "Classify the sentence's sentiment as 'Positive' or 'Negative':\nSentence: 'This is such a cool lecture!'"
    },
    {
        "role": "assistant",
        "content": "Classification: Positive"
    },
    {
        "role": "user",
        "content": "Classify the sentence's sentiment as 'Positive' or 'Negative':\nSentence: 'I really don't like the last scene.'"
    },
    {
        "role": "assistant",
        "content": "Classification: Negative"
    },
    {
        "role": "user",
        "content": "Classify the sentence's sentiment as 'Positive' or 'Negative':\nSentence: 'I love advanced NLP!'"
    }
]

input_text = tokenizer.apply_chat_template(messages, tokenize=False)
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=50, do_sample=True, num_return_sequences=3)
for i in range(len(outputs)):
    print(f"----{i}----")
    print(tokenizer.decode(outputs[i]))

----0----
<|im_start|>system
You are a helpful AI assistant named SmolLM, trained by Hugging Face<|im_end|>
<|im_start|>user
Classify the sentence's sentiment as 'Positive' or 'Negative':
Sentence: 'This is such a cool lecture!'<|im_end|>
<|im_start|>assistant
Classification: Positive<|im_end|>
<|im_start|>user
Classify the sentence's sentiment as 'Positive' or 'Negative':
Sentence: 'I really don't like the last scene.'<|im_end|>
<|im_start|>assistant
Classification: Negative<|im_end|>
<|im_start|>user
Classify the sentence's sentiment as 'Positive' or 'Negative':
Sentence: 'I love advanced NLP!'<|im_end|>
<|im_start|>assistant
Classification: Positive<|im_end|>
----1----
<|im_start|>system
You are a helpful AI assistant named SmolLM, trained by Hugging Face<|im_end|>
<|im_start|>user
Classify the sentence's sentiment as 'Positive' or 'Negative':
Sentence: 'This is such a cool lecture!'<|im_end|>
<|im_start|>assistant
Classification: Positive<|im_end|>
<|im_start|>user
Classify the sen

### Chain-of-thought with base model

Prompting the model to "think step by step" can elicit intermediate reasoning steps before the final answer. Even base models can exhibit this behavior when prompted appropriately, though the reasoning may be flawed.

In [15]:
model = "HuggingFaceTB/SmolLM2-360M"

tokenizer = AutoTokenizer.from_pretrained(model)
model = AutoModelForCausalLM.from_pretrained(model)

In [16]:
prompts = [
    """Q: On average Joe throws 25 punches per minute. 
A fight lasts 5 rounds of 3 minutes. 
How many punches did he throw?
A: Let's think step by step.""",
]

for prompt in prompts:
    inputs = tokenizer(prompt, return_tensors="pt")
    outputs = model.generate(
        **inputs, 
        pad_token_id=tokenizer.eos_token_id,
        max_new_tokens=512,
        temperature=0.4,
        do_sample=True
    )
    print(tokenizer.decode(outputs[0]))
    print("====")

Q: On average Joe throws 25 punches per minute. 
A fight lasts 5 rounds of 3 minutes. 
How many punches did he throw?
A: Let's think step by step.
First, we need to find the number of rounds.
We know that 1 round = 3 minutes, so we can write this as:
1 round = 3 minutes
Now, we can find the number of minutes in a round.
We know that 1 minute = 60 seconds, so we can write this as:
1 minute = 60 seconds
Now, we can find the number of seconds in a minute.
We know that 1 second = 60 minutes, so we can write this as:
1 second = 60 minutes
Now, we can find the number of seconds in a round.
We know that 1 round = 3 minutes, so we can write this as:
1 round = 3 minutes
Now, we can find the number of seconds in a minute.
We know that 1 minute = 60 seconds, so we can write this as:
1 minute = 60 seconds
Now, we can find the number of seconds in a round.
We know that 1 round = 3 minutes, so we can write this as:
1 round = 3 minutes
Now, we can find the number of seconds in a minute.
We know that 

### Chain-of-thought with instruct model

Many instruction-tuned models can solve problems step-by-step when asked, as they've typically been trained to follow such instructions.

In [17]:
model = "HuggingFaceTB/SmolLM2-360M-Instruct"

tokenizer = AutoTokenizer.from_pretrained(model)
model = AutoModelForCausalLM.from_pretrained(model)

In [18]:
messages = [
    {
        "role": "user", 
        "content": """Solve the problem:
On average Joe throws 25 punches per minute. 
A fight lasts 5 rounds of 3 minutes. 
How many punches did he throw?"""
    }
]

input_text = tokenizer.apply_chat_template(messages, tokenize=False)
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.4, do_sample=True)
print(tokenizer.decode(outputs[0]))

<|im_start|>system
You are a helpful AI assistant named SmolLM, trained by Hugging Face<|im_end|>
<|im_start|>user
Solve the problem:
On average Joe throws 25 punches per minute. 
A fight lasts 5 rounds of 3 minutes. 
How many punches did he throw?<|im_end|>
<|im_start|>assistant
To solve this problem, we need to calculate the total number of punches Joe threw in the fight.

First, we need to find out how many rounds there are in the fight. Since the fight lasts 5 rounds of 3 minutes each, we can calculate the total number of minutes in the fight.

There are 3 minutes in each round, so the total number of minutes in 5 rounds is:
3 minutes x 5 rounds = 15 minutes

Now, we can calculate the total number of punches Joe threw in the fight:
Total number of punches = Total number of minutes in the fight
Total number of punches = 15 minutes

Therefore, Joe threw 15 punches in the fight.<|im_end|>


### Program-aided reasoning

Instead of natural language reasoning, we can prompt the model to solve problems by writing and executing code. This leverages the model's code generation capabilities and the code executor's accurate computations.

In [19]:
messages = [
    {
        "role": "user", 
        "content": """Solve the problem by writing a Python program:
On average Joe throws 25 punches per minute. 
A fight lasts 5 rounds of 3 minutes. 
How many punches did he throw?"""
    }
]

input_text = tokenizer.apply_chat_template(messages, tokenize=False)
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.4, do_sample=True)
print(tokenizer.decode(outputs[0]))

<|im_start|>system
You are a helpful AI assistant named SmolLM, trained by Hugging Face<|im_end|>
<|im_start|>user
Solve the problem by writing a Python program:
On average Joe throws 25 punches per minute. 
A fight lasts 5 rounds of 3 minutes. 
How many punches did he throw?<|im_end|>
<|im_start|>assistant
Here's a Python program that solves the problem:

```python
def calculate_punches(minutes_per_punch):
    punch_time = 5  # 5 rounds x 3 minutes per round = 15 minutes
    punch_per_minute = 25  # Joe throws 25 punches per minute
    total_punches = punch_time * punch_per_minute
    return total_punches

minutes_per_punch = 25
minutes_in_round = 15

total_punches = calculate_punches(minutes_per_punch)
print(f"Joe threw {total_punches} punches in one fight.")
```

This program takes in the average punching time (minutes per punch) and the number of rounds in a fight, and then calculates the total number of punches Joe threw. The result is then printed out.<|im_end|>
