# Prompt stuffing example

This notebook demonstrates **prompt stuffing**: injecting a large block of context directly into the prompt and observing how it changes the model's answer.

We ask the **same question** in two scenarios:

1. Without any background context.
2. With a long article stuffed into the prompt.

You will see how additional context can ground the answer, but also how naive stuffing is not a scalable strategy compared to more structured context engineering techniques (RAG, memory, state, etc.).

## Setup: API key and client

In this cell, you will be asked for your OpenAI API key (it will not be stored in the notebook). We then create a client using the `openai` Python library.

In [1]:
import os
from getpass import getpass
from textwrap import dedent
from openai import OpenAI

# Ask for the API key interactively if not defined as env variable
if "OPENAI_API_KEY" not in os.environ:
    os.environ["OPENAI_API_KEY"] = getpass("Enter your OpenAI API key: ")

client = OpenAI()
MODEL = "gpt-4o-mini"

Enter your OpenAI API key: ··········


## Define the stuffed article and helper functions

The article below simulates an internal note about context engineering. We then define two helper functions:

* `ask_question` — asks the question with no background article.
* `ask_with_stuffed_context` — asks the same question but includes the article in the prompt (prompt stuffing).

In [10]:
ARTICLE = dedent(
    """
    [Article: Internal note on context engineering]

    Context engineering is the discipline of shaping and managing all the
    information that an AI model receives at inference time. Instead of
    treating a prompt as a single flat string, context engineering views
    the model's input as a deliberately assembled mix of complementary
    components: system instructions, user request, external knowledge,
    tools, memory, and state.

    The first challenge is relevance. Because the context window is
    limited, engineers must decide which pieces of information are
    genuinely useful for solving the current task and which are noise.
    Overstuffing the prompt with loosely related details can dilute the
    model's attention and lead to generic or confused answers.

    The second challenge is freshness. The model's parametric memory is
    frozen at training time, so external knowledge and memory pipelines
    must continuously feed updated information into the context window.
    This includes recent documents, user preferences, and live signals
    from tools or sensors.

    The third challenge is structure. Raw text is often messy; effective
    context engineering requires predictable patterns such as templates,
    sections, and schemas. Well-structured context makes it easier for
    the model to understand roles, goals, constraints, and how retrieved
    evidence should be used.

    In short, context engineering is about giving the model exactly the
    right information, in the right format, at the right moment—no more
    and no less.
    """
).strip()


def ask_question(question: str) -> str:
    response = client.chat.completions.create(
        model=MODEL,
        messages=[
            {
                "role": "system",
                "content": (
                    "You are a careful assistant. "
                    "If you do not have enough information, say "
                    "\"I don't know based on the context I have.\""
                ),
            },
            {
                "role": "user",
                "content": question,
            },
        ],
        temperature=0.3,
    )
    return response.choices[0].message.content


def ask_with_stuffed_context(article: str, question: str) -> str:
    stuffed_content = dedent(
        f"""
        I will give you an internal article as context. Use ONLY that
        article to answer the question. If something is not supported
        by the article, say you don't know.

        === Article begins ===
        {article}
        === Article ends ===

        Question:
        {question}
        """
    ).strip()

    return ask_question(stuffed_content)

## Run both scenarios

We now ask the same question twice:

1. Without any article.
2. With the article stuffed into the prompt.

Compare the outputs to see how the background context changes the model's answer.

In [11]:
question = (
    "According to the article, what are the three main challenges of "
    "context engineering, and how are they described?"
)

print("Scenario 1: No stuffed context\n")
answer1 = ask_question(question)
print(answer1)
print()

print("Scenario 2: Question with stuffed article\n")
answer2 = ask_with_stuffed_context(ARTICLE, question)
print(answer2)

Scenario 1: No stuffed context

I don't know based on the context I have.

Scenario 2: Question with stuffed article

The three main challenges of context engineering described in the article are:

1. **Relevance**: Engineers must determine which pieces of information are genuinely useful for solving the current task and which are noise. Overstuffing the prompt with loosely related details can dilute the model's attention and lead to generic or confused answers.

2. **Freshness**: Since the model's parametric memory is frozen at training time, there is a need for external knowledge and memory pipelines to continuously provide updated information into the context window. This includes recent documents, user preferences, and live signals from tools or sensors.

3. **Structure**: Effective context engineering requires predictable patterns such as templates, sections, and schemas, as raw text is often messy. Well-structured context helps the model understand roles, goals, constraints, and 