# GPT-5.2 Prompting Guide

This notebook covers prompting techniques specific to GPT-5.2, based on OpenAI's official cookbook.

## Key GPT-5.2 Traits

| Trait | Description |
|-------|-------------|
| Deliberate scaffolding | Constructs clearer intermediate plans by default |
| Lower verbosity | More concise, task-focused output |
| Stronger instruction adherence | Reduced drift from user intent |
| Conservative grounding | Favors correctness and explicit reasoning |

## Table of Contents

1. [Setup](#setup)
2. [Verbosity & Output Control](#verbosity)
3. [Scope Discipline](#scope)
4. [Long-Context Handling](#long-context)
5. [Hallucination Mitigation](#hallucination)
6. [Structured Extraction](#extraction)
7. [Tool-Calling Best Practices](#tools)
8. [Context Compaction](#compaction)
9. [Model Migration](#migration)

<a id='setup'></a>
## 1. Setup

In [1]:
import os
import getpass
import json
from openai import OpenAI
from dotenv import load_dotenv

load_dotenv()

if not os.environ.get("OPENAI_API_KEY"):
    os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter your OpenAI API key: ")

client = OpenAI()
print("OpenAI client initialized")

OpenAI client initialized


<a id='verbosity'></a>
## 2. Verbosity & Output Control

GPT-5.2 responds well to explicit output shape constraints. For enterprise/coding agents, define:
- Default response length
- Format for simple vs. complex answers
- Preference for lists/bullets over narrative

In [2]:
# Define a reusable verbosity specification
VERBOSITY_SPEC = """
<output_verbosity_spec>
- Default: 3-6 sentences or ≤5 bullets for typical answers.
- For yes/no questions: ≤2 sentences.
- For complex multi-step tasks:
  - 1 short overview paragraph
  - then ≤5 bullets: What changed, Where, Risks, Next steps, Open questions.
- Prefer compact bullets over long narrative paragraphs.
- Do not rephrase the user's request unless semantics change.
</output_verbosity_spec>
"""

print("Verbosity specification defined")

Verbosity specification defined


In [3]:
# Compare responses with and without verbosity spec
print("Comparing output with verbosity control")
print("-" * 60)

question = "Explain the differences between REST and GraphQL APIs."

# Without verbosity spec
response_default = client.responses.create(
    model="gpt-5.2",
    reasoning={"effort": "low"},
    input=[{"role": "user", "content": question}]
)

# With verbosity spec
response_controlled = client.responses.create(
    model="gpt-5.2",
    reasoning={"effort": "low"},
    input=[
        {"role": "developer", "content": VERBOSITY_SPEC},
        {"role": "user", "content": question}
    ]
)

print("DEFAULT OUTPUT:")
print(response_default.output_text)
print(f"\nTokens: {response_default.usage.output_tokens}")
print("\n" + "=" * 60)
print("\nWITH VERBOSITY SPEC:")
print(response_controlled.output_text)
print(f"\nTokens: {response_controlled.usage.output_tokens}")

Comparing output with verbosity control
------------------------------------------------------------
DEFAULT OUTPUT:
REST and GraphQL are two different approaches to designing web APIs. The main differences are about how you model data, fetch it, and evolve the API over time.

## 1) Interface and endpoints
- **REST:** Many endpoints, typically organized around resources (e.g., `GET /users/123`, `GET /users/123/orders`).
- **GraphQL:** Usually a **single endpoint** (e.g., `POST /graphql`) where the client specifies what it needs in the request body.

## 2) Data fetching: over-fetching vs under-fetching
- **REST:** Responses are fixed per endpoint. Clients often:
  - **Over-fetch** (get more fields than needed), or
  - **Under-fetch** (need multiple requests to assemble a view).
- **GraphQL:** Clients request **exactly the fields** they need in one query, which can reduce both over- and under-fetching.

Example:
- REST: might require `GET /user/123` and `GET /user/123/orders`
- GraphQL: 

<a id='scope'></a>
## 3. Scope Discipline

Prevent GPT-5.2 from adding unintended features or embellishments, especially in frontend/design work.

In [4]:
SCOPE_SPEC = """
<design_and_scope_constraints>
- Implement EXACTLY and ONLY what the user requests.
- No extra features, components, or UX embellishments.
- Do NOT invent colors, shadows, animations, or new UI elements.
- If any instruction is ambiguous, choose the simplest valid interpretation.
</design_and_scope_constraints>
"""

print("Scope discipline example")
print("-" * 60)

task = "Create a Python function that adds two numbers."

response = client.responses.create(
    model="gpt-5.2",
    reasoning={"effort": "none"},
    input=[
        {"role": "developer", "content": SCOPE_SPEC},
        {"role": "user", "content": task}
    ]
)

print(response.output_text)

Scope discipline example
------------------------------------------------------------
```python
def add_two_numbers(a, b):
    return a + b
```


<a id='long-context'></a>
## 4. Long-Context Handling

For inputs >10K tokens, use forced summarization and re-grounding to user constraints.

In [5]:
LONG_CONTEXT_SPEC = """
<long_context_handling>
- For lengthy inputs (multi-chapter docs, long threads, multiple files):
  1. First, produce a short internal outline of key sections relevant to the request.
  2. Re-state user's constraints explicitly (date range, product, team, etc.).
  3. Anchor claims to specific sections rather than speaking generically.
- If the answer depends on fine details (dates, thresholds, clauses), quote them.
</long_context_handling>
"""

print("Long-context handling pattern")
print("-" * 60)

# Simulating a document analysis task
document = """
=== CONTRACT SUMMARY ===
Effective Date: January 1, 2025
Parties: TechCorp Inc. and DataServices LLC
Term: 24 months
Auto-Renewal: Yes, 12-month periods
Termination Notice: 90 days written notice required
Liability Cap: $500,000
Governing Law: State of Delaware
"""

response = client.responses.create(
    model="gpt-5.2",
    reasoning={"effort": "medium"},
    input=[
        {"role": "developer", "content": LONG_CONTEXT_SPEC},
        {"role": "user", "content": f"Document:\n{document}\n\nQuestion: When can we terminate without penalty?"}
    ]
)

print(response.output_text)

Long-context handling pattern
------------------------------------------------------------
Based on the **Contract Summary** you provided, the only termination condition stated is:

- **“Termination Notice: 90 days written notice required”**

No separate **termination fee/penalty** is mentioned in the summary, so **the only “without penalty” path we can safely infer from what you shared is terminating by giving the required 90 days’ written notice** (subject to anything else that may exist in the full contract).

### To avoid auto-renewal / terminate at the end of the initial term
- **Effective Date:** January 1, 2025  
- **Term:** 24 months → runs through **December 31, 2026**
- To end it at that point, you must deliver written notice **at least 90 days before 12/31/2026**, i.e. **by October 2, 2026**.

### For later renewals
Because it **auto-renews in 12-month periods**, you’d need to give notice **90 days before the end of any renewal period** (e.g., to end on **12/31/2027**, notic

<a id='hallucination'></a>
## 5. Hallucination Mitigation

Configure prompts to handle uncertain or underspecified queries safely.

In [6]:
UNCERTAINTY_SPEC = """
<uncertainty_and_ambiguity>
- If the question is ambiguous or underspecified:
  - Ask 1-3 precise clarifying questions, OR
  - Present 2-3 plausible interpretations with labeled assumptions.
- When external facts may have changed recently:
  - Answer in general terms and state that details may have changed.
- Never fabricate exact figures, line numbers, or external references when uncertain.
- When unsure, prefer language like "Based on the provided context..."
</uncertainty_and_ambiguity>
"""

# Optional high-risk self-check for legal/financial contexts
HIGH_RISK_SPEC = """
<high_risk_self_check>
Before finalizing answers in legal, financial, compliance, or safety contexts:
- Re-scan your answer for unstated assumptions
- Check for specific numbers or claims not grounded in context
- Identify overly strong language ("always," "guaranteed," etc.)
- If found, soften or qualify them and explicitly state assumptions.
</high_risk_self_check>
"""

print("Hallucination mitigation example")
print("-" * 60)

response = client.responses.create(
    model="gpt-5.2",
    reasoning={"effort": "medium"},
    input=[
        {"role": "developer", "content": UNCERTAINTY_SPEC},
        {"role": "user", "content": "What's the exact revenue of Acme Corp last quarter?"}
    ]
)

print(response.output_text)

Hallucination mitigation example
------------------------------------------------------------
I can’t determine “Acme Corp’s exact revenue last quarter” from here without a specific source, because there are many companies that could be called “Acme Corp,” and revenue depends on the reporting entity and period.

1) Which Acme Corp do you mean (ticker/website/country)?  
2) What do you mean by “last quarter” (calendar Q4 2025, most recent fiscal quarter, or a specific quarter-end date)?  
3) Do you want the GAAP revenue figure from an earnings release/10‑Q, or another definition (e.g., net sales, operating revenue)?

If you share a link to the earnings report/filing (or paste the relevant table), I can extract the exact revenue number and cite where it appears.


<a id='extraction'></a>
## 6. Structured Extraction

For PDF/table extraction, provide explicit JSON schemas with required vs. optional field distinctions.

In [9]:
from pydantic import BaseModel, Field
from typing import Optional

class ContractExtraction(BaseModel):
    """Schema for contract data extraction"""
    party_name: str = Field(description="Name of the contracting party")
    jurisdiction: Optional[str] = Field(default=None, description="Legal jurisdiction if specified")
    effective_date: Optional[str] = Field(default=None, description="Contract effective date")
    termination_clause: Optional[str] = Field(default=None, description="Summary of termination terms")

print("Structured extraction example")
print("-" * 60)

contract_text = """
This Agreement is entered into by GlobalTech Solutions, effective March 15, 2025.
Either party may terminate with 60 days written notice.
"""

response = client.responses.parse(
    model="gpt-5.2",
    reasoning={"effort": "medium"},
    input=[
        {
            "role": "developer",
            "content": "Extract contract data. If a field is not present, set it to null."
        },
        {"role": "user", "content": f"Extract from:\n{contract_text}"}
    ],
    text_format=ContractExtraction
)

result = response.output_parsed
print(f"Party: {result.party_name}")
print(f"Effective Date: {result.effective_date}")
print(f"Jurisdiction: {result.jurisdiction}")
print(f"Termination: {result.termination_clause}")

Structured extraction example
------------------------------------------------------------
Party: GlobalTech Solutions
Effective Date: March 15, 2025
Jurisdiction: None
Termination: Either party may terminate with 60 days written notice.


<a id='tools'></a>
## 7. Tool-Calling Best Practices

GPT-5.2 tool usage guidelines:
- Prefer tools over internal knowledge for fresh/user-specific data
- Parallelize independent reads
- After write operations, restate what changed and where

In [24]:
TOOL_USAGE_SPEC = """
<tool_usage_rules>
- Prefer tools over internal knowledge when:
  - You need fresh or user-specific data (tickets, orders, configs, logs).
  - You reference specific IDs, URLs, or document titles.
- Parallelize independent reads when possible to reduce latency.
- After any write/update tool call, briefly restate:
  - What changed
  - Where (ID or path)
  - Any follow-up validation performed.
</tool_usage_rules>
"""

print("Tool-calling with GPT-5.2 (OpenAI function calling style)")
print("-" * 60)

# Note: Each tool object MUST include a "name" key at the top level of the tool dictionary (per OpenAI API)
tools = [
    {
        "type": "function",
        "name": "get_user_orders",  # <-- Required top-level attribute
        "description": "Retrieve orders for a user by their ID",
        "parameters": {
                "type": "object",
                "properties": {
                    "user_id": {
                        "type": "string",
                        "description": "The unique identifier for the user whose orders are to be retrieved"
                    },
                    "status": {
                        "type": "string",
                        "enum": ["pending", "shipped", "delivered"],
                        "description": "The status of orders to filter by. One of: pending, shipped, delivered."
                    },
                },
                "required": ["user_id"]
            }
        }
]

response = client.responses.create(
    model="gpt-5.2",
    reasoning={"effort": "low"},
    tools=tools,
    input=[
        {"role": "system", "content": TOOL_USAGE_SPEC},
        {"role": "user", "content": "What are the pending orders for user U12345?"}
    ]
)

# Output: the assistant may provide a direct answer or request a tool call via function_call type
# Simulate parsing tool call output to extract function name and call ID
tool_call = response.output[-1]

print("Parsed tool function call:")
print(f"  Name: {tool_call.name}")
print(f"  Call ID: {tool_call.call_id}")
print(f"  Arguments: {tool_call.arguments}")
# Should show: get_user_orders and the correct call ID and arguments

Tool-calling with GPT-5.2 (OpenAI function calling style)
------------------------------------------------------------
Parsed tool function call:
  Name: get_user_orders
  Call ID: call_RLSxNUN28ymoIkr0vmTUVRKJ
  Arguments: {"user_id":"U12345","status":"pending"}


<a id='compaction'></a>
## 8. Context Compaction

For multi-step agent flows exceeding context limits, use the compaction endpoint to reduce token footprint while preserving essential information.

In [6]:
print("Context Compaction Example")
print("-" * 60)

# Step 1: Generate a response that we'll then compact
response = client.responses.create(
    model="gpt-5.2",
    input="write a very long poem about a dog."
)

print(f"Original response tokens: {response.usage.output_tokens}")

# Get the output as a dict for compaction
output_item = response.output[0].model_dump()

# Step 2: Compact the conversation
# Note: client.responses.compact() requires openai>=1.60.0
# The API endpoint is POST /v1/responses/compact
try:
    compacted_response = client.responses.compact(
        model="gpt-5.2",
        input=[
            {"role": "user", "content": "write a very long poem about a dog."},
            output_item
        ]
    )
    print("\nCompacted response:")
    print(json.dumps(compacted_response.model_dump(), indent=2))
except AttributeError:
    print("\nNote: client.responses.compact() not available in your SDK version.")
    print("Update with: pip install --upgrade openai")
    print("\nAlternatively, use the REST API directly:")
    print("POST https://api.openai.com/v1/responses/compact")

Context Compaction Example
------------------------------------------------------------
Original response tokens: 3388

Compacted response:
{
  "id": "resp_04d3722d907f4ab20169849e22bfdc819497e7c118202d9aeb",
  "created_at": 1770298924,
  "object": "response.compaction",
  "output": [
    {
      "id": "msg_04d3722d907f4ab20169849e22c71c8194916ca1ae2a09e9d9",
      "content": [
        {
          "annotations": null,
          "text": "write a very long poem about a dog.",
          "type": "input_text",
          "logprobs": null
        }
      ],
      "role": "user",
      "status": "completed",
      "type": "message"
    },
    {
      "id": "cmp_04d3722d907f4ab20169849e23757c8194aa2b9d37e21a61ad",
      "encrypted_content": "gAAAAABphJ4sT0QEPlhPLmgxgCO2IHQYmgB_PAqlPrRS9Lb9ISKVwTYR-h_FbGv7WmwoV9d4GQfzGrlHXrOLX9Etd_rIKyxwKA7FXTxWhFTGDcQOauBohYnW1Q-ZrahPf3F9vGybQtCvmnbR9U2B-M2cLMDiwbg7knUnUmVQ2NCCTqlFVZrdZ3QQFWG6G6wcOmCjiiuZ5H9wkdp8alVPmNa-KlcU5x76L9-quU4bJNS-5DRYrCNiXmVP1hpip96yu

<a id='migration'></a>
## 9. Model Migration Guide

When upgrading to GPT-5.2 from earlier models:

| From | reasoning_effort | Notes |
|------|-----------------|-------|
| GPT-4o | `none` | Default to fast/low-deliberation |
| GPT-4.1 | `none` | Preserve snappy behavior |
| GPT-5 | same | Maintain latency/quality profile |
| GPT-5.1 | same | Adjust only after running evals |

In [None]:
print("Migration Steps")
print("-" * 60)

migration_steps = """
1. Switch models without changing prompts
2. Explicitly set reasoning_effort to match prior model's profile
3. Run evaluation suite for baseline
4. Address regressions through targeted prompt tuning
5. Re-measure after each incremental change
"""

print(migration_steps)

# Example: Migrating from GPT-4o
print("\nExample: GPT-4o → GPT-5.2 migration")
print("-" * 60)

response = client.responses.create(
    model="gpt-5.2",
    reasoning={"effort": "none"},  # Match GPT-4o speed profile
    input=[{"role": "user", "content": "Classify this as spam or not spam: 'You won $1000000!'"}]
)

print(f"Result: {response.output_text}")
print(f"Total tokens: {response.usage.total_tokens}")

## Summary

### Key GPT-5.2 Prompting Techniques

1. **Verbosity Control** - Define explicit output shape constraints
2. **Scope Discipline** - Prevent feature creep with explicit boundaries
3. **Long-Context** - Force summarization and re-grounding for large inputs
4. **Hallucination Mitigation** - Handle uncertainty explicitly
5. **Structured Extraction** - Use JSON schemas with null handling
6. **Tool Usage** - Parallelize reads, verify writes
7. **Compaction** - Reduce context at milestones

### Quick Reference

| Technique | When to Use |
|-----------|-------------|
| `VERBOSITY_SPEC` | Enterprise agents, coding assistants |
| `SCOPE_SPEC` | Frontend/design work, code generation |
| `LONG_CONTEXT_SPEC` | Documents >10K tokens |
| `UNCERTAINTY_SPEC` | Ambiguous queries, recent facts |
| `HIGH_RISK_SPEC` | Legal, financial, compliance contexts |