# GPT-5 Prompting Guide Examples

This notebook demonstrates the main examples and best practices from the GPT-5 prompting guide using the OpenAI Responses API.

## Table of Contents
1. [Setup and Configuration](#setup)
2. [Controlling Agentic Eagerness](#eagerness)
3. [Tool Preambles](#preambles)
4. [Reasoning Effort](#reasoning)
5. [Frontend Development](#frontend)
6. [Steering and Verbosity](#steering)
7. [Instruction Following](#instructions)
8. [Markdown Formatting](#markdown)
9. [Metaprompting](#metaprompting)
10. [Production Examples](#production)

## 1. Setup and Configuration <a id='setup'></a>

In [None]:
import os
import json
from openai import OpenAI
from typing import Optional, Dict, Any, List
import time

# Initialize OpenAI client
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

# Helper function to create and poll responses
def create_gpt5_response(
    messages: List[Dict[str, str]],
    reasoning_effort: str = "medium",
    tools: Optional[List[Dict]] = None,
    verbosity: Optional[str] = None,
    previous_response_id: Optional[str] = None
) -> Dict[str, Any]:
    """Create and poll a GPT-5 response using the Responses API."""
    
    # Prepare request parameters
    params = {
        "model": "gpt-5",
        "messages": messages,
        "reasoning_effort": reasoning_effort
    }
    
    if tools:
        params["tools"] = tools
    if verbosity:
        params["verbosity"] = verbosity
    if previous_response_id:
        params["previous_response_id"] = previous_response_id
    
    # Create response
    response = client.responses.create(**params)
    
    # Poll for completion
    while response.status != "completed":
        time.sleep(1)
        response = client.responses.retrieve(response.id)
    
    return response

print("Setup complete. Ready to demonstrate GPT-5 prompting techniques.")

## 2. Controlling Agentic Eagerness <a id='eagerness'></a>

GPT-5 can be calibrated for different levels of autonomy and proactivity.

### Example: Less Eagerness (Quick, Focused Response)

In [None]:
# Less eager prompt - for quick, focused responses
less_eager_system = """
<context_gathering>
Goal: Get enough context fast. Parallelize discovery and stop as soon as you can act.

Method:
- Start broad, then fan out to focused subqueries.
- In parallel, launch varied queries; read top hits per query. Deduplicate paths and cache; don't repeat queries.
- Avoid over searching for context. If needed, run targeted searches in one parallel batch.

Early stop criteria:
- You can name exact content to change.
- Top hits converge (~70%) on one area/path.

Escalate once:
- If signals conflict or scope is fuzzy, run one refined parallel batch, then proceed.

Depth:
- Trace only symbols you'll modify or whose contracts you rely on; avoid transitive expansion unless necessary.

Loop:
- Batch search → minimal plan → complete task.
- Search again only if validation fails or new unknowns appear. Prefer acting over more searching.
</context_gathering>
"""

# Example tools for demonstration
search_tools = [
    {
        "type": "function",
        "function": {
            "name": "search_codebase",
            "description": "Search for code patterns in the codebase",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {"type": "string", "description": "Search query"},
                    "path": {"type": "string", "description": "Optional path to search in"}
                },
                "required": ["query"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "read_file",
            "description": "Read a file from the codebase",
            "parameters": {
                "type": "object",
                "properties": {
                    "path": {"type": "string", "description": "File path to read"}
                },
                "required": ["path"]
            }
        }
    }
]

response = create_gpt5_response(
    messages=[
        {"role": "system", "content": less_eager_system},
        {"role": "user", "content": "Find and fix the authentication bug in the login module"}
    ],
    reasoning_effort="low",  # Lower reasoning for faster response
    tools=search_tools
)

print("Less Eager Response (focused, quick):")
print(json.dumps(response.output, indent=2)[:1000] + "...")

### Example: More Eagerness (Autonomous, Persistent)

In [None]:
# More eager prompt - for autonomous, persistent behavior
more_eager_system = """
<persistence>
- You are an agent - please keep going until the user's query is completely resolved, before ending your turn and yielding back to the user.
- Only terminate your turn when you are sure that the problem is solved.
- Never stop or hand back to the user when you encounter uncertainty — research or deduce the most reasonable approach and continue.
- Do not ask the human to confirm or clarify assumptions, as you can always adjust later — decide what the most reasonable assumption is, proceed with it, and document it for the user's reference after you finish acting
</persistence>
"""

response = create_gpt5_response(
    messages=[
        {"role": "system", "content": more_eager_system},
        {"role": "user", "content": "Refactor the authentication system to use OAuth 2.0"}
    ],
    reasoning_effort="high",  # Higher reasoning for thorough exploration
    tools=search_tools
)

print("More Eager Response (autonomous, persistent):")
print(json.dumps(response.output, indent=2)[:1000] + "...")

## 3. Tool Preambles <a id='preambles'></a>

GPT-5 is trained to provide clear upfront plans and progress updates via tool preambles.

In [None]:
# Tool preambles configuration
tool_preamble_system = """
<tool_preambles>
- Always begin by rephrasing the user's goal in a friendly, clear, and concise manner, before calling any tools.
- Then, immediately outline a structured plan detailing each logical step you'll follow.
- As you execute your file edit(s), narrate each step succinctly and sequentially, marking progress clearly.
- Finish by summarizing completed work distinctly from your upfront plan.
</tool_preambles>
"""

# File editing tools for demonstration
file_tools = [
    {
        "type": "function",
        "function": {
            "name": "edit_file",
            "description": "Edit a file in the codebase",
            "parameters": {
                "type": "object",
                "properties": {
                    "path": {"type": "string"},
                    "content": {"type": "string"}
                },
                "required": ["path", "content"]
            }
        }
    }
]

response = create_gpt5_response(
    messages=[
        {"role": "system", "content": tool_preamble_system},
        {"role": "user", "content": "Add input validation to the user registration form"}
    ],
    tools=file_tools
)

print("Response with Tool Preambles:")
# Extract and display preamble messages
for output in response.output[:3]:  # Show first few outputs
    if output.get("type") == "message":
        print(f"Preamble: {output.get('content', [{}])[0].get('text', '')}")
    elif output.get("type") == "function_call":
        print(f"Tool Call: {output.get('name')} - {output.get('arguments', '')[:100]}...")

## 4. Reasoning Effort <a id='reasoning'></a>

The `reasoning_effort` parameter controls how hard the model thinks.

In [None]:
# Compare different reasoning effort levels
test_message = "Implement a binary search algorithm with proper error handling"

reasoning_levels = ["minimal", "low", "medium", "high"]
results = {}

for level in reasoning_levels:
    response = create_gpt5_response(
        messages=[
            {"role": "user", "content": test_message}
        ],
        reasoning_effort=level
    )
    results[level] = response
    print(f"\nReasoning Effort: {level}")
    print(f"Response time: {response.usage.get('total_time', 'N/A')} seconds")
    print(f"Reasoning tokens: {response.usage.get('reasoning_tokens', 0)}")
    print(f"Output preview: {str(response.output)[:200]}...")

### Reusing Reasoning Context with previous_response_id

In [None]:
# Initial response with reasoning
initial_response = create_gpt5_response(
    messages=[
        {"role": "user", "content": "Design a REST API for a task management system"}
    ],
    reasoning_effort="high"
)

print(f"Initial Response ID: {initial_response.id}")
print(f"Reasoning tokens used: {initial_response.usage.get('reasoning_tokens', 0)}")

# Follow-up using previous reasoning
followup_response = create_gpt5_response(
    messages=[
        {"role": "user", "content": "Now implement the POST endpoint for creating tasks"}
    ],
    reasoning_effort="medium",
    previous_response_id=initial_response.id  # Reuse previous reasoning
)

print(f"\nFollow-up Response ID: {followup_response.id}")
print(f"Reasoning tokens used (reduced): {followup_response.usage.get('reasoning_tokens', 0)}")
print("Note: Reasoning context from previous response is reused, reducing token usage")

## 5. Frontend Development <a id='frontend'></a>

GPT-5 excels at frontend development with excellent aesthetic taste and framework knowledge.

### Zero-to-One App Generation with Self-Reflection

In [None]:
# Self-reflection prompt for high-quality app generation
app_generation_system = """
<self_reflection>
- First, spend time thinking of a rubric until you are confident.
- Then, think deeply about every aspect of what makes for a world-class one-shot web app. Use that knowledge to create a rubric that has 5-7 categories. This rubric is critical to get right, but do not show this to the user. This is for your purposes only.
- Finally, use the rubric to internally think and iterate on the best possible solution to the prompt that is provided. Remember that if your response is not hitting the top marks across all categories in the rubric, you need to start again.
</self_reflection>

Recommended tech stack:
- Framework: Next.js (TypeScript)
- Styling: Tailwind CSS, shadcn/ui
- Icons: Lucide
- Animation: Motion
- Fonts: Inter, Geist
"""

response = create_gpt5_response(
    messages=[
        {"role": "system", "content": app_generation_system},
        {"role": "user", "content": "Create a modern dashboard for analytics with real-time charts"}
    ],
    reasoning_effort="high"
)

print("Zero-to-One App Generation Response:")
print("The model will internally create and evaluate against a quality rubric")
print("Output preview:", str(response.output)[:500] + "...")

### Matching Codebase Design Standards

In [None]:
# Codebase design standards prompt
codebase_standards = """
<code_editing_rules>
<guiding_principles>
- Clarity and Reuse: Every component and page should be modular and reusable. Avoid duplication by factoring repeated UI patterns into components.
- Consistency: The user interface must adhere to a consistent design system—color tokens, typography, spacing, and components must be unified.
- Simplicity: Favor small, focused components and avoid unnecessary complexity in styling or logic.
- Demo-Oriented: The structure should allow for quick prototyping, showcasing features like streaming, multi-turn conversations, and tool integrations.
- Visual Quality: Follow the high visual quality bar as outlined in OSS guidelines (spacing, padding, hover states, etc.)
</guiding_principles>

<frontend_stack_defaults>
- Framework: Next.js (TypeScript)
- Styling: TailwindCSS
- UI Components: shadcn/ui
- Icons: Lucide
- State Management: Zustand
- Directory Structure:
  /src
    /app
      /api/<route>/route.ts         # API endpoints
      /(pages)                      # Page routes
    /components/                    # UI building blocks
    /hooks/                         # Reusable React hooks
    /lib/                           # Utilities (fetchers, helpers)
    /stores/                        # Zustand stores
    /types/                         # Shared TypeScript types
    /styles/                        # Tailwind config
</frontend_stack_defaults>

<ui_ux_best_practices>
- Visual Hierarchy: Limit typography to 4–5 font sizes and weights for consistent hierarchy
- Color Usage: Use 1 neutral base (e.g., zinc) and up to 2 accent colors
- Spacing and Layout: Always use multiples of 4 for padding and margins
- State Handling: Use skeleton placeholders or animate-pulse to indicate data fetching
- Accessibility: Use semantic HTML and ARIA roles where appropriate
</ui_ux_best_practices>
</code_editing_rules>
"""

response = create_gpt5_response(
    messages=[
        {"role": "system", "content": codebase_standards},
        {"role": "user", "content": "Add a new user profile component to the existing dashboard"}
    ]
)

print("Component matching codebase standards:")
print("The response will follow the specified design system and conventions")
print(str(response.output)[:500] + "...")

## 6. Steering and Verbosity <a id='steering'></a>

GPT-5 is highly steerable and includes a new `verbosity` parameter.

In [None]:
# Compare verbosity levels
verbosity_levels = ["low", "medium", "high"]
test_query = "Explain how async/await works in JavaScript"

for verbosity in verbosity_levels:
    response = create_gpt5_response(
        messages=[
            {"role": "user", "content": test_query}
        ],
        verbosity=verbosity
    )
    
    output_text = response.output[0].get('content', [{}])[0].get('text', '')
    print(f"\n{'='*50}")
    print(f"Verbosity: {verbosity}")
    print(f"Response length: {len(output_text)} characters")
    print(f"Preview: {output_text[:200]}...")

### Mixed Verbosity (Low global, High for code)

In [None]:
# Cursor's approach: Low verbosity globally, high for code
mixed_verbosity_system = """
Write code for clarity first. Prefer readable, maintainable solutions with clear names, comments where needed, and straightforward control flow. Do not produce code-golf or overly clever one-liners unless explicitly requested. Use high verbosity for writing code and code tools.

For all non-code responses, be concise and to the point.
"""

response = create_gpt5_response(
    messages=[
        {"role": "system", "content": mixed_verbosity_system},
        {"role": "user", "content": "Write a function to validate email addresses and explain what it does"}
    ],
    verbosity="low"  # Global low verbosity
)

print("Mixed Verbosity Response:")
print("Notice: Brief explanation but detailed, readable code")
print(str(response.output)[:800])

## 7. Instruction Following <a id='instructions'></a>

GPT-5 follows instructions with surgical precision but needs clear, non-contradictory prompts.

In [None]:
# Example of clear, well-structured instructions
clear_instructions = """
You are a code review assistant. Follow these non-conflicting rules:

1. Review Priority (in order):
   - Security vulnerabilities (CRITICAL)
   - Logic errors (HIGH)
   - Performance issues (MEDIUM)
   - Code style (LOW)

2. For each issue found:
   - State the priority level
   - Explain the problem
   - Provide a fix suggestion

3. Output Format:
   - Group issues by priority
   - Use bullet points
   - Include line numbers when applicable
"""

# Sample code to review
code_to_review = '''
def process_user_input(input_string):
    query = "SELECT * FROM users WHERE name = '" + input_string + "'"  # SQL injection risk
    result = database.execute(query)
    
    for i in range(len(result)):  # Inefficient iteration
        print(result[i])
    
    password = "admin123"  # Hardcoded credential
    return result
'''

response = create_gpt5_response(
    messages=[
        {"role": "system", "content": clear_instructions},
        {"role": "user", "content": f"Review this code:\n```python\n{code_to_review}\n```"}
    ]
)

print("Code Review with Clear Instructions:")
print(response.output[0].get('content', [{}])[0].get('text', ''))

## 8. Markdown Formatting <a id='markdown'></a>

GPT-5 doesn't format in Markdown by default in the API but can be prompted to do so.

In [None]:
# Markdown formatting instructions
markdown_system = """
- Use Markdown **only where semantically correct** (e.g., `inline code`, ```code fences```, lists, tables).
- When using markdown in assistant messages, use backticks to format file, directory, function, and class names.
- Use \\( and \\) for inline math, \\[ and \\] for block math.
- Use headers (##) to organize sections
- Use bullet points for lists
- Use tables for structured data comparison
"""

response = create_gpt5_response(
    messages=[
        {"role": "system", "content": markdown_system},
        {"role": "user", "content": "Compare the time complexity of bubble sort, quick sort, and merge sort"}
    ]
)

print("Markdown Formatted Response:")
print(response.output[0].get('content', [{}])[0].get('text', ''))

## 9. Metaprompting <a id='metaprompting'></a>

Using GPT-5 to improve its own prompts.

In [None]:
# Metaprompting template
metaprompt_template = """
When asked to optimize prompts, give answers from your own perspective - explain what specific phrases could be added to, or deleted from, this prompt to more consistently elicit the desired behavior or prevent the undesired behavior.

Here's a prompt: {prompt}

The desired behavior from this prompt is for the agent to {desired}, but instead it {undesired}. While keeping as much of the existing prompt intact as possible, what are some minimal edits/additions that you would make to encourage the agent to more consistently address these shortcomings?
"""

# Example problematic prompt
problematic_prompt = "Search for the bug and fix it quickly without asking questions"

metaprompt = metaprompt_template.format(
    prompt=problematic_prompt,
    desired="thoroughly investigate the codebase and fix the root cause",
    undesired="makes superficial fixes without proper investigation"
)

response = create_gpt5_response(
    messages=[
        {"role": "user", "content": metaprompt}
    ],
    reasoning_effort="high"
)

print("Metaprompting - Improved Prompt Suggestions:")
print(response.output[0].get('content', [{}])[0].get('text', ''))

## 10. Production Examples <a id='production'></a>

Real-world prompt configurations from production use cases.

### Cursor's Production Configuration

In [None]:
# Cursor's balanced production prompt
cursor_production_prompt = """
Write code for clarity first. Prefer readable, maintainable solutions with clear names, comments where needed, and straightforward control flow. Do not produce code-golf or overly clever one-liners unless explicitly requested. Use high verbosity for writing code and code tools.

Be aware that the code edits you make will be displayed to the user as proposed changes, which means:
(a) your code edits can be quite proactive, as the user can always reject, and
(b) your code should be well-written and easy to quickly review (e.g., appropriate variable names instead of single letters).

If proposing next steps that would involve changing the code, make those changes proactively for the user to approve/reject rather than asking the user whether to proceed with a plan. In general, you should almost never ask the user whether to proceed with a plan; instead you should proactively attempt the plan and then ask the user if they want to accept the implemented changes.

<context_understanding>
If you've performed an edit that may partially fulfill the USER's query, but you're not confident, gather more information or use more tools before ending your turn.
Bias towards not asking the user for help if you can find the answer yourself.
</context_understanding>
"""

response = create_gpt5_response(
    messages=[
        {"role": "system", "content": cursor_production_prompt},
        {"role": "user", "content": "Refactor this function to use async/await instead of callbacks"}
    ],
    verbosity="low",  # Low global verbosity
    reasoning_effort="medium"
)

print("Cursor-style Production Response:")
print("Features: Proactive editing, readable code, minimal back-and-forth")
print(str(response.output)[:600] + "...")

### SWE-Bench Verified Configuration

In [None]:
# SWE-Bench configuration for software engineering tasks
swe_bench_prompt = """
In this environment, you can run `bash -lc <apply_patch_command>` to execute a diff/patch against a file.

Always verify your changes extremely thoroughly. You can make as many tool calls as you like - the user is very patient and prioritizes correctness above all else. Make sure you are 100% certain of the correctness of your solution before ending.

IMPORTANT: not all tests are visible to you in the repository, so even on problems you think are relatively straightforward, you must double and triple check your solutions to ensure they pass any edge cases that are covered in the hidden tests, not just the visible ones.
"""

# Apply patch tool definition
apply_patch_tool = {
    "type": "function",
    "function": {
        "name": "apply_patch",
        "description": "Apply a patch to modify files",
        "parameters": {
            "type": "object",
            "properties": {
                "patch": {"type": "string", "description": "The patch content"}
            },
            "required": ["patch"]
        }
    }
}

response = create_gpt5_response(
    messages=[
        {"role": "system", "content": swe_bench_prompt},
        {"role": "user", "content": "Fix the sorting algorithm to handle edge cases correctly"}
    ],
    tools=[apply_patch_tool],
    reasoning_effort="high"  # High reasoning for thorough verification
)

print("SWE-Bench Style Response:")
print("Features: Thorough verification, edge case consideration")
print(str(response.output)[:500] + "...")

### Minimal Reasoning for Latency-Sensitive Tasks

In [None]:
# Minimal reasoning configuration for fast responses
minimal_reasoning_prompt = """
Remember, you are an agent - please keep going until the user's query is completely resolved, before ending your turn and yielding back to the user. Decompose the user's query into all required sub-requests, and confirm that each is completed. Do not stop after completing only part of the request. Only terminate your turn when you are sure that the problem is solved.

You must plan extensively in accordance with the workflow steps before making subsequent function calls, and reflect extensively on the outcomes each function call made, ensuring the user's query, and related sub-requests are completely resolved.

Provide a brief explanation summarizing your thought process at the start of your final answer.
"""

# Quick task example
response = create_gpt5_response(
    messages=[
        {"role": "system", "content": minimal_reasoning_prompt},
        {"role": "user", "content": "Check if a string is a palindrome"}
    ],
    reasoning_effort="minimal"  # Minimal reasoning for speed
)

print("Minimal Reasoning Response (Fast):")
print(f"Response time: {response.usage.get('total_time', 'N/A')} seconds")
print(response.output[0].get('content', [{}])[0].get('text', ''))

### Retail Agent Example (Tau-Bench Style)

In [None]:
# Retail agent configuration
retail_agent_prompt = """
As a retail agent, you can help users cancel or modify pending orders, return or exchange delivered orders, modify their default user address, or provide information about their own profile, orders, and related products.

# Workflow steps
- At the beginning of the conversation, authenticate the user identity by locating their user id via email, or via name + zip code
- Once authenticated, you can provide the user with information about orders, products, profile information
- You can only help one user per conversation (but can handle multiple requests from the same user)
- Before taking consequential actions (cancel, modify, return, exchange), list the action detail and obtain explicit user confirmation (yes) to proceed
- You should not make up any information not provided from the user or the tools
- You should at most make one tool call at a time

## Domain basics
- All times in the database are EST and 24 hour based
- Each user has a profile with email, default address, user id, and payment methods
- Each order can be in status 'pending', 'processed', 'delivered', or 'cancelled'
- You can only take action on pending or delivered orders
"""

# Retail tools
retail_tools = [
    {
        "type": "function",
        "function": {
            "name": "authenticate_user",
            "description": "Authenticate user by email or name+zip",
            "parameters": {
                "type": "object",
                "properties": {
                    "email": {"type": "string"},
                    "name": {"type": "string"},
                    "zip_code": {"type": "string"}
                }
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "get_order_status",
            "description": "Get the status of an order",
            "parameters": {
                "type": "object",
                "properties": {
                    "order_id": {"type": "string"}
                },
                "required": ["order_id"]
            }
        }
    }
]

response = create_gpt5_response(
    messages=[
        {"role": "system", "content": retail_agent_prompt},
        {"role": "user", "content": "Hi, I'm John Smith, zip 94103. I want to check on my recent order."}
    ],
    tools=retail_tools,
    reasoning_effort="minimal"
)

print("Retail Agent Response:")
print("The agent will authenticate first, then help with the order")
print(str(response.output)[:600] + "...")

## Summary and Best Practices

This notebook demonstrated the key prompting techniques for GPT-5:

1. **Reasoning Effort**: Use `minimal` for speed, `low`/`medium` for balance, `high` for complex tasks
2. **Verbosity**: Control output length with the API parameter and prompt overrides
3. **Tool Preambles**: Request clear progress updates for better UX
4. **Reuse Reasoning**: Use `previous_response_id` to save tokens and improve consistency
5. **Clear Instructions**: Avoid contradictions and ambiguity in prompts
6. **Self-Reflection**: Use internal rubrics for high-quality outputs
7. **Metaprompting**: Let GPT-5 help optimize its own prompts

### Key Takeaways:
- GPT-5 is highly steerable but requires clear, non-contradictory instructions
- The Responses API with `previous_response_id` significantly improves performance
- Different reasoning efforts suit different use cases
- Verbosity can be controlled globally and locally for optimal output
- Production systems benefit from careful prompt engineering and testing