# OpenAI Responses API V2 Client - Complete Guide

This notebook demonstrates the `OpenAIResponsesV2Client` which implements the new OpenAI Responses API with rich `UnifiedResponse` objects.

## Key Features

- **Stateful Conversations**: Maintain conversation context via `previous_response_id`
- **Built-in Tools**: Web search, image generation, apply_patch
- **Rich Content Blocks**: TextContent, ReasoningContent, CitationContent, ImageContent, ToolCallContent
- **Multimodal Support**: Send and receive images
- **Structured Output**: Pydantic models and JSON schema support
- **Cost Tracking**: Token and image generation cost tracking
- **Agent Integration**: Works with AG2 agents for single, two-agent, and group chat

## Requirements

AG2 requires `Python>=3.9`. Install the required packages:

In [None]:
%pip install "ag2[openai]" -q

## Setup

Set your OpenAI API key as an environment variable or pass it directly to the client.

In [None]:
import os

# Set your API key (or use environment variable OPENAI_API_KEY)
# os.environ["OPENAI_API_KEY"] = "sk-..."

---

# 1. Basic Usage

The `OpenAIResponsesV2Client` returns rich `UnifiedResponse` objects with typed content blocks.

In [2]:
from autogen.llm_clients.openai_resposnes_v2_client import OpenAIResponsesV2Client

# Create the V2 client
client = OpenAIResponsesV2Client()

# Make a simple request
response = client.create({
    "model": "gpt-4.1",
    "messages": [{"role": "user", "content": "how are you? tell me about yourself? and what is a machine?"}]
})

# Access the response
print(f"Response ID: {response.id}")
print(f"Model: {response.model}")
print(f"Content: {response.messages[0].get_text()}")

Response ID: resp_0d04c8f4ca158d66006980752f2d70819c8e7bba42b01dd1f6
Model: gpt-4.1-2025-04-14
Content: I'm an AI language model created by OpenAI, here to help answer your questions, have conversations, and assist with a wide range of topics. I don't have feelings, but thank you for asking—I'm always ready to help!

**About me:**  
- I can understand and generate text, images, and more.
- My knowledge extends up to June 2024.
- I don't have personal experiences or emotions, but I can explain concepts, solve problems, and simulate conversation.

**What is a machine?**  
A **machine** is any device or system that uses energy to perform a specific task, often making work easier, faster, or more efficient. Machines can be as simple as a lever or wheel, or as complex as a computer or robot. They are usually made up of components like gears, levers, motors, or electronic circuits that work together to achieve a specific function.

*Examples of machines include:*  
- A car (converts fuel int

## Understanding UnifiedResponse Structure

The `UnifiedResponse` contains rich, typed content blocks:

In [3]:
from autogen.llm_clients.models.content_blocks import (
    TextContent,
    ReasoningContent,
    CitationContent,
    ImageContent,
    ToolCallContent,
    GenericContent,
)

# Inspect the response structure
print(f"Number of messages: {len(response.messages)}")
print(f"Usage: {response.usage}")
print(f"Cost: ${response.cost:.6f}")

# Iterate through content blocks
for msg in response.messages:
    print(f"\nRole: {msg.role}")
    for block in msg.content:
        if isinstance(block, TextContent):
            print(f"  Text: {block.text[:100]}..." if len(block.text) > 100 else f"  Text: {block.text}")
        elif isinstance(block, ReasoningContent):
            print(f"  Reasoning: {block.text[:100]}...")

Number of messages: 1
Usage: {'prompt_tokens': 22, 'completion_tokens': 231, 'total_tokens': 253, 'token_cost': 0.001892, 'image_cost': 0.0}
Cost: $0.001892

Role: assistant
  Text: I'm an AI language model created by OpenAI, here to help answer your questions, have conversations, ...


---

# 2. Stateful Conversations

The Responses API is **stateful** - it maintains conversation context server-side using `previous_response_id`.

In [4]:
# Create a new client for stateful conversation
stateful_client = OpenAIResponsesV2Client()

# First message
response1 = stateful_client.create({
    "model": "gpt-4.1",
    "messages": [{"role": "user", "content": "My name is Alice. Remember this."}]
})
print(f"Response 1: {response1.messages[0].get_text()}")
print(f"Response ID: {response1.id}")

Response 1: Got it, Alice! I’ll remember your name. How can I help you today?
Response ID: resp_07cf7e90399d7a6c0069807553d864819e94e6e0e46ed47bbe


In [5]:
# Second message - the client automatically tracks state
response2 = stateful_client.create({
    "model": "gpt-4.1",
    "messages": [{"role": "user", "content": "What is my name?"}]
})
print(f"Response 2: {response2.messages[0].get_text()}")
print(f"\nThe model remembered the context from the previous turn!")

Response 2: Your name is Alice. How can I assist you today?

The model remembered the context from the previous turn!


In [6]:
# Reset conversation state to start fresh
stateful_client.reset_conversation()

response3 = stateful_client.create({
    "model": "gpt-4.1",
    "messages": [{"role": "user", "content": "What is my name?"}]
})
print(f"After reset: {response3.messages[0].get_text()}")
print("\nThe model no longer has context from previous conversation.")

After reset: You haven't told me your name yet! If you'd like to share it, I can use it in our conversation.

The model no longer has context from previous conversation.


## Manual State Control

You can also manually control the conversation state:

In [None]:
# Get current state
current_state = stateful_client._get_previous_response_id()
print(f"Current state: {current_state}")

# Manually set state to continue a specific conversation
stateful_client._set_previous_response_id("resp_07cf7e90399d7a6c0069807553d864819e94e6e0e46ed47bbe")

# Or pass previous_response_id directly in params
response = stateful_client.create({
    "model": "gpt-4.1",
    "messages": [{"role": "user", "content": "Continue from here"}],
    "previous_response_id": "resp_07cf7e90399d7a6c0069807553d864819e94e6e0e46ed47bbe"
})


Current state: resp_07cf7e90399d7a6c00698076e79244819e91adf42a6bbb50d6
Updated state: resp_07cf7e90399d7a6c00698076f07280819ea8ea0382e7d2f7f0


---

# 3. Multimodal Support

Send images in your messages using various formats.

In [None]:
# Create a multimodal message with an image URL
multimodal_message = OpenAIResponsesV2Client.create_multimodal_message(
    text="What do you see in this image?",
    images=["https://upload.wikimedia.org/wikipedia/commons/thumb/3/3a/Cat03.jpg/1200px-Cat03.jpg"],
    role="user"
)

print("Multimodal message structure:")
print(multimodal_message)

In [None]:
# Send multimodal request
mm_client = OpenAIResponsesV2Client()

response = mm_client.create({
    "model": "gpt-4.1",  # Use a vision-capable model
    "messages": [multimodal_message]
})

print(f"Image description: {response.messages[0].get_text()}")

## Creating Image Content

You can also create image content directly:

In [None]:
# Create image content from URL
image_content = OpenAIResponsesV2Client.create_image_content(
    image_url="https://example.com/image.jpg"
)
print(f"Image content: {image_content}")

# Or from base64 data URI
# image_content = OpenAIResponsesV2Client.create_image_content(
#     data_uri="data:image/png;base64,iVBORw0KGgo..."
# )

---

# 4. Built-in Tools

The Responses API provides built-in tools that don't require function definitions.

## 4.1 Web Search

In [None]:
# Enable web search
search_client = OpenAIResponsesV2Client()

response = search_client.create({
    "model": "gpt-4.1",
    "messages": [{"role": "user", "content": "What are the latest news about AI?"}],
    "built_in_tools": ["web_search"]
})

print(f"Response: {response.messages[0].get_text()[:500]}...")

In [None]:
# Extract citations from the response
citations = OpenAIResponsesV2Client.get_citations(response)

print(f"\nFound {len(citations)} citations:")
for citation in citations[:5]:  # Show first 5
    print(f"  - {citation.title}: {citation.url}")

## 4.2 Image Generation

In [None]:
# Enable image generation
image_client = OpenAIResponsesV2Client()

# Configure image output parameters
image_client.set_image_output_params(
    quality="high",
    size="1024x1024",
    output_format="png"
)

response = image_client.create({
    "model": "gpt-4.1",
    "messages": [{"role": "user", "content": "Generate an image of a sunset over mountains"}],
    "built_in_tools": ["image_generation"]
})

# Extract generated images
images = OpenAIResponsesV2Client.get_generated_images(response)
print(f"Generated {len(images)} image(s)")

if images:
    print(f"Image data URI (truncated): {images[0].data_uri[:100]}...")

In [None]:
# Check image generation costs
print(f"Image costs: ${image_client.get_image_costs():.4f}")
print(f"Total costs: ${image_client.get_total_costs():.4f}")

## 4.3 Structured Output

In [None]:
from pydantic import BaseModel

# Define a Pydantic model for structured output
class Person(BaseModel):
    name: str
    age: int
    occupation: str

# Request structured output
struct_client = OpenAIResponsesV2Client()

response = struct_client.create({
    "model": "gpt-4.1",
    "messages": [{"role": "user", "content": "Generate a fictional person's profile"}],
    "response_format": Person
})

# Get the parsed object
parsed = OpenAIResponsesV2Client.get_parsed_object(response)
if parsed:
    print(f"Name: {parsed.name}")
    print(f"Age: {parsed.age}")
    print(f"Occupation: {parsed.occupation}")

---

# 5. Cost Tracking

The V2 client tracks both token costs and image generation costs.

In [None]:
cost_client = OpenAIResponsesV2Client()

# Make several requests
for i in range(3):
    response = cost_client.create({
        "model": "gpt-4.1",
        "messages": [{"role": "user", "content": f"Count to {i+1}"}]
    })
    
    # Per-request cost
    usage = OpenAIResponsesV2Client.get_usage(response)
    print(f"Request {i+1}: {usage['total_tokens']} tokens, ${usage['cost']:.6f}")

In [None]:
# Get cumulative usage
cumulative = cost_client.get_cumulative_usage()
print(f"\nCumulative Usage:")
print(f"  Total prompt tokens: {cumulative['prompt_tokens']}")
print(f"  Total completion tokens: {cumulative['completion_tokens']}")
print(f"  Total tokens: {cumulative['total_tokens']}")
print(f"  Token cost: ${cumulative['token_cost']:.6f}")
print(f"  Image cost: ${cumulative['image_cost']:.6f}")
print(f"  Total cost: ${cumulative['total_cost']:.6f}")

In [None]:
# Reset cost tracking
cost_client.reset_all_costs()
print(f"After reset: ${cost_client.get_total_costs():.6f}")

In [None]:
# Set custom pricing for fine-tuned models
custom_client = OpenAIResponsesV2Client()
custom_client.set_custom_price(
    input_price_per_1k=0.003,
    output_price_per_1k=0.006
)

response = custom_client.create({
    "model": "ft:gpt-4.1:my-org:custom-model",
    "messages": [{"role": "user", "content": "Hello"}]
})

print(f"Cost with custom pricing: ${custom_client.cost(response):.6f}")

---

# 6. V1 Backward Compatibility

For code that expects ChatCompletion format, use `create_v1_compatible()`.

In [None]:
v1_client = OpenAIResponsesV2Client()

# Get ChatCompletion-like response
response = v1_client.create_v1_compatible({
    "model": "gpt-4.1",
    "messages": [{"role": "user", "content": "Hello!"}]
})

# Access like standard ChatCompletion
print(f"Type: {type(response).__name__}")
print(f"Content: {response.choices[0].message.content}")
print(f"Tokens: {response.usage.total_tokens}")
print(f"Cost: ${response.cost:.6f}")

---

# 7. Agent Integration

The V2 client integrates with AG2 agents for conversational AI workflows.

## 7.1 Single Agent

In [None]:
import autogen

# Configure LLM with Responses API
config_list = [
    {
        "model": "gpt-4.1",
        "api_type": "responses",  # Use Responses API
    }
]

llm_config = {"config_list": config_list}

# Create a single assistant agent
assistant = autogen.AssistantAgent(
    name="assistant",
    llm_config=llm_config,
    system_message="You are a helpful AI assistant."
)

# Create a user proxy agent
user_proxy = autogen.UserProxyAgent(
    name="user",
    human_input_mode="NEVER",
    max_consecutive_auto_reply=1,
    code_execution_config=False,
)

# Start a conversation
user_proxy.initiate_chat(
    assistant,
    message="What is the capital of France?"
)

## 7.2 Two-Agent Chat

In [None]:
# Create two specialized agents
researcher = autogen.AssistantAgent(
    name="researcher",
    llm_config=llm_config,
    system_message="""You are a research assistant. Your job is to:
    1. Analyze questions thoroughly
    2. Provide detailed, factual information
    3. Cite sources when possible"""
)

critic = autogen.AssistantAgent(
    name="critic",
    llm_config=llm_config,
    system_message="""You are a critical reviewer. Your job is to:
    1. Review the researcher's findings
    2. Point out any gaps or inaccuracies
    3. Suggest improvements
    Say 'TERMINATE' when the research is satisfactory."""
)

# Two-agent collaboration
researcher.initiate_chat(
    critic,
    message="Research the benefits and drawbacks of renewable energy sources.",
    max_turns=4
)

## 7.3 Group Chat

In [None]:
# Create multiple specialized agents for group chat
planner = autogen.AssistantAgent(
    name="planner",
    llm_config=llm_config,
    system_message="""You are a project planner. Break down tasks into actionable steps.
    Focus on creating clear, organized plans."""
)

developer = autogen.AssistantAgent(
    name="developer",
    llm_config=llm_config,
    system_message="""You are a software developer. Implement solutions based on the plan.
    Write clean, well-documented code."""
)

reviewer = autogen.AssistantAgent(
    name="reviewer",
    llm_config=llm_config,
    system_message="""You are a code reviewer. Review implementations for:
    1. Correctness
    2. Best practices
    3. Potential improvements
    Say 'TERMINATE' when the solution is complete and reviewed."""
)

user_proxy_gc = autogen.UserProxyAgent(
    name="user",
    human_input_mode="NEVER",
    code_execution_config=False,
)

In [None]:
# Create group chat
group_chat = autogen.GroupChat(
    agents=[user_proxy_gc, planner, developer, reviewer],
    messages=[],
    max_round=10
)

manager = autogen.GroupChatManager(
    groupchat=group_chat,
    llm_config=llm_config
)

# Start the group chat
user_proxy_gc.initiate_chat(
    manager,
    message="Create a Python function that calculates the Fibonacci sequence up to n terms."
)

---

# 8. Advanced: Custom Function Tools

Combine built-in tools with custom function tools.

In [None]:
# Define custom tools
def get_weather(city: str) -> str:
    """Get the current weather for a city."""
    # Mock implementation
    return f"The weather in {city} is sunny, 72°F"

def calculate(expression: str) -> str:
    """Evaluate a mathematical expression."""
    try:
        result = eval(expression)
        return str(result)
    except Exception as e:
        return f"Error: {e}"

# Define tool schemas
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather for a city",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {"type": "string", "description": "City name"}
                },
                "required": ["city"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "calculate",
            "description": "Evaluate a math expression",
            "parameters": {
                "type": "object",
                "properties": {
                    "expression": {"type": "string", "description": "Math expression"}
                },
                "required": ["expression"]
            }
        }
    }
]

print("Tools defined successfully!")

In [None]:
# Use custom tools with the V2 client
tools_client = OpenAIResponsesV2Client()

response = tools_client.create({
    "model": "gpt-4.1",
    "messages": [{"role": "user", "content": "What's 25 * 4 + 10?"}],
    "tools": tools
})

# Check for tool calls
for msg in response.messages:
    for block in msg.content:
        if isinstance(block, ToolCallContent):
            print(f"Tool call: {block.name}({block.arguments})")
        elif isinstance(block, TextContent):
            print(f"Text: {block.text}")

---

# Summary

The `OpenAIResponsesV2Client` provides:

| Feature | Description |
|---------|-------------|
| **Stateful Conversations** | Automatic context tracking via `previous_response_id` |
| **Rich Content Blocks** | TextContent, ReasoningContent, CitationContent, ImageContent, ToolCallContent |
| **Built-in Tools** | Web search, image generation, apply_patch |
| **Multimodal Support** | Send and receive images |
| **Structured Output** | Pydantic models and JSON schemas |
| **Cost Tracking** | Token and image generation cost tracking |
| **V1 Compatibility** | `create_v1_compatible()` for ChatCompletion format |
| **Agent Integration** | Works with AG2 single, two-agent, and group chat |

For more information, see the [AG2 documentation](https://docs.ag2.ai).