# Anthropic V2 Client with AG2 Agents

Author: [Priyanshu Deshmukh](https://github.com/priyansh4320)

This notebook demonstrates how to use the **Anthropic V2 Client** (`api_type: "anthropic_v2"`) with AG2's agent system. The V2 client provides:

- **Rich UnifiedResponse objects**: Typed content blocks (TextContent, ReasoningContent, ToolCallContent, etc.)
- **Structured Outputs**: Guaranteed schema-compliant JSON responses
- **Strict Tool Use**: Type-safe function calls with guaranteed schema validation
- **Vision Support**: Full multimodal capabilities with image input
- **Forward Compatibility**: GenericContent handles unknown future content types

## What is Anthropic V2 Client?

The V2 client implements both `ModelClient` and `ModelClientV2` protocols, returning rich `UnifiedResponse` objects with:

- **Typed content blocks**: `TextContent`, `ReasoningContent`, `ToolCallContent`, `CitationContent`
- **Structured outputs**: Native support for Pydantic models and JSON schemas
- **Strict tools**: Guaranteed schema validation for tool inputs
- **Rich metadata**: Full reasoning blocks, citations, and tool execution details
- **Type safety**: Pydantic validation for all response data
- **Cost tracking**: Automatic per-response cost calculation

## Installation

h
pip install ag2[anthropic]
```

## Setup

In [1]:
import os
import textwrap

from dotenv import load_dotenv
from pydantic import BaseModel

from autogen import AssistantAgent, UserProxyAgent
from autogen.io.run_response import Cost

load_dotenv()


# Helper function to extract total cost from ChatResult.cost dict
def get_total_cost(cost_dict):
    """Extract total cost from ChatResult.cost dict structure."""
    total = 0.0
    for usage_type in cost_dict.values():
        if isinstance(usage_type, dict):
            for model_usage in usage_type.values():
                if isinstance(model_usage, dict) and "cost" in model_usage:
                    total += model_usage["cost"]
    return total


# Helper function to extract cost from run response
def get_total_cost_from_run(run_response_cost):
    """Extract total cost from run response object."""
    if isinstance(run_response_cost, Cost):
        return run_response_cost.usage_including_cached_inference.total_cost
    return 0.0


print("✅ Environment configured")

python-dotenv could not parse statement starting at line 3
python-dotenv could not parse statement starting at line 4
python-dotenv could not parse statement starting at line 5
python-dotenv could not parse statement starting at line 8
python-dotenv could not parse statement starting at line 9
python-dotenv could not parse statement starting at line 10
python-dotenv could not parse statement starting at line 11
python-dotenv could not parse statement starting at line 12
python-dotenv could not parse statement starting at line 38


✅ Environment configured


# Part 1: Structured Outputs

Anthropic's structured outputs feature provides two powerful modes:

1. **JSON Outputs** (`response_format`): Get validated JSON responses matching a specific schema
2. **Strict Tool Use** (`strict: true`): Guaranteed schema validation for tool inputs

### Key Benefits

- **Always Valid**: No more `JSON.parse()` errors
- **Type Safe**: Guaranteed field types and required fields
- **Reliable**: No retries needed for schema violations
- **Dual Modes**: JSON for data extraction, strict tools for agentic workflows

### Requirements

- Claude Sonnet 4.5 (`claude-sonnet-4-5`) or Claude Opus 4.1 (`claude-opus-4-1`)
- Anthropic SDK >= 0.74.1
- Beta header: `structured-outputs-2025-11-13` (automatically applied by AG2)

## Example 1: JSON Structured Outputs with Pydantic Models

The most common use case is extracting structured data from unstructured text. We'll use Pydantic models to define our schema and get validated JSON responses.

### Use Case: Mathematical Reasoning

Let's create an agent that solves math problems and returns structured step-by-step reasoning.

In [None]:
# Define the structured output schema using Pydantic
class Step(BaseModel):
    """A single step in mathematical reasoning."""

    explanation: str
    output: str


class MathReasoning(BaseModel):
    """Structured output for mathematical problem solving."""

    steps: list[Step]
    final_answer: str

    def format(self) -> str:
        """Format the response for display."""
        steps_output = "\n".join(
            f"Step {i + 1}: {step.explanation}\n  Output: {step.output}" for i, step in enumerate(self.steps)
        )
        return f"{steps_output}\n\nFinal Answer: {self.final_answer}"


# Configure LLM with structured output using V2 client
llm_config = {
    "config_list": [
        {
            "model": "claude-sonnet-4-5",
            "api_key": os.getenv("ANTHROPIC_API_KEY"),
            "api_type": "anthropic_v2",  # <-- Use V2 client
            "response_format": MathReasoning,  # Enable structured outputs
        }
    ],
}

# Create agents
user_proxy = UserProxyAgent(
    name="User",
    human_input_mode="NEVER",
    max_consecutive_auto_reply=0,
    code_execution_config=False,
)

math_assistant = AssistantAgent(
    name="MathAssistant",
    system_message="You are a math tutor. Solve problems step by step.",
    llm_config=llm_config,
)

print("✅ Example 1 configured: Math reasoning with structured outputs")

In [None]:
# Ask the assistant to solve a math problem
chat_result = user_proxy.run(
    math_assistant,
    message="Solve the equation: 3x + 7 = 22",
    max_turns=1,
)

chat_result.process()

### How It Works

1. **Schema Definition**: Pydantic models define the expected structure
2. **Beta API**: AG2 automatically uses `beta.messages.parse()` for Pydantic models
3. **Constrained Decoding**: Claude generates output that strictly follows the schema
4. **FormatterProtocol**: If your model has a `format()` method, it's automatically called

**Benefits**:
- ✅ No JSON parsing errors
- ✅ Guaranteed schema compliance
- ✅ Type-safe field access
- ✅ Custom formatting support

## Example 2: Strict Tool Use for Type-Safe Function Calls

Strict tool use ensures that Claude's tool inputs exactly match your schema. This is critical for production agentic systems where invalid parameters can break workflows.

### Use Case: Weather API with Validated Inputs

Without strict mode, Claude might return `"celsius"` as a string when you expect an enum, or `"2"` instead of `2`. Strict mode guarantees correct types.

In [None]:
# Define a tool function
def get_weather(location: str, unit: str = "celsius") -> str:
    """Get the weather for a location.

    Args:
        location: The city and state, e.g. San Francisco, CA
        unit: Temperature unit (celsius or fahrenheit)
    """
    # In a real application, this would call a weather API
    return f"Weather in {location}: 22°{unit.upper()[0]}, partly cloudy"


# Configure LLM with strict tool using V2 client
llm_config_strict = {
    "config_list": [
        {
            "model": "claude-sonnet-4-5",
            "api_key": os.getenv("ANTHROPIC_API_KEY"),
            "api_type": "anthropic_v2",  # <-- Use V2 client
        }
    ],
    "functions": [
        {
            "name": "get_weather",
            "description": "Get the weather for a location",
            "strict": True,  # Enable strict schema validation ✨
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {"type": "string", "description": "The city and state, e.g. San Francisco, CA"},
                    "unit": {"type": "string", "enum": ["celsius", "fahrenheit"], "description": "Temperature unit"},
                },
                "required": ["location"],
            },
        }
    ],
}

# Create agents
weather_assistant = AssistantAgent(
    name="WeatherAssistant",
    system_message="You help users get weather information. Use the get_weather function.",
    llm_config=llm_config_strict,
)

user_proxy_2 = UserProxyAgent(
    name="User",
    human_input_mode="NEVER",
    max_consecutive_auto_reply=1,
    code_execution_config=False,
)

# Register function on both agents
weather_assistant.register_function({"get_weather": get_weather})
user_proxy_2.register_function({"get_weather": get_weather})

print("✅ Example 2 configured: Strict tool use for weather queries")

In [None]:
# Query the weather
chat_result = user_proxy_2.initiate_chat(
    weather_assistant,
    message="What's the weather in Boston, MA?",
    max_turns=2,
)

# Verify tool call had strict typing
print("\n" + "=" * 60)
print("TOOL CALL VERIFICATION:")
print("=" * 60)

import json

for message in chat_result.chat_history:
    if message.get("tool_calls"):
        tool_call = message["tool_calls"][0]
        args = json.loads(tool_call["function"]["arguments"])
        print(f"Function: {tool_call['function']['name']}")
        print(f"Arguments: {args}")
        print(f"✅ location type: {type(args['location']).__name__}")
        if "unit" in args:
            print(f"✅ unit value: {args['unit']} (valid enum)")
        break

## Example 3: Combined JSON Outputs + Strict Tools

The most powerful pattern is combining both features: use strict tools for calculations/actions, then return structured JSON for the final result.

### Use Case: Math Calculator Agent

The agent uses strict tools to perform calculations (guaranteed correct types), then provides a structured summary of the work.

In [None]:
# Define calculator tool
def calculate(operation: str, a: float, b: float) -> float:
    """Perform a calculation.

    Args:
        operation: The operation to perform (add, subtract, multiply, divide)
        a: First number
        b: Second number
    """
    if operation == "add":
        return a + b
    elif operation == "subtract":
        return a - b
    elif operation == "multiply":
        return a * b
    elif operation == "divide":
        return a / b if b != 0 else 0
    return 0


# Result model for structured output
class CalculationResult(BaseModel):
    """Structured output for calculation results."""

    problem: str
    steps: list[str]
    result: float
    verification: str


# Configure with BOTH features using V2 client
llm_config_combined = {
    "config_list": [
        {
            "model": "claude-sonnet-4-5",
            "api_key": os.getenv("ANTHROPIC_API_KEY"),
            "api_type": "anthropic_v2",  # <-- Use V2 client
            "response_format": CalculationResult,  # 1. Structured JSON output
        }
    ],
    "functions": [
        {  # 2. Strict tool validation
            "name": "calculate",
            "description": "Perform arithmetic calculation",
            "strict": True,  # Enable strict mode
            "parameters": {
                "type": "object",
                "properties": {
                    "operation": {"type": "string", "enum": ["add", "subtract", "multiply", "divide"]},
                    "a": {"type": "number"},
                    "b": {"type": "number"},
                },
                "required": ["operation", "a", "b"],
            },
        }
    ],
}

# Create agents
calc_assistant = AssistantAgent(
    name="MathAssistant",
    system_message="You solve math problems using tools and provide structured results.",
    llm_config=llm_config_combined,
)

user_proxy_3 = UserProxyAgent(
    name="User",
    human_input_mode="NEVER",
    max_consecutive_auto_reply=5,
    code_execution_config=False,
)

# Register function on both agents
calc_assistant.register_function({"calculate": calculate})
user_proxy_3.register_function({"calculate": calculate})

print("✅ Example 3 configured: Combined strict tools + structured output")

In [None]:
chat_result = user_proxy_3.run(
    calc_assistant,
    message="add 3 and 555",
    max_turns=2,
)
chat_result.process()

# Part 2: Vision and Image Input

The V2 client also supports full multimodal capabilities with image input.

## Example 4: Simple Image Description

Using formal image input format to reduce hallucination.

In [2]:
# Configure LLM to use V2 client for vision
llm_config_vision = {
    "config_list": [
        {
            "api_type": "anthropic_v2",  # <-- Key: use V2 client architecture
            "model": "claude-3-5-haiku-20241022",  # Vision-capable model
            "api_key": os.getenv("ANTHROPIC_API_KEY"),
        }
    ],
    "temperature": 0.3,
}

# Create vision assistant
vision_assistant = AssistantAgent(
    name="VisionBot",
    llm_config=llm_config_vision,
    system_message=textwrap.dedent("""
        You are an AI assistant with vision capabilities.
        You can analyze images and provide detailed, accurate descriptions.
    """).strip(),
)

# Create user proxy
user_proxy_vision = UserProxyAgent(
    name="User",
    human_input_mode="NEVER",
    max_consecutive_auto_reply=0,
    code_execution_config=False,
)

# Test image URL
IMAGE_URL = "https://upload.wikimedia.org/wikipedia/commons/3/3b/BlkStdSchnauzer2.jpg"

print("✅ Vision assistant with V2 client created")

✅ Vision assistant with V2 client created


In [3]:
# Formal image input format (recommended)
message_with_image = {
    "role": "user",
    "content": [
        {"type": "text", "text": "Describe this image in one sentence."},
        {"type": "image_url", "image_url": {"url": IMAGE_URL}},
    ],
}

# Initiate chat with image
chat_result = user_proxy_vision.initiate_chat(
    vision_assistant, message=message_with_image, max_turns=1, summary_method="last_msg"
)

print("\n=== Response ===")
print(chat_result.summary)
print(f"\nCost: ${get_total_cost(chat_result.cost):.4f}")

[33mUser[0m (to VisionBot):

Describe this image in one sentence.
<image>

--------------------------------------------------------------------------------
[33mVisionBot[0m (to User):

A black, shaggy-coated Schnauzer dog stands alertly in a grassy field with an upright, sturdy posture.

--------------------------------------------------------------------------------
[31m
>>>>>>>> TERMINATING RUN (dcec32e4-c50c-4d89-a384-7857bdf09444): Maximum turns (1) reached[0m

=== Response ===
A black, shaggy-coated Schnauzer dog stands alertly in a grassy field with an upright, sturdy posture.

Cost: $0.0007


## Example 5: Detailed Image Analysis

In [None]:
detailed_message = {
    "role": "user",
    "content": [
        {"type": "text", "text": "Analyze this image in detail. What breed is this dog? What are its characteristics?"},
        {"type": "image_url", "image_url": {"url": IMAGE_URL}},
    ],
}

chat_result = user_proxy_vision.initiate_chat(
    vision_assistant,
    message=detailed_message,
    max_turns=1,
    clear_history=True,  # Start fresh conversation
)

print(chat_result.summary)
print(f"\nCost: ${get_total_cost(chat_result.cost):.4f}")

## Summary

### Key Benefits of Anthropic V2 Client

1. **Structured Outputs**: Guaranteed schema-compliant JSON responses
2. **Strict Tool Use**: Type-safe function calls with guaranteed validation
3. **Vision Support**: Full multimodal capabilities with image input
4. **Rich Response Data**: Access to typed content blocks (reasoning, citations, etc.)
5. **Cost Tracking**: Automatic per-response cost calculation
6. **Type Safety**: Pydantic validation for all response data
7. **Forward Compatible**: GenericContent handles unknown future types

### Usage Pattern

on
# Simple: Just change api_type
```python
llm_config = {
    "config_list": [{
        "api_type": "anthropic_v2",  # <-- That's it!
        "model": "claude-sonnet-4-5",
        "api_key": "...",
        "response_format": YourPydanticModel,  # Optional: for structured outputs
    }]
}

assistant = AssistantAgent(llm_config=llm_config)
# Everything works as before, but with rich UnifiedResponse internally
```

### When to Use V2 Client

- ✅ Need structured outputs with guaranteed schema compliance
- ✅ Want strict tool validation for type-safe function calls
- ✅ Working with vision/multimodal models
- ✅ Need access to reasoning blocks
- ✅ Require rich metadata and citations
- ✅ Building systems that need forward compatibility

### Migration from Standard Client

No code changes needed! Just update `api_type`:


# Before
{"model": "claude-3-5-sonnet-20241022", "api_key": "..."}

# After  
{"api_type": "anthropic_v2", "model": "claude-sonnet-4-5", "api_key": "..."}
```