# AG2 + Gemini V2 Client Example

Author: [Your Name]

This notebook demonstrates the **Gemini V2 Client** (`api_type: "gemini_v2"`), which implements the ModelClientV2 protocol and returns rich `UnifiedResponse` objects with typed content blocks.

## Key Features of Gemini V2 Client

- **Rich Content Support**: Access reasoning blocks, multimodal content (images, audio, video), and tool calls
- **Provider Agnostic**: Unified format compatible with other V2 clients (OpenAI, Anthropic, Bedrock)
- **Type Safety**: Typed content blocks with Pydantic validation
- **Direct Property Access**: Use `response.text`, `response.reasoning`, etc. instead of parsing nested structures
- **Thinking Config Support**: Full support for Gemini's thinking features (`thinking_budget`, `thinking_level`, `include_thoughts`)
- **Structured Outputs**: Support for Pydantic models and JSON schemas

## Installation

sh
pip install ag2[gemini]
```

## Why Use V2 Client?

The V2 client (`api_type: "gemini_v2"`) provides several advantages over the V1 client (`api_type: "google"`):

1. **Rich Content Access**: Direct access to reasoning blocks, multimodal content, and citations
2. **Unified Interface**: Same response format across all providers (OpenAI, Anthropic, Gemini, Bedrock)
3. **Forward Compatible**: Handles new content types without code changes
4. **Better Developer Experience**: Type-safe content blocks with direct property access
5. **Full Thinking Support**: Complete support for Gemini 3 thinking features

Reference: [ModelClientV2 Migration Guide](https://github.com/microsoft/autogen/blob/main/autogen/llm_clients/MIGRATION_TO_V2.md)

In [7]:
import os

from dotenv import load_dotenv
from pydantic import BaseModel

from autogen import ConversableAgent, GroupChat, GroupChatManager, LLMConfig
from autogen.llm_clients import GeminiV2Client, UnifiedResponse

load_dotenv()

api_key = os.getenv("GEMINI_API_KEY")
if not api_key:
    raise RuntimeError("GEMINI_API_KEY is not set. Please set it in your environment or .env file.")

print("Libraries imported successfully!")

python-dotenv could not parse statement starting at line 3
python-dotenv could not parse statement starting at line 4
python-dotenv could not parse statement starting at line 5
python-dotenv could not parse statement starting at line 8
python-dotenv could not parse statement starting at line 9
python-dotenv could not parse statement starting at line 10
python-dotenv could not parse statement starting at line 11
python-dotenv could not parse statement starting at line 12
python-dotenv could not parse statement starting at line 38


Libraries imported successfully!


## 1. Basic Usage: Direct Client Access

The Gemini V2 client can be used directly to get rich `UnifiedResponse` objects:

In [8]:
# Create Gemini V2 client directly
client = GeminiV2Client(api_key=api_key)

# Create a completion
response = client.create({
    "model": "gemini-2.5-flash",
    "messages": [{"role": "user", "content": "Say 'Hello' in one word."}],
})

# Verify it's a UnifiedResponse
print(f"Response type: {type(response)}")
print(f"Is UnifiedResponse: {isinstance(response, UnifiedResponse)}")
print(f"Provider: {response.provider}")
print(f"Model: {response.model}")
print(f"Text: {response.text}")
print(f"Usage: {response.usage}")
print(f"Cost: ${response.cost:.6f}")

Response type: <class 'autogen.llm_clients.models.unified_response.UnifiedResponse'>
Is UnifiedResponse: True
Provider: gemini
Model: gemini-2.5-flash
Text: Hello
Usage: {'prompt_tokens': 9, 'completion_tokens': 1, 'total_tokens': 10}
Cost: $0.000005


## 2. Accessing Rich Content Blocks

The V2 client preserves all content types in structured blocks:

In [9]:
# Get response with rich content
response = client.create({
    "model": "gemini-2.5-flash",
    "messages": [{"role": "user", "content": "Explain quantum computing in 2 sentences."}],
})

# Access text content directly
print("=== Text Content ===")
print(response.text)
print()

# Access individual messages
print("=== Messages ===")
for idx, message in enumerate(response.messages):
    print(f"Message {idx} (role: {message.role}):")
    for content_block in message.content:
        print(f"  - Type: {content_block.type}, Content: {str(content_block)[:100]}...")
    print()

# Get content by type
text_blocks = response.get_content_by_type("text")
print(f"Number of text blocks: {len(text_blocks)}")

=== Text Content ===
Quantum computing harnesses quantum-mechanical phenomena like superposition and entanglement to allow "qubits" to exist in multiple states simultaneously, unlike classical bits. This enables them to process vast amounts of information in parallel, tackling certain complex problems impossible for traditional computers.

=== Messages ===
Message 0 (role: UserRoleEnum.ASSISTANT):
  - Type: ContentType.TEXT, Content: type=<ContentType.TEXT: 'text'> extra={} text='Quantum computing harnesses quantum-mechanical phenom...

Number of text blocks: 1


## 3. Thinking Config Support (Gemini 3 Models)

The V2 client fully supports Gemini's thinking features. This is especially powerful with Gemini 3 models:

In [None]:
# Example: Using thinking_level with Gemini 3 Pro
prompt = """You are playing the 20 question game. You know that what you are looking for
is an aquatic mammal that doesn't live in the sea, is venomous and that's
smaller than a cat. What could that be and how could you make sure?
"""

response = client.create({
    "model": "gemini-3-pro-preview",
    "messages": [{"role": "user", "content": prompt}],
    "thinking_level": "High",  # Use thinking_level for Gemini 3 Pro
    "include_thoughts": True,  # Include thought summaries in response
})

print("=== Response with Thinking ===")
print(response.text)
print()

# Access reasoning blocks if present
reasoning_blocks = response.get_content_by_type("reasoning")
if reasoning_blocks:
    print("=== Reasoning Blocks ===")
    for reasoning in reasoning_blocks:
        print(f"Reasoning: {reasoning.reasoning[:200]}...")
else:
    print("No reasoning blocks found (thoughts may be included in text)")

In [None]:
# Example: Using thinking_budget with Gemini 2.5 Flash
response = client.create({
    "model": "gemini-2.5-flash",
    "messages": [{"role": "user", "content": prompt}],
    "thinking_budget": 4096,  # Token budget for thinking
})

print("=== Response with Thinking Budget ===")
print(response.text)
print(f"\nTokens used: {response.usage.get('total_tokens', 0)}")

## 4. Structured Outputs with Pydantic Models

The V2 client supports structured outputs using Pydantic models:

In [None]:
# Define a Pydantic model for structured output
class Answer(BaseModel):
    answer: str
    confidence: float
    reasoning: str | None = None


# Create client with response format
structured_client = GeminiV2Client(api_key=api_key, response_format=Answer)

# Get structured response
response = structured_client.create({
    "model": "gemini-2.5-flash",
    "messages": [{"role": "user", "content": "What is 2+2? Answer with confidence score."}],
})

print("=== Structured Response ===")
print(response.text)
print()

# The response text contains JSON matching the schema
import json

try:
    structured_data = json.loads(response.text)
    print("Parsed structured data:")
    print(json.dumps(structured_data, indent=2))
except json.JSONDecodeError:
    print("Response is not valid JSON")

## 5. Tool/Function Calling

The V2 client preserves tool calls as `ToolCallContent` blocks:

In [None]:
# Define a tool
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA",
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "Temperature unit",
                    },
                },
                "required": ["location"],
            },
        },
    }
]

# Create request with tools
response = client.create({
    "model": "gemini-2.5-flash",
    "messages": [{"role": "user", "content": "What's the weather in San Francisco?"}],
    "tools": tools,
})

print("=== Response with Tool Calls ===")
print(f"Text: {response.text}")
print()

# Access tool calls
tool_calls = response.messages[0].get_tool_calls()
if tool_calls:
    print("=== Tool Calls ===")
    for tool_call in tool_calls:
        print(f"Tool: {tool_call.name}")
        print(f"ID: {tool_call.id}")
        print(f"Arguments: {tool_call.arguments}")
        print()
else:
    print("No tool calls in response")

## 6. Using with LLMConfig (Recommended for Agents)

The V2 client can be used with `LLMConfig` for agent integration:

In [3]:
# Configure LLM with V2 client
llm_config_v2 = LLMConfig(
    config_list=[
        {
            "api_type": "gemini_v2",  # Use gemini_v2 for V2 client
            "model": "gemini-2.5-flash",
            "api_key": api_key,
            "temperature": 0.7,
            "max_tokens": 500,
        }
    ]
)

# Create agent with V2 client
agent_v2 = ConversableAgent(
    name="assistant_v2",
    llm_config=llm_config_v2,
    system_message="You are a helpful assistant.",
)

# Use the agent
response = agent_v2.generate_reply(messages=[{"role": "user", "content": "Explain machine learning in one sentence."}])

print("=== Agent Response ===")
print(response)

[31m
>>>>>>>> USING AUTO REPLY...[0m
=== Agent Response ===
Machine learning is a field where computers learn from data to identify patterns, make decisions, or predictions without


## 7. V1 vs V2 Comparison

Compare the V1 and V2 client responses:

In [None]:
# V1 Client (legacy)
llm_config_v1 = LLMConfig(
    config_list=[
        {
            "api_type": "google",  # V1 client
            "model": "gemini-2.5-flash",
            "api_key": api_key,
            "response_format"
        }
    ]
)

agent_v1 = ConversableAgent(
    name="assistant_v1",
    llm_config=llm_config_v1,
)

# V2 Client
agent_v2 = ConversableAgent(
    name="assistant_v2",
    llm_config=llm_config_v2,
)

test_message = "What is the capital of France?"

print("=== V1 Client Response ===")
response_v1 = agent_v1.generate_reply(messages=[{"role": "user", "content": test_message}])
print(f"Type: {type(response_v1)}")
print(f"Response: {response_v1}")
print()

print("=== V2 Client Response ===")
response_v2 = agent_v2.generate_reply(messages=[{"role": "user", "content": test_message}])
print(f"Type: {type(response_v2)}")
print(f"Response: {response_v2}")
print()

print("Note: Both responses work with agents, but V2 provides richer content access when using the client directly.")

## 8. Group Chat with V2 Client

The V2 client works seamlessly in group chats:

In [None]:
# Create agents with V2 client
planner = ConversableAgent(
    name="planner",
    llm_config=LLMConfig(
        config_list=[
            {
                "api_type": "gemini_v2",
                "model": "gemini-2.5-flash",
                "api_key": api_key,
            }
        ]
    ),
    system_message="You are a planning assistant. Create detailed plans.",
    description="Creates plans",
)

reviewer = ConversableAgent(
    name="reviewer",
    llm_config=LLMConfig(
        config_list=[
            {
                "api_type": "gemini_v2",
                "model": "gemini-2.5-flash",
                "api_key": api_key,
            }
        ]
    ),
    system_message="You are a review assistant. Provide constructive feedback.",
    description="Reviews plans",
)

executor = ConversableAgent(
    name="executor",
    llm_config=LLMConfig(
        config_list=[
            {
                "api_type": "gemini_v2",
                "model": "gemini-2.5-flash",
                "api_key": api_key,
            }
        ]
    ),
    system_message="You are an execution assistant. Implement plans.",
    description="Executes plans",
)

# Create group chat
groupchat = GroupChat(
    agents=[planner, reviewer, executor],
    speaker_selection_method="auto",
    messages=[],
)

manager = GroupChatManager(
    name="manager",
    groupchat=groupchat,
    llm_config=llm_config_v2,
    # is_termination_msg=lambda x: "DONE" in (x.get("content", "") or "").upper(),
)

# Start conversation
chat_result = planner.initiate_chat(
    recipient=manager,
    message="Create a plan for organizing a small team meeting. Say DONE when finished.",
    max_turns=3,
)

print("=== Group Chat History ===")
for msg in chat_result.chat_history:
    print(f"{msg.get('name', 'unknown')}: {msg.get('content', '')[:100]}...")

## 9. Group Chat with Structured Outputs

Combine V2 client with structured outputs in a group chat:

In [6]:
# Define structured output model
class Plan(BaseModel):
    title: str
    steps: list[str]
    estimated_time: str
    resources_needed: list[str]


# Create agent with structured output
structured_agent = ConversableAgent(
    name="structured_planner",
    llm_config=LLMConfig(
    config_list=[{
        "api_type": "gemini_v2",
        "model": "gemini-2.5-flash",
        "api_key": os.environ.get("GEMINI_API_KEY"),
        # "temperature": 0,
    "response_format":Plan
    }],
),
    system_message="You are a planning assistant. Always respond with structured plans.",
)

# Get structured response
response = structured_agent.run(
    messages=[{"role": "user", "content": "Create a plan for learning Python in 30 days."}],
    max_turns=3,
)

print("=== Structured Plan ===")
print(response)
print()

# Parse the structured response
import json

try:
    plan_data = json.loads(response)
    print("=== Parsed Plan ===")
    print(f"Title: {plan_data.get('title')}")
    print(f"Steps: {len(plan_data.get('steps', []))}")
    print(f"Estimated Time: {plan_data.get('estimated_time')}")
except json.JSONDecodeError:
    print("Response is not valid JSON")

=== Structured Plan ===
<autogen.io.run_response.RunResponse object at 0x15fedcec0>



TypeError: the JSON object must be str, bytes or bytearray, not RunResponse

## 10. Cost and Usage Tracking

The V2 client provides detailed cost and usage information:

In [None]:
# Get response
response = client.create({
    "model": "gemini-2.5-flash",
    "messages": [{"role": "user", "content": "Explain the theory of relativity."}],
    "max_tokens": 500,
})

# Access usage information
usage = GeminiV2Client.get_usage(response)
cost = client.cost(response)

print("=== Usage Information ===")
print(f"Prompt tokens: {usage['prompt_tokens']}")
print(f"Completion tokens: {usage['completion_tokens']}")
print(f"Total tokens: {usage['total_tokens']}")
print(f"Cost: ${cost:.6f}")
print(f"Model: {usage['model']}")

## 11. Backward Compatibility: create_v1_compatible()

The V2 client provides backward compatibility with V1 code:

In [None]:
# Get V1-compatible response
v1_response = client.create_v1_compatible({
    "model": "gemini-2.5-flash",
    "messages": [{"role": "user", "content": "Hello!"}],
})

print("=== V1-Compatible Response ===")
print(f"Type: {type(v1_response)}")
print(f"Keys: {list(v1_response.keys())}")
print(f"Model: {v1_response.get('model')}")
print(f"Choices: {len(v1_response.get('choices', []))}")
print(f"Usage: {v1_response.get('usage')}")
print(f"Cost: {v1_response.get('cost')}")

## Summary

The Gemini V2 client (`api_type: "gemini_v2"`) provides:

✅ **Rich Content Access**: Direct access to reasoning blocks, multimodal content, and tool calls
✅ **Unified Interface**: Same response format across all providers
✅ **Full Thinking Support**: Complete support for Gemini 3 thinking features
✅ **Structured Outputs**: Support for Pydantic models and JSON schemas
✅ **Backward Compatible**: Works with existing agent code via `create_v1_compatible()`
✅ **Type Safe**: Typed content blocks with Pydantic validation

## Next Steps

- Explore multimodal content (images, audio, video)
- Try different Gemini models (2.5 Flash, 2.5 Pro, 3 Pro Preview)
- Experiment with thinking configurations
- Integrate with other V2 clients in group chats

In [1]:
import os

from pydantic import BaseModel

from autogen import ConversableAgent, LLMConfig


class Extra(BaseModel):
    notes: str



class Output(BaseModel):
    is_good: bool

    extra: dict[str, Extra]


llm_config = LLMConfig(
    config_list={
        "api_type": "gemini_v2",
        "model": "gemini-2.5-flash",
        "api_key": os.environ.get("GEMINI_API_KEY"),
        "temperature": 0,
        "response_format":Output
    },
)

bot = ConversableAgent(
    name="bot",
    llm_config=llm_config,
    system_message="You are a smart assistant.",
)

response = bot.run(
    message="Think about the weather in paris, and return any information you find.",
    max_turns=1
)
result = response.process()

[33muser[0m (to bot):

Think about the weather in paris, and return any information you find.

--------------------------------------------------------------------------------
[31m
>>>>>>>> USING AUTO REPLY...[0m


ValueError: additionalProperties is not supported in the Gemini API.

In [2]:
! pip show google-genai

Name: google-genai
Version: 1.60.0
Summary: GenAI Python SDK
Home-page: https://github.com/googleapis/python-genai
Author: 
Author-email: Google LLC <googleapis-packages@google.com>
License-Expression: Apache-2.0
Location: /Users/priyanshu/Documents/GitHub/ag2/.venv/lib/python3.14/site-packages
Requires: anyio, distro, google-auth, httpx, pydantic, requests, sniffio, tenacity, typing-extensions, websockets
Required-by: google-cloud-aiplatform
