Skip to content

[FEATURE] Support for Gemini Built-in Tools (GoogleSearch, CodeExecution, etc.) #1049

@pshiko

Description

@pshiko

Problem Statement

Gemini API provides powerful built-in tools such as GoogleSearch, CodeExecution, ComputerUse, UrlContext, and FileSearch as documented at https://ai.google.dev/gemini-api/docs/google-search and https://ai.google.dev/api/caching#Tool. These tools are distinct from standard FunctionDeclaration-based tools and are defined in separate fields within the Gemini API's Tool type.

Currently, the Strands SDK's tool interface is designed around standard function declarations and cannot accommodate these Gemini-specific built-in tools. While GeminiConfig has a params field for passing Gemini-specific parameters, the tools are handled through Strands' standard tool interface, making it impossible to pass Gemini's built-in tools through the params mechanism.

This limitation prevents users from leveraging Gemini's powerful built-in capabilities such as:

  • Real-time web search with grounding and citations (GoogleSearch)
  • Code execution for computational tasks (CodeExecution)
  • Computer control capabilities (ComputerUse)
  • URL-based context retrieval (UrlContext)
  • File search functionality (FileSearch)

Proposed Solution

Add support for Gemini's built-in tools by providing a mechanism to pass genai.types.Tool objects that contain non-FunctionDeclaration tools (such as GoogleSearch, CodeExecution, etc.) to the Gemini model provider.

The solution should:

  • Be Gemini-specific (not affect the core tool interface or other model providers)
  • Support passing genai.types.Tool objects directly to maintain type safety and compatibility with Gemini API
  • Allow combining standard function calling tools with Gemini built-in tools
  • Clearly distinguish between standard function tools and Gemini-specific built-in tools

Use Case

from strands import Agent
from strands.models.gemini import GeminiModel
from google import genai
from google.genai import types

# Create model with Google Search capability
model = GeminiModel(
    model_id="gemini-2.5-flash",
    gemini_tools=[
        types.Tool(google_search=types.GoogleSearch())
    ]
)

agent = Agent(model=model)

# Agent can now answer questions with real-time web data and citations
response = agent("Who won the euro 2024?")
# Response will be grounded in recent web search results

Alternatives Solutions

Alternative 1: Modify ToolSpec Interface

Approach: Extend the core ToolSpec type to accept genai.types.Tool objects.

Pros:

  • Unified interface for all tool types
  • Consistent across all model providers

Cons:

  • Very broad impact: Affects core tool interface used by all model providers
  • Breaking change risk: Could break existing implementations
  • Provider-specific coupling: Introduces Gemini-specific types into shared interface
  • Maintenance burden: Core tool interface would need to accommodate multiple provider-specific formats

Alternative 2: Magic String Transformation

Approach: Accept specially formatted FunctionDeclaration objects with reserved names (e.g., __gemini_google_search__) and transform them internally to Gemini built-in tools.

Pros:

  • No changes to existing interfaces
  • Works with current tool infrastructure

Cons:

  • Fragile: Relies on magic strings/conventions that could conflict with real tool names
  • Not future-proof: May not adapt well to new Gemini tool parameters or features
  • Collision risk: Could conflict with actual user-defined function names
  • Hidden behavior: Transformation logic is implicit and hard to discover
  • Poor developer experience: Requires documentation of special naming conventions

Additional Context

Unlike standard function calling where the SDK executes tools and can track their usage, Gemini's built-in tools are executed server-side by Google's infrastructure. This presents a challenge for tool usage tracking and observability.

The built-in tools don't generate toolUse and toolResult blocks in the same way as standard function declarations. Instead, they:

  • Execute transparently on Google's servers
  • Return results directly in the response
  • May include grounding metadata (for GoogleSearch) but not explicit tool call traces

Potential Solutions:

  1. Document the difference in behavior between standard and built-in tools
  2. Parse grounding metadata where available (e.g., GoogleSearch includes groundingMetadata)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions