-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Description
Issue Summary
When deploying agents to Vertex AI using ADK's agent_engines.create(), environment variables specified in
AdkApp.env_vars are not available when toolsets are initialised during the pickle serialisation process. This causes
pickled toolsets to retain configuration values from the deployment environment rather than reading from the runtime
environment.
Impact
This behaviour breaks environment-specific configuration patterns and forces developers to implement complex workarounds
using lazy loading and custom pickle serialisation to defer toolset creation until runtime.
Environment
- ADK Version: 1.9.0+
- google-cloud-aiplatform: 1.95.1+
- Python: 3.12
- Deployment Target: Vertex AI Agent Engines
Problem Description
Expected Behaviour
When using Pydantic BaseSettings for configuration with environment variables:
- Set environment variable in
AdkApp.env_varsfor the runtime environment - Toolset reads configuration from environment variables when initialised
- Toolset uses runtime-specific values (e.g., production API URLs)
Actual Behaviour
- Toolset is initialised at agent definition time (during module import on deployment machine)
- Pydantic
BaseSettingsreads from deployment environment, not runtime environment - Configuration values are frozen into the pickled toolset object
AdkApp.env_varsare set too late—after unpickling has already restored the frozen configuration
Root Cause
ADK's deployment flow creates a timing mismatch:
-
Deployment time (local/CI environment):
- Agent module is imported
- Toolsets are created with configuration read from deployment environment
- Agent hierarchy (including toolsets with embedded config) is pickled
- Pickle is uploaded to GCS
-
Runtime (Vertex AI):
- Agent is unpickled, restoring toolsets with frozen configuration
- Environment variables from
AdkApp.env_varsare set - But it's too late—toolsets already exist with deployment-time configuration
Illustration
# config.py
from pydantic_settings import BaseSettings
class AppConfig(BaseSettings):
API_BASE_URL: str = "http://localhost:8000" # Development default
app_config = AppConfig()# agent.py
from google.genai.types import Agent
from config import app_config
def create_openapi_toolset() -> OpenAPIToolset:
# Reads API_BASE_URL from environment AT IMPORT TIME
spec_dict["servers"] = [{
"url": app_config.API_BASE_URL, # Deployment environment value!
"description": "API Server"
}]
return OpenAPIToolset(spec_dict=spec_dict)
my_agent = Agent(
name="MyAgent",
tools=[
create_openapi_toolset(), # Called during import on deployment machine
],
)# deployment.py
from google import genai
from google.cloud import aiplatform
from agent import my_agent
# Environment variable set here, but TOO LATE
app = reasoning_engines.AdkApp(
agent=my_agent, # Already created with localhost:8000
env_vars={
"API_BASE_URL": "https://production-api.example.com" # Ignored!
}
)
remote_app = agent_engines.create(agent_engine=app)Result: The deployed agent makes API calls to http://localhost:8000 instead of
https://production-api.example.com.
Minimal Reproduction
# 1. Create simple config with environment variable
from pydantic_settings import BaseSettings
class Config(BaseSettings):
SERVICE_URL: str = "http://localhost:8000"
config = Config()
# 2. Create toolset that uses config at import time
from google.genai.types import Agent
from genai.toolsets import OpenAPIToolset
def create_toolset():
spec = {
"openapi": "3.1.0",
"servers": [{"url": config.SERVICE_URL}], # Read at import time
"paths": {"/health": {"get": {"operationId": "health_check"}}}
}
return OpenAPIToolset(spec_dict=spec)
agent = Agent(
name="TestAgent",
tools=[create_toolset()] # Toolset created with localhost:8000
)
# 3. Deploy with different environment variable
from google import genai
from google.cloud import aiplatform
app = reasoning_engines.AdkApp(
agent=agent,
env_vars={"SERVICE_URL": "https://production.example.com"} # Won't work!
)
remote_app = agent_engines.create(agent_engine=app)
# 4. When agent runs on Vertex AI, it will still use http://localhost:8000Current Workaround
Developers must implement lazy-loading wrappers with custom pickle behaviour:
from google.genai.toolsets import BaseToolset
class LazyOpenAPIToolset(BaseToolset):
def __init__(self, spec_path: Path, tool_filter: list[str]):
self.spec_path = spec_path
self.tool_filter = tool_filter
self._toolset: OpenAPIToolset | None = None
def _ensure_toolset(self) -> OpenAPIToolset:
if self._toolset is None:
# Read configuration from RUNTIME environment
spec_dict = load_spec(self.spec_path)
spec_dict["servers"] = [{
"url": app_config.API_BASE_URL # Now reads from Vertex AI environment
}]
self._toolset = OpenAPIToolset(spec_dict=spec_dict, tool_filter=self.tool_filter)
return self._toolset
def get_tools(self):
toolset = self._ensure_toolset()
return toolset.get_tools()
def __reduce__(self):
# Only pickle the configuration, not the toolset instance
return (self.__class__, (self.spec_path, self.tool_filter))This workaround:
- Adds significant complexity
- Requires understanding of Python pickle internals
- Must be implemented for every environment-dependent toolset
- Is not documented in ADK guides
Suggested Solutions
Option 1: Lazy Toolset Creation in ADK
ADK could provide built-in lazy loading for toolsets:
from google.genai.toolsets import lazy_toolset
def create_toolset():
# Factory function called at runtime, not import time
spec_dict["servers"] = [{"url": app_config.API_BASE_URL}]
return OpenAPIToolset(spec_dict=spec_dict)
agent = Agent(
name="MyAgent",
tools=[lazy_toolset(create_toolset)] # Factory wrapped automatically
)Option 2: Environment Variables Before Unpickling
Set AdkApp.env_vars before unpickling the agent, not after:
- Upload environment variables to GCS as separate metadata
- Set environment variables in Vertex AI runtime
- Unpickle agent with environment already configured
- Agent initialisation can read correct values
Option 3: Configuration Injection
Allow toolsets to receive configuration after unpickling:
app = reasoning_engines.AdkApp(
agent=my_agent,
env_vars={"API_BASE_URL": "https://production.example.com"},
toolset_config={ # New parameter
"OpenAPIToolset": {
"server_url": "https://production.example.com"
}
}
)ADK would inject this configuration into matching toolsets after unpickling.
Option 4: Documentation and Best Practices
If the current behaviour is intentional, document it clearly:
- Explain the pickle timing issue in ADK deployment guides
- Provide official lazy-loading patterns
- Include warnings about import-time side effects
- Show examples of environment-specific configuration
Additional Context
This issue particularly affects:
- OpenAPI toolsets connecting to environment-specific APIs
- Database toolsets with different connection strings per environment
- Credential toolsets requiring different secrets across environments
- Any toolset using Pydantic
BaseSettingsfor configuration
The problem is subtle and difficult to debug because:
- No errors are raised during deployment
- The agent appears to deploy successfully
- Failures only occur at runtime in Vertex AI
- Error messages don't indicate configuration issues (e.g., "Connection refused to localhost")
References
- ADK Documentation: https://cloud.google.com/vertex-ai/generative-ai/docs/agent-builder/agents
- Pydantic Settings: https://docs.pydantic.dev/latest/concepts/pydantic_settings/
- Python Pickle Protocol: https://docs.python.org/3/library/pickle.html#pickling-class-instances
Request
Could the ADK team:
- Confirm whether this behaviour is intentional
- Consider implementing one of the suggested solutions
- Provide official guidance on environment-specific configuration
- Document the pickle timing behaviour in deployment guides
Thank you for considering this issue.
I am happy to raise documentation if you find this behaviour intentional and provide a pattern for lazy toolset for
users.