Skip to content

ADK Bug Report: Environment Variables Not Available During Toolset Initialisation #3208

@tommyhutcheson

Description

@tommyhutcheson

Issue Summary

When deploying agents to Vertex AI using ADK's agent_engines.create(), environment variables specified in
AdkApp.env_vars are not available when toolsets are initialised during the pickle serialisation process. This causes
pickled toolsets to retain configuration values from the deployment environment rather than reading from the runtime
environment.

Impact

This behaviour breaks environment-specific configuration patterns and forces developers to implement complex workarounds
using lazy loading and custom pickle serialisation to defer toolset creation until runtime.

Environment

  • ADK Version: 1.9.0+
  • google-cloud-aiplatform: 1.95.1+
  • Python: 3.12
  • Deployment Target: Vertex AI Agent Engines

Problem Description

Expected Behaviour

When using Pydantic BaseSettings for configuration with environment variables:

  1. Set environment variable in AdkApp.env_vars for the runtime environment
  2. Toolset reads configuration from environment variables when initialised
  3. Toolset uses runtime-specific values (e.g., production API URLs)

Actual Behaviour

  1. Toolset is initialised at agent definition time (during module import on deployment machine)
  2. Pydantic BaseSettings reads from deployment environment, not runtime environment
  3. Configuration values are frozen into the pickled toolset object
  4. AdkApp.env_vars are set too late—after unpickling has already restored the frozen configuration

Root Cause

ADK's deployment flow creates a timing mismatch:

  1. Deployment time (local/CI environment):

    • Agent module is imported
    • Toolsets are created with configuration read from deployment environment
    • Agent hierarchy (including toolsets with embedded config) is pickled
    • Pickle is uploaded to GCS
  2. Runtime (Vertex AI):

    • Agent is unpickled, restoring toolsets with frozen configuration
    • Environment variables from AdkApp.env_vars are set
    • But it's too late—toolsets already exist with deployment-time configuration

Illustration

# config.py
from pydantic_settings import BaseSettings


class AppConfig(BaseSettings):
    API_BASE_URL: str = "http://localhost:8000"  # Development default


app_config = AppConfig()
# agent.py
from google.genai.types import Agent
from config import app_config


def create_openapi_toolset() -> OpenAPIToolset:
    # Reads API_BASE_URL from environment AT IMPORT TIME
    spec_dict["servers"] = [{
        "url": app_config.API_BASE_URL,  # Deployment environment value!
        "description": "API Server"
    }]
    return OpenAPIToolset(spec_dict=spec_dict)


my_agent = Agent(
    name="MyAgent",
    tools=[
        create_openapi_toolset(),  # Called during import on deployment machine
    ],
)
# deployment.py
from google import genai
from google.cloud import aiplatform
from agent import my_agent

# Environment variable set here, but TOO LATE
app = reasoning_engines.AdkApp(
    agent=my_agent,  # Already created with localhost:8000
    env_vars={
        "API_BASE_URL": "https://production-api.example.com"  # Ignored!
    }
)

remote_app = agent_engines.create(agent_engine=app)

Result: The deployed agent makes API calls to http://localhost:8000 instead of
https://production-api.example.com.

Minimal Reproduction

# 1. Create simple config with environment variable
from pydantic_settings import BaseSettings


class Config(BaseSettings):
    SERVICE_URL: str = "http://localhost:8000"


config = Config()

# 2. Create toolset that uses config at import time
from google.genai.types import Agent
from genai.toolsets import OpenAPIToolset


def create_toolset():
    spec = {
        "openapi": "3.1.0",
        "servers": [{"url": config.SERVICE_URL}],  # Read at import time
        "paths": {"/health": {"get": {"operationId": "health_check"}}}
    }
    return OpenAPIToolset(spec_dict=spec)


agent = Agent(
    name="TestAgent",
    tools=[create_toolset()]  # Toolset created with localhost:8000
)

# 3. Deploy with different environment variable
from google import genai
from google.cloud import aiplatform

app = reasoning_engines.AdkApp(
    agent=agent,
    env_vars={"SERVICE_URL": "https://production.example.com"}  # Won't work!
)

remote_app = agent_engines.create(agent_engine=app)

# 4. When agent runs on Vertex AI, it will still use http://localhost:8000

Current Workaround

Developers must implement lazy-loading wrappers with custom pickle behaviour:

from google.genai.toolsets import BaseToolset


class LazyOpenAPIToolset(BaseToolset):
    def __init__(self, spec_path: Path, tool_filter: list[str]):
        self.spec_path = spec_path
        self.tool_filter = tool_filter
        self._toolset: OpenAPIToolset | None = None

    def _ensure_toolset(self) -> OpenAPIToolset:
        if self._toolset is None:
            # Read configuration from RUNTIME environment
            spec_dict = load_spec(self.spec_path)
            spec_dict["servers"] = [{
                "url": app_config.API_BASE_URL  # Now reads from Vertex AI environment
            }]
            self._toolset = OpenAPIToolset(spec_dict=spec_dict, tool_filter=self.tool_filter)
        return self._toolset

    def get_tools(self):
        toolset = self._ensure_toolset()
        return toolset.get_tools()

    def __reduce__(self):
        # Only pickle the configuration, not the toolset instance
        return (self.__class__, (self.spec_path, self.tool_filter))

This workaround:

  • Adds significant complexity
  • Requires understanding of Python pickle internals
  • Must be implemented for every environment-dependent toolset
  • Is not documented in ADK guides

Suggested Solutions

Option 1: Lazy Toolset Creation in ADK

ADK could provide built-in lazy loading for toolsets:

from google.genai.toolsets import lazy_toolset


def create_toolset():
    # Factory function called at runtime, not import time
    spec_dict["servers"] = [{"url": app_config.API_BASE_URL}]
    return OpenAPIToolset(spec_dict=spec_dict)


agent = Agent(
    name="MyAgent",
    tools=[lazy_toolset(create_toolset)]  # Factory wrapped automatically
)

Option 2: Environment Variables Before Unpickling

Set AdkApp.env_vars before unpickling the agent, not after:

  1. Upload environment variables to GCS as separate metadata
  2. Set environment variables in Vertex AI runtime
  3. Unpickle agent with environment already configured
  4. Agent initialisation can read correct values

Option 3: Configuration Injection

Allow toolsets to receive configuration after unpickling:

app = reasoning_engines.AdkApp(
    agent=my_agent,
    env_vars={"API_BASE_URL": "https://production.example.com"},
    toolset_config={  # New parameter
        "OpenAPIToolset": {
            "server_url": "https://production.example.com"
        }
    }
)

ADK would inject this configuration into matching toolsets after unpickling.

Option 4: Documentation and Best Practices

If the current behaviour is intentional, document it clearly:

  • Explain the pickle timing issue in ADK deployment guides
  • Provide official lazy-loading patterns
  • Include warnings about import-time side effects
  • Show examples of environment-specific configuration

Additional Context

This issue particularly affects:

  • OpenAPI toolsets connecting to environment-specific APIs
  • Database toolsets with different connection strings per environment
  • Credential toolsets requiring different secrets across environments
  • Any toolset using Pydantic BaseSettings for configuration

The problem is subtle and difficult to debug because:

  1. No errors are raised during deployment
  2. The agent appears to deploy successfully
  3. Failures only occur at runtime in Vertex AI
  4. Error messages don't indicate configuration issues (e.g., "Connection refused to localhost")

References

Request

Could the ADK team:

  1. Confirm whether this behaviour is intentional
  2. Consider implementing one of the suggested solutions
  3. Provide official guidance on environment-specific configuration
  4. Document the pickle timing behaviour in deployment guides

Thank you for considering this issue.

I am happy to raise documentation if you find this behaviour intentional and provide a pattern for lazy toolset for
users.

Metadata

Metadata

Labels

agent engine[Component] This issue is related to Agent Engine deployment

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions