ADK Bug Report: Environment Variables Not Available During Toolset Initialisation

## Issue Summary

When deploying agents to Vertex AI using ADK's `agent_engines.create()`, environment variables specified in
`AdkApp.env_vars` are not available when toolsets are initialised during the pickle serialisation process. This causes
pickled toolsets to retain configuration values from the deployment environment rather than reading from the runtime
environment.

## Impact

This behaviour breaks environment-specific configuration patterns and forces developers to implement complex workarounds
using lazy loading and custom pickle serialisation to defer toolset creation until runtime.

## Environment

- **ADK Version**: 1.9.0+
- **google-cloud-aiplatform**: 1.95.1+
- **Python**: 3.12
- **Deployment Target**: Vertex AI Agent Engines

## Problem Description

### Expected Behaviour

When using Pydantic `BaseSettings` for configuration with environment variables:

1. Set environment variable in `AdkApp.env_vars` for the runtime environment
2. Toolset reads configuration from environment variables when initialised
3. Toolset uses runtime-specific values (e.g., production API URLs)

### Actual Behaviour

1. Toolset is initialised at agent definition time (during module import on deployment machine)
2. Pydantic `BaseSettings` reads from deployment environment, not runtime environment
3. Configuration values are frozen into the pickled toolset object
4. `AdkApp.env_vars` are set too late—after unpickling has already restored the frozen configuration

## Root Cause

ADK's deployment flow creates a timing mismatch:

1. **Deployment time** (local/CI environment):
    - Agent module is imported
    - Toolsets are created with configuration read from deployment environment
    - Agent hierarchy (including toolsets with embedded config) is pickled
    - Pickle is uploaded to GCS

2. **Runtime** (Vertex AI):
    - Agent is unpickled, restoring toolsets with frozen configuration
    - Environment variables from `AdkApp.env_vars` are set
    - But it's too late—toolsets already exist with deployment-time configuration

### Illustration

```python
# config.py
from pydantic_settings import BaseSettings


class AppConfig(BaseSettings):
    API_BASE_URL: str = "http://localhost:8000"  # Development default


app_config = AppConfig()
```

```python
# agent.py
from google.genai.types import Agent
from config import app_config


def create_openapi_toolset() -> OpenAPIToolset:
    # Reads API_BASE_URL from environment AT IMPORT TIME
    spec_dict["servers"] = [{
        "url": app_config.API_BASE_URL,  # Deployment environment value!
        "description": "API Server"
    }]
    return OpenAPIToolset(spec_dict=spec_dict)


my_agent = Agent(
    name="MyAgent",
    tools=[
        create_openapi_toolset(),  # Called during import on deployment machine
    ],
)
```

```python
# deployment.py
from google import genai
from google.cloud import aiplatform
from agent import my_agent

# Environment variable set here, but TOO LATE
app = reasoning_engines.AdkApp(
    agent=my_agent,  # Already created with localhost:8000
    env_vars={
        "API_BASE_URL": "https://production-api.example.com"  # Ignored!
    }
)

remote_app = agent_engines.create(agent_engine=app)
```

**Result**: The deployed agent makes API calls to `http://localhost:8000` instead of
`https://production-api.example.com`.

## Minimal Reproduction

```python
# 1. Create simple config with environment variable
from pydantic_settings import BaseSettings


class Config(BaseSettings):
    SERVICE_URL: str = "http://localhost:8000"


config = Config()

# 2. Create toolset that uses config at import time
from google.genai.types import Agent
from genai.toolsets import OpenAPIToolset


def create_toolset():
    spec = {
        "openapi": "3.1.0",
        "servers": [{"url": config.SERVICE_URL}],  # Read at import time
        "paths": {"/health": {"get": {"operationId": "health_check"}}}
    }
    return OpenAPIToolset(spec_dict=spec)


agent = Agent(
    name="TestAgent",
    tools=[create_toolset()]  # Toolset created with localhost:8000
)

# 3. Deploy with different environment variable
from google import genai
from google.cloud import aiplatform

app = reasoning_engines.AdkApp(
    agent=agent,
    env_vars={"SERVICE_URL": "https://production.example.com"}  # Won't work!
)

remote_app = agent_engines.create(agent_engine=app)

# 4. When agent runs on Vertex AI, it will still use http://localhost:8000
```

## Current Workaround

Developers must implement lazy-loading wrappers with custom pickle behaviour:

```python
from google.genai.toolsets import BaseToolset


class LazyOpenAPIToolset(BaseToolset):
    def __init__(self, spec_path: Path, tool_filter: list[str]):
        self.spec_path = spec_path
        self.tool_filter = tool_filter
        self._toolset: OpenAPIToolset | None = None

    def _ensure_toolset(self) -> OpenAPIToolset:
        if self._toolset is None:
            # Read configuration from RUNTIME environment
            spec_dict = load_spec(self.spec_path)
            spec_dict["servers"] = [{
                "url": app_config.API_BASE_URL  # Now reads from Vertex AI environment
            }]
            self._toolset = OpenAPIToolset(spec_dict=spec_dict, tool_filter=self.tool_filter)
        return self._toolset

    def get_tools(self):
        toolset = self._ensure_toolset()
        return toolset.get_tools()

    def __reduce__(self):
        # Only pickle the configuration, not the toolset instance
        return (self.__class__, (self.spec_path, self.tool_filter))
```

This workaround:

- Adds significant complexity
- Requires understanding of Python pickle internals
- Must be implemented for every environment-dependent toolset
- Is not documented in ADK guides

## Suggested Solutions

### Option 1: Lazy Toolset Creation in ADK

ADK could provide built-in lazy loading for toolsets:

```python
from google.genai.toolsets import lazy_toolset


def create_toolset():
    # Factory function called at runtime, not import time
    spec_dict["servers"] = [{"url": app_config.API_BASE_URL}]
    return OpenAPIToolset(spec_dict=spec_dict)


agent = Agent(
    name="MyAgent",
    tools=[lazy_toolset(create_toolset)]  # Factory wrapped automatically
)
```

### Option 2: Environment Variables Before Unpickling

Set `AdkApp.env_vars` before unpickling the agent, not after:

1. Upload environment variables to GCS as separate metadata
2. Set environment variables in Vertex AI runtime
3. Unpickle agent with environment already configured
4. Agent initialisation can read correct values

### Option 3: Configuration Injection

Allow toolsets to receive configuration after unpickling:

```python
app = reasoning_engines.AdkApp(
    agent=my_agent,
    env_vars={"API_BASE_URL": "https://production.example.com"},
    toolset_config={  # New parameter
        "OpenAPIToolset": {
            "server_url": "https://production.example.com"
        }
    }
)
```

ADK would inject this configuration into matching toolsets after unpickling.

### Option 4: Documentation and Best Practices

If the current behaviour is intentional, document it clearly:

- Explain the pickle timing issue in ADK deployment guides
- Provide official lazy-loading patterns
- Include warnings about import-time side effects
- Show examples of environment-specific configuration

## Additional Context

This issue particularly affects:

- **OpenAPI toolsets** connecting to environment-specific APIs
- **Database toolsets** with different connection strings per environment
- **Credential toolsets** requiring different secrets across environments
- Any toolset using Pydantic `BaseSettings` for configuration

The problem is subtle and difficult to debug because:

1. No errors are raised during deployment
2. The agent appears to deploy successfully
3. Failures only occur at runtime in Vertex AI
4. Error messages don't indicate configuration issues (e.g., "Connection refused to localhost")

## References

- ADK Documentation: https://cloud.google.com/vertex-ai/generative-ai/docs/agent-builder/agents
- Pydantic Settings: https://docs.pydantic.dev/latest/concepts/pydantic_settings/
- Python Pickle Protocol: https://docs.python.org/3/library/pickle.html#pickling-class-instances

## Request

Could the ADK team:

1. Confirm whether this behaviour is intentional
2. Consider implementing one of the suggested solutions
3. Provide official guidance on environment-specific configuration
4. Document the pickle timing behaviour in deployment guides

Thank you for considering this issue.

I am happy to raise documentation if you find this behaviour intentional and provide a pattern for lazy toolset for
users.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ADK Bug Report: Environment Variables Not Available During Toolset Initialisation #3208

Issue Summary

Impact

Environment

Problem Description

Expected Behaviour

Actual Behaviour

Root Cause

Illustration

Minimal Reproduction

Current Workaround

Suggested Solutions

Option 1: Lazy Toolset Creation in ADK

Option 2: Environment Variables Before Unpickling

Option 3: Configuration Injection

Option 4: Documentation and Best Practices

Additional Context

References

Request

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ADK Bug Report: Environment Variables Not Available During Toolset Initialisation #3208

Description

Issue Summary

Impact

Environment

Problem Description

Expected Behaviour

Actual Behaviour

Root Cause

Illustration

Minimal Reproduction

Current Workaround

Suggested Solutions

Option 1: Lazy Toolset Creation in ADK

Option 2: Environment Variables Before Unpickling

Option 3: Configuration Injection

Option 4: Documentation and Best Practices

Additional Context

References

Request

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions