Skip to content

evals: ConversationTurn rejects extra fields from run_inference API response and missing turn_index #6785

@tottenjordan

Description

@tottenjordan

Environment

  • google-cloud-aiplatform: 1.152.0
  • google-genai: 1.75.0
  • google-adk: 1.33.0
  • Python: 3.12
  • Region: us-central1

Description

When using the simulated evaluation pipeline (generate_conversation_scenariosrun_inferenceevaluate), two bugs prevent end-to-end execution:

Bug 1: ConversationTurn rejects extra fields from API response

run_inference() returns agent turn data with fields (model_version, content, id, timestamp, author, actions, invocation_id, long_running_tool_ids, finish_reason, usage_metadata, avg_logprobs) that are not defined in ConversationTurn (which only has turn_index, turn_id, events).

Since google.genai._common.BaseModel sets extra='forbid', pydantic rejects these:

pydantic_core._pydantic_core.ValidationError: 44 validation errors for AgentData
turns.0.model_version
  Extra inputs are not permitted
turns.0.content
  Extra inputs are not permitted
...

The error originates at _evals_common.py:1880 in _process_multi_turn_agent_response:

return types.evals.AgentData(
    turns=resp_item,
    agents=agent_data_agents,
).model_dump(exclude_unset=True)

Note: The SDK has _remove_extra_fields() in google/genai/_common.py:317 that handles this for _from_response() calls, but _process_multi_turn_agent_response uses direct construction instead.

Bug 2: turn_index not populated by run_inference

The raw turn dicts from the agent engine response don't include turn_index. _process_multi_turn_agent_response passes them directly to AgentData(turns=...) without adding turn_index. When the resulting data is sent to the evaluate API, it fails:

400 INVALID_ARGUMENT: Field: instance.agent_eval_data.turns[0].turn_index; Message: Required field is not set.

Workaround

We patched both issues in our eval script:

# Bug 1: Set ConversationTurn to ignore unknown fields
ct = evals_types.ConversationTurn
ct.model_config["extra"] = "ignore"
ct.__pydantic_complete__ = False
ct.model_rebuild(force=True)
evals_types.AgentData.__pydantic_complete__ = False
evals_types.AgentData.model_rebuild(force=True)

# Bug 2: Inject turn_index based on position
_orig_process = _evals_common._process_multi_turn_agent_response
def _patched_process(resp_item, agent_data_agents):
    if isinstance(resp_item, list):
        for i, turn in enumerate(resp_item):
            if isinstance(turn, dict) and "turn_index" not in turn:
                turn["turn_index"] = i
    return _orig_process(resp_item, agent_data_agents)
_evals_common._process_multi_turn_agent_response = _patched_process

Steps to Reproduce

import vertexai
from vertexai import Client, types

vertexai.init(project="PROJECT", location="us-central1")
client = Client(project="PROJECT", location="us-central1")

# Generate scenarios
eval_dataset = client.evals.generate_conversation_scenarios(
    agent_info=agent_info,
    config={"count": 1, "generation_instruction": "Search for hotels"},
    allow_cross_region_model=True,
)

# This fails with Bug 1
eval_dataset_with_traces = client.evals.run_inference(
    agent="projects/PROJECT_NUM/locations/us-central1/reasoningEngines/ENGINE_ID",
    src=eval_dataset,
    config={"user_simulator_config": {"max_turn": 3, "model_name": "gemini-2.5-flash"}},
)

# If Bug 1 is patched, this fails with Bug 2
eval_result = client.evals.evaluate(
    dataset=eval_dataset_with_traces,
    metrics=[types.RubricMetric.FINAL_RESPONSE_QUALITY],
)

Suggested Fix

  1. Add the missing fields to ConversationTurn in vertexai/_genai/types/evals.py, or set extra='ignore' on the model
  2. In _process_multi_turn_agent_response, add turn["turn_index"] = i for each turn before constructing AgentData

Metadata

Metadata

Assignees

No one assigned

    Labels

    api: vertex-aiIssues related to the googleapis/python-aiplatform API.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions