fix: make `_EvalMetricResultWithInvocation.expected_invocation` `Optional` for conversation_scenario support by ASRagab · Pull Request #5215 · google/adk-python

ASRagab · 2026-04-08T21:16:20Z

Summary

_EvalMetricResultWithInvocation.expected_invocation is typed as Invocation (required), but local_eval_service.py:285-287 intentionally sets it to None when eval_case.conversation is None (i.e., conversation_scenario user-simulation cases)
The public model EvalMetricResultPerInvocation in eval_metrics.py:323 already types this field as Optional[Invocation] = None
This mismatch causes a pydantic ValidationError during post-processing in _get_eval_metric_results_with_invocation, after all metrics have been computed

Changes

Make expected_invocation Optional[Invocation] = None in _EvalMetricResultWithInvocation
Guard the three attribute accesses in _print_details to handle None (fall back to actual_invocation.user_content for the prompt column, None for expected response/tool calls)
Both _convert_content_to_text and _convert_tool_calls_to_text already accept Optional parameters

Testing Plan

Verified with a pytest-based evaluation using AgentEvaluator.evaluate() against an evalset containing conversation_scenario cases (LLM-backed user simulation, no explicit conversation arrays).

Before fix — crashes after ~33 minutes of metric computation during post-processing:

pydantic_core._pydantic_core.ValidationError: 1 validation error for _EvalMetricResultWithInvocation
expected_invocation
  Input should be a valid dictionary or instance of Invocation [type=model_type, input_value=None, input_type=NoneType]

.venv/lib/python3.11/site-packages/google/adk/evaluation/agent_evaluator.py:639: ValidationError

After fix — the ValidationError is eliminated. The None expected_invocation flows through correctly because:

The field now accepts Optional[Invocation], matching the upstream EvalMetricResultPerInvocation model
_print_details gracefully handles None by falling back to actual_invocation.user_content for the prompt column and passing None to _convert_content_to_text/_convert_tool_calls_to_text (both already accept Optional inputs)

Reproduction evalset (any evalset with conversation_scenario triggers this):

{
  "eval_set_id": "test",
  "eval_cases": [{
    "eval_id": "scenario_1",
    "conversation_scenario": {
      "starting_prompt": "Hello",
      "conversation_plan": "Ask the agent a question and accept the answer."
    },
    "session_input": {"app_name": "my_agent", "user_id": "user1", "state": {}}
  }]
}

@pytest.mark.asyncio
async def test_scenario():
    await AgentEvaluator.evaluate("my_agent", "path/to/evalset.json", num_runs=1)

Fixes #5214

When using conversation_scenario for user simulation, expected_invocation is None because conversations are dynamically generated. The public model EvalMetricResultPerInvocation already types this as Optional[Invocation], but the private _EvalMetricResultWithInvocation requires non-None, causing a pydantic ValidationError during post-processing. - Make expected_invocation Optional[Invocation] = None - Guard attribute accesses in _print_details to handle None - Fall back to actual_invocation.user_content for the prompt column Fixes google#5214

adk-bot · 2026-04-08T21:17:51Z

Response from ADK Triaging Agent

Hello @ASRagab, thank you for submitting this pull request!

To help the reviewers, could you please add a testing plan section to your PR description explaining how you verified the fix? For example, did you run the evaluation with a conversation_scenario?

Including the logs or a screenshot showing that the ValidationError is gone after your change would also be very helpful.

You can find more details in our contribution guidelines. Thanks!

ASRagab · 2026-04-09T19:42:05Z

Testing Evidence for PR #5215

Reproduction Script

A targeted script that exercises the exact codepath fixed by this PR:

Constructs _EvalMetricResultWithInvocation with expected_invocation=None (the conversation_scenario path)
Exercises the three guard paths in _print_details where .user_content, .final_response, and .intermediate_data are accessed on expected_invocation
Verifies the non-None path still works (regression check)

Before (PyPI `google-adk==1.28.0`, unfixed)

============================================================
TEST 1: _EvalMetricResultWithInvocation(expected_invocation=None)
============================================================
  FAIL: ValidationError: 1 validation error for _EvalMetricResultWithInvocation
expected_invocation
  Input should be a valid dictionary or instance of Invocation [type=model_type, input_value=None, input_type=NoneType]
    For further information visit https://errors.pydantic.dev/2.12/v/model_type

Pydantic rejects None because the field is typed as Invocation (non-optional).

After (local editable install from `fix/optional-expected-invocation` branch)

============================================================
TEST 1: _EvalMetricResultWithInvocation(expected_invocation=None)
============================================================
  PASS: constructed successfully
  expected_invocation is None: True

============================================================
TEST 2: Guard paths for None expected_invocation
============================================================
  PASS: prompt = 'Hello'
  PASS: expected_response = ''
  PASS: expected_tool_calls = ''

============================================================
TEST 3: _EvalMetricResultWithInvocation(expected_invocation=<Invocation>)
============================================================
  PASS: constructed with real invocation
  PASS: prompt = 'Hello'

============================================================
ALL TESTS PASSED
============================================================

What was verified

Check	Result
`expected_invocation: Optional[Invocation] = None` (line 93)	`None` accepted without `ValidationError`
`_print_details` prompt fallback to `actual_invocation.user_content`	Works correctly
`_print_details` expected_response fallback to `None`	`_convert_content_to_text(None)` returns `""`
`_print_details` expected_tool_calls fallback to `None`	`_convert_tool_calls_to_text(None)` returns `""`
Non-None `expected_invocation` (regression)	Still works as before

Context

This was tested using a conversation_scenario-based evalset from an agent project. The multi-turn evalset has 5 cases that all use conversation_scenario (no explicit conversation array), which is exactly the codepath where local_eval_service.py sets expected_invocation=None during post-processing.

rohityan · 2026-04-13T20:14:37Z

Hi @ASRagab , Thank you for your contribution! We appreciate you taking the time to submit this pull request. Your PR has been received by the team and is currently under review. We will provide feedback as soon as we have an update to share.

rohityan · 2026-04-13T20:15:06Z

Hi @wukath , can you please review this.

wukath · 2026-04-13T20:30:48Z

cc @ankursharmas if you could take a look

adk-bot added the eval [Component] This issue is related to evaluation label Apr 8, 2026

kylegallatin reviewed Apr 8, 2026

View reviewed changes

Comment thread src/google/adk/evaluation/agent_evaluator.py

kylegallatin approved these changes Apr 8, 2026

View reviewed changes

rohityan self-assigned this Apr 9, 2026

ASRagab changed the title ~~fix: make _EvalMetricResultWithInvocation.expected_invocation Optional for conversation_scenario support~~ fix: make _EvalMetricResultWithInvocation.expected_invocation Optional for conversation_scenario support Apr 9, 2026

surajksharma07 mentioned this pull request Apr 9, 2026

AgentEvaluator crashes with ValidationError when evaluating conversation_scenario eval cases #5214

Closed

rohityan added the needs review [Status] The PR/issue is awaiting review from the maintainer label Apr 13, 2026

ankursharmas approved these changes Apr 14, 2026

View reviewed changes

Merge branch 'main' into fix/optional-expected-invocation

d3d56b6

ankursharmas merged commit a4c9387 into google:main Apr 14, 2026
14 checks passed

Jacksunwei mentioned this pull request Apr 16, 2026

chore(release/candidate): release 1.31.0 #5360

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: make `_EvalMetricResultWithInvocation.expected_invocation` `Optional` for conversation_scenario support#5215

fix: make `_EvalMetricResultWithInvocation.expected_invocation` `Optional` for conversation_scenario support#5215
ankursharmas merged 2 commits intogoogle:mainfrom
ASRagab:fix/optional-expected-invocation

ASRagab commented Apr 8, 2026 •

edited

Loading

Uh oh!

adk-bot commented Apr 8, 2026

Uh oh!

Uh oh!

ASRagab commented Apr 9, 2026

Uh oh!

rohityan commented Apr 13, 2026

Uh oh!

rohityan commented Apr 13, 2026

Uh oh!

wukath commented Apr 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

ASRagab commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Testing Plan

Uh oh!

adk-bot commented Apr 8, 2026

Uh oh!

Uh oh!

ASRagab commented Apr 9, 2026

Testing Evidence for PR #5215

Reproduction Script

Before (PyPI google-adk==1.28.0, unfixed)

After (local editable install from fix/optional-expected-invocation branch)

What was verified

Context

Uh oh!

rohityan commented Apr 13, 2026

Uh oh!

rohityan commented Apr 13, 2026

Uh oh!

wukath commented Apr 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

ASRagab commented Apr 8, 2026 •

edited

Loading

Before (PyPI `google-adk==1.28.0`, unfixed)

After (local editable install from `fix/optional-expected-invocation` branch)