Feat: Agent Evaluations print detailed results when all tests pass #3892

quad2524 · 2025-12-10T23:11:22Z

Link to Issue or Description of Change

1. Link to an existing issue (if applicable):

Closes: Feature Request: Agent Evaluations print detailed results even when all tests pass #3891
Related: None

Testing Plan

I verified the changes using a reproduction script that tested both the default behavior and the new opt-in behavior.

Unit Tests:

I have added or updated unit tests for my change.
All unit tests pass locally.

Manual End-to-End (E2E) Tests:

I created a reproduction script reproduce_issue.py (which I have since cleaned up) that invoked AgentEvaluator.evaluate on an existing test fixture (tests/integration/fixture/home_automation_agent/simple_test.test.json).

Case 1 (Default): Called with print_detailed_results=True (and implied print_detailed_results_on_success=False). Verified that no detailed metrics table was printed for a passing test.
Case 2 (Opt-in): Called with print_detailed_results=True and print_detailed_results_on_success=True. Verified that the detailed metrics table was printed for the passing test.

CLI Verification:

I also verified the new CLI flag --print_detailed_results_on_success with the adk eval command.

Checklist

I have read the CONTRIBUTING.md document.
I have performed a self-review of my own code.
I have commented my code, particularly in hard-to-understand areas.
I have added tests that prove my fix is effective or that my feature works.
New and existing unit tests pass locally with my changes.
I have manually tested my changes end-to-end.
Any dependent changes have been merged and published in downstream modules.

Additional context

This change introduces a backward-compatible optional argument print_detailed_results_on_success to AgentEvaluator.evaluate and evaluate_eval_set.
It also adds the --print_detailed_results_on_success flag to the adk eval CLI command.

gemini-code-assist · 2025-12-10T23:11:37Z

Summary of Changes

Hello @quad2524, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the agent evaluation system by providing an opt-in mechanism to display detailed evaluation metrics for tests that successfully pass. Previously, detailed results were only presented for failing tests. This change introduces a new parameter and a corresponding CLI flag, allowing users greater control over the verbosity of evaluation output, particularly useful for in-depth analysis of successful agent performance.

Highlights

New Feature: Introduced the ability to print detailed evaluation results for agent tests even when all tests pass, which was previously only shown for failures.
API Changes: Added a new optional boolean parameter, print_detailed_results_on_success, to the AgentEvaluator.evaluate and evaluate_eval_set functions. This parameter defaults to False to maintain backward compatibility.
CLI Integration: A new CLI flag, --print_detailed_results_on_success, has been added to the adk eval command, allowing users to control this behavior directly from the command line.
Conditional Output Logic: Modified the internal logic for printing detailed evaluation results to respect the new print_detailed_results_on_success flag, ensuring detailed output is only shown for passing tests when explicitly requested.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a new feature to print detailed evaluation results even for passing tests, controlled by the print_detailed_results_on_success argument. The changes are well-implemented across both the CLI command adk eval and the programmatic AgentEvaluator methods, ensuring consistent behavior. The logic correctly handles the new option. I've included a couple of suggestions to refactor the conditional logic for improved readability and maintainability.

src/google/adk/cli/cli_tools_click.py

src/google/adk/evaluation/agent_evaluator.py

ryanaiagent · 2025-12-12T01:01:34Z

Hi @quad2524, Thank you for your contribution! We appreciate you taking the time to submit this pull request.
Can you fix the lint error using autoformat.sh.
It looks like the test failures here are unrelated to your changes. They were caused by a temporary issue on our main branch that has since been fixed.
Could you please update your branch with the latest changes from main? This will pull in the fix and should get the tests to pass.

adk-bot added the eval [Component] This issue is related to evaluation label Dec 10, 2025

gemini-code-assist bot reviewed Dec 10, 2025

View reviewed changes

src/google/adk/cli/cli_tools_click.py Outdated Show resolved Hide resolved

src/google/adk/evaluation/agent_evaluator.py Show resolved Hide resolved

ryanaiagent self-assigned this Dec 12, 2025

ryanaiagent added the request clarification [Status] The maintainer need clarification or more information from the author label Dec 12, 2025

quad2524 added 4 commits December 11, 2025 19:46

Add Agent Evaluations print results on success flag

7b332e8

Adding cli option

25ec8a6

Refactored per gemini suggestions.

5a58e88

Format update from autoformat tool

852da24

quad2524 force-pushed the feat/eval_print_results_on_success branch from 29bcb87 to 852da24 Compare December 12, 2025 01:49

quad2524 added 2 commits December 11, 2025 19:53

format changes from autoformat.sh

34a4ec7

Merge branch 'main' into feat/eval_print_results_on_success

b55bcf2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feat: Agent Evaluations print detailed results when all tests pass #3892

Feat: Agent Evaluations print detailed results when all tests pass #3892

quad2524 commented Dec 10, 2025

Uh oh!

gemini-code-assist bot commented Dec 10, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

ryanaiagent commented Dec 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Feat: Agent Evaluations print detailed results when all tests pass #3892

Are you sure you want to change the base?

Feat: Agent Evaluations print detailed results when all tests pass #3892

Conversation

quad2524 commented Dec 10, 2025

Link to Issue or Description of Change

Testing Plan

Checklist

Additional context

Uh oh!

gemini-code-assist bot commented Dec 10, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

ryanaiagent commented Dec 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants