Skip to content

Conversation

@lizzij
Copy link
Member

@lizzij lizzij commented Oct 28, 2025

This PR supersedes and fixes #3114. It includes all commits from the original PR and adds a new commit to fix formatting.

Closes #3114.

Link to Issue or Description of Change

2. Or, if no issue exists, describe the change:

If applicable, please follow the issue templates to provide as much detail as
possible.

Problem:
This PR ensures that the finish_reason field from LiteLLM responses is properly propagated to LlmResponse objects, enabling after_model_callback functions to detect completion conditions like max_tokens truncation.

Solution:
This fix allows after_model_callback functions to properly detect:

  • "length": max_tokens limit reached
  • "stop": natural completion
  • "tool_calls": tool invocations
  • "content_filter": filtered content

This enables implementing retry logic for incomplete responses, logging completion statistics, and handling different completion conditions appropriately.

Testing Plan

Unit Tests:

✅ 53/53 tests in test_litellm.py pass
✅ All 4 new finish_reason tests pass
✅ No existing tests broken

Checklist

  • I have read the CONTRIBUTING.md document.
  • I have performed a self-review of my own code.
  • I have commented my code, particularly in hard-to-understand areas.
  • I have added tests that prove my fix is effective or that my feature works.
  • New and existing unit tests pass locally with my changes.
  • I have manually tested my changes end-to-end.
  • Any dependent changes have been merged and published in downstream modules.

aperepel and others added 12 commits October 28, 2025 10:56
Fixes google#3109

This change ensures that the finish_reason field from LiteLLM responses
is properly propagated to LlmResponse objects, enabling callbacks to
detect completion conditions like max_tokens truncation.

Changes:
- Extract finish_reason from LiteLLM response in lite_llm.py
- Update tracing.py to handle both enum (Gemini) and string (LiteLLM)
  finish_reason values
- Add comprehensive unit tests for finish_reason propagation

The fix allows after_model_callback functions to properly detect:
- "length": max_tokens limit reached
- "stop": natural completion
- "tool_calls": tool invocations
- "content_filter": filtered content
- Use .name instead of .value for enum finish_reason (more robust for IntEnum)
- Extract first choice using walrus operator for better readability
- Consolidate tests using @pytest.mark.parametrize to reduce duplication
- Strengthen test assertions to verify response content

All 53 tests pass.
Addressing review comments

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Addressing review comments

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Address type safety issue where finish_reason can be either:
- types.FinishReason enum (from Gemini responses)
- str (from LiteLLM responses)

Updated LlmResponse.finish_reason type hint to:
Optional[Union[types.FinishReason, str]]

This ensures type checkers correctly validate the dual nature of this
field across different model providers.

All 53 tests pass.
- Map finish_reason strings to proper FinishReason enum values in lite_llm.py
  - 'length' -> FinishReason.MAX_TOKENS
  - 'stop' -> FinishReason.STOP
  - 'tool_calls'/'function_call' -> FinishReason.STOP
  - 'content_filter' -> FinishReason.SAFETY
  - unknown values -> FinishReason.OTHER

- Add clarifying comment in tracing.py for string fallback path

- Update test_litellm.py to verify enum mapping:
  - Assert finish_reason is FinishReason enum instance
  - Verify correct enum values for each finish_reason string
  - Add test for unknown finish_reason mapping to OTHER

Benefits:
- Type consistency with Gemini native responses
- Avoids runtime warnings from string finish_reason
- Enables proper instanceof checks in callbacks
- Better integration with ADK telemetry
Maps LiteLLM finish_reason string values to proper FinishReason enum
for type consistency with Gemini native responses.

Changes:
- Add _FINISH_REASON_MAPPING dictionary for string->enum conversion
  - "length" -> FinishReason.MAX_TOKENS
  - "stop" -> FinishReason.STOP
  - "tool_calls"/"function_call" -> FinishReason.STOP
  - "content_filter" -> FinishReason.SAFETY
  - Unknown values -> FinishReason.OTHER (fallback)

- Update finish_reason type hint to Optional[FinishReason] (no Union needed)

- Update telemetry tracing to use .name for enum serialization

- Add explanatory comments:
  - Why tool_calls maps to STOP (no TOOL_CALL enum exists)
  - Docstring clarifies mapping applies to all model providers

Tests:
- test_finish_reason_propagation: verifies enum mapping for all values
- test_finish_reason_unknown_maps_to_other: verifies fallback behavior

Benefits:
- Type consistency: finish_reason is always FinishReason enum
- No runtime warnings from mixed types
- Enables proper isinstance() checks in callbacks
- Dictionary mapping improves maintainability
- Better integration with ADK telemetry
- Simplify tracing.py by removing isinstance check (always enum now)
- Refactor test assertions to use dictionary mapping instead of if/elif
- Reduce code duplication and improve readability

Addresses Gemini Code Assist bot suggestions:
- tracing.py: Direct .name access since finish_reason is always enum
- test_litellm.py: Dictionary mapping for cleaner test assertions
Import and use the actual _FINISH_REASON_MAPPING from lite_llm instead of
duplicating it in tests. This ensures tests stay in sync with implementation
changes automatically.

Benefits:
- Single source of truth for finish_reason mappings
- Tests automatically reflect any future mapping changes
- Reduced code duplication
- Better maintainability

Addresses review comment:
google#3114 (review)
The Union type is no longer needed since finish_reason is always
a FinishReason enum (never a string after our mapping).

Addresses review comment:
google#3114 (comment)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@gemini-code-assist
Copy link

Summary of Changes

Hello @lizzij, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical gap in how LiteLLM responses handle completion reasons. By correctly propagating and standardizing the finish_reason field, it significantly enhances the ability of downstream functions to interpret model outputs. This improvement is crucial for implementing advanced logic like retry mechanisms for incomplete responses, detailed logging of completion statistics, and appropriate handling of diverse model termination conditions, ultimately leading to more reliable and intelligent LLM interactions.

Highlights

  • LiteLLM finish_reason Propagation: Ensures that the finish_reason field from LiteLLM responses is correctly extracted and mapped to the LlmResponse object's finish_reason attribute, using the types.FinishReason enum.
  • Enhanced Callback Functionality: Allows after_model_callback functions to accurately detect various completion conditions such as max_tokens truncation, natural completion, tool invocations, and content filtering, enabling more robust post-response logic.
  • Standardized finish_reason Handling: Introduces a mapping (_FINISH_REASON_MAPPING) to convert LiteLLM's string-based finish_reason values to the standardized types.FinishReason enum, ensuring consistency across different model providers and improving type safety.
  • Updated Telemetry and Documentation: Adjusts the telemetry tracing to correctly log the standardized finish_reason enum values and updates the LlmResponse docstring to clarify that finish_reason will always be a types.FinishReason enum.
  • Comprehensive Unit Testing: Adds new unit tests to validate the proper propagation and mapping of various finish_reason types, including length, stop, tool_calls, content_filter, and ensures unknown reasons default to FinishReason.OTHER.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@adk-bot adk-bot added the models [Component] Issues related to model support label Oct 28, 2025
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly propagates the finish_reason from LiteLLM responses to LlmResponse objects for non-streaming calls. It introduces a mapping from LiteLLM's string reasons to the FinishReason enum, handles unknown reasons gracefully, and updates telemetry and docstrings accordingly. The accompanying tests are thorough for the non-streaming case. However, the same logic for propagating finish_reason has not been applied to streaming responses in generate_content_async, which means the fix is incomplete. This should be addressed to ensure consistent behavior across both streaming and non-streaming modes.

@xuanyang15
Copy link
Collaborator

Hi @lizzij, is this a duplication of #3114? Could you push to original author's branch so that your change will update the original PR instead of creating a new PR?

@lizzij
Copy link
Member Author

lizzij commented Oct 28, 2025

Hi @lizzij, is this a duplication of #3114? Could you push to original author's branch so that your change will update the original PR instead of creating a new PR?

I don't have write access to google/adk-python and can't directly update original author's branch.

@xuanyang15
Copy link
Collaborator

@lizzij I see, thanks for the helping update the change! Do you have a screenshot for the manual tests?

@lizzij
Copy link
Member Author

lizzij commented Oct 28, 2025

@lizzij I see, thanks for the helping update the change! Do you have a screenshot for the manual tests?

Screenshot 2025-10-28 at 1 23 53 PM

@GWeale GWeale self-requested a review October 28, 2025 21:34
@GWeale GWeale added ready to pull [Status] This PR is ready to be importing back to Google and removed ready to pull [Status] This PR is ready to be importing back to Google labels Oct 28, 2025
@GWeale GWeale added ready to pull [Status] This PR is ready to be importing back to Google and removed ready to pull [Status] This PR is ready to be importing back to Google labels Oct 29, 2025
@GWeale GWeale added ready to pull [Status] This PR is ready to be importing back to Google and removed ready to pull [Status] This PR is ready to be importing back to Google labels Oct 29, 2025
@GWeale GWeale added ready to pull [Status] This PR is ready to be importing back to Google and removed ready to pull [Status] This PR is ready to be importing back to Google labels Oct 29, 2025
copybara-service bot pushed a commit that referenced this pull request Oct 30, 2025
Closes #3114.

Co-authored-by: Eliza Huang <heliza@google.com>
COPYBARA_INTEGRATE_REVIEW=#3319 from lizzij:fix/litellm-finish-reason-3109 da6ed0a
PiperOrigin-RevId: 825783229
@GWeale GWeale added ready to pull [Status] This PR is ready to be importing back to Google and removed ready to pull [Status] This PR is ready to be importing back to Google labels Oct 30, 2025
@Jacksunwei Jacksunwei added this to the 10/27 ADK Week milestone Oct 30, 2025
@GWeale GWeale added ready to pull [Status] This PR is ready to be importing back to Google and removed ready to pull [Status] This PR is ready to be importing back to Google labels Oct 30, 2025
@GWeale
Copy link
Collaborator

GWeale commented Oct 30, 2025

Thank you for your contribution, added to the codebase!

@GWeale GWeale closed this Oct 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

models [Component] Issues related to model support ready to pull [Status] This PR is ready to be importing back to Google

Projects

None yet

Development

Successfully merging this pull request may close these issues.

The LiteLLM wrapper does not propagate finish_reason from LiteLLM responses

6 participants