Skip to content

feat: Add finish_reason field to StreamingChunk #9536

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

vblagoje
Copy link
Member

@vblagoje vblagoje commented Jun 19, 2025

Why:

Introduces a strategic migration path for standardizing finish_reason handling across all chat generators in Haystack. This PR adds a dedicated finish_reason field in StreamingChunk class to replace the current dictionary-based approach of reading finish_reason from meta, enabling better type safety, self-documented and easier to understand finish_reason logic while providing a smooth migration timeline from Haystack 2.15 to 2.17.

What:

  • Phase 1 (Haystack 2.15): Added finish_reason: Optional[Union[FinishReason, str]] field to StreamingChunk class
  • Migration Foundation: Created FinishReason Literal type with OpenAI standard values ("stop", "length", "tool_calls", "content_filter")
  • Backward Compatibility: Updated OpenAIChatGenerator to populate both new field and legacy meta field during transition
  • Smart Fallback Logic: Modified utility functions to prefer new field but gracefully fall back to meta["finish_reason"]
  • Migration Timeline: Implemented deprecation warnings for meta access with explicit Haystack 2.17 removal date
  • Documentation: Created reno release note outlining the migration plan and timeline

How can it be used:

The migration plan enables immediate adoption while maintaining compatibility:

# New approach (works immediately with OpenAI generator)
if chunk.finish_reason == "stop":
    add_spacing()

# Legacy approach (works until Haystack 2.17 with warning)
if chunk.meta.get("finish_reason") == "stop":
    add_spacing()

How did you test it:

  • Migration Compatibility: Tested mixed scenarios with chunks from updated (OpenAI) and non-updated generators
  • Backward Compatibility: Verified existing code using meta["finish_reason"] continues working with deprecation warnings
  • Timeline Validation: Confirmed deprecation warnings explicitly mention "Haystack 2.17" removal
  • Fallback Logic: Validated utility functions handle both field access patterns during migration period

Notes for the reviewer:

Migration Strategy: This PR implements Phase 1 of a planned 3-phase migration:

  • Phase 1 (Haystack 2.15): Introduce new field, update OpenAI generator, maintain full backward compatibility
  • Phase 2 (2.15-2.17): Gradually update remaining generators (Anthropic, HuggingFace, Bedrock, etc.) to populate new field
  • Phase 3 (Haystack 2.17): Remove meta["finish_reason"] support and complete migration

Critical Migration Logic: The fallback in _convert_streaming_chunks_to_chat_message is essential - it handles chunks from both updated generators (using chunk.finish_reason) and non-updated generators (using meta["finish_reason"]) during the 2-version migration window. This ensures zero breaking changes while we systematically update all chat generators.

@vblagoje vblagoje added the ignore-for-release-notes PRs with this flag won't be included in the release notes. label Jun 19, 2025
@github-actions github-actions bot added topic:tests type:documentation Improvements on the docs labels Jun 19, 2025
@coveralls
Copy link
Collaborator

coveralls commented Jun 19, 2025

Pull Request Test Coverage Report for Build 15758813118

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 12 unchanged lines in 4 files lost coverage.
  • Overall coverage decreased (-0.01%) to 90.174%

Files with Coverage Reduction New Missed Lines %
components/generators/utils.py 1 74.0%
dataclasses/init.py 2 23.08%
components/generators/chat/openai.py 3 96.36%
dataclasses/streaming_chunk.py 6 91.43%
Totals Coverage Status
Change from base Build 15728854337: -0.01%
Covered Lines: 11582
Relevant Lines: 12844

💛 - Coveralls

@vblagoje vblagoje removed the ignore-for-release-notes PRs with this flag won't be included in the release notes. label Jun 19, 2025
@vblagoje vblagoje marked this pull request as ready for review June 19, 2025 13:42
@vblagoje vblagoje requested review from a team as code owners June 19, 2025 13:42
@vblagoje vblagoje requested review from dfokina and sjrl and removed request for a team June 19, 2025 13:42
@vblagoje vblagoje added this to the 2.15.0 milestone Jun 19, 2025
@vblagoje
Copy link
Member Author

@sjrl please review and make sure that PR and migration plan make sense to you 🙏

@julian-risch julian-risch removed this from the 2.15.0 milestone Jun 20, 2025
@julian-risch
Copy link
Member

The issue corresponding to this PR is already in the 2.15.0 milestone. That's the reason why I removed the PR from the milestone again. It's still planned to be part of the release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic:tests type:documentation Improvements on the docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add finish_reason field to StreamingChunk
3 participants