Fix/is last chunk 9085 #9094

baughmann · 2025-11-26T20:30:46Z

Summary

Fixes #9085 - is_last_chunk was never being set to True in StreamResponse objects in DSPy 3.0.4, breaking stream completion detection. I noticed this behaviour both with Qwen 3 Next 80b and with GPT OSS 120b with vLLM. Given that @chenmoneygithub fixed #8890 it would be really nice to get their two cents on this PR if possible. Not sure if the model they used behaved differently to these ones.

Background

I discovered that is_last_chunk was always False when using StreamListener with the built-in ReAct module. This made it impossible to reliably detect when streaming had completed for a given field. My original goal was straightforward: make sure is_last_chunk gets sent.

Why This Fix Couldn't Be More Surgical

The root cause is in _default_handle_stream_chunk():

Original code (line 295)

if token:
    return StreamResponse(...)

The method only returned a StreamResponse when token was non-empty. Since the final chunk is detected by seeing the next field's marker (or end of stream), the token buffer is often empty at that point - so no final chunk was ever emitted.

I considered several approaches:

Minimal fix: Only emit when self.stream_end=True - This would still end up not emitting the last chunk most of the time
Separate final chunk: Basically keep the existing if token: condition as-is, but add logic to track whether we've already emitted a final chunk and emit it if we haven't. This would've added more state to keep track of in the stream listener.
Current approach: Change condition to if token or self.stream_end: - Simpler but changes behavior

I went with option 3 because it ensures is_last_chunk=True is always emitted, makes the behavior predictable, and aligns with the existing finalize() method behavior added in #8890.

Behavioral Changes

This changes user-facing behavior in a significant way.

Before (v3.0.4):

chunks = [
    StreamResponse(chunk='To', is_last_chunk=False),
    StreamResponse(chunk=' get to', is_last_chunk=False),
    StreamResponse(chunk=' the other side', is_last_chunk=False),
]
# No chunk with is_last_chunk=True was emitted

After (this PR):

chunks = [
    StreamResponse(chunk='To', is_last_chunk=False),
    StreamResponse(chunk=' get to', is_last_chunk=False),
    StreamResponse(chunk=' the other side', is_last_chunk=False),
    StreamResponse(chunk='', is_last_chunk=True),  # New: empty final chunk
]

What's different:

Every field now gets a final chunk with chunk="" and is_last_chunk=True
Users will receive one additional StreamResponse per field
The empty string in chunk is intentional - it's an end-of-stream marker

Why I Think This Is Correct

Looking at the history:

Commit ab7ac0b9 (July 2025) initially added is_last_chunk with the same if token: condition
Commit a0b2155a (October 2025) added the finalize() method specifically to handle missing completion markers by emitting empty final chunks
Several tests were updated to expect all_chunks[-2] instead of all_chunks[-1] because of these empty chunks

This suggests empty final chunks were already the intended behavior (maybe?), but the implementation was incomplete. This PR makes it consistent across all streaming scenarios, not just when completion markers are missing.

Changes Made

Meat-and-potatoes

Modified dspy/streaming/streaming_listener.py::_default_handle_stream_chunk() (1 line change)
Changed condition from if token: to if token or self.stream_end:

Tests

Fixed test_stream_listener_returns_correct_chunk_chat_adapter_untokenized_stream to expect empty final chunks
Added 8 tests in tests/streaming/test_streaming_is_last_chunk.py (willing to append these to test_streaming.py if desired:
- Coverage for ChatAdapter, JSONAdapter, XMLAdapter
- Multiple listeners scenario
- allow_reuse=True and allow_reuse=False behavior
- Edge cases (few tokens, varied chunk sizes, order invariance)

Documentation

Enhanced StreamResponse docstring to document all 4 fields including is_last_chunk
Updated StreamListener docstring with usage examples
Added comprehensive section in streaming tutorial explaining when is_last_chunk=True occurs, why the final chunk is empty, and how to use it

Testing

All 41 streaming tests pass (36 passed, 3 skipped). The new tests ensure exactly one is_last_chunk=True per field per streaming session and consistent behavior across all adapters. I think I am unable to run some of the other tests locally, so I am hoping that CI picks up on this and gives me some more answers.

Questions for Reviewers

I'm specifically looking for feedback on:

Is this the right approach? The behavioral change affects all users. Are there backward compatibility concerns I should address?
Alternative solutions? Are there better ways to fix is_last_chunk that would be less invasive? For example:
- Emitting final chunk only from finalize() instead of in the main flow?
- Making empty final chunks opt-in via a flag?
- Different signaling mechanism entirely?
Documentation: Is the explanation of empty final chunks clear enough? Should we add migration guidance?
Edge cases: Are there scenarios where this change could break existing usage patterns that I haven't considered?

Compatibility Notes

I think the impact is minimal because:

The bug made the feature unusable - is_last_chunk was always False, so users couldn't rely on it anyway
Empty chunks are filterable - Users can if chunk.chunk: to skip them if needed
Predictable behavior - The guarantee of exactly one is_last_chunk=True per field makes completion detection reliable

That said, users who iterate over all chunks will see additional entries. If this is a concern, I'm open to exploring other approaches.

Thanks in advance, and please be gentle I'm new here!

Fixes stanfordnlp#9085 - Modified _default_handle_stream_chunk to always set is_last_chunk=self.stream_end - Ensures every StreamListener emits exactly one chunk with is_last_chunk=True - Updated test to expect empty final chunks with is_last_chunk=True - Added 8 comprehensive tests for is_last_chunk behavior across all adapters - Updated StreamResponse and StreamListener docstrings - Enhanced streaming tutorial documentation with is_last_chunk explanation

chenmoneygithub · 2025-11-26T20:36:06Z

duplicates #9089

Anyway, thanks for the PR! The fix is mostly correct, but it's incorrect to claim that the last chunk is always empty.

baughmann · 2025-11-26T20:39:36Z

@chenmoneygithub oops yep good catch thank you!

baughmann added 2 commits November 26, 2025 15:12

Merge remote-tracking branch 'origin/main' into fix/is-last-chunk-9085

27325cf

chenmoneygithub closed this Nov 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix/is last chunk 9085 #9094

Fix/is last chunk 9085 #9094

baughmann commented Nov 26, 2025 •

edited

Loading

Uh oh!

chenmoneygithub commented Nov 26, 2025

Uh oh!

baughmann commented Nov 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix/is last chunk 9085 #9094

Fix/is last chunk 9085 #9094

Conversation

baughmann commented Nov 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Background

Why This Fix Couldn't Be More Surgical

Original code (line 295)

Behavioral Changes

Before (v3.0.4):

After (this PR):

Why I Think This Is Correct

Changes Made

Meat-and-potatoes

Tests

Documentation

Testing

Questions for Reviewers

Compatibility Notes

Uh oh!

chenmoneygithub commented Nov 26, 2025

Uh oh!

baughmann commented Nov 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

baughmann commented Nov 26, 2025 •

edited

Loading