Skip to content

Fix OpenAiChatModel.stream() to only buffer tool calls.#6288

Closed
ericbottard wants to merge 1 commit into
spring-projects:mainfrom
ericbottard:gh-5987
Closed

Fix OpenAiChatModel.stream() to only buffer tool calls.#6288
ericbottard wants to merge 1 commit into
spring-projects:mainfrom
ericbottard:gh-5987

Conversation

@ericbottard
Copy link
Copy Markdown
Member

Previously, the new implementation of OpenAiChatModel.stream() buffered
the entire response, which meant that the caller would not receive
any tokens until the entire response was received. This change modifies
the implementation to only buffer tool calls, allowing the caller to
receive tokens as they are generated.

Fix #5987

@ericbottard ericbottard requested a review from tzolov June 4, 2026 09:30
Previously, the new implementation of OpenAiChatModel.stream() buffered
 the entire response, which meant that the caller would not receive
 any tokens until the entire response was received. This change modifies
the implementation to only buffer tool calls, allowing the caller to
 receive tokens as they are generated.

Fix spring-projects#5987

Signed-off-by: Eric Bottard <eric.bottard@broadcom.com>
@tzolov tzolov self-assigned this Jun 4, 2026
@tzolov tzolov added bug Something isn't working openai streaming labels Jun 4, 2026
@tzolov tzolov added this to the 2.0.0-RC1 milestone Jun 4, 2026

Flux<ChatResponse> flux = chatResponses
Flux<ChatResponse> observedResponses = chatResponses.doOnError(observation::error)
.doFinally(s -> observation.stop())
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tzolov pay attention to this comment as well, maybe this needs changing.

tzolov added a commit that referenced this pull request Jun 4, 2026
Previously the entire response was buffered before emitting any tokens.
The new implementation uses bufferUntil(finish_reason=TOOL_CALLS) so only
tool-call chunks are held back for aggregation; all other tokens stream
immediately.

Also removes dead code and broken tests left over after the internal tool
execution loop was deleted from OpenAiChatModel.

Closes #5987

Signed-off-by: Eric Bottard <eric.bottard@broadcom.com>
Signed-off-by: Christian Tzolov <christian.tzolov@broadcom.com>

Co-authored-by: Christian Tzolov <christian.tzolov@broadcom.com>
@tzolov
Copy link
Copy Markdown
Contributor

tzolov commented Jun 4, 2026

Rebased, fix tests, squashed and merged at 12f5cbe
Thanks @ericbottard !

@tzolov tzolov closed this Jun 4, 2026
tzolov added a commit that referenced this pull request Jun 4, 2026
Follows-up #6288

Signed-off-by: Christian Tzolov <christian.tzolov@broadcom.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working openai streaming

Projects

None yet

Development

Successfully merging this pull request may close these issues.

OpenAiChatModel#stream buffers the entire response before emitting chunks

3 participants