Fix(components): conversational retrieval qa streaming regression by estebanjosse · Pull Request #6089 · FlowiseAI/Flowise

estebanjosse · 2026-03-30T19:53:25Z

Summary

This PR fixes a streaming regression in the Conversational Retrieval QA Chain introduced in 3.1.x, where responses were no longer streamed token-by-token and were instead returned only once completed.

In addition to restoring streaming, this PR refactors the streaming implementation to make it more maintainable, consistent with other chains, and significantly easier to test.

Root cause

The previous implementation relied on a manual JSON patch-based streaming mechanism using streamLog() and applyPatch to reconstruct streamed output.

This approach:

does not reliably propagate token-level streaming events
introduces complex state management
makes the streaming behavior difficult to reason about and test

Fix

This PR replaces the JSON patch-based streaming logic with a callback handler-based approach using CustomChainHandler, aligning the implementation with ConversationChain.

Key changes:

Remove streamLog() / JSON patch parsing (applyPatch)
Introduce callback-based token streaming via CustomChainHandler
Ensure proper propagation of streaming events to the frontend

This results in:

restored token-by-token streaming
simpler and more maintainable code
consistent streaming behavior across chains

Refactoring

Simplified streaming logic by removing manual patch/state handling
Reworked chain composition using RunnableSequence and RunnableMap for better clarity and modularity
Cleaned up imports and removed unused code

Tests

✅ Added a dedicated test harness (ConversationalRetrievalQAChain.test.ts)
✅ Uses test doubles for:
- streaming
- memory
- retriever
✅ Verified streaming behavior with red → green test cycle
✅ Covers:
- token streaming
- source document emission
- correct output structure

Behavioral improvements

Ensures source documents are streamed only when appropriate
Prevents leakage of condensed question text in streamed tokens
Introduces a clear output contract via ConversationalRetrievalQAResult

Impact

Restores expected streaming behavior for Conversational Retrieval QA Chain
No impact on other chains or flows
Improves maintainability and testability of the node

Notes

This change removes reliance on streamLog() for streaming and aligns the implementation with the callback-based pattern already used in other parts of the codebase.

…lbacks

…g handler

…lback flush

…sert source-doc streaming

…ming-regression

estebanjosse · 2026-03-30T19:55:20Z

Closes #6070

gemini-code-assist

Code Review

This pull request refactors the ConversationalRetrievalQAChain to use answerChain.invoke instead of streamLog, implementing a CustomChainHandler for streaming and restructuring the internal chain to better handle source documents. A new test suite is also introduced to verify streaming and document retrieval. The reviewer identified a redundant call to serializeHistory when checking for chat history, suggesting a more efficient check using the history array's length.

...ges/components/nodes/chains/ConversationalRetrievalQAChain/ConversationalRetrievalQAChain.ts

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

estebanjosse added 8 commits March 27, 2026 22:38

test(components): add conversational retrieval qa streaming harness

0549942

test(components): add CRQA streaming regression expectation

58a8e4d

refactor(components): switch CRQA streaming to CustomChainHandler cal…

69ecc48

…lbacks

fix(components): derive CRQA skipK from chat history presence

74d2cf1

refactor(components): normalize CRQA chain output for shared streamin…

42ddf9b

…g handler

test(components): stabilize CRQA streaming assertions after async cal…

b2d8ffc

…lback flush

fix(components): prevent CRQA condensed-question token leakage and as…

1bf3e21

…sert source-doc streaming

Merge branch 'FlowiseAI:main' into fix/conversational-retrieval-strea…

16d06d1

…ming-regression

gemini-code-assist bot reviewed Mar 30, 2026

View reviewed changes

...ges/components/nodes/chains/ConversationalRetrievalQAChain/ConversationalRetrievalQAChain.ts Outdated Show resolved Hide resolved

estebanjosse mentioned this pull request Mar 30, 2026

Fix: restore token-by-token streaming for chains using createTextOnlyOutputParser #6086

Open

refactor(components): use history.length to init hasChatHistory

0b3b943

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix(components): conversational retrieval qa streaming regression#6089

Fix(components): conversational retrieval qa streaming regression#6089
estebanjosse wants to merge 9 commits intoFlowiseAI:mainfrom
estebanjosse:fix/conversational-retrieval-streaming-regression

estebanjosse commented Mar 30, 2026

Uh oh!

estebanjosse commented Mar 30, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

estebanjosse commented Mar 30, 2026

Summary

Root cause

Fix

Refactoring

Tests

Behavioral improvements

Impact

Notes

Uh oh!

estebanjosse commented Mar 30, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant