Fix: restore token-by-token streaming for chains using createTextOnlyOutputParser#6086
Fix: restore token-by-token streaming for chains using createTextOnlyOutputParser#6086HenryHengZJ wants to merge 1 commit intomainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request refactors the createTextOnlyOutputParser utility to improve streaming support by replacing the RunnableLambda implementation with a dedicated TextOnlyOutputParser class. The logic for extracting text from message chunks has been moved to a helper function. A review comment suggests enhancing type safety by replacing the any type with a type guard during the filtering of content blocks.
|
This is a fix coming from here |
|
I tested this PR locally and can confirm it resolves #6070 on my side. Streaming is now working again as expected in the Conversational Retrieval QA Chain (token-by-token output in the UI). For context, I initially investigated this issue and opened an alternative fix in #6089, but I only saw this PR afterwards. This approach also resolves the issue correctly 👍 I also added a test harness covering this behavior in my PR — I’d be happy to contribute/adapt those tests to this PR if useful. |
createTextOnlyOutputParser()used a RunnableLambda which accumulates all LLM tokens before yielding, breaking progressive streaming in any chain that relies on streamLog() (e.g. Conversational Retrieval QA Chain)Runnablesubclass (TextOnlyOutputParser) that implements_transform()to yield each token as it arrives