How to have chunked answers from a long running tool call (like a streaming callback)? #8998

nettnikl · 2025-03-07T11:00:56Z

nettnikl
Mar 7, 2025

I have a pipeline which is long running, and can partially answer during its run time (as it includes a classical chat generator). It is invoked as a tool, as there are other tools and other pipelines wrapped as tools that the LLm needs to select from.

This works, but the answers are not streamed if they are coming from the pipeline called by the tool, as the chat generator has no access to the callback (makes sense - from the top level pipeline, there is no llm, but only a tool invoker which accepts messages, but no callback).

How can i make this work?
Maybe i can work with the new super component somehow? Or i can create a llm node instance that includes the streaming callback instead of reusing the same instance? Or can we add a streaming_callback to the tool_invoker?

Answered by julian-risch

Mar 10, 2025

Hello @nettnikl and thank you for your feedback. We have an open issue for streaming output from Tool calls in Agents and plan to start working on it this week: deepset-ai/haystack-experimental#219

View full answer

julian-risch · 2025-03-10T07:42:24Z

julian-risch
Mar 10, 2025
Maintainer

Hello @nettnikl and thank you for your feedback. We have an open issue for streaming output from Tool calls in Agents and plan to start working on it this week: deepset-ai/haystack-experimental#219

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to have chunked answers from a long running tool call (like a streaming callback)? #8998

{{title}}

Replies: 1 comment

{{title}}

Select a reply

How to have chunked answers from a long running tool call (like a streaming callback)? #8998

nettnikl Mar 7, 2025

Replies: 1 comment

julian-risch Mar 10, 2025 Maintainer

nettnikl
Mar 7, 2025

julian-risch
Mar 10, 2025
Maintainer