How to have chunked answers from a long running tool call (like a streaming callback)? #8998
-
I have a pipeline which is long running, and can partially answer during its run time (as it includes a classical chat generator). It is invoked as a tool, as there are other tools and other pipelines wrapped as tools that the LLm needs to select from. This works, but the answers are not streamed if they are coming from the pipeline called by the tool, as the chat generator has no access to the callback (makes sense - from the top level pipeline, there is no llm, but only a tool invoker which accepts messages, but no callback). How can i make this work? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Hello @nettnikl and thank you for your feedback. We have an open issue for streaming output from Tool calls in Agents and plan to start working on it this week: deepset-ai/haystack-experimental#219 |
Beta Was this translation helpful? Give feedback.
Hello @nettnikl and thank you for your feedback. We have an open issue for streaming output from Tool calls in Agents and plan to start working on it this week: deepset-ai/haystack-experimental#219