You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a follow-up on issue #46. Is it possible to stream responses from LLM in Q&A? For example, LiteLLM allows streaming responses by adding stream=True. Thanks!
Hi there,
Thanks for your question. As discussed internally:
Current Support: Our LiteLLMChat wrapper does not support using stream=True to split a single LLM output into multiple messages over time. The current wrapper returns the full response once generated.
Technical Considerations: Implementing streaming would require a significant redesign—specifically around how UDFs return results (for instance, allowing them to yield partial outputs). This isn’t supported by the current engine.
Kafka and Output Connectors: While you can stream question and answer pairs to systems like Kafka by adjusting how the webserver handles responses, splitting one answer into multiple parts is not currently implemented.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
This is a follow-up on issue #46. Is it possible to stream responses from LLM in Q&A? For example, LiteLLM allows streaming responses by adding
stream=True
. Thanks!Beta Was this translation helpful? Give feedback.
All reactions