Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Streaming Conversational Responses #1727

Open
Bajocode opened this issue May 13, 2024 · 2 comments
Open

[Feature Request] Streaming Conversational Responses #1727

Bajocode opened this issue May 13, 2024 · 2 comments

Comments

@Bajocode
Copy link

Description

OpenAI and the majority of LLM's allow for response streaming.

To get responses sooner, you can 'stream' the completion as it's being generated. This allows you to start printing or processing the beginning of the completion before the full completion is finished.

Expected Behavior

Allow for conversation API response streaming (as this is a built in feature of the majority of LLM's)

Actual Behavior

When you request a conversation response, the entire completion is generated before being sent back in a single response. If you're generating long completions, waiting for the response can take many seconds.

Metadata

Typesense Version: 26.0

@Bajocode Bajocode changed the title Streaming Conversational Responses [Feature Request] Streaming Conversational Responses May 18, 2024
@tommmyy
Copy link

tommmyy commented May 30, 2024

Voting on the issue: For most use cases, showing just a loader while the response is being generated is not sufficient. It makes the conversational bot almost unsuitable for production use.

@lmatejka
Copy link

Also vote for this issue. I presented our demo based on typsense RAG last week and it was little bit annoying for customers to wait for long response. If they see stream of letters (they're used to it now) it will look much better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants