-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support streaming in dev-ui chat #590
Conversation
I'm at a conference so I'll have time to test it later, but I'll just throw in that, in case of a timeout, or whatever other error from the LLM, are we sure that we don't get the history in the Dev UI desynchronized with the one stored in the actual chat memory? That's why the original synchronous variant of the |
The memory gets modified when |
I plan on going to do a |
I'm trying to cause an error by setting a short timeout (I used ollama and set
but the UI doesn't receive/show anything and the progress bar keeps going forever. Ideally the jsonrpc service should send an error and the page should discard the previous user message. |
I updated the PR and added error handling in all possible ways I could imagine:
Even though the error is reaching the UI as I can tell from the console, I can't somehow capture it in my javascript code. |
@jmartisk: I checked what you do in |
I have not looked at this yet, but in the mean time, can you see if anything prints in :
|
I'm now realizing, this will also cause problems when Tools are used, not just RAG, no? Because the streaming response from the server will not contain tool-related messages. :( |
Yeah, that sounds right |
@jmartisk: the issue you had with errors being swallowed has been fixed on the quarkus side of the house by @phillip-kruger and should be available in 3.11.1. Nothing additional needed here. Now, regarding your comment about tools, the suggested solution is not clear to me. Is it something that is required for this PR or something we could add later on? |
It would be nice to have RAG and tools working properly with streaming, but if you don't want to do it I can have a look. I have some heavier refactoring planned anyway, to be able to support images. |
The pull request introduces a new option (streaming chat). The option is enabled when model supports streaming chat model and rag is disabled (streaming chat can't work with rag atm). When user enables streaming chat. Responses from the model are streamed to the devui chat screen.