-
-
Notifications
You must be signed in to change notification settings - Fork 214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Streaming API #37
Comments
There are some people already working on APIs. But it is on my list. I just need to do a little more research to figure out what the best, minimal stateless API would look like. |
@bkutasi I have a (very) basic "stateless" API wrapper for exllama that might point you in the right direction: fast_api.py is just a FastAPI wrapper around the model and generate_simple functions. It takes the -d command for the model directory. It will load the model and start listening on port 7862 for POST requests to You can go to fastapi_request.py is an example script of how to call the API from python. This is just a quick implementation, I will actually be revisiting this code to work in some of the new improvements Turboderp made... after I get in a bit of Diablo4 this week ^^; |
Your implementation looks great, I will try it out right away. Would love to see it merged down the line(in some form) into the main branch. |
@disarmyouwitha your fast api is working great, but the web interface is not sending generation requests if its not accessed through the localhost, even when listening (0.0.0.0). Probably other requests are also not sent, but the page loads. |
@bkutasi oh hm, I never noticed you had to enable issues - I have opened up the issues tab in my repo if you continue to have problems we can follow up there =] Are you accessing the GUI by clicking the .html file, or by going to http://host:7862/chat? If accessing it through the HTML file it will always assume localhost:
If accessing through /chat it should be trying to determine your host like this:
(But maybe I was trying to be too clever and broke something) I have the FastAPI running on a headless server, so I access the page like this: And in It may be worth mentioning that you will probably need to forward port 7862 to access it from another machine: |
Foremost, this is a terrific project.
I've been trying to integrate it with other apps, but the API is a little bit different compared to other implementations like KobolAI and its API or textgen-webui and its API examples.
I could get it to work (while the webapp is running) with the following script with my limited knowledge, albeit it's not the best:
What do you think about the possibility of making a streaming api endpoint on /api/stream that is not connected with the backend user handling and message saving, and is "stateless" so it follows the REST principles? Since it's one of the most performant backends this would surely boost its popularity.
The text was updated successfully, but these errors were encountered: