Replies: 1 comment
-
|
Here is a quick PoC / prototype I was playing with - Kami@62dc303 It appears to be working, connections are correctly re-used which results in better performance. As mentioned above, this is a very quick hacky prototype, getting it across the finish line would require a lot of work:
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Feature Request / Improvement
Description
I deployed Open WebUI behind a reverse proxy (nginx). Same goes for Ollama (it's also deployed behind nginx reverse proxy which adds authentication and TLS).
I notice that connections from Open WebUI -> Ollama HTTP API endpoint don't seem to be re-used (aka TCP persistent connections and keep-alive).
I originally thought it may be my proxy configuration or Ollama, but I tested it with with my other proxies services and Ollama directly and it works fine:
See
Connection: keep-aliveheader returned by the reverse proxy which indicates long lived connections are supported and working.Here are logs from the Open Web UI reverse proxy where you can see connections are not re-used (different connection id - new connection is established for each outgoing request to Ollama):
After that, I started digging into the code and I noticed this pattern:
open-webui/backend/open_webui/routers/ollama.py
Line 105 in b72150c
open-webui/backend/open_webui/routers/ollama.py
Line 190 in b72150c
In short, the code always retrieves a new
aiohttp.ClientSession. So even thoughaiohttpenables and supports keep-alive by default, it won't work because the code always obtain a new isolated session object.I believe, that to make it work correctly, we would need to re-use the same
ClientSessionfor all the outgoing requests. So something like:This is similar to the
requests.Session- if you want to re-use underlying TCP connections (to avoid overhead of TCP + TLS handshake, etc.) you need to use the samerequests.Sessionoption for all the outgoing requests.Proposed Improvement
I propose to refactor the code to re-use same (global?)
aiohttp.ClientSessioninstance for all the outgoing requests to Ollama. This should result in lower latency and better overall performance and end user experience.Additionally, it may be good to expose some underlying
aiohttpconnector keep-alive related options via environment variables (e.g.keepalive_timeout,limit,limit_per_host).P.S. I imagine a similar problem may exist with other external model connectors (e.g. OpenAPI), but I didn't dig in.
Links, References
Beta Was this translation helpful? Give feedback.
All reactions