multiple users on the same llama.cpp server: is it possible? #24401
Unanswered
ChrisDelapierre
asked this question in
Q&A
Replies: 1 comment
-
|
When configuring model in You need to account that context size will be divided equally: So if you need 2 parallel requests to model with If all slots occupied - request will wait in queue and processed once new slot is available. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I am using llama.cpp server and it is connected to open webui. Open webui allows multiple users. I wonder how llama.cpp handles the situation where multiple users make queries to the server at the same time via open webui? How does llama.cpp handles context of each user?
Beta Was this translation helpful? Give feedback.
All reactions