Hang in OpenAI package with many threads -- not thread safe? #3043
Unanswered
pseudotensor
asked this question in
Potential Issue
Replies: 2 comments 1 reply
-
If there are any debug suggestions, the program is still up at moment in this bad state. |
Beta Was this translation helpful? Give feedback.
1 reply
-
I'm not sure I'm entirely clear on your setup, but asynchronous Python is not intended to be used with threads. If you mix threads and asynchronous code, you're very likely to encounter deadlocks e.g. a few scenarios
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I'm using gradio package -> OpenAI API package -> network to vLLM. However, httpx hangs frequently, so why I'm here.
The only possible explanation to me is that httpx or httpcore are not thread safe.
The setup is that gradio uses asyncio and I launch threads for each chatbot like at https://gpt.h2o.ai . Typically the problem is with the most-used chatbot. There can be many users, each user's chatbot would have a thread. However, while things work for a week or so without any issue with heavy use, for some reason at some point connection timeouts.
Key is that after this starts, not a single user or connection can connect to a particular vLLM IP. It just keeps on timing out forever from then on.
So there are many many of these kinds of blocks like above.
Once this is hit, I go back and then wonder why it's stuck forever, since every use of OpenAI is fresh. That is, I create a fresh client every time, so I don't understand why things are stuck since nothing is re-used on my end.
Once this is happening, I go back and check the threads in the process (I have a faulthandler attached), then I see numerous threads like this the below. This is with 0 use of the the process by any user at this point (it's been isolated).
It's as if prior timeouts are making everything stuck in the same synchronization during
handle_request
in httpx.I don't understand what could cause a single connection to a single IP out of many IPs (other chatbots are totally fine) to be blocked like this.
The only possible explanation to me is that httpx or httpcore are not thread safe.
Beta Was this translation helpful? Give feedback.
All reactions