-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple Async calls to the api fail catastrophically #1195
Comments
Spawning 5000-25000 requests concurrently would likely hit many of the rate-limiting caps like requests/sec and tokens/sec. Are you ramping up requests or just starting thousands all at once? |
You can use response = await openai_client.embeddings.with_raw_response.create(
model="ada-text-002-etc",
input="Ground control to Major Tom",
)
# Get rate limit information
print(response.headers.get("x-ratelimit-remaining-requests")
print(response.headers.get("x-ratelimit-remaining-tokens")
embeddings = response.parse() # get back the Embeddings object If you're trying to simulate a production environment at load, I'd recommend ramping up requests or using something like locust. That's what we're using to load test OpenAI endpoints and models. |
Can you share a repro script? Assuming this is not, in fact, just rate-limits? (you might run with |
I suspect this is just rate limits acting in a normal way, so I'm closing. |
Confirm this is an issue with the Python library and not an underlying OpenAI API
Describe the bug
For a resume-writing program with multiple levels of async calls, launching relatively small scale async processing causes the API to fail catastrophically.
Attempted the OpenAIClient and httpx.AsyncClient solutions which were suggested here and elsewhere:
#769
When called synchronously, code processes 50 resumes sequentially with no problem, and perhaps 3 or 4 'Timeout' failures in aggregate that are successfully completed using exponential backoff. The average completion time for each document is 50 seconds with a std of perhaps 10 seconds.
When the same 50 documents are run simultaneously using asyncio:
await asyncio.gather(*tasks)
Several hundred - several thousand timeout errors occur in aggregate, and most of the time, the processes will fail catastrophically as None is returned by the OpenAI api, which then fails cascadingly throughout the system.
Average completion time rises to 240 seconds with an std of perhaps 30 seconds.
I've confirmed that unique clients are created for each document:
OpenAIClient object at 0x7f9a57762fb0
OpenAIClient object at 0x7f9a5764f430
OpenAIClient object at 0x7f9a57249870
...
Running with a clean new environment updated today:
python==3.10.13
openai==1.12.0
httpx==0.27.0
#769 seems to indicate that the problem was resolved in open 1.3.8, but we can't fix.
To Reproduce
Code snippets
OS
Amazon Linux
Python version
3.10.13
Library version
openai 1.12.0
The text was updated successfully, but these errors were encountered: