Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

openai models do not complete any further requests after throwing an APITimeoutError #428

Closed
HerrIvan opened this issue Dec 11, 2023 · 5 comments · Fixed by #434
Closed
Labels

Comments

@HerrIvan
Copy link
Contributor

Describe the issue as clearly as possible:

Whenever an OpenAI model throws an APITimeoutError retrying again won't work.
For instance:

  • Catching the exception and simply trying again, won't work. Will 100% result in another time out.
  • Creating a new instance of the openai model and trying again, same problem.
  • Closing the client attribute (of class openai.AsyncOpenAI) and then creating a new instance of the openai model -> same problem.

However, closing the client and restarting it, works. The connection recovers and often continues normally. The same "recovery" happened if you would simple re-run the script.

Btw, in order to "restart" the client, I had to change the code below

--- a/outlines/models/openai.py
+++ b/outlines/models/openai.py
@@ -126,9 +126,14 @@ class OpenAI:
         else:
             self.config = OpenAIConfig(model=model_name)
 
-        self.client = openai.AsyncOpenAI(
-            api_key=api_key, max_retries=max_retries, timeout=timeout
+        self.create_client = functools.partial(
+            openai.AsyncOpenAI,
+            api_key=api_key,
+            max_retries=max_retries,
+            timeout=timeout,
         )
+
+        self.client = self.create_client()

And do this in my code (quite hacky):

  self.model.client.close()
  self.model.client = self.model.create_client()

Maybe you have a more principled understanding of what's going on. There seems to be some issue with open connections.
Indeed, maybe the requests with openai simply time out too often.

In any case, the issue is quite recurrent and, imo, it may have appeared after the commit with the commit of adapting the code to openai sdk 1.0.0 (b60bb7a).

Maybe a fast way of dealing with this would be with a restart_client method on the openAI class.

Steps/code to reproduce the bug:

Hard to reproduce the bug without spending $$$.

Expected result:

Either less timeouts are triggered (I have no clue about the cause for that) or in case of a timeout, it is possible to restart the client and continue.

Error message:

No response

Outlines/Python version information:

outlines 0.0.13
python 3.10.13

Context for the issue:

I am getting a timeout at least once every 20 or 30 requests. Without fixing this no way of completing any sizeable batch work.

@HerrIvan HerrIvan added the bug label Dec 11, 2023
@rlouf
Copy link
Member

rlouf commented Dec 14, 2023

This seems to be a problem with the OpenAI API. Check here for status: openai/openai-python#769. In the meantime we could create a client for each request. It's inefficient, but random timeouts are more annoying.

@rlouf
Copy link
Member

rlouf commented Dec 14, 2023

This should be (temporarily) fixed by #434. Could you please confirm?

@HerrIvan
Copy link
Contributor Author

I am on it.

@HerrIvan
Copy link
Contributor Author

Great! I wrote that in the PR:

Very big improvement in performance. I cannot say that it recovers after a timeout, because it doesn't timeout anymore. Workflows where before there was a 100% chance of throwing a timeout are now completed.

@HerrIvan HerrIvan reopened this Dec 14, 2023
@HerrIvan
Copy link
Contributor Author

I guess better wait for the PR to be merged before closing it ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants