Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Frequent request timed out error #3005

Closed
KalakondaKrish opened this issue Apr 17, 2023 · 38 comments
Closed

Frequent request timed out error #3005

KalakondaKrish opened this issue Apr 17, 2023 · 38 comments

Comments

@KalakondaKrish
Copy link

I am getting this error whenever the time is greater than 60 seconds. I tried giving timeout=120 seconds in ChatOpenAI().

Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised Timeout: Request timed out: HTTPSConnectionPool(host='api.openai.com', port=443): Read timed out. (read timeout=60).

What is the reason for this issue and how can I rectify it?

@homanp
Copy link
Contributor

homanp commented Apr 17, 2023

+1 I'm seeing a lot of these as with ChatOpenAI and retrievers connected.

@AMK9978
Copy link
Contributor

AMK9978 commented Apr 17, 2023

I couldn't reproduce the error rn, but if you see the request_timeout param is not set, this issue is a bug.

@homanp
Copy link
Contributor

homanp commented Apr 17, 2023

Seems to work again so probably OpenAI API issues?

@joybro
Copy link
Contributor

joybro commented Apr 17, 2023

+1, I'm consistently encountering the same error today.

@mkhanplative
Copy link

+1, seeing the same issue when using langchain only. Direct calls to Open AI works fine.

@rafaelquintanilha
Copy link

rafaelquintanilha commented Apr 17, 2023

gpt-4 is always timing out for me (gpt-3.5-turbo works fine). Increasing the request_timeout helps:

llm = ChatOpenAI(temperature=0, model_name=model, request_timeout=120)

@dtthanh1971
Copy link

I have set up upto 20 seconds in openai.py

def _create_retry_decorator(self) -> Callable[[Any], Any]:
import openai

    min_seconds = 20
    max_seconds = 60
    # Wait 2^x * 1 second between each retry starting with
    # 4 seconds, then up to 10 seconds, then 10 seconds afterwards
    return retry(
        reraise=True,
        stop=stop_after_attempt(self.max_retries),
        wait=wait_exponential(multiplier=1, min=min_seconds, max=max_seconds),
        retry=(
            retry_if_exception_type(openai.error.Timeout)
            | retry_if_exception_type(openai.error.APIError)
            | retry_if_exception_type(openai.error.APIConnectionError)
            | retry_if_exception_type(openai.error.RateLimitError)
            | retry_if_exception_type(openai.error.ServiceUnavailableError)
        ),
        before_sleep=before_sleep_log(logger, logging.WARNING),
    )

But still have an rate limit error:

Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.._completion_with_retry in 20.0 seconds as it raised RateLimitError: Rate limit reached for default-gpt-3.5-turbo in organization org-oTVXM6oG3frz1CFRijB3heo9 on requests per min. Limit: 3 / min. Please try again in 20s. Contact support@openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method

is there only 3 requests per minute with a normal user?

@mkhanplative
Copy link

gpt-4 is always timing out for me (gpt-3.5-turbo works fine). Increasing the request_timeout helps:

llm = ChatOpenAI(temperature=0, model_name=model, request_timeout=120)

Increasing the timeout helped. Thanks for the tip, @rafaelquintanilha !

@dtthanh1971
Copy link

USER_NAME = "Agent 007" # The name you want to use when interviewing the agent.
LLM = ChatOpenAI(max_tokens=1500, request_timeout=120) # Can be any LLM you want.

But I did not work for my case.

@neethanwu
Copy link

+1 frequently timeout with gpt-4, I increased the request_timeout, but didn't help much. Tried OpenAI direct call, works as expected. Any workaround or potential root cause?

Usage: Refine summarization chain

@KalakondaKrish
Copy link
Author

Increasing the request_timeout value helped. Thanks.

@achempak-polymer
Copy link

Not sure if this should be marked as completed. It's probably still a "bug" since it happens more often than not when using gpt-4. Maybe the request timeout should be set to 120 if model_name is "gpt-4" by default.

@votkon
Copy link

votkon commented Apr 27, 2023

Had this appear for some complex prompts today. Changed timeout to 120. It helped!

hwchase17 pushed a commit that referenced this issue May 1, 2023
With longer context and completions, gpt-3.5-turbo and, especially,
gpt-4, will more times than not take > 60seconds to respond.

Based on some other discussions, it seems like this is an increasingly
common problem, especially with summarization tasks.
- #3512
- #3005

OpenAI's max 600s timeout seems excessive, so I settled on 120, but I do
run into generations that take >240 seconds when using large prompts and
completions with GPT-4, so maybe 240 would be a better compromise?
@ColinTitahi
Copy link

ColinTitahi commented May 9, 2023

This is driving me completely batty - hoping for any advice.
I'm running a flask app on Azure - I can't replicate the issue locally but this is preventing me rolling it out

Increasing the timeout just increases how long until this error is raised
It appears to happen BEFORE I call chat.generate or an agent
Even before I define the base llm

I know this may be more of an Azure thing but any advice?

@sagardspeed2
Copy link

Today I am getting the same error every time with model gpt-4-0314, I also set the request_timeout to 240 and even after that, I am still getting same error every time. my max_token limit is 2048.

yesterday that was working well, but today it is giving me same error and driving crazy

@Suprimepl
Copy link

I have same problem with gpt-4. My script workes well. From yesterday its timeout all the time :).

@sagardspeed2
Copy link

@Suprimepl which model you are using in your script ?

@Suprimepl
Copy link

model="gpt-4",

@santialferez
Copy link

That same problem is happening to me with "model=gpt-3.5-turbo" and "request_timeout=120"

@Django-Jiang
Copy link

Quite the same problem for me since midnight when I used "gpt-4-0314". It worked well before I went to sleep, but most of the stuff timeout today

@rcro19
Copy link

rcro19 commented May 24, 2023

Getting this same error. Code seems to be fine but problem is exponentially worse when executing within AWS

@Suprimepl
Copy link

I think it's OPEN AI fault :/

@ColinTitahi
Copy link

Still driving me batty. Looking at server config gunicorn on azure gthread / gevent worker and thread numbers timeouts, Auzure timeouts etc. Could just be the size of the VM, but I shouldn't need a production level server for testing with 5 users.
Getting to the point I think I might just have to re-write in node/js.
Anyone have the magic configuration for gunicorn that works as well as the development flask server?

@rafaelquintanilha
Copy link

Still driving me batty. Looking at server config gunicorn on azure gthread / gevent worker and thread numbers timeouts, Auzure timeouts etc. Could just be the size of the VM, but I shouldn't need a production level server for testing with 5 users. Getting to the point I think I might just have to re-write in node/js. Anyone have the magic configuration for gunicorn that works as well as the development flask server?

It's common to have to increase the gunicorn timeout when running on prod, their default timeout is too short.

However, from a designing perspective, calling Langchain may take unpredictable times, so a safer solution in this case would implement some sort of queue system (for example using Celery). This way the processing happens in background and you won't have timeout issues with gunicorn.

That said, you can try to increase gunicorn timeout by doing something like gunicorn --timeout 300 [rest of commands]

@ColinTitahi
Copy link

Thanks @rafaelquintanilha my timeout was at 600. Will investigate Celery. My attempts at using gevent were unsuccessful - like crashing the container due to ignorance.

@zeke-john
Copy link

Any updates on this issue?

@SinaArdehali
Copy link

I still have this issue . Does anyone knows a workaround ?

@santialferez
Copy link

Hi, for me the problem went away when I set "request_timeout=600" (or more than 600, and I think is the default value in the last versions of langchain). I think that this problem is mainly a time request issue.

@masa8
Copy link

masa8 commented Jun 19, 2023

To ensure that retries are made until the Timeout is reached.
I think it would be better to set max_retries=12 for the default setting, and if you change the max_seconds or multiplier setting, set max_retries for retries to be performed within the timeout period.

@ArtificialIntelligence-Hub

I still have this issue . Does anyone knows how to solve it?

@masa8
Copy link

masa8 commented Nov 3, 2023

I'm using langchain==0.0.319 and Adding request_timeout=120 to ChatOpenAI() looks working well.
This means Sometimes openai does not return a response within 120 seconds, which causes a retry, and then I could get responds in several calls.

What is the reason for this issue?

Actually I don't know but I guess it's because openai, which is server-side, does not return responses for various reasons:

  • There is a large number of requests at once
  • Perhaps there is a bug in the server-side program.
  • others..

how can I rectify it?

Since the root cause of the request timeout is on the server side, this is unavoidable on the client side. The server side will sometimes not respond, and the client side has a responsibility to handle on this scenario. If the retry results in a response, there should be no problem.

In my case, the LLM sometimes does not respond, so I set the response_timeout value to a smaller value to cause a retry on purpose, so that the LLM will retry when it does not respond, and as a result, I could get the LLM respond properly.

I hope this may be help.

@snayan06
Copy link

snayan06 commented Jan 5, 2024

facing a similar issue with vertex-ai Gemini pro models , and gevent that after 60 seconds the stuff chain times out. but when i am using simple worker type its working fine .

@votkon
Copy link

votkon commented Jan 5, 2024

facing a similar issue with vertex-ai Gemini pro models , and gevent that after 60 seconds the stuff chain times out. but when i am using simple worker type its working fine .

Check if you have installed root HTTPS certificates for your venvs.

@snayan06
Copy link

snayan06 commented Jan 6, 2024

#15222 (comment)
actually, this was happening because of gevent and its compatibility with grpc , trying to figure out now how this we can make it work with grpc as there are good amount of issue i have seen where this is not working

@eta1232002
Copy link

facing a similar issue with vertex-ai Gemini pro models , and gevent that after 60 seconds the stuff chain times out. but when i am using simple worker type its working fine .

Could you pls explain in details how to resolve this issue and what do you mean by "simple worker type"?

@eta1232002
Copy link

#3005 (comment)

Could you pls explain in details how to resolve this issue and what do you mean by "simple worker type"?

@snayan06
Copy link

snayan06 commented Jan 8, 2024

#15222 (comment)

basically it was an issue wrt to gevent and grpc after upgrading library and doing monkey patching it worked for me , simple worker type by i mean is using sync worker and not the gevent one ...

@eta1232002
Copy link

eta1232002 commented Jan 8, 2024

#3005 (comment)

Could you pls clarify your reply by "LangChain code"?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests