Connect Timeout confusion #832

victoraugustolls · 2020-02-28T00:46:24Z

Hi!

I'm quite confused about Connect Timeout. I observed that I have some applications that happens to throw a ConnectionTimeout exception in httpx/backends/asyncio.py, in this part (lines 201, 202):

except asyncio.TimeoutError:
	raise ConnectTimeout()

What I think is strange is that, my applications are monitored with NewRelic and Microsoft Application Insights, and although they say that a request took something like 5s, I can find the request in the origin and it took just some milliseconds.

So I think my question here is this: is there something in the origin application that can cause this timeout? Maybe some blocking code? Or wrong AsyncClient creation?

Any help would be appreciated, thanks!

Obs: my applications are developed using FastAPI

The text was updated successfully, but these errors were encountered:

florimondmanca · 2020-02-28T08:09:58Z

Interesting, thanks!

they say that a request took something like 5s, I can find the request in the origin and it took just some milliseconds.

Questions:

I'm not entirely sure what "the request in the origin" means exactly here — does it refer to an entire client.get() call?
Is the ConnectTimeout expected in your case, or is it a spurrious error that's popping up and you're not sure why?

My two hypotheses so far are:

a) These monitoring backends have limited support for asyncio, or…
b) There's something (a synchronous operation?) blocking the event loop (perhaps causing the timeout in the first place?) that makes these backends unable to correctly pick up the start time of the request call.

Would you be able to share:

A screenshot of the flame graph (assuming that's what NewRelic/Insights give you)?
A snippet of the code surrounding the request call? (ideally a minimal repro example, stripped down of any bits deemed irrelevant)

victoraugustolls · 2020-02-28T17:41:39Z

Hi @florimondmanca ! Thanks for the response. I must add that, right now, after removing the newrelic dependency I no longer seeing the ConnectTimeout error, so probably there was some blocking code there, but will wait to see if the exception happens again.

Is the ConnectTimeout expected in your case, or is it a spurrious error that's popping up and you're not sure why?

Not expected :/

Right now what I'm seeing is this:
<class 'httpx.exceptions.ProtocolError'> : <ConnectionTerminated error_code:ErrorCodes.NO_ERROR, last_stream_id:1999, additional_data:None>

and

[Errno 104] Connection reset by peer

and

<class 'KeyError'> : HTTPConnection(origin=Origin(scheme='https' host='[FILTERED]' port=443))

victoraugustolls · 2020-02-28T17:45:27Z

For reference, I have this HTTPX Wrapper class that I use:

from httpx import AsyncClient, Response, Timeout
from opencensus.trace import execution_context
from opencensus.trace.span import SpanKind
from opencensus.trace.status import Status
from opencensus.trace.tracer import Tracer

from .call import HTTPCall


class HTTPService(object):
    __slots__ = "_timeout", "_client", "_uri"

    _timeout: Timeout
    _client: AsyncClient
    _uri: str

    def __init__(
        self,
        *,
        timeout=Timeout(connect_timeout=20, read_timeout=20, write_timeout=20),
        client: AsyncClient,
        uri: str
    ):
        self._timeout = timeout
        self._client = client
        self._uri = uri

    async def call(
        self, call: HTTPCall, client: AsyncClient = None
    ) -> Response:
        if client is None:
            client = self._client

        return await self._call(call, client)

    async def _call(self, call: HTTPCall, client: AsyncClient) -> Response:
        tracer: Tracer = execution_context.get_opencensus_tracer()
        url = self._uri + call.endpoint

        span = tracer.start_span(call.name)

        span.span_kind = SpanKind.CLIENT
        span.add_attribute("http.method", call.method)
        span.add_attribute("http.url", url)
        span.add_attribute("http.host", self._uri)
        span.add_attribute("http.route", call.endpoint)

        try:
            r: Response = await client.request(
                method=call.method,
                url=url,
                auth=call.auth,
                headers=call.headers,
                params=call.params,
                json=call.body,
            )
        except Exception as e:
            span.set_status(Status.from_exception(e))
            span.add_attribute("error", str(e))
            raise
        else:
            span.add_attribute("http.status_code", r.status_code)
            for header in call.tracing_headers:
                span.add_attribute(header, r.headers[header])
            return r
        finally:
            tracer.end_span()

victoraugustolls · 2020-02-28T18:24:13Z

One more observation: when disabling the HTTP/2 in a client, the number of connect timeouts are a lot higher to a specific host then when enabled. When enabled I have almost no timeouts, but have the errors described above.

victoraugustolls · 2020-02-28T22:09:55Z

Idea: maybe I'm hitting #634 and the the underlying connection can't be acquired because close in hanging? I know this specific host is made in DotNet Core 2.2, which (as being Microsoft) could result in the same effect as the linked issue?

victoraugustolls · 2020-02-29T03:07:44Z

More news: creating the client with PoolLimits(soft_limit=0, hard_limit=None) basically made most of the errors disappear 🤔

freis · 2020-02-29T17:34:31Z

@victoraugustolls that approach of the wrapper seems nice, could you share what you are doing on the call file?

victoraugustolls · 2020-02-29T18:02:57Z

@freis sure:

from typing import Optional, Tuple, List


class HTTPCall(object):
    __slots__ = (
        "name",
        "method",
        "endpoint",
        "auth",
        "headers",
        "params",
        "body",
        "tracing_headers",
    )

    name: str
    method: str
    endpoint: str
    auth: Optional[Tuple[str, str]]
    headers: Optional[dict]
    params: Optional[dict]
    body: Optional[dict]
    tracing_headers: List[str]

    def __init__(
        self,
        name: str,
        method: str,
        endpoint: str,
        auth: Optional[Tuple[str, str]] = None,
        headers: Optional[dict] = None,
        params: Optional[dict] = None,
        body: Optional[dict] = None,
        tracing_headers: List[str] = None,
    ):
        if tracing_headers is None:
            tracing_headers = []

        self.name = name
        self.method = method
        self.endpoint = endpoint
        self.auth = auth
        self.headers = headers
        self.params = params
        self.body = body
        self.tracing_headers = tracing_headers

victoraugustolls · 2020-02-29T18:04:09Z

@florimondmanca a new error has appeared 😢

<class 'httpx.exceptions.ConnectionClosed'>: can't handle event type ConnectionClosed when role=SERVER and state=SEND_RESPONSE

freis · 2020-03-01T00:20:51Z

Yeah I got something similar using trio also

victoraugustolls · 2020-03-01T02:51:01Z

A snippet of the code surrounding the request call? (ideally a minimal repro example, stripped down of any bits deemed irrelevant)

@florimondmanca I made a sample project to help find the issue!

https://github.com/victoraugustolls/httpx-timeout

victoraugustolls · 2020-03-01T18:06:11Z

victoraugustolls/httpx-timeout

Added an update with an endpoints for aiohttp and another for httpx. I only receive timeouts on httpx one.

florimondmanca · 2020-03-01T23:14:15Z

@victoraugustolls Are you using the latest HTTPX release? Several issues you mentioned (ProtocolError, ConnectionCloses, etc) remind me of past issues that should have been resolved in bug fix releases.

Are you still experiencing the spurious connect timeouts?

Also FYI — the HTTPService code looks okay. Though I see you’re still using a tracing library — are you sure the library correctly supports async execution?

victoraugustolls · 2020-03-01T23:26:04Z

Are you using the latest HTTPX release?

Yes! 0.11.1

Are you still experiencing the spurious connect timeouts?

Yes I am, randomly

Though I see you’re still using a tracing library — are you sure the library correctly supports async execution?

I tried without without any tracing and it had no effects. In the sample repository I removed all tracing, just basic client usage, and I can simulate the random ConnectTimeout error with stress tests, but not with aiohttp.

Also, the ConnectTimeout seems to happen only with one specific host, which when I tried out by swapping my endpoint from async to sync and httpx to requests it worked flawlessly.

Other observation is that, if in my application I have the same client for my HTTPServices (total of 4), I receive the errors described above. If I do one client for each HTTPService, the specific host (which is HTTP/2 enabled) receives a lot more ConnectTimeout errors and the other ones work flawlessly.

Can this be some kind of race condition inside the connection pool cause by the HTTP/2 implementation?

florimondmanca · 2020-03-02T15:56:37Z

@victoraugustolls Are we off to close this now that we tracked a reproducible issue in #841?

victoraugustolls · 2020-03-02T15:58:33Z

I still receive timeouts with http2 disabled, but I think we can close this now and come back to it later, what do you think?

tomchristie · 2020-03-02T17:01:52Z

It'd be interesting to test httpcore against this too, since there's a bit more work in there now around eg. ensuring that concurrent HTTP/2 requests end up on the same connection.

victoraugustolls · 2020-03-02T17:03:51Z

I would gladly deploy a version of the microservice that I see this behaviour with a version which uses httpcore before it is officially released to help.

victoraugustolls · 2020-03-02T20:54:13Z

@florimondmanca one point is that here I raised many types of errors and described receiving ConnectTimeout errors that make no sense.

On #841 the errors are ReadTimeout and KeyError, which in the end can be the same problem (maybe ?) or not.

florimondmanca · 2020-03-04T20:13:51Z

@victoraugustolls

Also, the ConnectTimeout seems to happen only with one specific host, which when I tried out by swapping my endpoint from async to sync and httpx to requests it worked flawlessly.

This is a very interesting point. Should we perhaps look deeper into what we might be doing differently than Requests when establishing a connection in the sync case? Our sync implementation is currently backed by urllib3 so if you’re experiencing timeouts in the sync case then there must be something Requests does that we don’t — most likely a special edge case for certain kinds of hosts.

victoraugustolls · 2020-03-04T20:20:19Z

I can try making the endpoint sync again but using the sync client instead of the requests library to see what’s happens. As soon as I do this I can post the result here!

victoraugustolls · 2020-03-06T14:10:34Z

@florimondmanca @tomchristie by using httpcore-interface here the issues have been mitigated. Right now when issuing a high volume of requests, I have a high amount of Error: <class 'httpcore._exceptions.ConnectTimeout'> errors ONLY with http2 disabled on client. Relevant to say that using aiohttp I have no timeout errors. They seem to appear only on HTTP/1.1 with httpx.

victoraugustolls · 2020-03-06T14:13:31Z

With HTTP/2 I only receive Error: <class 'h2.exceptions.TooManyStreamsError'> Max outbound streams is 100, 100 open, but that was excepted, at least until this is merged.

tomchristie · 2020-03-06T14:48:36Z

They seem to appear only on HTTP/1.1 with httpx.

Right, and presuambly then only because we have tighter defaults.

victoraugustolls · 2020-03-06T14:53:57Z

Right, and presuambly then only because we have tighter defaults.

If you are referencing to timeout defaults, I'm using this Session and Client:

client = httpx.AsyncClient(
    http2=False,
    pool_limits=httpx.PoolLimits(soft_limit=10, hard_limit=50000)
)
session = aiohttp.ClientSession(
    timeout=aiohttp.ClientTimeout(5.0)
)

victoraugustolls · 2020-03-06T15:30:59Z

Also, I receive the following errors with HTTP/2 connections under stress:

Error: <class 'httpcore._exceptions.ProtocolError'> <ConnectionTerminated error_code:ErrorCodes.NO_ERROR, last_stream_id:1999, additional_data:None>
Error: <class 'h2.exceptions.ProtocolError'> Invalid input ConnectionInputs.RECV_HEADERS in state ConnectionState.CLOSED
Error: <class 'httpcore._exceptions.ProtocolError'> <ConnectionTerminated error_code:ErrorCodes.NO_ERROR, last_stream_id:1999, additional_data:None>
Error: <class 'h2.exceptions.ProtocolError'> Invalid input ConnectionInputs.RECV_HEADERS in state ConnectionState.CLOSED
Error: <class 'h2.exceptions.ProtocolError'> Invalid input ConnectionInputs.RECV_DATA in state ConnectionState.CLOSED

florimondmanca · 2020-06-20T07:13:59Z

Hi @victoraugustolls, is this still an issue as of 0.13? Curious if any of the improvements since 4mo ago have helped reduce these spurious connect timeouts.

victoraugustolls · 2020-07-16T02:09:14Z

Hi @florimondmanca , sorry for the (very) late response. I just deployed a version of my api with the newest release and I must say it works flawlessly.

The only error I receive is this one: can't handle event type ConnectionClosed when role=SERVER and state=SEND_RESPONSE but I do not think, at this point, that this is a bug / error from httpx, but yes a flaw I the remote server.

If you want, I can give more info about this error.

tomchristie · 2020-07-21T09:48:18Z

If you want, I can give more info about this error.

@victoraugustolls That'd be great yup - is there an easy minimal way to reproduce it?
Even if it is a server error, it'd be helpful to dig into, and make sure we're raising an appropriate exception?

florimondmanca added the question Further information is requested label Feb 28, 2020

victoraugustolls mentioned this issue Mar 1, 2020

HTTPX AsyncClient slower than aiohttp? #838

Closed

florimondmanca mentioned this issue Mar 2, 2020

HTTP/2 higher number of timeouts compared with 1.1 #841

Closed

victoraugustolls mentioned this issue Mar 5, 2020

Httpcore interface #804

Merged

tomchristie closed this as completed Jul 28, 2020

valiant1x mentioned this issue Dec 5, 2020

Max outbound streams is 100, 100 open #1414

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Connect Timeout confusion #832

Connect Timeout confusion #832

victoraugustolls commented Feb 28, 2020 •

edited

florimondmanca commented Feb 28, 2020

victoraugustolls commented Feb 28, 2020 •

edited

victoraugustolls commented Feb 28, 2020

victoraugustolls commented Feb 28, 2020 •

edited

victoraugustolls commented Feb 28, 2020

victoraugustolls commented Feb 29, 2020 •

edited

freis commented Feb 29, 2020

victoraugustolls commented Feb 29, 2020

victoraugustolls commented Feb 29, 2020

freis commented Mar 1, 2020

victoraugustolls commented Mar 1, 2020 •

edited

victoraugustolls commented Mar 1, 2020

florimondmanca commented Mar 1, 2020

victoraugustolls commented Mar 1, 2020 •

edited

florimondmanca commented Mar 2, 2020

victoraugustolls commented Mar 2, 2020

tomchristie commented Mar 2, 2020

victoraugustolls commented Mar 2, 2020

victoraugustolls commented Mar 2, 2020

florimondmanca commented Mar 4, 2020 •

edited

victoraugustolls commented Mar 4, 2020

victoraugustolls commented Mar 6, 2020 •

edited

victoraugustolls commented Mar 6, 2020 •

edited

tomchristie commented Mar 6, 2020

victoraugustolls commented Mar 6, 2020

victoraugustolls commented Mar 6, 2020 •

edited

florimondmanca commented Jun 20, 2020 •

edited

victoraugustolls commented Jul 16, 2020

tomchristie commented Jul 21, 2020

Connect Timeout confusion #832

Connect Timeout confusion #832

Comments

victoraugustolls commented Feb 28, 2020 • edited

florimondmanca commented Feb 28, 2020

victoraugustolls commented Feb 28, 2020 • edited

victoraugustolls commented Feb 28, 2020

victoraugustolls commented Feb 28, 2020 • edited

victoraugustolls commented Feb 28, 2020

victoraugustolls commented Feb 29, 2020 • edited

freis commented Feb 29, 2020

victoraugustolls commented Feb 29, 2020

victoraugustolls commented Feb 29, 2020

freis commented Mar 1, 2020

victoraugustolls commented Mar 1, 2020 • edited

victoraugustolls commented Mar 1, 2020

florimondmanca commented Mar 1, 2020

victoraugustolls commented Mar 1, 2020 • edited

florimondmanca commented Mar 2, 2020

victoraugustolls commented Mar 2, 2020

tomchristie commented Mar 2, 2020

victoraugustolls commented Mar 2, 2020

victoraugustolls commented Mar 2, 2020

florimondmanca commented Mar 4, 2020 • edited

victoraugustolls commented Mar 4, 2020

victoraugustolls commented Mar 6, 2020 • edited

victoraugustolls commented Mar 6, 2020 • edited

tomchristie commented Mar 6, 2020

victoraugustolls commented Mar 6, 2020

victoraugustolls commented Mar 6, 2020 • edited

florimondmanca commented Jun 20, 2020 • edited

victoraugustolls commented Jul 16, 2020

tomchristie commented Jul 21, 2020

victoraugustolls commented Feb 28, 2020 •

edited

victoraugustolls commented Feb 28, 2020 •

edited

victoraugustolls commented Feb 28, 2020 •

edited

victoraugustolls commented Feb 29, 2020 •

edited

victoraugustolls commented Mar 1, 2020 •

edited

victoraugustolls commented Mar 1, 2020 •

edited

florimondmanca commented Mar 4, 2020 •

edited

victoraugustolls commented Mar 6, 2020 •

edited

victoraugustolls commented Mar 6, 2020 •

edited

victoraugustolls commented Mar 6, 2020 •

edited

florimondmanca commented Jun 20, 2020 •

edited