Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

asyncio CancelledError raised from SSLWantReadError handler in _call_sslobject_method #534

Closed
IlyaBizyaev opened this issue Mar 3, 2023 · 23 comments

Comments

@IlyaBizyaev
Copy link

In an async Python application, I get the following traceback:

Traceback (most recent call last):
  File "python3.10/site-packages/anyio/streams/tls.py", line 130, in _call_sslobject_method
    result = func(*args)
  File "/usr/lib/python3.10/ssl.py", line 917, in read
    v = self._sslobj.read(len)
ssl.SSLWantReadError: The operation did not complete (read) (_ssl.c:2548)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "python3.10/site-packages/httpcore/backends/asyncio.py", line 34, in read
    return await self._stream.receive(max_bytes=max_bytes)
  File "python3.10/site-packages/anyio/streams/tls.py", line 195, in receive
    data = await self._call_sslobject_method(self._ssl_object.read, max_bytes)
  File "python3.10/site-packages/anyio/streams/tls.py", line 137, in _call_sslobject_method
    data = await self.transport_stream.receive()
  File "python3.10/site-packages/anyio/_backends/_asyncio.py", line 1265, in receive
    await self._protocol.read_event.wait()
  File "/usr/lib/python3.10/asyncio/locks.py", line 214, in wait
    await fut
asyncio.exceptions.CancelledError

################# (below is the code calling anyio, left for posterity) ######################

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "python3.10/site-packages/httpcore/_exceptions.py", line 10, in map_exceptions
    yield
  File "python3.10/site-packages/httpcore/backends/asyncio.py", line 32, in read
    with anyio.fail_after(timeout):
  File "python3.10/site-packages/anyio/_core/_tasks.py", line 118, in __exit__
    raise TimeoutError
TimeoutError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "python3.10/site-packages/httpx/_transports/default.py", line 60, in map_httpcore_exceptions
    yield
  File "python3.10/site-packages/httpx/_transports/default.py", line 353, in handle_async_request
    resp = await self._pool.handle_async_request(req)
  File "python3.10/site-packages/httpcore/_async/connection_pool.py", line 253, in handle_async_request
    raise exc
  File "python3.10/site-packages/httpcore/_async/connection_pool.py", line 237, in handle_async_request
    response = await connection.handle_async_request(request)
  File "python3.10/site-packages/httpcore/_async/connection.py", line 90, in handle_async_request
    return await self._connection.handle_async_request(request)
  File "python3.10/site-packages/httpcore/_async/http2.py", line 144, in handle_async_request
    raise exc
  File "python3.10/site-packages/httpcore/_async/http2.py", line 112, in handle_async_request
    status, headers = await self._receive_response(
  File "python3.10/site-packages/httpcore/_async/http2.py", line 229, in _receive_response
    event = await self._receive_stream_event(request, stream_id)
  File "python3.10/site-packages/httpcore/_async/http2.py", line 260, in _receive_stream_event
    await self._receive_events(request, stream_id)
  File "python3.10/site-packages/httpcore/_async/http2.py", line 281, in _receive_events
    events = await self._read_incoming_data(request)
  File "python3.10/site-packages/httpcore/_async/http2.py", line 341, in _read_incoming_data
    raise exc
  File "python3.10/site-packages/httpcore/_async/http2.py", line 327, in _read_incoming_data
    data = await self._network_stream.read(self.READ_NUM_BYTES, timeout)
  File "python3.10/site-packages/httpcore/backends/asyncio.py", line 31, in read
    with map_exceptions(exc_map):
  File "/usr/lib/python3.10/contextlib.py", line 153, in __exit__
    self.gen.throw(typ, value, traceback)
  File "python3.10/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
    raise to_exc(exc)
httpcore.ReadTimeout

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "python3.10/site-packages/telegram/request/_httpxrequest.py", line 199, in do_request
    res = await self._client.request(
  File "python3.10/site-packages/httpx/_client.py", line 1533, in request
    return await self.send(request, auth=auth, follow_redirects=follow_redirects)
  File "python3.10/site-packages/httpx/_client.py", line 1620, in send
    response = await self._send_handling_auth(
  File "python3.10/site-packages/httpx/_client.py", line 1648, in _send_handling_auth
    response = await self._send_handling_redirects(
  File "python3.10/site-packages/httpx/_client.py", line 1685, in _send_handling_redirects
    response = await self._send_single_request(request)
  File "python3.10/site-packages/httpx/_client.py", line 1722, in _send_single_request
    response = await transport.handle_async_request(request)
  File "python3.10/site-packages/httpx/_transports/default.py", line 352, in handle_async_request
    with map_httpcore_exceptions():
  File "/usr/lib/python3.10/contextlib.py", line 153, in __exit__
    self.gen.throw(typ, value, traceback)
  File "python3.10/site-packages/httpx/_transports/default.py", line 77, in map_httpcore_exceptions
    raise mapped_exc(message) from exc
httpx.ReadTimeout

As far as I understand, SSLWantReadError should be handled inside anyio, but the CancelledError that gets thrown by the exception handler prevents that from happening. Also, it seems like there is some logic in place to catch CancelledError in other places, so I thought this might be a bug in anyio.

Relevant library versions:

anyio==3.6.2
httpcore==0.16.3
httpx==0.23.3
@agronholm
Copy link
Owner

agronholm commented Mar 3, 2023

This looks like:

  1. Socket stream is trying to read more data from a TLS stream
  2. The SSLObject says it needs to read more data by raising SSLWantReadError
  3. The TLS stream handles the error by trying to (asynchronously) read more data from the socket
  4. The read operation is cancelled

Am I mistaken?

@IlyaBizyaev
Copy link
Author

I read it the same, but I'm at loss as to why "The read operation is cancelled"... I'm new to Python's async stack, so I apologize if anyio is a wrong place for this issue, but I don't know how to start debugging a CancelledError thrown from underneath 4 levels of networking libraries :)

Is it generally an expected behavior that this low-level error can be thrown from anyio's code? Could anyio instead raise a higher-level exception with a more specific cause as to why the stream ended up in an invalid state?

@smurfix
Copy link
Collaborator

smurfix commented Mar 4, 2023

Well, apparently …

File "python3.10/site-packages/httpcore/backends/asyncio.py", line 32, in read
    with anyio.fail_after(timeout):

… this timeout expired. Thus anyio cancelled the operation, which is done by injecting a CancelledError into its stack frame. The fail_after context manager converted the CancelledError to a TimeoutError because, well, there was a timeout and the code told it to fail (as opposed to move_on_after which swallows and ignores the cancellation), and then the HTTPX error mapping converted that to an httpx.ReadTimeout which is what your application got. (Or at least I assume that it did get that.)

Yes, there's a whole bunch of unexperienced-programmer-confusing tracebacks attached to what should be a straightforward and "simple" exception, but ultimately there doesn't seem to be any real problem here.

@agronholm
Copy link
Owner

I think the confusion is that you thought SSLWantReadError was unhandled, but you probably missed this part: During handling of the above exception, another exception occurred:. So, while this was handled, the task was cancelled due to the timeout. So, nothing to see here.

@agronholm agronholm closed this as not planned Won't fix, can't repro, duplicate, stale Mar 4, 2023
@IlyaBizyaev
Copy link
Author

IlyaBizyaev commented Mar 4, 2023

I think the confusion is that you thought SSLWantReadError was unhandled, but you probably missed this part: During handling of the above exception, another exception occurred:.

No, that part I actually understood and mentioned in the question, but I would've never guessed that CancelledError represents a timeout, and the calling code does not mislabel it. Thank you, @smurfix, for the detailed explanation!

@smurfix
Copy link
Collaborator

smurfix commented Mar 4, 2023

I would've never guessed that CancelledError represents a timeout

Well, it doesn't. Not really. It represents, surprise, the fact that part of your code got cancelled. That might happen because of a timeout, or an exception in some other unrelated part of your code that propagated out of a taskgroup (which cancels all the other tasks in it, otherwise we'd violate Structured Concurrency guarantees), or ….

@IlyaBizyaev
Copy link
Author

IlyaBizyaev commented Mar 4, 2023

I guess what confused me the most was the inverse logic of the backtrace, which, due to Python's message (During handling of the above exception, another exception occurred), reads like CancelledError has caused the timeout, not vice versa. But I have now confirmed that response reads have indeed been timing out (not something I was expecting from <big company's API>!). Thanks again for the help.

@agronholm
Copy link
Owner

We often see read timeouts happening with big company APIs.

@TheTechromancer
Copy link

I don't know how to start debugging a CancelledError thrown from underneath 4 levels of networking libraries :)

This is my life right now, someone please kill me

@agronholm
Copy link
Owner

Are you still having issues? Have you tried v4.0.0rc1 yet?

@TheTechromancer
Copy link

Oh uh hey there! Thanks for the quick reply. I'm on 3.7.1, which is the current version installed by httpx. Basically yeah, I'm at a loss because I don't know which task was cancelled or where, only that a CancelledError seems to be bubbling up from somewhere in httpx, httpcore, or anyio, I'm not sure which.

@agronholm
Copy link
Owner

If it's caused by an AnyIO cancel scope, the cancel message will show which cancel scope it is (by object ID) in v4.0.0rc1 (which btw appears to be already compatible with httpcore).

@agronholm
Copy link
Owner

We had a lengthy conversation on the Trio Gitter room about tracking cancellation origin just a few days ago. Some interesting ideas came up. It will require some effort from both projects (AnyIO and Trio) in order to work.

@TheTechromancer
Copy link

Sweet, okay I'll give that a try. Yeah I know what you mean, I'm new to async and tracking down stray CancelledErrors has been a real mindfuck.

@smurfix
Copy link
Collaborator

smurfix commented Jul 27, 2023

… almost as bad as figuring out which of N shielded cancel scopes (those are pretty much required for async code in finally: blocks) prevents my program from terminating after I press ^C.

@agronholm
Copy link
Owner

Not to mention, it could also be a stray thread that's causing it. The 4.0 series should decrease the number of these headaches a bit by the means of bugfixes.

@TheTechromancer
Copy link

Yikes. The worst part of all this is I'm not even sure which gods we should be praying to. Who's fault is it? Who's gonna save us? CPython? AnyIO? Mojo? ¯_(ツ)_/¯

@TheTechromancer
Copy link

Anyway @agronholm for the sake of record, this is the traceback I'm dealing with. It appears unexpectedly out of httpx about 1 out of every 1000 web requests. It is the same every time.

I'd love to be able to handle only this specific CancelledError as a temporary solution, but there doesn't seem to be a way to determine which task was cancelled or where.

Traceback (most recent call last):
  File "/root/bbot/bbot/core/helpers/web.py", line 116, in request
    response = await client.request(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/httpx/_client.py", line 1530, in request
    return await self.send(request, auth=auth, follow_redirects=follow_redirects)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/httpx/_client.py", line 1617, in send
    response = await self._send_handling_auth(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/httpx/_client.py", line 1645, in _send_handling_auth
    response = await self._send_handling_redirects(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/httpx/_client.py", line 1682, in _send_handling_redirects
    response = await self._send_single_request(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/httpx/_client.py", line 1719, in _send_single_request
    response = await transport.handle_async_request(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/httpx/_transports/default.py", line 353, in handle_async_request
    resp = await self._pool.handle_async_request(req)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/httpcore/_async/connection_pool.py", line 262, in handle_async_request
    raise exc
  File "/usr/lib/python3.11/site-packages/httpcore/_async/connection_pool.py", line 245, in handle_async_request
    response = await connection.handle_async_request(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/httpcore/_async/connection.py", line 69, in handle_async_request
    stream = await self._connect(request)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/httpcore/_async/connection.py", line 117, in _connect
    stream = await self._network_backend.connect_tcp(**kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/httpcore/_backends/auto.py", line 31, in connect_tcp
    return await self._backend.connect_tcp(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/httpcore/_backends/anyio.py", line 114, in connect_tcp
    stream: anyio.abc.ByteStream = await anyio.connect_tcp(
                                   ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/anyio/_core/_sockets.py", line 221, in connect_tcp
    await event.wait()
  File "/usr/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 1778, in wait
    if await self._event.wait():
       ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/asyncio/locks.py", line 213, in wait
    await fut
asyncio.exceptions.CancelledError

@agronholm
Copy link
Owner

Concurrency in programming is hard. We're trying to make it less painful (structured concurrency helps a lot), but we still have ways to go.

As far as your traceback goes, it's hard to say where this cancellation comes from, but it doesn't seem to come from AnyIO, as the exception would not fall out of a task. If you want to be sure, try with AnyIO 4.0 (you should see a message associated with the exception if, by some chance, an AnyIO exception did in fact fall out of a task). It's a good idea to switch anyway given the myriad cancellation related fixes that went into the release. Best of luck!

@smurfix
Copy link
Collaborator

smurfix commented Jul 28, 2023

This looks suspiciously like a bug I've found. If your app uses memory object streams, 4.0 should fix this.

@TheTechromancer
Copy link

TheTechromancer commented Jul 28, 2023

Btw, I was able to confirm this was an AnyIO bug, and that it's been fixed in 4.0.0rc1.

This was the code that triggered the CancelledError on anyio==3.7.1:

import httpx
import random
import asyncio

## STEP 1:
##   python -m http.server 5555

async def main():
    async def request():
        transport = httpx.AsyncHTTPTransport(retries=5) # <-----------------------
        async with httpx.AsyncClient(transport=transport, timeout=random.random() / 10) as client:
            try:
                return await client.get("http://127.0.0.1:5555")
            except httpx.TimeoutException:
                print("_", end="", flush=True)

    while 1:
        tasks = [asyncio.create_task(request()) for _ in range(100)]
        await asyncio.gather(*tasks)


asyncio.run(main())

@agronholm thanks again for your help.

@agronholm
Copy link
Owner

You really should not use asyncio.gather() ever again if you can possibly avoid id (and in this case you can). Use task groups instead.

@agronholm
Copy link
Owner

Also why are you creating an HTTP client for every single request? That's very inefficient.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants