Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix failures when ASGI app rejects a connection during handshake #704

Merged
merged 1 commit into from
Dec 9, 2020

Conversation

adamhooper
Copy link
Contributor

Fixes #244. The unit test already existed, but it logged exceptions that
weren't being noticed.

  • websockets_impl: transfer_data_task is unset if the handshake fails, so we can't call recv().
  • wsproto_impl: h11.Reject is invalid during handshake; send RejectConnection instead.

@adamhooper
Copy link
Contributor Author

I could use some help on this unit-test failure. I've tested 10 times in the venv on localhost and there are no failures.

I also wonder: do other tests have un-awaited tasks? Is my only mistake the use of caplog? pytest -v -s indeed prints such a backtrace in tests/protocols/test_websocket.py::test_send_before_handshake[WSProtocol] on my machine.

@matteing
Copy link

Any updates? This is a blocker for us in prod.

Copy link
Member

@florimondmanca florimondmanca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice one, did some local testing against this and this looks pretty good. Left a suggestion to simplify the implementation in the websockets_impl case!

uvicorn/protocols/websockets/websockets_impl.py Outdated Show resolved Hide resolved
uvicorn/protocols/websockets/websockets_impl.py Outdated Show resolved Hide resolved
uvicorn/protocols/websockets/wsproto_impl.py Show resolved Hide resolved
@florimondmanca
Copy link
Member

@matteing Mind giving a look at the Windows error we're seeing…? Occurs in the test modified here, apparently there's a coroutine that's not awaited on Windows but I can't tell which coroutine that is exactly, i.e. is it due to the test server, or due to the ASGI application…? I don't have a Windows machine so debugging this is pretty iffy, and I don't want to rush things in case this could be a hint that there's something deeper we need to take care of here.

task: <Task pending name='Task-168' coro=<IocpProactor.accept.<locals>.accept_coro() running at c:\hostedtoolcache\windows\python\3.8.5\x64\lib\asyncio\windows_events.py:562> wait_for=<_OverlappedFuture cancelled>>

@adamhooper
Copy link
Contributor Author

@florimondmanca we're all having trouble reproducing this bug on Windows. But it seems to me this error is currently on Windows, and it has been for ages?

It looks to me like this patch merely reveals the bug -- it doesn't cause it.

@adamhooper
Copy link
Contributor Author

(My suggesion being: 1. Make unit tests pass somehow, so we can deploy this and finally fix #244; and then 2. Investigate the un-awaited coroutine on Windows, which is a different bug that already exists and shouldn't delay the fix to #244....)

@euri10
Copy link
Member

euri10 commented Nov 20, 2020

my 2 cents on the CI issue, I didn't follow the issue at stake here...

we dropped iocp support for windows a while ago and now we force uvicorn to run the SelectorEventLoop, which is not the default loop on windows > 3.8 iirc

here the issue (I think) is that the CI uses the default windows loop, so I'd just force it in the run_server that should fix the CI and be closer to what we're doing anyway in the windows case

adamhooper added a commit to adamhooper/uvicorn that referenced this pull request Dec 3, 2020
adamhooper added a commit to adamhooper/uvicorn that referenced this pull request Dec 3, 2020
@adamhooper
Copy link
Contributor Author

Thanks, @euri10! That made the test pass

@florimondmanca I added a new function, uvicorn.loops.asyncio._new_event_loop(). Is that okay? Is this pull request [gulp] mergeable?

@adamhooper
Copy link
Contributor Author

ping? Is there anything I can do to help this along?

@florimondmanca
Copy link
Member

Taking a look. :)

@@ -236,6 +235,10 @@ async def asgi_receive(self):
return {"type": "websocket.connect"}

await self.handshake_completed_event.wait()
if self.closed_event.is_set():
# Handshake has failed.
return {"type": "websocket.disconnect", "code": self.initial_response[0]}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That code is tricky… We actually need to consider two situations: A/ the handshake failed / got rejected, in that case .initial_response is available and we can use it like here, and B/ the connection was closed normally by the client sending a websocket.close event, and in that case .initial_response isn't available and we need to send the close code that was given by the client.

@florimondmanca
Copy link
Member

@adamhooper Sadly it seems the entire suite of websocket tests seems off. The run_server() helper is potentially plain broken, because it creates an starts tasks on a loop created in the main thread, but runs the loop in a different thread, which asyncio simply doesn't allow. So eventually, we need to switch all of those tests to using actual async/await code instead, with proper asyncio tasks for running the server in the background.

So I'd say — forget about the test suite for now. Let's have it work on manual tests. If that works, let's ship it. :shipit:

For now I was able to confirm that your closed_event.is_set() check makes the bug in the websockets case go away. Then I looked at the logs in the test cases and saw the one about h11.Response, and switching to RejectConnection resolves it.

So… How about a new pull request, that makes those two changes only?

As noted in #704 (comment) we'll need to introduce a new .close_code variable, set it when receiving the websocket.close event, and use that instead of .initial_response to derive the close code. I can provide some more pointers on the PR if necessary but hopefully you get the idea?

@florimondmanca
Copy link
Member

florimondmanca commented Dec 8, 2020

Here's the diff I got to, if that's helpful:

diff --git a/uvicorn/protocols/websockets/websockets_impl.py b/uvicorn/protocols/websockets/websockets_impl.py
index 407674c..9c26306 100644
--- a/uvicorn/protocols/websockets/websockets_impl.py
+++ b/uvicorn/protocols/websockets/websockets_impl.py
@@ -49,6 +49,7 @@ class WebSocketProtocol(websockets.WebSocketServerProtocol):
         self.closed_event = asyncio.Event()
         self.initial_response = None
         self.connect_sent = False
+        self.close_code = None
         self.accepted_subprotocol = None
         self.transfer_data_task = None
 
@@ -216,6 +217,7 @@ class WebSocketProtocol(websockets.WebSocketServerProtocol):
 
             elif message_type == "websocket.close":
                 code = message.get("code", 1000)
+                self.close_code = code
                 await self.close(code)
                 self.closed_event.set()
 
@@ -236,6 +238,18 @@ class WebSocketProtocol(websockets.WebSocketServerProtocol):
             return {"type": "websocket.connect"}
 
         await self.handshake_completed_event.wait()
+
+        if self.closed_event.is_set():
+            if self.initial_response is None:
+                # Client has disconnected.
+                assert self.close_code is not None
+                code = self.close_code
+            else:
+                # Handshake has failed, or the connection was rejected.
+                code = self.initial_response[0]
+
+            return {"type": "websocket.disconnect", "code": code}
+
         try:
             data = await self.recv()
         except websockets.ConnectionClosed as exc:
diff --git a/uvicorn/protocols/websockets/wsproto_impl.py b/uvicorn/protocols/websockets/wsproto_impl.py
index d712963..456af13 100644
--- a/uvicorn/protocols/websockets/wsproto_impl.py
+++ b/uvicorn/protocols/websockets/wsproto_impl.py
@@ -255,10 +255,8 @@ class WSProtocol(asyncio.Protocol):
                 )
                 self.handshake_complete = True
                 self.close_sent = True
-                msg = h11.Response(status_code=403, headers=[])
+                msg = events.RejectConnection(status_code=403, headers=[])
                 output = self.conn.send(msg)
-                msg = h11.EndOfMessage()
-                output += self.conn.send(msg)
                 self.transport.write(output)
                 self.transport.close()

[closes encode#244]

Previously, if the application tried to close the connection before the
handshake completed, uvicorn would log:

    TypeError: An asyncio.Future, a coroutine or an awaitable is required.

These TypeErrors happen in tests/protocols/test_websocket.py. pytest is
hiding them. In my initial pull request, I added `caplog` to
`test_websocket.py` to expose the bug:

    def test_close_connection(protocol_cls, caplog):
        caplog.set_level(logging.ERROR)
        # ...
        assert not [r.message for r in caplog.records]

... but this caused unrelated errors on Windows; so @florimondmanca
suggests leaving the test suite alone, in
encode#704 (comment)
@adamhooper
Copy link
Contributor Author

@florimondmanca I'm a bit confused as to what my role is here. I assume I am to copy/paste your diff into the pull request, reset to master....

... and here's the irony: out of curiosity, I added my caplog hack from my original commit, just to see what would happen. And your patch caused a big stack trace:

INFO     uvicorn.error:websockets_impl.py:194 ('127.0.0.1', 53910) - "WebSocket /" 403
ERROR    asyncio:base_events.py:1738 Exception in callback _SelectorSocketTransport._call_connection_lost(None)
handle: <Handle _SelectorSocketTransport._call_connection_lost(None)>
Traceback (most recent call last):
  File "/usr/lib64/python3.9/asyncio/events.py", line 80, in _run
    self._context.run(self._callback, *self._args)
  File "/usr/lib64/python3.9/asyncio/selector_events.py", line 978, in _call_connection_lost
    super()._call_connection_lost(exc)
  File "/usr/lib64/python3.9/asyncio/selector_events.py", line 736, in _call_connection_lost
    self._protocol.connection_lost(exc)
  File "/home/adam/src/upstream/uvicorn/uvicorn/protocols/websockets/websockets_impl.py", line 72, in connection_lost
    super().connection_lost(exc)
  File "/home/adam/src/upstream/uvicorn/venv/lib64/python3.9/site-packages/websockets/protocol.py", line 1367, in connection_lost
    self.abort_pings()
  File "/home/adam/src/upstream/uvicorn/venv/lib64/python3.9/site-packages/websockets/protocol.py", line 1307, in abort_pings
    exc = self.connection_closed_exc()
  File "/home/adam/src/upstream/uvicorn/venv/lib64/python3.9/site-packages/websockets/protocol.py", line 779, in connection_closed_exc
    exception = ConnectionClosedError(self.close_code, self.close_reason)
  File "/home/adam/src/upstream/uvicorn/venv/lib64/python3.9/site-packages/websockets/exceptions.py", line 134, in __init__
    super().__init__(code, reason)
  File "/home/adam/src/upstream/uvicorn/venv/lib64/python3.9/site-packages/websockets/exceptions.py", line 121, in __init__
    super().__init__(format_close(code, reason))
  File "/home/adam/src/upstream/uvicorn/venv/lib64/python3.9/site-packages/websockets/exceptions.py", line 93, in format_close
    if 3000 <= code < 4000:
TypeError: '<=' not supported between instances of 'int' and 'NoneType'

The fix is to nix the self.close_code = None. (WebSocketServerProtocol uses hasattr(), not is None.) So I committed that.

Also, I changed the logic so the websocket.disconnect event (which I guess is part of ASGI spec?) has code=1006 in this case. As I understand it, code here is meant to be a valid WebSocket close code, and 403 isn't one. ref: https://asgi.readthedocs.io/en/latest/specs/www.html#disconnect-receive-event-ws

All that to say: I've force-pushed a very brief diff. I also recommend adding caplog to every test ... but I agree with you that we should merge this post-haste to tackle my thousands of spurious stack traces per day :).

@florimondmanca
Copy link
Member

@adamhooper Thanks for following up, sorry if my comments looked patronizing — didn't mean to have it come across that way.

Your approach for .close_code sounds good to me!

So is this PR fixing the two main issues that are overflowing your logs? I don't have a workstation handy just now, but I can do an extra-checkup, then I guess we can go ahead and release this? :-)

@adamhooper
Copy link
Contributor Author

@florimondmanca Not patronizing at all! I'm all eagerness here.

Yes, I heartily expect this PR to fix #244, the spurious log messages in websockets_impl.

I don't have experience with the parallel bug in wsproto_impl, but my caplog change exposed it. It looks like that might be #596 and/or #811. My change makes tests pass, so I assume it's correct.

Copy link
Member

@florimondmanca florimondmanca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was able to confirm this fixes the following two issues:

  • When running with --ws websockets and the ASGI app rejects the connection, a TypeError is raised -> Fixed
  • When running with --ws wsproto and the ASGI app rejects the connection, a LocalProtocolError is raised -> Fixed

@florimondmanca
Copy link
Member

@adamhooper Thanks! Seems like we can go ahead and get this in.

We'll get to fixing the test suite later — I'll log an issue about that to track it.

@florimondmanca florimondmanca changed the title Fix recv() errors when closing during handshake Fix failures when ASGI app rejects a connection during handshake Dec 9, 2020
@euri10
Copy link
Member

euri10 commented Dec 9, 2020

sorry to be late in the game but this feels like a wrong fix to me at least in the websocket case, if you read the comment here python-websockets/websockets#670 (comment)

a simpler and cleaner imho fix would look like this :

        await self.handshake_completed_event.wait()
        try:
            await self.ensure_open()
            data = await self.recv()
        except websockets.ConnectionClosed as exc:
            return {"type": "websocket.disconnect", "code": exc.code}

euri10 added a commit to euri10/uvicorn that referenced this pull request Dec 9, 2020
florimondmanca pushed a commit that referenced this pull request Dec 9, 2020
* Revert "Fix recv() errors when closed during handshake (#704)"

This reverts commit 8581b34

* Fix 204 by using ensure_open that raises the disconnect
adamhooper added a commit to CJWorkbench/cjworkbench that referenced this pull request Dec 14, 2020
See encode/uvicorn#704

Should silence 20,000 stack traces per day on production:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/uvicorn/protocols/websockets/websockets_impl.py", line 154, in run_asgi
    result = await self.app(self.scope, self.asgi_receive, self.asgi_send)
  File "/usr/local/lib/python3.8/site-packages/uvicorn/middleware/proxy_headers.py", line 45, in __call__
    return await self.app(scope, receive, send)
  File "/usr/local/lib/python3.8/site-packages/uvicorn/middleware/asgi2.py", line 7, in __call__
    await instance(receive, send)
  File "/usr/local/lib/python3.8/site-packages/channels/sessions.py", line 183, in __call__
    return await self.inner(receive, self.send)
  File "/usr/local/lib/python3.8/site-packages/channels/middleware.py", line 41, in coroutine_call
    await inner_instance(receive, send)
  File "/usr/local/lib/python3.8/site-packages/channels/sessions.py", line 183, in __call__
    return await self.inner(receive, self.send)
  File "/usr/local/lib/python3.8/site-packages/channels/middleware.py", line 41, in coroutine_call
    await inner_instance(receive, send)
  File "/usr/local/lib/python3.8/site-packages/channels/consumer.py", line 58, in __call__
    await await_many_dispatch(
  File "/usr/local/lib/python3.8/site-packages/channels/utils.py", line 58, in await_many_dispatch
    await task
  File "/usr/local/lib/python3.8/site-packages/channels/utils.py", line 50, in await_many_dispatch
    result = task.result()
  File "/usr/local/lib/python3.8/site-packages/uvicorn/protocols/websockets/websockets_impl.py", line 240, in asgi_receive
    data = await self.recv()
  File "/usr/local/lib/python3.8/site-packages/websockets/protocol.py", line 492, in recv
    await asyncio.wait(
  File "/usr/local/lib/python3.8/asyncio/tasks.py", line 424, in wait
    fs = {ensure_future(f, loop=loop) for f in set(fs)}
  File "/usr/local/lib/python3.8/asyncio/tasks.py", line 424, in <setcomp>
    fs = {ensure_future(f, loop=loop) for f in set(fs)}
  File "/usr/local/lib/python3.8/asyncio/tasks.py", line 673, in ensure_future
    raise TypeError('An asyncio.Future, a coroutine or an awaitable is '
TypeError: An asyncio.Future, a coroutine or an awaitable is required
@adamhooper adamhooper deleted the close-during-handshake branch December 14, 2020 18:25
@kylebebak
Copy link
Contributor

kylebebak commented Nov 15, 2021

@euri10

We can't call WebSocketCommonProtocol.ensure_open before calling WebSocketCommonProtocol.recv, because there may be unread frames in the server's read queue that were sent by the client before the client sent the close frame.

WebSocketCommonProtocol.ensure_open raises an exception and prevents the server from reading these unread frames. Neither the client, nor application code using uvicorn, has any way of knowing that frames successfully sent by the client to the server's read queue, before any close frame was sent, will never be read by the server.

The spec is clear on this: https://datatracker.ietf.org/doc/html/rfc6455#section-1.4

After sending a control frame indicating the connection should be closed, a peer does not send any further data; after receiving a control frame indicating the connection should be closed, a peer discards any further data received.

The current behavior in uvicorn is rather "a peer discards any further data received, and any data previously received"


More detail in #1230

Also, python-websockets/websockets#670 was eventually closed without any changes being made; in other words, there was no bug in websockets, and the maintainer agrees that calling await self.ensure_open() in recv is incorrect

cr313 added a commit to cr313/py-project-uvicorn that referenced this pull request Apr 19, 2024
[closes #244]

Previously, if the application tried to close the connection before the
handshake completed, uvicorn would log:

    TypeError: An asyncio.Future, a coroutine or an awaitable is required.

These TypeErrors happen in tests/protocols/test_websocket.py. pytest is
hiding them. In my initial pull request, I added `caplog` to
`test_websocket.py` to expose the bug:

    def test_close_connection(protocol_cls, caplog):
        caplog.set_level(logging.ERROR)
        # ...
        assert not [r.message for r in caplog.records]

... but this caused unrelated errors on Windows; so @florimondmanca
suggests leaving the test suite alone, in
encode/uvicorn#704 (comment)
Kludex pushed a commit to Kludex/jik that referenced this pull request Aug 14, 2024
[closes #244]

Previously, if the application tried to close the connection before the
handshake completed, uvicorn would log:

    TypeError: An asyncio.Future, a coroutine or an awaitable is required.

These TypeErrors happen in tests/protocols/test_websocket.py. pytest is
hiding them. In my initial pull request, I added `caplog` to
`test_websocket.py` to expose the bug:

    def test_close_connection(protocol_cls, caplog):
        caplog.set_level(logging.ERROR)
        # ...
        assert not [r.message for r in caplog.records]

... but this caused unrelated errors on Windows; so @florimondmanca
suggests leaving the test suite alone, in
encode/uvicorn#704 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TypeError when connection is closed in connect()
5 participants