Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Websocket Server: Closing handshake failed in v10, v9.1 was ok #1072

Closed
chuckyblack opened this issue Oct 16, 2021 · 22 comments · Fixed by #1086
Closed

Websocket Server: Closing handshake failed in v10, v9.1 was ok #1072

chuckyblack opened this issue Oct 16, 2021 · 22 comments · Fixed by #1086

Comments

@chuckyblack
Copy link

Hello Augustin,
from version 10 occurs an error during closing browser tab connected to my websocket server.
It occurs only in approximately 30% cases.
But everything was ok in version 9.1.

Tested in Google Chrome 94.

Traceback:

closing handshake failed
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/websockets/legacy/server.py", line 232, in handler
    await self.close()
  File "/usr/local/lib/python3.9/site-packages/websockets/legacy/protocol.py", line 779, in close
    await asyncio.shield(self.close_connection_task)
  File "/usr/local/lib/python3.9/site-packages/websockets/legacy/protocol.py", line 1309, in close_connection
    self.transport.write_eof()
  File "/usr/local/lib/python3.9/asyncio/selector_events.py", line 972, in write_eof
    self._sock.shutdown(socket.SHUT_WR)
OSError: [Errno 107] Socket not connected

Websocket server simplified code:

import asyncio
import signal
import websockets
import sys


class Chat(object):
	async def process(self, websocket: websockets.WebSocketServerProtocol, path: str):
		async for event in websocket:
			print(event)
			sys.stdout.flush()

	__call__ = process


async def server(stop):
	async with websockets.serve(Chat(), "0.0.0.0", 8765):
		await stop


if __name__ == "__main__":
	loop = asyncio.get_event_loop()
	stop = loop.create_future()
	loop.add_signal_handler(signal.SIGTERM, stop.set_result, None)
	loop.run_until_complete(server(stop))

Can you help me, please?

Thanks a lot,
Jaromir

@aaugustin
Copy link
Member

Looks like a bug. Probably websockets should ignore exceptions there.

@aaugustin aaugustin added the bug label Oct 17, 2021
@meseta
Copy link

meseta commented Oct 23, 2021

Hey all, In case it helps, I got the same issue on FastAPI/Starlette/uvicorn which is using websockets v10.0. Seems to happen if the client kills the connection suddenly. The disconnect is detected (and TCPTransport shows closed=True before returning from the endpoint), but websockets tries to write_eof still

Minimal code:

from fastapi import FastAPI, WebSocket, WebSocketDisconnect

app = FastAPI()

@app.websocket("/")
async def websocket_endpoint(websocket: WebSocket):
    await websocket.accept()

    try:
        while True:
            data = await websocket.receive_text()
            await websocket.send_text(f"Message text was: {data}")
    except WebSocketDisconnect:
        return

(executed through uvicorn uvicorn main:app)

Traceback:

ERROR:    closing handshake failed
Traceback (most recent call last):
  File "/home/yuan/.local/share/virtualenvs/gateway-tbJlYYrM/lib/python3.9/site-packages/websockets/legacy/server.py", line 232, in handler
    await self.close()
  File "/home/yuan/.local/share/virtualenvs/gateway-tbJlYYrM/lib/python3.9/site-packages/websockets/legacy/protocol.py", line 779, in close
    await asyncio.shield(self.close_connection_task)
  File "/home/yuan/.local/share/virtualenvs/gateway-tbJlYYrM/lib/python3.9/site-packages/websockets/legacy/protocol.py", line 1309, in close_connection
    self.transport.write_eof()
  File "uvloop/handles/stream.pyx", line 696, in uvloop.loop.UVStream.write_eof
  File "uvloop/handles/handle.pyx", line 159, in uvloop.loop.UVHandle._ensure_alive
RuntimeError: unable to perform operation on <TCPTransport closed=True reading=False 0xf561a0>; the handler is closed

Hope this helps!

@Catastropha
Copy link

Happens to me as well, when client disconnects suddenly.

@aaugustin
Copy link
Member

I tried to reproduce this error but I didn't manage. I didn't find any changes in this area between 9.1 and 10.0 either. So I have no idea why it's happening in 10.0 but didn't happen before :-(

This change could help. Would someone who hits the issue be able to test if it helps, please?

diff --git a/src/websockets/legacy/protocol.py b/src/websockets/legacy/protocol.py
index a31a5c7..9c0819c 100644
--- a/src/websockets/legacy/protocol.py
+++ b/src/websockets/legacy/protocol.py
@@ -1303,7 +1303,7 @@ class WebSocketCommonProtocol(asyncio.Protocol):
                     self.logger.debug("! timed out waiting for TCP close")

             # Half-close the TCP connection if possible (when there's no TLS).
-            if self.transport.can_write_eof():
+            if self.transport.can_write_eof() and not self.transport.is_closing():
                 if self.debug:
                     self.logger.debug("x half-closing TCP connection")
                 self.transport.write_eof()

@aaugustin
Copy link
Member

Is anyone able to test this patch? If you have a way to reproduce the issue, it should take only of few minutes to apply this change and check if the issue goes away 🙏

@Catastropha
Copy link

Catastropha commented Nov 3, 2021 via email

@ngnpope
Copy link

ngnpope commented Nov 5, 2021

@aaugustin I can confirm that your patch above fixes the issue.

Before
2021-11-05 11:54:06,881 [ERROR] websockets.server:175 run_asgi - Exception in ASGI application
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/uvicorn/protocols/websockets/websockets_impl.py", line 171, in run_asgi
    result = await self.app(self.scope, self.asgi_receive, self.asgi_send)
  File "/usr/local/lib/python3.9/site-packages/uvicorn/middleware/proxy_headers.py", line 59, in __call__
    return await self.app(scope, receive, send)
  File "/usr/local/lib/python3.9/site-packages/uvicorn/middleware/asgi2.py", line 17, in __call__
    await instance(receive, send)
  File "/usr/local/lib/python3.9/site-packages/sentry_sdk/integrations/asgi.py", line 100, in inner
    return await self._run_app(scope, lambda: self.app(scope)(receive, send))
  File "/usr/local/lib/python3.9/site-packages/sentry_sdk/integrations/asgi.py", line 152, in _run_app
    raise exc from None
  File "/usr/local/lib/python3.9/site-packages/sentry_sdk/integrations/asgi.py", line 149, in _run_app
    return await callback()
  File "/usr/local/lib/python3.9/site-packages/sentry_sdk/integrations/asgi.py", line 100, in <lambda>
    return await self._run_app(scope, lambda: self.app(scope)(receive, send))
  File "/usr/local/lib/python3.9/site-packages/channels/routing.py", line 54, in __call__
    return self.application_mapping[scope["type"]](scope)
  File "/usr/local/lib/python3.9/site-packages/channels/security/websocket.py", line 37, in __call__
    return self.application(scope)
  File "...", line 39, in __call__
    return self.inner(scope)
  File "/usr/local/lib/python3.9/site-packages/channels/sessions.py", line 47, in __call__
    return self.inner(dict(scope, cookies=cookies))
  File "/usr/local/lib/python3.9/site-packages/channels/sessions.py", line 145, in __call__
    return SessionMiddlewareInstance(scope, self)
  File "/usr/local/lib/python3.9/site-packages/channels/sessions.py", line 169, in __init__
    self.inner = self.middleware.inner(self.scope)
  File "/usr/local/lib/python3.9/site-packages/channels/middleware.py", line 31, in __call__
    inner_instance = self.inner(scope)
  File "/usr/local/lib/python3.9/site-packages/channels/routing.py", line 150, in __call__
    raise ValueError("No route found for path %r." % path)
ValueError: No route found for path 'api/stream/invalid/'.
2021-11-05 11:54:06,883 [INFO] websockets.server:633 handshake - connection open
2021-11-05 11:54:06,884 [ERROR] websockets.server:236 handler - closing handshake failed
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/websockets/legacy/server.py", line 232, in handler
    await self.close()
  File "/usr/local/lib/python3.9/site-packages/websockets/legacy/protocol.py", line 779, in close
    await asyncio.shield(self.close_connection_task)
  File "/usr/local/lib/python3.9/site-packages/websockets/legacy/protocol.py", line 1309, in close_connection
    self.transport.write_eof()
  File "uvloop/handles/stream.pyx", line 696, in uvloop.loop.UVStream.write_eof
  File "uvloop/handles/handle.pyx", line 159, in uvloop.loop.UVHandle._ensure_alive
RuntimeError: unable to perform operation on <TCPTransport closed=True reading=False 0x562533123c10>; the handler is closed
2021-11-05 11:54:06,887 [INFO] websockets.server:252 handler - connection closed
After
2021-11-05 11:55:21,095 [ERROR] websockets.server:175 run_asgi - Exception in ASGI application
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/uvicorn/protocols/websockets/websockets_impl.py", line 171, in run_asgi
    result = await self.app(self.scope, self.asgi_receive, self.asgi_send)
  File "/usr/local/lib/python3.9/site-packages/uvicorn/middleware/proxy_headers.py", line 59, in __call__
    return await self.app(scope, receive, send)
  File "/usr/local/lib/python3.9/site-packages/uvicorn/middleware/asgi2.py", line 17, in __call__
    await instance(receive, send)
  File "/usr/local/lib/python3.9/site-packages/sentry_sdk/integrations/asgi.py", line 100, in inner
    return await self._run_app(scope, lambda: self.app(scope)(receive, send))
  File "/usr/local/lib/python3.9/site-packages/sentry_sdk/integrations/asgi.py", line 152, in _run_app
    raise exc from None
  File "/usr/local/lib/python3.9/site-packages/sentry_sdk/integrations/asgi.py", line 149, in _run_app
    return await callback()
  File "/usr/local/lib/python3.9/site-packages/sentry_sdk/integrations/asgi.py", line 100, in <lambda>
    return await self._run_app(scope, lambda: self.app(scope)(receive, send))
  File "/usr/local/lib/python3.9/site-packages/channels/routing.py", line 54, in __call__
    return self.application_mapping[scope["type"]](scope)
  File "/usr/local/lib/python3.9/site-packages/channels/security/websocket.py", line 37, in __call__
    return self.application(scope)
  File "...", line 39, in __call__
    return self.inner(scope)
  File "/usr/local/lib/python3.9/site-packages/channels/sessions.py", line 47, in __call__
    return self.inner(dict(scope, cookies=cookies))
  File "/usr/local/lib/python3.9/site-packages/channels/sessions.py", line 145, in __call__
    return SessionMiddlewareInstance(scope, self)
  File "/usr/local/lib/python3.9/site-packages/channels/sessions.py", line 169, in __init__
    self.inner = self.middleware.inner(self.scope)
  File "/usr/local/lib/python3.9/site-packages/channels/middleware.py", line 31, in __call__
    inner_instance = self.inner(scope)
  File "/usr/local/lib/python3.9/site-packages/channels/routing.py", line 150, in __call__
    raise ValueError("No route found for path %r." % path)
ValueError: No route found for path 'api/stream/invalid/'.
2021-11-05 11:55:21,100 [INFO] websockets.server:633 handshake - connection open
2021-11-05 11:55:21,100 [INFO] websockets.server:252 handler - connection closed
Versions
channels==2.4.0
gunicorn==20.0.4
sentry-sdk==1.3.1
uvicorn==0.14.0
uvloop==0.16.0
websockets==10.0

@DmytroLitvinov
Copy link

In case if someone is interested to compare changes from v9.1 to v10

@ngnpope
Copy link

ngnpope commented Nov 5, 2021

For some more background, this issue occurred for me when upgrading uvloop from 0.15.3 to 0.16.0 and websockets from 9.1 to 10.0. I've tested downgrading each of the packages independently and can confirm that the error is caused by the upgrade to websockets==10.0.

@aaugustin
Copy link
Member

Thanks a lot for your help! I'll commit the fix shortly.

@aaugustin
Copy link
Member

(I'd still love if someone had a way to reproduce on macOS because I swear I tried a dozen combos of clients x ways to break the connection and never reproduced this.)

@ngnpope
Copy link

ngnpope commented Nov 5, 2021

I'm not very familiar with the bit of code that I'm looking into right now so not sure how much help I can be.

In my example above I triggered an exception:

ValueError: No route found for path 'api/stream/invalid/'.

So it looks like we're seeing the following because the connection is terminated abnormally:

2021-11-05 11:54:06,884 [ERROR] websockets.server:236 handler - closing handshake failed

Not sure if that is helpful. I was able to trigger this behaviour easily with the following where the path isn't valid:

wscat --connect=ws://127.0.0.1:3000/api/stream/invalid/

@timvandermeij
Copy link

timvandermeij commented Nov 5, 2021

I also ran into this problem with websockets 10.0 in combination with uvloop and I can confirm that the change in #1072 (comment) fixes the problem on Debian Buster (and also that websockets 9.1 didn't have the issue). Hopefully this fix can be merged and released since I now had to downgrade to version 9.1 for the time being. Thank you for your efforts!

aaugustin added a commit that referenced this issue Nov 6, 2021
aaugustin added a commit that referenced this issue Nov 14, 2021
aaugustin added a commit that referenced this issue Nov 14, 2021
@aaugustin
Copy link
Member

Version 10.1, which includes the fix, is available on PyPI. Thanks everyone for your help!

@1007tejas
Copy link

aaugustin,

Maybe this will help.

behavior:
Clicking on a different tab of a web page running inside QT5 dumps and reloads the websockets javascript client closing the Transport endpoint.

websockets client js environment: running in Qt5/Pyside2 with included Chrome browser
websockets version 10.1
python3.8
os linuxmint

js: websocket client listens to events 'onopen','onclose','onmessage' and 'onerror'.

error output:
js: WebSocket connection is up

stack trace:

closing handshake failed
Traceback (most recent call last):
File "/home/jeff/.local/lib/python3.8/site-packages/websockets/legacy/server.py", line 239, in handler
await self.close()
File "/home/jeff/.local/lib/python3.8/site-packages/websockets/legacy/protocol.py", line 780, in close
await asyncio.shield(self.close_connection_task)
File "/home/jeff/.local/lib/python3.8/site-packages/websockets/legacy/protocol.py", line 1309, in close_connection
self.transport.write_eof()
File "/usr/lib/python3.8/asyncio/selector_events.py", line 964, in write_eof
self._sock.shutdown(socket.SHUT_WR)
OSError: [Errno 107] Transport endpoint is not connected

output:
js: WebSocket connection is up

observation:
time.sleep(2) before line 1309 worked, eliminating the OSError. It seems to indicate a slow update of status transport.is_closing() and perhaps others.

work around:
catch the OSError
protocol.py line 1309: self.transport.write_eof() enclosed in try/except
# except OSError as error :
# try:
# self.transport.write_eof()
# except OSError as error :
# if self.debug:
# self.logger.debug(f"! OSError: {error}")
# finally:
# pass

The Pyside2 environment probably affected the issue's timing, but hopefully sheds some light on the overall problem.

@ChronoDK
Copy link

I believe this error has resurfaced in 10.2.

10.1 is working fine, but 10.2 is giving me this error, when a connection is non-gracefully closed:

[ERROR] closing handshake failed
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/websockets/legacy/server.py", line 240, in handler
await self.close()
File "/usr/local/lib/python3.9/site-packages/websockets/legacy/protocol.py", line 779, in close
await asyncio.shield(self.close_connection_task)
File "/usr/local/lib/python3.9/site-packages/websockets/legacy/protocol.py", line 1312, in close_connection
self.transport.write_eof()
File "uvloop/handles/stream.pyx", line 696, in uvloop.loop.UVStream.write_eof
File "uvloop/handles/handle.pyx", line 159, in uvloop.loop.UVHandle._ensure_alive
RuntimeError: unable to perform operation on <TCPTransport closed=True reading=False 0x55ee6d824e80>; the handler is closed

@aaugustin aaugustin reopened this Mar 14, 2022
@ngnpope
Copy link

ngnpope commented Mar 14, 2022

I can confirm that this as resurfaced for me as of 5b3a6d2 in v10.2.

That change got rid of the not self.transport.is_closing() check in favour of catching OSError to solve #1117, but the exception raised in this case was RuntimeError. The two options are either to catch that also, or reinstate the check that fixed this.

@aaugustin
Copy link
Member

I will fix it. Really sorry :-(

@ngnpope
Copy link

ngnpope commented Apr 4, 2022

Hi @aaugustin

Was wondering if any progress had been made on this one? Would you like me to open a PR reinstating the not self.transport.is_closing() condition?

I also noticed some other changes in the main branch yesterday related to closing stuff. Is that related to this in any way?

Thanks 🙂

@aaugustin
Copy link
Member

I am not clear on how I want to handle this :-(

Yesterday's changes are related to the Sans-I/O implementation which isn't involved here.

@ngnpope
Copy link

ngnpope commented Apr 4, 2022

OK. Thanks for the update 😢

@aaugustin
Copy link
Member

I decided to keep 5b3a6d2, else #1117 would come back, and just catch RuntimeError to support uvloop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants