You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Description
When a SyncConsumer performs a blocking action like using the Django ORM, all other actively connected consumers (both SyncConsumer and AsyncConsumer) are blocked and subsequent connections from any consumer are also blocked until the blocking action is completed.
Here 'blocked' in the context of actively connected consumers means that Daphne acknowledges incoming frames from the client but no consumer code is triggered:
daphne.ws_protocol DEBUG WebSocket incoming frame on ['127.0.0.1', 50176]
'blocked' in the context of subsequent connections means that Channels initiates the handshake and Daphne upgrades the connection to websocket, but no consumer code is triggered and ~5 seconds later the connection attempt times out:
django.channels.server INFO WebSocket HANDSHAKING /ws/chat/sync [127.0.0.1:50834]
daphne.http_protocol DEBUG Upgraded connection ['127.0.0.1', 50834] to WebSocket
daphne.ws_protocol DEBUG WebSocket closed for ['127.0.0.1', 50834]
django.channels.server INFO WebSocket DISCONNECT /ws/chat/sync [127.0.0.1:50834]
Expected behavior
My expectation is that only the thread for the SyncConsumer that is performing the blocking action should be blocked. Other connected consumers (both SyncConsumer and AsyncConsumer) should continue being able to send/receive messages, and new consumers should be able to open connections. That expectation is based off this section of the documentation, which indicates that a SyncConsumer will run in a dedicated thread:
If you’re calling any part of Django’s ORM or other synchronous code, you should use a SyncConsumer, as this will run the whole consumer in a thread and stop your ORM queries blocking the entire server.
Environment
All tests were performed in Python 3.10.6 virtual environments where the only added packages are the ones explicitly listed below:
Interestingly there is a slight change in behavior between the Django 4.1 + Channels 4.0 environment and the other two environments. In the Django 4.1 + Channels 4.0 environment, any actively connected AsyncConsumer can continue to send and receive messages while the SyncConsumer performs the blocking action (any other actively connected SyncConsumer is still blocked), while in the other environments the blocking SyncConsumer blocks all other operations, including from any actively connected AsyncConsumer. As in the other environments, all consumer connections attempted while the blocking action occurs are also blocked. This difference in behavior between the Django 4.1 + Channels 4.0 environment and the others stood out because it appears to be a regression in behavior.
Steps to reproduce
Sample code is available via a public repo: https://github.com/JKasakyan/channels-ws-blocking-sample. This is a basic expansion of the Channels tutorial chat application that sets up two endpoints for accessing a SyncConsumer and an AsyncConsumer. The consumer logic is identical. All messages are echoed back by the server, and a message containing 'sleep:' will run pg_sleep for the specified time (e.g. 'sleep: 60' will run pg_sleep(60) in the consumer). The requirements.txt in that repo is the Django 4.2 + Channels 4.2 environment referenced above
Follow the steps in the Reproduction section of that repo
Optionally run the same application in the other two environments referenced above (Django 4.1 + Channels 4.0 or Django 5.1 + Channels 4.2) and confirm behavior is as described above
Use case
We have an application that uses Django + Channels for WS, and some of these consumers use the Django ORM. We've observed situations where when traffic is high on certain WS endpoints that heavily use the Django ORM, other active WS connections are less responsive and new WS connections fail more frequently. We believe the blocking behavior described in this post is the source of the issue. I can't imagine this is intended behavior, and the section of the documentation I highlighted earlier seems to describe different behavior. Any clarification would be greatly appreciated!
The text was updated successfully, but these errors were encountered:
Description
When a
SyncConsumer
performs a blocking action like using the Django ORM, all other actively connected consumers (bothSyncConsumer
andAsyncConsumer
) are blocked and subsequent connections from any consumer are also blocked until the blocking action is completed.Here 'blocked' in the context of actively connected consumers means that Daphne acknowledges incoming frames from the client but no consumer code is triggered:
'blocked' in the context of subsequent connections means that Channels initiates the handshake and Daphne upgrades the connection to websocket, but no consumer code is triggered and ~5 seconds later the connection attempt times out:
Expected behavior
My expectation is that only the thread for the
SyncConsumer
that is performing the blocking action should be blocked. Other connected consumers (bothSyncConsumer
andAsyncConsumer
) should continue being able to send/receive messages, and new consumers should be able to open connections. That expectation is based off this section of the documentation, which indicates that aSyncConsumer
will run in a dedicated thread:Environment
All tests were performed in Python 3.10.6 virtual environments where the only added packages are the ones explicitly listed below:
Django 4.2 + Channels 4.2 environment:
I have also tested the exact same application in a Django 4.1 + Channels 4.0 environment:
As well as in a Django 5.1 + Channels 4.2 environment:
Interestingly there is a slight change in behavior between the Django 4.1 + Channels 4.0 environment and the other two environments. In the Django 4.1 + Channels 4.0 environment, any actively connected
AsyncConsumer
can continue to send and receive messages while theSyncConsumer
performs the blocking action (any other actively connectedSyncConsumer
is still blocked), while in the other environments the blockingSyncConsumer
blocks all other operations, including from any actively connectedAsyncConsumer
. As in the other environments, all consumer connections attempted while the blocking action occurs are also blocked. This difference in behavior between the Django 4.1 + Channels 4.0 environment and the others stood out because it appears to be a regression in behavior.Steps to reproduce
SyncConsumer
and anAsyncConsumer
. The consumer logic is identical. All messages are echoed back by the server, and a message containing 'sleep:' will runpg_sleep
for the specified time (e.g. 'sleep: 60' will runpg_sleep(60)
in the consumer). Therequirements.txt
in that repo is the Django 4.2 + Channels 4.2 environment referenced aboveReproduction
section of that repoUse case
We have an application that uses Django + Channels for WS, and some of these consumers use the Django ORM. We've observed situations where when traffic is high on certain WS endpoints that heavily use the Django ORM, other active WS connections are less responsive and new WS connections fail more frequently. We believe the blocking behavior described in this post is the source of the issue. I can't imagine this is intended behavior, and the section of the documentation I highlighted earlier seems to describe different behavior. Any clarification would be greatly appreciated!
The text was updated successfully, but these errors were encountered: