Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix race condition in leaving rooms #74

Closed
wants to merge 1 commit into from
Closed

Fix race condition in leaving rooms #74

wants to merge 1 commit into from

Conversation

sharat87
Copy link
Contributor

Hi team!

This is a quick small PR to address an error that we are seeing more and more often recently. The error looks like this:

TypeError: Cannot read property 'size' of undefined
    at Adapter._del (/app/node_modules/socket.io-adapter/dist/index.js:67:37)
    at Adapter.delAll (/app/node_modules/socket.io-adapter/dist/index.js:83:18)
    at Socket.leaveAll (/app/node_modules/socket.io/dist/socket.js:190:22)
    at Socket._onclose (/app/node_modules/socket.io/dist/socket.js:334:14)
    at Client.onclose (/app/node_modules/socket.io/dist/client.js:245:20)
    at Socket.emit (events.js:412:35)
    at Socket.onClose (/app/node_modules/engine.io/lib/socket.js:348:12)
    at Object.onceWrapper (events.js:519:28)
    at WebSocket.emit (events.js:400:28)
    at WebSocket.onClose (/app/node_modules/engine.io/lib/transport.js:106:10)

It looks like we are checking the room exists in the rooms Map, and if it does, we proceed to do a few things. But when multiple sockets are removed from a room independently, then this is causing a race condition on this line:

https://github.com/socketio/socket.io-adapter/blob/master/lib/index.ts#L91

This PR addresses this by,

  1. Calling .get on the Map exactly once.
  2. Emitting the delete-room even if the deletion actually happened.

However, I'm a bit skeptical of this as well. While I believe this is the underlying reason for this error, I don't hard evidence to prove it, other than staring at the code really hard and coming up with a scenario where the error happens.

So, if you've already seen this error in the past, any advice would be much appreciated. I tried searching the repository for this error message but couldn't find anything relevant.

Thank you very much for your time, and of course, for your work on socket.io! 馃殌

darrachequesne pushed a commit that referenced this pull request Aug 28, 2021
This should fix the related issue:

```
TypeError: Cannot read property 'size' of undefined
    at Adapter._del (/app/node_modules/socket.io-adapter/dist/index.js:67:37)
    at Adapter.delAll (/app/node_modules/socket.io-adapter/dist/index.js:83:18)
    at Socket.leaveAll (/app/node_modules/socket.io/dist/socket.js:190:22)
    at Socket._onclose (/app/node_modules/socket.io/dist/socket.js:334:14)
    at Client.onclose (/app/node_modules/socket.io/dist/client.js:245:20)
    at Socket.emit (events.js:412:35)
    at Socket.onClose (/app/node_modules/engine.io/lib/socket.js:348:12)
    at Object.onceWrapper (events.js:519:28)
    at WebSocket.emit (events.js:400:28)
    at WebSocket.onClose (/app/node_modules/engine.io/lib/transport.js:106:10)
```

A test case was added, which reproduces the issue by adding a listener
to the "leave-room" event. This does not seem to be the case for the
user reporting the issue though, which may indicate that the root cause
is elsewhere.
@darrachequesne
Copy link
Member

Hi! Thanks for the pull request 馃憤

Merged as 912e13a, though I'm not sure how it could happen (the object being defined here and then undefined here 馃憖 )

@sharat87 sharat87 deleted the patch-1 branch August 28, 2021 08:40
@sharat87
Copy link
Contributor Author

Hey, thanks for merging! Super extra thanks for also pushing a release 馃帀

Yeah bugs due to race conditions have that peculiarity. They don't seem possible just from looking at the code. But we have situations where several multiple socket connections leave rooms at the same time, and this keeps showing up. We can't reproduce it predictably though.

I think the underlying reason is about multiple asynchronously invoked function calls, trying to update and read the same Map object.

sharat87 added a commit to appsmithorg/appsmith that referenced this pull request Aug 31, 2021
Fix potential race condition error that's killing RTS, when multiple several users leave a room at the same time, which isn't uncommon for our use case.

This is done by upgrading socket.io-adapter dependency to version 2.3.2, which includes fix from PR socketio/socket.io-adapter#74.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants