Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[gql_websocket_link]: Race condition resulting in multiple channels in a single link #227

Closed
shyndman opened this issue May 18, 2021 · 3 comments

Comments

@shyndman
Copy link

shyndman commented May 18, 2021

Hello there,

While debugging the project I'm working on (nhost/nhost-dart#9), I noticed that a websocket link instance was invoking its channel generator multiple times before it connected. Which means more sockets than expected...conceivably misbehaving sockets.

Here's how to reproduce:

  1. Create a new WebSocketLink
  2. Make two (or more!) requests in succession, without waiting for a response. In our case, both requests were made in the same method body, and both are subscribed to via Stream.listen and not await for.

Here are the logs associated with the above, annotated with what's happening (I added logging throughout the code to better understand the flow of things):

flutter: sending request #1
# Beginning to connect so we can satisfy request
flutter: _connect, channel is null  
flutter: invoking channel generator
# While awaiting the channel generator, execution returns to the next request
# (enqueued in microtask)
flutter: sending request #2
# !!! Oh no !!! We're starting to connect while we're already connecting
flutter: _connect, channel is null 
flutter: invoking channel generator # Second channel being generated
# Here we print identityHashCode of each channel immediately after creation,
# which confirms that they're distinct objects
flutter: _channel: 0x338ca006 
flutter: _channel: 0x215f5847 
# Now the last channel generated "wins" (gets written to the _channel field) and 
# gets the writes
flutter: writing request #1 to channel 0x215f5847
flutter: writing request #2 to channel 0x215f5847

Digging a little deeper, it appears the initialization logic doesn't take into account whether a channel is already being created. This check

if (_channel == null || _connectionStateController.value == closed) {
await _connect();
}
will be passed by any request that comes in before a channel is generated (which can be many more than two), and each of those requests will result in a new socket channel being created.

While I don't know the specifics of what happens after this point, it doesn't look stable. All those sockets are going to be sharing the same connection state stream and timers...and if any one of those streams hits its onDone handler, it will either trigger a reconnection on the "winning" socket, or just close it outright.

I've also been thinking through solutions, and I think that making an AsyncMemoizer responsible for channel creation would do the trick. The only thing to keep in mind is that you'd need to recreate the memoizer every time the socket disconnects, so that it can protect against the next request barrage.

Version: 0.3.0

Bonus. Screenshot of the supremely connected link:
Screen Shot 2021-05-18 at 1 42 20 AM

@agent3bood
Copy link
Collaborator

Hi @shyndman
Thanks for the details, by reading you description of the issue I think there might be a race condition in _connectionStateController, I will recheck it and provide a fix if needed.

@shyndman
Copy link
Author

shyndman commented May 18, 2021

It's actually simpler than that. The race condition exists because _channel being null leads to a _connect, and _channel takes time to be assigned. Example:

https://dartpad.dev/da5ae0734f2eed81ac682514b3dabc94?null_safety=true

@shyndman
Copy link
Author

shyndman commented May 18, 2021

Is there any reason you can't rely entirely on the current connection state to determine whether or not to connect? It looks like you're already managing state transitions appropriately in request(), so why the null check?

shyndman added a commit to nhost/nhost-dart that referenced this issue Jun 6, 2021
The latest version includes a fix for a race condition that allows
multiple websocket channels to open simultaneously, leading to undefined
behavior.

gql-dart/gql#227
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants