Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix race condition in websocket stream write #25624

Merged
merged 1 commit into from Jul 8, 2016

Conversation

@tinco
Copy link
Contributor

@tinco tinco commented Jul 1, 2016

Summary

It is not safe to write to a socket from two threads at the same time. If multiple threads are spawned that perform ActionCable.server.broadcast there is a chance their calls to ActionCable::Connection::Stream will overlap and their data gets mingled and corrupted. This will usually result in an InvalidUTF8 exception in the browser, which will subsequently kill the websocket connection.

This patch simply puts a write lock around the socket so it can never be called concurrently. I don't expect this will not have any significant performance overhead.

I also put the lock around stream_send, though I'm not 100% certain this is also necessary.

Do you require tests for concurrency problems like this?

To reproduce the error, clone this repository: https://github.com/phusion/actioncable-slow-client, run it in Passenger using passenger start --min-instances=1 --max-pool-size=2 and visit localhost:3000. In the inspector you will see the connection dying every few frames.

p.s.: really sorry that we started performance tests so late, I actually started this on Tuesday thinking I'd make sure Passenger works correctly with Rails 5rc2 but we didn't figure out what was going on until today.

@rails-bot
Copy link

@rails-bot rails-bot commented Jul 1, 2016

Thanks for the pull request, and welcome! The Rails team is excited to review your changes, and you should hear from @kaspth (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@maclover7
Copy link
Member

@maclover7 maclover7 commented Jul 1, 2016

@rails-bot rails-bot assigned matthewd and unassigned kaspth Jul 1, 2016
@tinco
Copy link
Contributor Author

@tinco tinco commented Jul 4, 2016

Alright, so I was a little curious why this happens so I browsed through ruby/ruby's io.c to find out what IO#write does exactly, and it's a straight up single call to write(2). The write(2) manpage says that its atomic only for regular files, so not for sockets. There's a SO page here: Is it safe to issue blocking write calls on the same tcp socket from multiple threads? in which someone explains Linux tries to do atomic writes but fails in certain circumstances, and OSX and BSD don't guarantee anything.

I'm not sure why this particular test I did makes it break so reliably, but there you have it.

I guess it might be up to discussion whether Rails should allow ActionCable.server.transmit to be called from different threads, is that something you guys are opinionated on?

@tinco
Copy link
Contributor Author

@tinco tinco commented Jul 4, 2016

Just a small aside, on MRI if you set io.sync = true it will also make it have a mutex and be thread safe, but that behavior isn't documented so we can't know for sure JRuby or Rubinius etc do this.

@matthewd matthewd merged commit ce5f9bb into rails:master Jul 8, 2016
1 check passed
1 check passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
matthewd added a commit that referenced this pull request Jul 8, 2016
Fix race condition in websocket stream write
matthewd added a commit that referenced this pull request Jul 8, 2016
Fix race condition in websocket stream write
@matthewd
Copy link
Member

@matthewd matthewd commented Jul 8, 2016

Backported in 7340459

@raggi
Copy link
Contributor

@raggi raggi commented Jul 9, 2016

I'm not sure why this particular test I did makes it break so reliably, but there you have it.

It wrote more than a pipe buf.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

6 participants
You can’t perform that action at this time.