Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protect against cross origin websockets. #980

Merged
merged 10 commits into from
May 26, 2014

Conversation

rgbkrk
Copy link
Contributor

@rgbkrk rgbkrk commented Jan 24, 2014

Since CORS headers don't exist in WebSockets land, we need to check origin and host. This adds origin and host checking within the WebSocketHandler.

The opinionated piece of this PR is that cross origin is disabled by default.

If a user wants different behavior, they'll need to override check_origin.

@rgbkrk
Copy link
Contributor Author

rgbkrk commented Jan 24, 2014

Since websocket_connect doesn't set origin or host and is used in conjunction with WebSocketHandlers, this is definitely going to fail tests for the moment. I'll have to look deeper in the tests, but would love some feedback on this PR.

@bdarnell
Copy link
Member

We should definitely make it easy for applications to check the origin header, and it should probably be turned on by default (I'm not 100% convinced here, but I was surprised when I learned that cross-origin websockets were allowed by default so it's probably better to be conservative). A request with no origin header should probably be allowed rather than rejected - browsers always send the origin, so a request without one is coming from a source where the origin header would provide no meaningful authentication anyway.

Because of proxies like nginx and haproxy, a tornado server doesn't necessarily know its correct origin (it may be listening on another port than the one the client connects to). We need some way to tell Tornado of its true origin. This could either be another option at the WebSocketHandler level, or perhaps we could put it on HTTPServer to make it a part of the incoming request (by analogy with the existing protocol option there).

You added the allow_cross_origin argument to _execute, but it's not really feasible for applications to pass in arguments here. This should probably be managed via a new overridable method, similar to allow_draft76. Maybe structure it as check_origin(self, origin) so applications can choose to allow selected non-same origins.

When comparing origins the default ports :80 and :443 are optional and should probably be either added or removed from both sides of the comparison for consistency.

@rgbkrk
Copy link
Contributor Author

rgbkrk commented Jan 24, 2014

@bdarnell - Thank you so much for the excellent feedback! I'm never sure how PRs out of nowhere will be received on a project.

A request with no origin header should probably be allowed rather than rejected - browsers always send the origin, so a request without one is coming from a source where the origin header would provide no meaningful authentication anyway.

It is a protection for endusers on a browser, so that's completely sane. I'll change to match that.

Because of proxies like nginx and haproxy, a tornado server doesn't necessarily know its correct origin (it may be listening on another port than the one the client connects to).

The Host header is actually passed by the browser as well. I did some testing with nginx where the host name is different (tornado server running only on localhost, nginx forwarding to another interface). @minrk tested it with SSH tunnels and node-http-proxy, but again only with browsers.

This should probably be managed via a new overridable method, similar to allow_draft76. Maybe structure it as check_origin(self, origin) so applications can choose to allow selected non-same origins.

Just the feedback I needed. Wasn't quite sure how to structure this. That's a really good idea. Disable by default but let users override how the checking is done, to include checking for particular origins?

When comparing origins the default ports :80 and :443 are optional and should probably be either added or removed from both sides of the comparison for consistency.

In the server I'm using, which is a client side application, the ports are actually high ports and do appear both on the origin and the header. I can provide samples of this directly to you if necessary.


I submitted this on the road but I'll be back to it soon, including getting tests to pass (and adding some).

@rgbkrk
Copy link
Contributor Author

rgbkrk commented Jan 25, 2014

While adding tests, I noticed how terrible of an interface I made by putting this in _execute. If they override it, they'd have to provide a default argument for it to do anything different.

class SomeWebSocketHandler(WebSocketHandler):

  def check_origin(self, allowed_origins=["localhost"]):
      super(SomeWebSocketHandler, self).check_origin(allowed_origins)

That's so ugly.

This is my first time hacking on Tornado, so I'm a bit lost on how to get to certain data and responses within the tests. In particular, I would want to verify the status code that was returned to the WebSocketClientConnection. Can you point me in the right direction so I can make better, more robust tests?

On a side note, I noticed that some of the tests aren't enabled (by way of the gen_test decorator). In particular, test_websocket_callbacks within websocket_test. If enabled, it's a failing test (and completely unrelated to this PR). Is this by design?

@bdarnell
Copy link
Member

Good catch on the test_websocket_callbacks bug - that wasn't intentional. I'll fix that and I think I've got a trick to detect and prevent these in the future.

For testing the error codes, you should be able to catch the HTTPError raised by websocket_connect - see test_websocket_http_fail.

It's not always possible for an application to pass in all its allowed origins (unless perhaps we introduce some sort of wildcard matching, but that just invites complexity), so I'd turn the interface around a bit: _execute calls self.check_origin(origin) with the value extracted from the headers (and perhaps normalized? lowercase it, add a port if it's missing, etc), and then it's up to check_origin to decide whether to allow it. The default implementation would compute an expected origin from the request and compare to that, but applications are free to augment or replace that default.

@rgbkrk
Copy link
Contributor Author

rgbkrk commented Jan 26, 2014

For testing the error codes, you should be able to catch the HTTPError raised by websocket_connect - see test_websocket_http_fail.

Cool. That helped. Also made me realize I had the origin check in the wrong spot. Yay for tests.

...with the value extracted from the headers (and perhaps normalized? lowercase it, add a port if it's missing, etc)

We won't actually know the port since origin is in the context of the browser. I can try to reach a websocket hosted on localhost using my javascript console while on github.com - the origin will show http://github.com.


This is in a happy state now I think. Ready for comments or merging!

@@ -296,7 +296,7 @@ directory as your Python file, you could render this template with:
self.render("template.html", title="My title", items=items)

Tornado templates support *control statements* and *expressions*.
Control statements are surronded by ``{%`` and ``%}``, e.g.,
Control statements are surrounded by ``{%`` and ``%}``, e.g.,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, this typo fix was meant to go in a separate PR. I can move this if necessary.

@rgbkrk
Copy link
Contributor Author

rgbkrk commented Apr 27, 2014

@bdarnell, just realized this PR was still out and about. Would you still want to bring this in if I address your last few items?

@bdarnell
Copy link
Member

Yes, I think if you address my last few comments it'll be ready to merge.

@rgbkrk
Copy link
Contributor Author

rgbkrk commented May 8, 2014

I've addressed the last few comments and performed a rebase to make this easier to merge.


# If there was an origin header, check to make sure it matches
# according to check_origin. When the origin is None, we assume it
# came from a browser and that it can be passed on.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean "we assume it did not come from a browser and therefore we do not need to enforce the same-origin policy", right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, "did not come from a browser".

@bdarnell bdarnell merged commit 7fcbe30 into tornadoweb:master May 26, 2014
@bdarnell
Copy link
Member

Merged with the handling of non-absolute origins removed.

@rgbkrk
Copy link
Contributor Author

rgbkrk commented May 26, 2014

Awesome, thanks for working through it with me. Checking out 6403cb9 now.

@rgbkrk rgbkrk deleted the cross_origin branch May 26, 2014 02:32
a-pertsev added a commit to a-pertsev/tornado that referenced this pull request May 3, 2018
a-pertsev added a commit to a-pertsev/tornado that referenced this pull request May 3, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants