Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle too long URIs and too large header fields in HTTP::Request.from_io #8013

Merged

Conversation

@straight-shoota
Copy link
Member

commented Jul 30, 2019

This change removes HTTP::Request::BadRequest which was just a status type to signal an error in the request parser. Instead, Request.from_io directly returns an HTTP::Status which can designate a different status code than 400 Bad Request.

When a URI is too long (max size is configurable, defaults to 1MB), it returns 414 Request-URI too long.
When the header fields are to large (HTTP::MAX_HEADER_SIZE = 1MB), it returns 431 Request Header Fields Too Large.

Fixes #7838

@jhass
jhass approved these changes Jul 30, 2019
@straight-shoota straight-shoota force-pushed the straight-shoota:feature/http-path-length branch from 3701e66 to ff268b4 Jul 30, 2019
@straight-shoota straight-shoota added this to the 0.31.0 milestone Jul 30, 2019
@straight-shoota straight-shoota force-pushed the straight-shoota:feature/http-path-length branch from ff268b4 to a8db1e4 Jul 30, 2019
@RX14

This comment has been minimized.

Copy link
Member

commented Jul 30, 2019

1MiB is way too large.

@asterite

This comment has been minimized.

Copy link
Member

commented Jul 30, 2019

Go does the same, so it might be fine.

@RX14

This comment has been minimized.

Copy link
Member

commented Jul 30, 2019

No, we should make the limits configurable and leave the defaults as-is. Maybe doubled, but no larger.

It's so easy to DoS a server, and the current values work for any sane usecase.

If someone's already designed an insane API which you can't break, you should be able to raise the limits. But the defaults should be left as-is.

@asterite

This comment has been minimized.

Copy link
Member

commented Jul 30, 2019

So, if you hit www.google.com with such large request it works fine.

However, it might make sense to have a compromise and choose a smallish value but let the user configure it if they really need it.

@straight-shoota

This comment has been minimized.

Copy link
Member Author

commented Jul 30, 2019

I think it makes sense to look at what default values others are using.

Request line:

Request Header:

  • apache: 8K
  • nginx: 4 * 8K (8K per header field, 32K in total with request line)
  • golang: 1M (total)

This list can be expanded.

@asterite

This comment has been minimized.

Copy link
Member

commented Jul 30, 2019

I forgot to mention this in my previous review: HTTP::Server should have a configurable property that is passed to HTTP::Request.from_io, so that one can choose the maximum size.

@straight-shoota

This comment has been minimized.

Copy link
Member Author

commented Jul 30, 2019

@asterite Do you mean max header size or max request line size? Or both?

@asterite

This comment has been minimized.

Copy link
Member

commented Jul 30, 2019

Probably both.

@RX14

This comment has been minimized.

Copy link
Member

commented Jul 31, 2019

Going with the same as NGINX is OK with me. I'd be fine with copying just the 32k without the 8k per-header too. Realistically, most everything running crystal is gonna pass through NGINX on the way.

@straight-shoota

This comment has been minimized.

Copy link
Member Author

commented Jul 31, 2019

Realistically, most everything running crystal is gonna pass through NGINX on the way.

In this case Crystal wouldn't need to enforce any limits at all. In fact, it might even be counterproductive because increasing the limits in NGINX wouldn't have any effect unless they're increased in the app server as well.

I think that's why golang has such high limits: they expect their HTTP server to be exposed behind a reverse proxy. For this use case, you want high limits in order to accept everything coming from the frontend.
Only when the app server is exposed directly, it needs to enforce reasonable limits by itself to reduce DoS attack surface.

From that line of thought, you could say that an app server supposed to be exposed without a reverse proxy needs to take special precautions anyway. One of them would be to decrease the default limits.
I'm not sure which limits Caddy uses. The documentation says it's configurable (even depending on path) but not what they use as default value.

Given that Crystal and Go are used for writing the same kinds of software, I'd be inclined to follow Go's example here and use very high limits.
It's not safe by default, but honestly, you probably shouldn't hook up your HTTP::Server directly to the internet anyway. That's what reverse proxies are for.

@RX14

This comment has been minimized.

Copy link
Member

commented Jul 31, 2019

I don't follow that we should support DoSing the server by default because we assume that we're behind NGINX. While we're behind NGINX most of the time right now, I want to make sure that Crystal is robust with good defaults for graduating beyond that!

If we cannot agree to new limits, then perhaps we can all agree to make this configurable, then change limits later.

@straight-shoota

This comment has been minimized.

Copy link
Member Author

commented Aug 1, 2019

I want to make sure that Crystal is robust with good defaults for graduating beyond that!

The question is whether the stdlib can and should provide that.

But disussing limits and long-term goals is not the purpose of this PR. So I agree to leave this part of the discussion out.

@straight-shoota

This comment has been minimized.

Copy link
Member Author

commented Aug 1, 2019

I've added these limits as class properties of HTTP with the defaults:

  • max_request_line_size: 8 KB
  • max_headers_size: 16 KB (as before)

These properties are currently the only way to configure this, making them global values. I'm not sure whether we would need more fine-grained control, for example per HTTP::Server instance. I'd say this could be easily added later, should there be a demand.

I've also changed read_header_line to read only as much bytes that are available, otherwise a purposly desgined request could make it initially read up to 2 * max_header_size - 1 bytes.
This could potentially be enhanced by a {max_size, HTTP.max_single_header_size}.min to enforce a maximum size per single header (like 8KB in NGINX).

@straight-shoota straight-shoota force-pushed the straight-shoota:feature/http-path-length branch from 702634d to f0fc36b Aug 1, 2019
src/http/common.cr Outdated Show resolved Hide resolved
src/http/common.cr Outdated Show resolved Hide resolved
src/http/common.cr Outdated Show resolved Hide resolved
src/http/request.cr Outdated Show resolved Hide resolved
@asterite

This comment has been minimized.

Copy link
Member

commented Aug 1, 2019

I've added these limits as class properties of HTTP with the defaults

Thank you! But I was thinking about this as properties of an HTTP::Server. What do you think?

I know an app will usually have just one http server, but I usually try to avoid global variables. One reason is that it's easier to test (no need to remember unsetting the property after each test).

@RX14

This comment has been minimized.

Copy link
Member

commented Aug 1, 2019

I'd definitely find it far cleaner to have them on HTTP::Server - even if the code for passing it around is a bit messy.

@straight-shoota straight-shoota force-pushed the straight-shoota:feature/http-path-length branch from f0fc36b to faec19e Aug 1, 2019
@straight-shoota

This comment has been minimized.

Copy link
Member Author

commented Aug 1, 2019

I've moved configuration to HTTP::Server.

src/http/content.cr Outdated Show resolved Hide resolved
src/http/common.cr Outdated Show resolved Hide resolved

# EOF
break unless request

if request.is_a?(HTTP::Request::BadRequest)
response.respond_with_error("Bad Request", 400)
if request.is_a?(HTTP::Status)

This comment has been minimized.

Copy link
@Sija

Sija Aug 2, 2019

Contributor
Suggested change
if request.is_a?(HTTP::Status)
if status = request.as?(HTTP::Status)

This comment has been minimized.

Copy link
@straight-shoota

straight-shoota Aug 2, 2019

Author Member

I've tried that, unfortunately it doesn't work because it doesn't restrict the type of request to HTTP::Request.
I figure this should probably work, though because HTTP::Status runs into the branch and returns.

@RX14
RX14 approved these changes Aug 2, 2019
This change allows Request.from_io to directly return a HTTP::Status
which might designate a different status code than 400 Bad Request.
@straight-shoota straight-shoota force-pushed the straight-shoota:feature/http-path-length branch from f2acbbe to 6239126 Aug 6, 2019
@straight-shoota straight-shoota merged commit 92b4a3e into crystal-lang:master Aug 6, 2019
4 of 5 checks passed
4 of 5 checks passed
ci/circleci: test_linux32 Your tests failed on CircleCI
Details
ci/circleci: check_format Your tests passed on CircleCI!
Details
ci/circleci: test_darwin Your tests passed on CircleCI!
Details
ci/circleci: test_linux Your tests passed on CircleCI!
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@straight-shoota straight-shoota deleted the straight-shoota:feature/http-path-length branch Aug 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.