Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sort select_best_encoding candidates by available_encodings order #1184

Closed
wants to merge 1 commit into from

Conversation

wjordan
Copy link
Contributor

@wjordan wjordan commented Jun 27, 2017

This PR makes a small change to Rack::Utils#select_best_encoding that allows extra server-side control over the encoding returned in the case of a tie (multiple encodings at the same qvalue).

The reason this extra control is necessary is because the Accept-Encoding header field (ref. rfc7231#5.3.4) uses HTTP content negotiation which contains a bit of ambiguity regarding tie-breakers at the same qvalue.

My use-case is the Chrome browser currently sending the following Accept-Encoding header for most requests over an HTTPS connection:

Accept-Encoding: gzip, deflate, br

Because br (the official identifier for Brotli Compressed Data Format) has a better compression ratio than gzip or deflate, I'd like to tell Rack to prefer sending br-encoded content above the others.

The current implementation of #select_best_encoding interprets these encoding preferences in sorted-order, e.g.: "I prefer gzip most, next deflate, and br least", so gzip is returned.

This PR interprets these encoding preferences equally, e.g.: "I prefer gzip, deflate, and br equally, so send me whatever the server decides is best". The tie-breaker would be determined by the order of the encodings provided to #select_best_encoding, so #select_best_encoding(%w(br gzip identity), request.accept_encoding) for the above request would return br instead of gzip, as desired.

@HoneyryderChuck
Copy link

you shouldn't need this IMO, you just need to pass the available encodings in order of preference to the first argument (see its usage in the deflater middleware). I'm assuming you're br-encoding them in a middleware of yours, as rack doesn't handle brotli by default.

@wjordan
Copy link
Contributor Author

wjordan commented Jun 27, 2017

@HoneyryderChuck refer to the following test I've included in this PR:

helper.call(%w(compress gzip identity), [["gzip", 1.0], ["compress", 1.0]]).must_equal "compress"

If what you're saying is true, this test should pass in the current implementation, but it does not.

@HoneyryderChuck
Copy link

@wjordan , got it finally. I guess I'm not sure what to say here. As you stated, the RFC doesn't cover this case, and all other documentation found leans toward "undefined behaviour".

However, I'd keep it as it is. The order in which the client sends the encodings reflects preference (or browser defaults... ). Google, which champions brotli, also responds with a gzipped response in such a scenario, as with all of the other examples I've found. Can you provide me with URLs which match your desired behaviour?

AFAIK you can always skip the deflater middleware (most applications encode at the load balancer / on deploy time) and insert your own which handles brotli only. It might not be what you're looking after though.

@wjordan
Copy link
Contributor Author

wjordan commented Jun 28, 2017

Can you provide me with URLs which match your desired behaviour?

Facebook:

$ curl -sSL facebook.com -H "Accept-Encoding: gzip, deflate, br" -I | grep Content-Encoding
Content-Encoding: br

Google Fonts API:

$ curl -sSL fonts.googleapis.com -X GET -I -H "Accept-Encoding: gzip, deflate, br" | grep Content-Encoding
Content-Encoding: br

Dropbox (static resources):

$ curl -sSL https://cfl.dropboxstatic.com/static/css/main-vflch2mBM.css -X GET -I -H "Accept-Encoding: gzip, deflate, br" | grep Content-Encoding
Content-Encoding: br

LinkedIn (static resources):

$ curl -sSL https://static.licdn.com/sc/h/br/58iglxc1u5jqbwn0wedk02q21 -X GET -I -H "Accept-Encoding: gzip, deflate, br" | grep Content-Encoding
Content-Encoding: br

@HoneyryderChuck
Copy link

Thx for the links, I think I got it now. I've also reasoned about why the google main page isn't brotli'ing, and that might be because when it comes to dynamic content, the compression rate is beaten by the compress time(?).

So I think that your proposal makes sense.

@jeremyevans
Copy link
Contributor

This change seems reasonable to me. Can you please rebase against master and add documentation?

@wjordan
Copy link
Contributor Author

wjordan commented Jan 24, 2020

Can you please rebase against master and add documentation?

What kind of documentation do you have in mind for a small change like this? The only thing CONTRIBUTING suggests is to "Document any external behavior in the README" but that seems like overkill for this tiny detail.

(edit: Is updating the CHANGELOG what you had in mind?)

@jeremyevans
Copy link
Contributor

I was thinking adding a method rdoc comment (one doesn't currently exist). Unfortunately, rack doesn't document all existing methods, but we should start moving in that direction.

@ioquatix
Copy link
Member

ioquatix commented Feb 5, 2020

I will merge this by hand. Thanks for your effort.

@ioquatix ioquatix closed this Feb 5, 2020
@ioquatix ioquatix added this to the v2.2.0 milestone Feb 5, 2020
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this pull request Mar 20, 2020
Update ruby-rack to 2.2.2.


## [2.2.2] - 2020-02-11

### Fixed

- Fix incorrect `Rack::Request#host` value. ([#1591](rack/rack#1591), [@ioquatix](https://github.com/ioquatix))
- Revert `Rack::Handler::Thin` implementation. ([#1583](rack/rack#1583), [@jeremyevans](https://github.com/jeremyevans))
- Double assignment is still needed to prevent an "unused variable" warning. ([#1589](rack/rack#1589), [@kamipo](https://github.com/kamipo))
- Fix to handle same_site option for session pool. ([#1587](rack/rack#1587), [@kamipo](https://github.com/kamipo))

## [2.2.1] - 2020-02-09

### Fixed

- Rework `Rack::Request#ip` to handle empty `forwarded_for`. ([#1577](rack/rack#1577), [@ioquatix](https://github.com/ioquatix))

## [2.2.0] - 2020-02-08

### SPEC Changes

- `rack.session` request environment entry must respond to `to_hash` and return unfrozen Hash. ([@jeremyevans](https://github.com/jeremyevans))
- Request environment cannot be frozen. ([@jeremyevans](https://github.com/jeremyevans))
- CGI values in the request environment with non-ASCII characters must use ASCII-8BIT encoding. ([@jeremyevans](https://github.com/jeremyevans))
- Improve SPEC/lint relating to SERVER_NAME, SERVER_PORT and HTTP_HOST. ([#1561](rack/rack#1561), [@ioquatix](https://github.com/ioquatix))

### Added

- `rackup` supports multiple `-r` options and will require all arguments. ([@jeremyevans](https://github.com/jeremyevans))
- `Server` supports an array of paths to require for the `:require` option. ([@khotta](https://github.com/khotta))
- `Files` supports multipart range requests. ([@fatkodima](https://github.com/fatkodima))
- `Multipart::UploadedFile` supports an IO-like object instead of using the filesystem, using `:filename` and `:io` options. ([@jeremyevans](https://github.com/jeremyevans))
- `Multipart::UploadedFile` supports keyword arguments `:path`, `:content_type`, and `:binary` in addition to positional arguments. ([@jeremyevans](https://github.com/jeremyevans))
- `Static` supports a `:cascade` option for calling the app if there is no matching file. ([@jeremyevans](https://github.com/jeremyevans))
- `Session::Abstract::SessionHash#dig`. ([@jeremyevans](https://github.com/jeremyevans))
- `Response.[]` and `MockResponse.[]` for creating instances using status, headers, and body. ([@ioquatix](https://github.com/ioquatix))
- Convenient cache and content type methods for `Rack::Response`. ([#1555](rack/rack#1555), [@ioquatix](https://github.com/ioquatix))

### Changed

- `Request#params` no longer rescues EOFError. ([@jeremyevans](https://github.com/jeremyevans))
- `Directory` uses a streaming approach, significantly improving time to first byte for large directories. ([@jeremyevans](https://github.com/jeremyevans))
- `Directory` no longer includes a Parent directory link in the root directory index. ([@jeremyevans](https://github.com/jeremyevans))
- `QueryParser#parse_nested_query` uses original backtrace when reraising exception with new class. ([@jeremyevans](https://github.com/jeremyevans))
- `ConditionalGet` follows RFC 7232 precedence if both If-None-Match and If-Modified-Since headers are provided. ([@jeremyevans](https://github.com/jeremyevans))
- `.ru` files supports the `frozen-string-literal` magic comment. ([@eregon](https://github.com/eregon))
- Rely on autoload to load constants instead of requiring internal files, make sure to require 'rack' and not just 'rack/...'. ([@jeremyevans](https://github.com/jeremyevans))
- `Etag` will continue sending ETag even if the response should not be cached. ([@henm](https://github.com/henm))
- `Request#host_with_port` no longer includes a colon for a missing or empty port. ([@AlexWayfer](https://github.com/AlexWayfer))
- All handlers uses keywords arguments instead of an options hash argument. ([@ioquatix](https://github.com/ioquatix))
- `Files` handling of range requests no longer return a body that supports `to_path`, to ensure range requests are handled correctly. ([@jeremyevans](https://github.com/jeremyevans))
- `Multipart::Generator` only includes `Content-Length` for files with paths, and `Content-Disposition` `filename` if the `UploadedFile` instance has one. ([@jeremyevans](https://github.com/jeremyevans))
- `Request#ssl?` is true for the `wss` scheme (secure websockets). ([@jeremyevans](https://github.com/jeremyevans))
- `Rack::HeaderHash` is memoized by default. ([#1549](rack/rack#1549), [@ioquatix](https://github.com/ioquatix))
- `Rack::Directory` allow directory traversal inside root directory. ([#1417](rack/rack#1417), [@ThomasSevestre](https://github.com/ThomasSevestre))
- Sort encodings by server preference. ([#1184](rack/rack#1184), [@ioquatix](https://github.com/ioquatix), [@wjordan](https://github.com/wjordan))
- Rework host/hostname/authority implementation in `Rack::Request`. `#host` and `#host_with_port` have been changed to correctly return IPv6 addresses formatted with square brackets, as defined by [RFC3986](https://tools.ietf.org/html/rfc3986#section-3.2.2). ([#1561](rack/rack#1561), [@ioquatix](https://github.com/ioquatix))
- `Rack::Builder` parsing options on first `#\` line is deprecated. ([#1574](rack/rack#1574), [@ioquatix](https://github.com/ioquatix))

### Removed

- `Directory#path` as it was not used and always returned nil. ([@jeremyevans](https://github.com/jeremyevans))
- `BodyProxy#each` as it was only needed to work around a bug in Ruby <1.9.3. ([@jeremyevans](https://github.com/jeremyevans))
- `URLMap::INFINITY` and `URLMap::NEGATIVE_INFINITY`, in favor of `Float::INFINITY`. ([@ch1c0t](https://github.com/ch1c0t))
- Deprecation of `Rack::File`. It will be deprecated again in rack 2.2 or 3.0. ([@rafaelfranca](https://github.com/rafaelfranca))
- Support for Ruby 2.2 as it is well past EOL. ([@ioquatix](https://github.com/ioquatix))
- Remove `Rack::Files#response_body` as the implementation was broken. ([#1153](rack/rack#1153), [@ioquatix](https://github.com/ioquatix))
- Remove `SERVER_ADDR` which was never part of the original SPEC. ([#1573](rack/rack#1573), [@ioquatix](https://github.com/ioquatix))

### Fixed

- `Directory` correctly handles root paths containing glob metacharacters. ([@jeremyevans](https://github.com/jeremyevans))
- `Cascade` uses a new response object for each call if initialized with no apps. ([@jeremyevans](https://github.com/jeremyevans))
- `BodyProxy` correctly delegates keyword arguments to the body object on Ruby 2.7+. ([@jeremyevans](https://github.com/jeremyevans))
- `BodyProxy#method` correctly handles methods delegated to the body object. ([@jeremyevans](https://github.com/jeremyevans))
- `Request#host` and `Request#host_with_port` handle IPv6 addresses correctly. ([@AlexWayfer](https://github.com/AlexWayfer))
- `Lint` checks when response hijacking that `rack.hijack` is called with a valid object. ([@jeremyevans](https://github.com/jeremyevans))
- `Response#write` correctly updates `Content-Length` if initialized with a body. ([@jeremyevans](https://github.com/jeremyevans))
- `CommonLogger` includes `SCRIPT_NAME` when logging. ([@Erol](https://github.com/Erol))
- `Utils.parse_nested_query` correctly handles empty queries, using an empty instance of the params class instead of a hash. ([@jeremyevans](https://github.com/jeremyevans))
- `Directory` correctly escapes paths in links. ([@yous](https://github.com/yous))
- `Request#delete_cookie` and related `Utils` methods handle `:domain` and `:path` options in same call. ([@jeremyevans](https://github.com/jeremyevans))
- `Request#delete_cookie` and related `Utils` methods do an exact match on `:domain` and `:path` options. ([@jeremyevans](https://github.com/jeremyevans))
- `Static` no longer adds headers when a gzipped file request has a 304 response. ([@chooh](https://github.com/chooh))
- `ContentLength` sets `Content-Length` response header even for bodies not responding to `to_ary`. ([@jeremyevans](https://github.com/jeremyevans))
- Thin handler supports options passed directly to `Thin::Controllers::Controller`. ([@jeremyevans](https://github.com/jeremyevans))
- WEBrick handler no longer ignores `:BindAddress` option. ([@jeremyevans](https://github.com/jeremyevans))
- `ShowExceptions` handles invalid POST data. ([@jeremyevans](https://github.com/jeremyevans))
- Basic authentication requires a password, even if the password is empty. ([@jeremyevans](https://github.com/jeremyevans))
- `Lint` checks response is array with 3 elements, per SPEC. ([@jeremyevans](https://github.com/jeremyevans))
- Support for using `:SSLEnable` option when using WEBrick handler. (Gregor Melhorn)
- Close response body after buffering it when buffering. ([@ioquatix](https://github.com/ioquatix))
- Only accept `;` as delimiter when parsing cookies. ([@mrageh](https://github.com/mrageh))
- `Utils::HeaderHash#clear` clears the name mapping as well. ([@raxoft](https://github.com/raxoft))
- Support for passing `nil` `Rack::Files.new`, which notably fixes Rails' current `ActiveStorage::FileServer` implementation. ([@ioquatix](https://github.com/ioquatix))

### Documentation

- CHANGELOG updates. ([@Aupajo](https://github.com/aupajo))
- Added [CONTRIBUTING](CONTRIBUTING.md). ([@dblock](https://github.com/dblock))

## [2.1.2] - 2020-01-27

- Fix multipart parser for some files to prevent denial of service ([@aiomaster](https://github.com/aiomaster))
- Fix `Rack::Builder#use` with keyword arguments ([@kamipo](https://github.com/kamipo))
- Skip deflating in Rack::Deflater if Content-Length is 0 ([@jeremyevans](https://github.com/jeremyevans))
- Remove `SessionHash#transform_keys`, no longer needed ([@pavel](https://github.com/pavel))
- Add to_hash to wrap Hash and Session classes ([@oleh-demyanyuk](https://github.com/oleh-demyanyuk))
- Handle case where session id key is requested but missing ([@jeremyevans](https://github.com/jeremyevans))

## [2.1.1] - 2020-01-12

- Remove `Rack::Chunked` from `Rack::Server` default middleware. ([#1475](rack/rack#1475), [@ioquatix](https://github.com/ioquatix))

## 2.1.0

_Note: There are many unreleased changes in Rack (`master` is around 300 commits ahead of `2-0-stable`), and below is not an exhaustive list. If you would like to help out and document some of the unreleased changes, PRs are welcome._

### Added

- Add support for `SameSite=None` cookie value. ([@hennikul](https://github.com/hennikul))
- Add trailer headers. ([@eileencodes](https://github.com/eileencodes))
- Add MIME Types for video streaming. ([@styd](https://github.com/styd))
- Add MIME Type for WASM. ([@buildrtech](https://github.com/buildrtech))
- Add `Early Hints(103)` to status codes. ([@egtra](https://github.com/egtra))
- Add `Too Early(425)` to status codes. ([@y-yagi]((https://github.com/y-yagi)))
- Add `Bandwidth Limit Exceeded(509)` to status codes. ([@CJKinni](https://github.com/CJKinni))
- Add method for custom `ip_filter`. ([@svcastaneda](https://github.com/svcastaneda))
- Add boot-time profiling capabilities to `rackup`. ([@tenderlove](https://github.com/tenderlove))
- Add multi mapping support for `X-Accel-Mappings` header. ([@yoshuki](https://github.com/yoshuki))
- Add `sync: false` option to `Rack::Deflater`. (Eric Wong)
- Add `Builder#freeze_app` to freeze application and all middleware instances. ([@jeremyevans](https://github.com/jeremyevans))
- Add API to extract cookies from `Rack::MockResponse`. ([@petercline](https://github.com/petercline))

### Changed

- Don't propagate nil values from middleware. ([@ioquatix](https://github.com/ioquatix))
- Lazily initialize the response body and only buffer it if required. ([@ioquatix](https://github.com/ioquatix))
- Fix deflater zlib buffer errors on empty body part. ([@felixbuenemann](https://github.com/felixbuenemann))
- Set `X-Accel-Redirect` to percent-encoded path. ([@diskkid](https://github.com/diskkid))
- Remove unnecessary buffer growing when parsing multipart. ([@tainoe](https://github.com/tainoe))
- Expand the root path in `Rack::Static` upon initialization. ([@rosenfeld](https://github.com/rosenfeld))
- Make `ShowExceptions` work with binary data. ([@axyjo](https://github.com/axyjo))
- Use buffer string when parsing multipart requests. ([@janko-m](https://github.com/janko-m))
- Support optional UTF-8 Byte Order Mark (BOM) in config.ru. ([@mikegee](https://github.com/mikegee))
- Handle `X-Forwarded-For` with optional port. ([@dpritchett](https://github.com/dpritchett))
- Use `Time#httpdate` format for Expires, as proposed by RFC 7231. ([@nanaya](https://github.com/nanaya))
- Make `Utils.status_code` raise an error when the status symbol is invalid instead of `500`. ([@adambutler](https://github.com/adambutler))
- Rename `Request::SCHEME_WHITELIST` to `Request::ALLOWED_SCHEMES`.
- Make `Multipart::Parser.get_filename` accept files with `+` in their name. ([@lucaskanashiro](https://github.com/lucaskanashiro))
- Add Falcon to the default handler fallbacks. ([@ioquatix](https://github.com/ioquatix))
- Update codebase to avoid string mutations in preparation for `frozen_string_literals`. ([@pat](https://github.com/pat))
- Change `MockRequest#env_for` to rely on the input optionally responding to `#size` instead of `#length`. ([@janko](https://github.com/janko))
- Rename `Rack::File` -> `Rack::Files` and add deprecation notice. ([@postmodern](https://github.com/postmodern)).

### Removed

- Remove `to_ary` from Response ([@tenderlove](https://github.com/tenderlove))
- Deprecate `Rack::Session::Memcache` in favor of `Rack::Session::Dalli` from dalli gem ([@fatkodima](https://github.com/fatkodima))

### Documentation

- Update broken example in `Session::Abstract::ID` documentation. ([tonytonyjan](https://github.com/tonytonyjan))
- Add Padrino to the list of frameworks implmenting Rack. ([@wikimatze](https://github.com/wikimatze))
- Remove Mongrel from the suggested server options in the help output. ([@tricknotes](https://github.com/tricknotes))
- Replace `HISTORY.md` and `NEWS.md` with `CHANGELOG.md`. ([@twitnithegirl](https://github.com/twitnithegirl))
- Backfill `CHANGELOG.md` from 2.0.1 to 2.0.7 releases. ([@drenmi](https://github.com/Drenmi))

## [2.0.8] - 2019-12-08

- [[CVE-2019-16782](https://nvd.nist.gov/vuln/detail/CVE-2019-16782)] Prevent timing attacks targeted at session ID lookup. BREAKING CHANGE: Session ID is now a SessionId instance instead of a String. ([@tenderlove](https://github.com/tenderlove), [@rafaelfranca](https://github.com/rafaelfranca))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants