Skip to content

Using Rack::ETag and Rack::Deflater together does not follow spec #577

Closed
wants to merge 2 commits into from

3 participants

@mlangenberg

ETags have Weak and Strong Validators. A weak ETag validator looks like this ETag: W/"123456789".

According to spec, if two response have the same ETag, their bodies should be byte-for-byte identical.

Since md5(gzip(body)) != md5(body), a request with the Accept-Encoding: gzip header should have a different ETag. And that is for example why Nginx clears the ETag header from responses when compressing on the fly. Makes sense, right?

Back to Rack. In order to comply, we need to do the HTTP compression first, and generate the ETag over the compressed data.

 use Rack::ETag
 use Rack::Deflater
 run proc { |env| [200, {"Content-Type" => "text/plain"}, ["OK"] }

The response will include a Strong ETag validator. But, since the Gzip format has a mandatory MTIME, which is set to Time.now in Rack::Deflator, the ETag of every request will be different.

When using the Rack middleware the other way around:

 use Rack::Deflater
 use Rack::ETag
 run proc { |env| [200, {"Content-Type" => "text/plain"}, ["OK"] }

We fail to comply to spec because a request with Accept-Encoding: gzip and one without will both have the same strong ETag validator in the response header.

Solution 1

Calculate the strong ETag from the compressed response body, but set the MTIME to 0 (Unix Epoch). It is a good thing Rack::ETag tries to use the Last-Modified header, but often it is not set for dynamic requests. The Gzip format allows this.

MTIME (Modification TIME)
This gives the most recent modification time of the original file being compressed. The > time is in Unix format, i.e., seconds since 00:00:00 GMT, Jan. 1, 1970. (Note that this > may cause problems for MS-DOS and other systems that use local rather than
Universal time.) If the compressed data did not come from a file, MTIME is set to the
time at which compression started. MTIME = 0 means no time stamp is available.

This requires a small change to Rack::Deflater.

Solution 2

In other situations Rack::Etag is called before the response body is compressed. This is in the case of Rack::Deflater mounted before Rack::Etag or when Nginx is setup to do HTTP compression on-the-fly.

If we suspect that the body (or headers) are going to change after they ran through Rack::Etag, we should not add a Strong ETag validator to the response. In that case it is better to use a Weak ETag validator.

For instance if Accept-Encoding is set, but Content-Encoding is not yet. We could simply do headers['ETag'] = %(W/"#{digest}").

Rack::Deflater will do its thing a moment later, and change the response body, but the ETag is still a perfectly valid Weak ETag Validator.

Any thoughts?

mlangenberg added some commits Jun 17, 2013
@mlangenberg mlangenberg Verify that Rack::Deflater uses Last-Modified header in gzip MTIME he…
…ader. #577
0a4a603
@mlangenberg mlangenberg Set the gzip MTIME header to 0 when no Last-Modified header is availa…
…ble. #577

Often strong ETag validators are set on response headers. For instance with Rack::ETag, the ETag is a digest of the response body.
It is a waste of bandwidth to set the current time in the gzip header, because it causes all ETags to expire every second.
Also, most HTTP clients automatically inflate the response body. The MTIME of the gzip header is not used anyway.

The argument to generate the ETag based of the uncompressed body is invalid, unless weak ETag validators are being used.
a5f958f
@raggi
Official Rack repositories member
raggi commented Jul 4, 2013

Issue accepted. I will take some time to review this soon.

@bdarfler

How is this going? I really would love to use both.

@mlangenberg

Any updates? This is just a small two-line change.

@raggi raggi added this to the Rack 1.6 milestone Jul 12, 2014
@raggi
Official Rack repositories member
raggi commented Aug 3, 2014

I added weak etags in master.

@raggi raggi closed this Aug 3, 2014
@mlangenberg

Thanks for reviewing the issue.
So in master, from now on, all ETags are weak?

Next step is to stop Nginx from stripping weak ETags when compression is enabled.
update: Nginx already made the appropriate changes:

  • e491b26fa5 - Entity tags: downgrade strong etags to weak ones as needed.
  • af229f8cf9 - Entity tags: weak comparison for If-None-Match.
@raggi
Official Rack repositories member
raggi commented Aug 4, 2014

Correct, we only digest the bodies, don't cache or hash header information, so this was the smallest reasonable change I could come up with. It'll still invalidate caches for some users on the first request after upgrade, but at least it shouldn't cause major regressions to their caching strategies in most cases.

Thanks for reporting!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.