Revised fileserver Accept-Encoding and ETag #1432

rickb777 · 2017-02-13T21:59:04Z

The parsing of Accept-Encoding was inadequate, not recognising word boundaries. The returned ETag was not using strong and weak variants correctly.

This PR addresses these issues and improves the inadequacies in the related tests.

… switch between strong and weak etags.

CLAassistant · 2017-02-13T22:00:07Z

Thank you for your submission, we really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ rickb777
❌ Rick

Rick seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.

rickb777 · 2017-02-13T22:03:29Z

I signed the CLA. It claims I didn't. Fooey.

tpng · 2017-02-14T10:16:46Z

It seems you need to add this email address used in the commit to your GitHub a/c.
rickb777@github.com
https://github.com/settings/emails

mholt · 2017-02-14T13:42:52Z

Hm, yeah, the CLA tool doesn't work if your email isn't on your GitHub profile. It's fine though, I can override that check.

mholt

Thanks for the PR! I just have a question that I inlined.

mholt · 2017-02-14T13:43:41Z

caddyhttp/staticfiles/fileserver.go

+// calculateEtag produces a strong etag by default. Prefix the result with "W/" to convert this into a weak one.
+// see https://tools.ietf.org/html/rfc7232#section-2.3
+func calculateEtag(d os.FileInfo) string {
+	return fmt.Sprintf(`"%x-%x"`, d.ModTime().Unix(), d.Size())


This is a strong ETag, but we don't use the actual bytes of the file to produce it. Will that be a problem? (FWIW, I like not reading over the whole file for efficiency purposes.)

This is a good question. Generally, I think you're right to be more concerned about the inefficiency of scanning the whole file to compute a hash. And the use of timestamp+length is not a complete guarantee that the file contents haven't changed.

But the bullish high-performance argument is that timestamp+length is good enough in all normal use-cases. This is what most people will be happy with, I think, and is what Nginx does. After all, strong etags lead to more aggressive caching, boosting perceived performance significantly.

Thinking about this a little more, the same argument suggests that the compressed artefacts should also have strong etags (based on timestamp+length).

If you think Caddy should be more accommodating, perhaps we can provide an option to force the etags to be weak, or to disable them completely..

And there's more work to be done on 304s.

So I am now thinking that maybe this needs a little more work to

a) use strong etags for all served resources by default
b) provide an option to use weak etags instead when required.
c) implement 304-not-modified checks on If-None-Match conditional requests.

I'll take another look tonight.

[also of interest: PR191 https://github.com//issues/191]

I just checked Nginx 1.10.1:

when Accept-Encoding:gzip is sent and gzip_static is on, the precompressed content is served if available.

when Accept-Encoding:gzip is missing or gzip_static is off or the precompressed content is not available, the uncompressed content is served.

the ETag is computed from the timestamp+length algorithm in both cases above.

strong ETags are always used.

But the bullish high-performance argument is that timestamp+length is good enough in all normal use-cases.

Because if the contents of the file change, it is highly likely that the timestamp or length will change. I agree that's reasonable in pretty much every case I can think of.

Thinking about this a little more, the same argument suggests that the compressed artefacts should also have strong etags (based on timestamp+length).

Sure, I don't see why the caching on those needs to be treated any differently? It's still a static file.

Let's try strong E-Tags then.

… ETags are now of the 'strong' variety.

Revised fileserver to handle parsing of Accept-Encoding better and to…

b5a92d6

… switch between strong and weak etags.

Merge branch 'master' into master

f9a76a7

mholt reviewed Feb 14, 2017

View reviewed changes

Rick Beton and others added 3 commits February 14, 2017 15:32

Merge branch 'master' into master

7e39ef9

Revised fileserver further to use shorter ETags based on base-36. All…

ef755ac

… ETags are now of the 'strong' variety.

Merge branch 'master' of github.com:rickb777/caddy

9b7bef9

rickb777 closed this Feb 14, 2017

This was referenced Feb 14, 2017

Revised PR 1432 #1434

Closed

Revised fileserver Accept-Encoding and ETag #1435

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revised fileserver Accept-Encoding and ETag #1432

Revised fileserver Accept-Encoding and ETag #1432

rickb777 commented Feb 13, 2017

CLAassistant commented Feb 13, 2017 •

edited

rickb777 commented Feb 13, 2017

tpng commented Feb 14, 2017

mholt commented Feb 14, 2017

mholt left a comment

mholt Feb 14, 2017

rickb777 Feb 14, 2017

rickb777 Feb 14, 2017

mholt Feb 16, 2017

Revised fileserver Accept-Encoding and ETag #1432

Revised fileserver Accept-Encoding and ETag #1432

Conversation

rickb777 commented Feb 13, 2017

CLAassistant commented Feb 13, 2017 • edited

rickb777 commented Feb 13, 2017

tpng commented Feb 14, 2017

mholt commented Feb 14, 2017

mholt left a comment

Choose a reason for hiding this comment

mholt Feb 14, 2017

Choose a reason for hiding this comment

rickb777 Feb 14, 2017

Choose a reason for hiding this comment

rickb777 Feb 14, 2017

Choose a reason for hiding this comment

mholt Feb 16, 2017

Choose a reason for hiding this comment

CLAassistant commented Feb 13, 2017 •

edited