Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revised fileserver Accept-Encoding and ETag #1432

Closed
wants to merge 5 commits into from
Closed

Revised fileserver Accept-Encoding and ETag #1432

wants to merge 5 commits into from

Conversation

rickb777
Copy link

The parsing of Accept-Encoding was inadequate, not recognising word boundaries. The returned ETag was not using strong and weak variants correctly.

This PR addresses these issues and improves the inadequacies in the related tests.

@CLAassistant
Copy link

CLAassistant commented Feb 13, 2017

CLA assistant check
Thank you for your submission, we really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ rickb777
❌ Rick


Rick seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.

@rickb777
Copy link
Author

I signed the CLA. It claims I didn't. Fooey.

@tpng
Copy link

tpng commented Feb 14, 2017

It seems you need to add this email address used in the commit to your GitHub a/c.
rickb777@github.com
https://github.com/settings/emails

@mholt
Copy link
Member

mholt commented Feb 14, 2017

Hm, yeah, the CLA tool doesn't work if your email isn't on your GitHub profile. It's fine though, I can override that check.

Copy link
Member

@mholt mholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! I just have a question that I inlined.

// calculateEtag produces a strong etag by default. Prefix the result with "W/" to convert this into a weak one.
// see https://tools.ietf.org/html/rfc7232#section-2.3
func calculateEtag(d os.FileInfo) string {
return fmt.Sprintf(`"%x-%x"`, d.ModTime().Unix(), d.Size())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a strong ETag, but we don't use the actual bytes of the file to produce it. Will that be a problem? (FWIW, I like not reading over the whole file for efficiency purposes.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good question. Generally, I think you're right to be more concerned about the inefficiency of scanning the whole file to compute a hash. And the use of timestamp+length is not a complete guarantee that the file contents haven't changed.

But the bullish high-performance argument is that timestamp+length is good enough in all normal use-cases. This is what most people will be happy with, I think, and is what Nginx does. After all, strong etags lead to more aggressive caching, boosting perceived performance significantly.

Thinking about this a little more, the same argument suggests that the compressed artefacts should also have strong etags (based on timestamp+length).

If you think Caddy should be more accommodating, perhaps we can provide an option to force the etags to be weak, or to disable them completely..

And there's more work to be done on 304s.

So I am now thinking that maybe this needs a little more work to

a) use strong etags for all served resources by default
b) provide an option to use weak etags instead when required.
c) implement 304-not-modified checks on If-None-Match conditional requests.

I'll take another look tonight.

[also of interest: PR191 https://github.com//issues/191]

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just checked Nginx 1.10.1:

  • when Accept-Encoding:gzip is sent and gzip_static is on, the precompressed content is served if available.
  • when Accept-Encoding:gzip is missing or gzip_static is off or the precompressed content is not available, the uncompressed content is served.
  • the ETag is computed from the timestamp+length algorithm in both cases above.
  • strong ETags are always used.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the bullish high-performance argument is that timestamp+length is good enough in all normal use-cases.

Because if the contents of the file change, it is highly likely that the timestamp or length will change. I agree that's reasonable in pretty much every case I can think of.

Thinking about this a little more, the same argument suggests that the compressed artefacts should also have strong etags (based on timestamp+length).

Sure, I don't see why the caching on those needs to be treated any differently? It's still a static file.

Let's try strong E-Tags then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants