Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature #1282 - Support serving statically compressed .gz and .br files #1289

Merged
merged 17 commits into from Dec 19, 2016

Conversation

wendigo
Copy link

@wendigo wendigo commented Dec 5, 2016

This change adds support for serving pre-compressed files (.gz for gzip encoding and .br for brotli).

How it works:

If file.x is requested and request's Accept-Encoding contains br, try to serve file.x.br instead if it exists. Vary: Content-Encoding header and Content-Encoding: br headers are then added to response.

If file.x is requested and request's Accept-Encoding contains gzip, try to serve file.x.gz instead if it exists. Vary: Content-Encoding header and Content-Encoding: gzip headers are then added to response.

This works for index files as well 😁

It also modifies gzip plugin so it no longer removes Accept-Encoding header from requests when gzip is enabled. This allows various other plugins in the chain (like fastcgi or proxy) to return compressed responses that will be omitted entirely by gzip plugin (backend can emit already compressed files).

Tip 1.: when manually testing gzip files with gzip -k filename
Tip 2.: when using git plugin on linux all files can be gzipped using then_long find ROOT -type f -exec gzip -kf {} ;

@wendigo
Copy link
Author

wendigo commented Dec 5, 2016

Btw. test cases in fileserver_test.go were broken.

@rickb777
Copy link

rickb777 commented Dec 5, 2016

Please don't merge this. The algorithm for matching Content-Encoding is inadequate and the response headers are not all set correctly (Vary is missing for example).

@wendigo
Copy link
Author

wendigo commented Dec 5, 2016

@Rick777: Vary is added (https://github.com/mholt/caddy/pull/1289/files#diff-8aa323f93179b21d27b52fe43c5e3193R141 Line 141).

Please clarify what do you mean by "algoritm for matching..." so I can fix that - it's pretty the same as in gzip plugin.

@wendigo
Copy link
Author

wendigo commented Dec 5, 2016

P.S. Appveyor is broken due to: remote: Daily bandwidth rate limit exceeded for IP 74.205.54.20

@wendigo wendigo changed the title Feature #1282 - Support pre-gzipped files Feature #1282 - Support serving statically compressed .gz and .br files Dec 5, 2016
@wendigo
Copy link
Author

wendigo commented Dec 7, 2016

Waiting for review & thoughts @mholt

@mholt
Copy link
Member

mholt commented Dec 8, 2016

I will look at it -- school is ending, gimme a few days, sorry. :)

Copy link
Member

@mholt mholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for implementing this. LOVE the added tests.

@@ -25,6 +25,14 @@ func (l LengthFilter) ShouldCompress(w http.ResponseWriter) bool {
return l != 0 && int64(l) <= length
}

// SkipGzipped is ResponseFilter that will discard already gzipped files
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, it will skip compressing responses that have a non-empty Content-Encoding header; it could be any value...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I've added brotli support after initialy adding gzip support only. Name is misleading. I will change that :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've fixed this filter so it now checks if response is compressed

@@ -106,6 +106,8 @@ func gzipParse(c *caddy.Controller) ([]Config, error) {
config.RequestFilters = append(config.RequestFilters, DefaultExtFilter())
}

config.ResponseFilters = append(config.ResponseFilters, SkipGzipped{})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is extremely important now that the Accept-Encoding header stays in the request, so add a test to ensure this response filter is always added.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test added

w.Header().Add("Vary", "Accept-Encoding")
w.Header().Set("Content-Encoding", encoding)

break encodings
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you intend for this to be a labeled break? I guess I like that it's readable, but it's not a nested loop.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right - it's redundant.


stat, err := gf.Stat()
if err == nil {
d = stat
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is filename updated properly in this case, when d is re-assigned?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It don't need to. Filename is only needed so the http.ServeContent will return valid mime-type - it has to point to original filename, not compressed one :)
So: http.ServeContent(w, r, "index.html", compressedIndex.ModTime(), compressedIndexFile) is intended :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, indeed

@mholt
Copy link
Member

mholt commented Dec 11, 2016

@rickb777 Have your concerns been addressed, by having the Vary header added?

@wendigo
Copy link
Author

wendigo commented Dec 15, 2016

Any more thoughts @mholt ?

Copy link
Member

@mholt mholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can double-check my logic on this. We're almost there I think!

if strings.Contains(r.Header.Get("Accept-Encoding"), encoding) {
gf, err := fs.Root.Open(location + staticEncoding[encoding])
if err == nil {
defer gf.Close()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this defer could potentially leak file descriptors. If a file takes a long time to download (network congestion or client leaves connection open or something), this handler may never terminate and the loop that we're in could open multiple files that never end up being used.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you suggest then?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I see that now - I'm calling defer inside loop - I've fixed that :)


stat, err := gf.Stat()
if err == nil {
d = stat
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, indeed

@nhooyr
Copy link

nhooyr commented Dec 18, 2016

@mholt what do you think of gzipping files that can be gzipped (json/css/html/javascript) and storing them in memory and serving from there? This won't pollute the file system.

@wendigo
Copy link
Author

wendigo commented Dec 18, 2016

@nhooyr: I'm personally against - it will require synchronization between original and compressed image (in memory) - so invalidation of cache, memory hard limits and other things. There is also no big difference between serving file from memory and from disk - kernel will cache recently accessed files anyway.

@nhooyr
Copy link

nhooyr commented Dec 18, 2016

@wendigo I didn't mean that we should do it because of the minimal performance gains, it's just that it's a lot easier as you don't have to manually gzip the files manually and pollute your file system. You can just have caddy do it and update them whenever necessary.

@wendigo
Copy link
Author

wendigo commented Dec 18, 2016

@nhooyr you don't have to manually gzip files - a lot of javascript builders (like webpack) can output gzipped files for you. This change is about serving pre-compressed file if it exists, not compression per se.


if err == nil {
// Close previous file - release fd
f.Close()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is closing f a second time problematic? It's already closed above with a defer.

Copy link
Author

@wendigo wendigo Dec 18, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've checked and in case that file was previously closed Close() will return error (ErrInvalid - invalid argument). As we don't check Close() return error calling it is no-op. My intention was to close file as soon as possible to release underlying file descriptor.

Copy link

@lhecker lhecker Dec 18, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wendigo @mholt You could also replace this line with the following:

defer func() { _ = f.Close() }()

That way no file handle is closed twice...
...as long as you remember to always close f before assigning it a new value, but considering how "dangerous" Go is anyways already I'd consider such a requirement quite normal. Calling Close() on already closed handles is not a no-op btw. 😅

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wendigo Did you mean this line? That would generate a EINVAL error. Otherwise I wouldn't know how the *File pointer would be nil.

Well either way it's nearly a no-op. In that case I'm sorry for "accusing" you. I know that my opinion doesn't "really" count since I'm neither maintainer nor contributor of this PR, but in the end I'd still say that the critique of the double-close is valid - on one hand because defer is on the heavy side of inbuilt operations and on the other because it appears to be cleanlier in my opinion if you reduce the amount of duplicate (apparent) work. Kinda like how you use bounded channels in Go instead of unbounded ones.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I'd rather have one too many calls to Close() than one too few. We can clean up things as needed later (as long as we have sufficient tests)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lhecker totally agree. By writing "no-op" I meant "leaving *File in the same, terminal state". As we are not checking Close() error anyway calling Close() multiple times has no side effects.

Copy link
Member

@mholt mholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating on this! The static file server is a really critical piece of code so I want it to be just right (and agreeable!). 😊

break
}

gf.Close()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop is constructed a bit "deep" in my opinion. Can we do this instead, I find this much easier to follow (untested, please try it out if you like it!):

// see if there is a compressed version of the file available; if so, use it
for _, encoding := range staticEncodingPriority {
	if !strings.Contains(r.Header.Get("Accept-Encoding"), encoding) {
		continue
	}
	encodedFile, err := fs.Root.Open(location + staticEncoding[encoding])
	if err != nil {
		encodedFile.Close()
		continue
	}
	encodedFileInfo, err := encodedFile.Stat()
	if err != nil {
		encodedFile.Close()
		continue
	}
	
	f.Close() // release fd of previously-selected file
	f = encodedFile
	d = encodedFileInfo

	w.Header().Add("Vary", "Accept-Encoding")
	w.Header().Set("Content-Encoding", encoding)

	defer f.Close()
	break
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's easier to follow 👍 First encodedFile.Close() is not needed as Open resulted in error :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refactored also one extra loop - when iterating over IndexPages - same comments apply

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point about the first Close(), thanks.


if err == nil {
// Close previous file - release fd
f.Close()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I'd rather have one too many calls to Close() than one too few. We can clean up things as needed later (as long as we have sufficient tests)

var staticEncodingPriority = []string{
"br",
"gzip",
}
Copy link
Member

@mholt mholt Dec 19, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you comment on what these two variables (the map and the slice) do? godoc style (I know they're unexported, but it's not obvious what they're used for, since this isn't the compression module.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

wendigo and others added 5 commits December 19, 2016 08:22
…1278)

* Fix for #1276

* Use strconv.Format

* Use map[string]interface{} as variables

* One more file

* Always run all tests before commit

* Get rid of DocFlags
* Convert rwc field on FCGIClient from type io.ReadWriteCloser to net.Conn.
* Return HTTP 504 to the client when a timeout occurs.
* In Handler.ServeHTTP(), close the connection before returning an HTTP
502/504.
* Refactor tests and add coverage.
@mholt mholt merged commit 54c6300 into master Dec 19, 2016
@mholt
Copy link
Member

mholt commented Dec 19, 2016

Thank you very much for your work, Mateusz! :)

@wendigo
Copy link
Author

wendigo commented Dec 19, 2016

Thx! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants