Configuration options for Rack::Deflate #457

Merged
merged 2 commits into from Aug 3, 2014

Conversation

Projects
None yet
10 participants
@jakubpawlowicz
Contributor

jakubpawlowicz commented Nov 14, 2012

Rack::Deflater currently does not support any configuration options which makes it cumbersome to control.

This pull request adds support for the following options:

  • include - an array of content types enabled for compression
  • if - a lambda for choosing whether to deflate based on current execution scope (request, status, body, or headers)

Examples:
use Rack::Deflater, include: %w(text/json application/json)
use Rack::Deflater, if: lambda { |env, status, headers, body| body.length > 512 }

EDIT: description was update based on a PR discussion

@maletor

This comment has been minimized.

Show comment Hide comment
@maletor

maletor Nov 15, 2012

Was this created with the intention of skipping image files with Deflator? Because that would be an excellent use case according to https://developers.google.com/speed/docs/best-practices/payload#GzipCompression.

maletor commented Nov 15, 2012

Was this created with the intention of skipping image files with Deflator? Because that would be an excellent use case according to https://developers.google.com/speed/docs/best-practices/payload#GzipCompression.

@jakubpawlowicz

This comment has been minimized.

Show comment Hide comment
@jakubpawlowicz

jakubpawlowicz Nov 15, 2012

Contributor

Yes, that's one use case I can think of. In general it's a bad idea to compress requests smaller than TCP packet size as it only adds an overhead on both ends.

This threshold feature can always be configured on the web server level (nginx, Apache) but in case of services which do not proxy requests through a web server (e.g. Heroku Cedar) having such option in Rack::Deflater is the only option available.

Contributor

jakubpawlowicz commented Nov 15, 2012

Yes, that's one use case I can think of. In general it's a bad idea to compress requests smaller than TCP packet size as it only adds an overhead on both ends.

This threshold feature can always be configured on the web server level (nginx, Apache) but in case of services which do not proxy requests through a web server (e.g. Heroku Cedar) having such option in Rack::Deflater is the only option available.

@maletor

This comment has been minimized.

Show comment Hide comment
@maletor

maletor Nov 15, 2012

Definitely important. Would love to see this get merged!

maletor commented Nov 15, 2012

Definitely important. Would love to see this get merged!

@maletor

This comment has been minimized.

Show comment Hide comment
@maletor

maletor Nov 15, 2012

TCP packet size is different for everybody? But in general, seems to be about 64K?

maletor commented Nov 15, 2012

TCP packet size is different for everybody? But in general, seems to be about 64K?

@jakubpawlowicz

This comment has been minimized.

Show comment Hide comment
@jakubpawlowicz

jakubpawlowicz Nov 15, 2012

Contributor

64kB is the theoretical upper limit for TCP/IP. In reality it depends on the connection type and does not exceed ~1 kB - still there's no need to compress such small amounts of data.

Contributor

jakubpawlowicz commented Nov 15, 2012

64kB is the theoretical upper limit for TCP/IP. In reality it depends on the connection type and does not exceed ~1 kB - still there's no need to compress such small amounts of data.

@bpinto

This comment has been minimized.

Show comment Hide comment
@bpinto

bpinto Dec 21, 2012

Is this ready to be merged?

bpinto commented Dec 21, 2012

Is this ready to be merged?

@jakubpawlowicz

This comment has been minimized.

Show comment Hide comment
@jakubpawlowicz

jakubpawlowicz Dec 21, 2012

Contributor

Not yet as some new specs are failing under 1.8.7 / jruby / ree. It will be ready to merge within the next 24 hours.

Contributor

jakubpawlowicz commented Dec 21, 2012

Not yet as some new specs are failing under 1.8.7 / jruby / ree. It will be ready to merge within the next 24 hours.

@jakubpawlowicz

This comment has been minimized.

Show comment Hide comment
@jakubpawlowicz

jakubpawlowicz Dec 22, 2012

Contributor

@bpinto it's ready for merge. I've refactored the tests a bit more, turned off deflating to see if all relevant tests fail (they do!) and reverted to a working state. Should be great now!

Contributor

jakubpawlowicz commented Dec 22, 2012

@bpinto it's ready for merge. I've refactored the tests a bit more, turned off deflating to see if all relevant tests fail (they do!) and reverted to a working state. Should be great now!

lib/rack/deflater.rb
@app = app
+
+ @min_content_length = options[:min_content_length] || options['min_content_length']

This comment has been minimized.

Show comment Hide comment
@raggi

raggi Dec 30, 2012

Owner
  • Lets just support symbol option keys, not strings, consistent with other middleware options.
  • Maybe min_size or min_length? Just a little shorter?
@raggi

raggi Dec 30, 2012

Owner
  • Lets just support symbol option keys, not strings, consistent with other middleware options.
  • Maybe min_size or min_length? Just a little shorter?

This comment has been minimized.

Show comment Hide comment
@raggi

raggi Dec 30, 2012

Owner

Lets default min-content-length to 1kb. There can be some advantages on mobile networks for smaller entities, but that domain has additional solutions and complexities - shared dictionaries are better, lets hope for them in http2.

@raggi

raggi Dec 30, 2012

Owner

Lets default min-content-length to 1kb. There can be some advantages on mobile networks for smaller entities, but that domain has additional solutions and complexities - shared dictionaries are better, lets hope for them in http2.

lib/rack/deflater.rb
@@ -5,19 +5,20 @@
module Rack
class Deflater
- def initialize(app)
+ def initialize(app, options = {})

This comment has been minimized.

Show comment Hide comment
@raggi

raggi Dec 30, 2012

Owner

Lets document the options, using RDoc style. :-)

@raggi

raggi Dec 30, 2012

Owner

Lets document the options, using RDoc style. :-)

lib/rack/deflater.rb
+
+ # Skip if response body is too short
+ if @min_content_length &&
+ @min_content_length > headers['Content-Length'].to_i

This comment has been minimized.

Show comment Hide comment
@raggi

raggi Dec 30, 2012

Owner

If we default this to some value, we can skip the nil check.

@raggi

raggi Dec 30, 2012

Owner

If we default this to some value, we can skip the nil check.

lib/rack/deflater.rb
+ # Skip if :include is provided and evaluates to false
+ if @include &&
+ @include.kind_of?(Regexp) &&
+ !@include.match(env['PATH_INFO'])

This comment has been minimized.

Show comment Hide comment
@raggi

raggi Dec 30, 2012

Owner

If we use env['PATH_INFO'] =~ @include then it's fine as long as @include is nil, or a regex, or a string. If necessary that could be sanitized in the constructor. This will crunch down all the boolean logic here.

@raggi

raggi Dec 30, 2012

Owner

If we use env['PATH_INFO'] =~ @include then it's fine as long as @include is nil, or a regex, or a string. If necessary that could be sanitized in the constructor. This will crunch down all the boolean logic here.

This comment has been minimized.

Show comment Hide comment
@raggi

raggi Dec 30, 2012

Owner

In fact, if you use === instead of =~ (change operation order) then you naturally get lambda support in 1.9+ as well as string and regex matching.

@raggi

raggi Dec 30, 2012

Owner

In fact, if you use === instead of =~ (change operation order) then you naturally get lambda support in 1.9+ as well as string and regex matching.

lib/rack/deflater.rb
+ # Skip if :exclude is provided and evaluates to true
+ if @exclude &&
+ @exclude.kind_of?(Regexp) &&
+ @exclude.match(env['PATH_INFO'])

This comment has been minimized.

Show comment Hide comment
@raggi

raggi Dec 30, 2012

Owner

As for @include

@raggi

raggi Dec 30, 2012

Owner

As for @include

@raggi

This comment has been minimized.

Show comment Hide comment
@raggi

raggi Dec 30, 2012

Owner

In principle this looks great. I left you some line comments, thanks for your work so far!

Owner

raggi commented Dec 30, 2012

In principle this looks great. I left you some line comments, thanks for your work so far!

@jakubpawlowicz

This comment has been minimized.

Show comment Hide comment
@jakubpawlowicz

jakubpawlowicz Jan 9, 2013

Contributor

Thanks @raggi for all suggestions! I've just applied them so it should be more ready to merge.

Contributor

jakubpawlowicz commented Jan 9, 2013

Thanks @raggi for all suggestions! I've just applied them so it should be more ready to merge.

@raggi

This comment has been minimized.

Show comment Hide comment
@raggi

raggi Jan 22, 2013

Owner

This needs rebasing on top of rack master. I'll be happy to merge this in the next release, but I am out of time for 1.5.0.

Owner

raggi commented Jan 22, 2013

This needs rebasing on top of rack master. I'll be happy to merge this in the next release, but I am out of time for 1.5.0.

@jakubpawlowicz

This comment has been minimized.

Show comment Hide comment
@jakubpawlowicz

jakubpawlowicz Jan 22, 2013

Contributor

That's a pity. I'll merge it as soon as possible to make it into 1.6.

Contributor

jakubpawlowicz commented Jan 22, 2013

That's a pity. I'll merge it as soon as possible to make it into 1.6.

@jakubpawlowicz

This comment has been minimized.

Show comment Hide comment
@jakubpawlowicz

jakubpawlowicz Jan 22, 2013

Contributor

@raggi That should be it!

Contributor

jakubpawlowicz commented Jan 22, 2013

@raggi That should be it!

@nathanaeljones

This comment has been minimized.

Show comment Hide comment
@nathanaeljones

nathanaeljones Jan 30, 2013

To me, the safest default scenario is opt-in based on mime-type.

HTML5 Bootstrap includes server-side configuration to compress the following mime-types

text/html    
text/css
text/plain
text/x-component
application/javascript
application/json
application/xml
application/xhtml+xml
application/x-font-ttf
application/x-font-opentype
application/vnd.ms-fontobject
image/svg+xml
image/x-icon;

I would consider the following list instead:

# All html, text, css, and csv content should be compressed
text/plain
text/html
text/csv
text/css

# Only vector graphics and uncompressed bitmaps can benefit from compression.
#GIF, JPG, and PNG already use a lz* algorithm, and certain browsers can get confused.
image/x-icon
image/svg+xml
application/x-font-ttf
application/x-font-opentype
application/vnd.ms-fontobject


# All javascript should be compressed
text/javascript
application/ecmascript
application/json
application/javascript

# All xml should be compressed
text/xml
application/xml
application/xml-dtd
application/soap+xml
application/xhtml+xml
application/rdf+xml
application/rss+xml
application/atom+xml

If it's not too late, I would suggest mime-type evaluation instead of URL regexes. Providing a sane default set would be great; the current usage pattern is causing lots of issues with images and PDFs, since old browsers lie about Accept-Encoding.

Rack::Deflate has become very important, as Heroku's new Cedar stack removed automatic gzip support, and requires that step be moved to the application itself.

To me, the safest default scenario is opt-in based on mime-type.

HTML5 Bootstrap includes server-side configuration to compress the following mime-types

text/html    
text/css
text/plain
text/x-component
application/javascript
application/json
application/xml
application/xhtml+xml
application/x-font-ttf
application/x-font-opentype
application/vnd.ms-fontobject
image/svg+xml
image/x-icon;

I would consider the following list instead:

# All html, text, css, and csv content should be compressed
text/plain
text/html
text/csv
text/css

# Only vector graphics and uncompressed bitmaps can benefit from compression.
#GIF, JPG, and PNG already use a lz* algorithm, and certain browsers can get confused.
image/x-icon
image/svg+xml
application/x-font-ttf
application/x-font-opentype
application/vnd.ms-fontobject


# All javascript should be compressed
text/javascript
application/ecmascript
application/json
application/javascript

# All xml should be compressed
text/xml
application/xml
application/xml-dtd
application/soap+xml
application/xhtml+xml
application/rdf+xml
application/rss+xml
application/atom+xml

If it's not too late, I would suggest mime-type evaluation instead of URL regexes. Providing a sane default set would be great; the current usage pattern is causing lots of issues with images and PDFs, since old browsers lie about Accept-Encoding.

Rack::Deflate has become very important, as Heroku's new Cedar stack removed automatic gzip support, and requires that step be moved to the application itself.

@maletor

This comment has been minimized.

Show comment Hide comment
@maletor

maletor Jan 30, 2013

I definitely agree with you Nathanael.

On Wednesday, January 30, 2013, Nathanael Jones wrote:

To me, the safest default scenario is opt-in based on mime-type.

HTML5 Bootstrap includes server-side configuration to compress the
following mime-types

text/html
text/css
text/plain
text/x-component
application/javascript
application/json
application/xml
application/xhtml+xml
application/x-font-ttf
application/x-font-opentype
application/vnd.ms-fontobject
image/svg+xml
image/x-icon;

I would consider the following list instead:

All html, text, css, and csv content should be compressed

text/plain
text/html
text/csv
text/css

Only vector graphics and uncompressed bitmaps can benefit from compression.

#GIF, JPG, and PNG already use a lz* algorithm, and certain browsers can get confused.
image/x-icon
image/svg+xml
application/x-font-ttf
application/x-font-opentype
application/vnd.ms-fontobject

All javascript should be compressed

text/javascript
application/ecmascript
application/json
application/javascript

All xml should be compressed

text/xml
application/xml
application/xml-dtd
application/soap+xml
application/xhtml+xml
application/rdf+xml
application/rss+xml
application/atom+xml

If it's not too late, I would suggest mime-type evaluation instead of URL
regexes. Providing a sane default set would be great; the current usage
pattern is causing lots of issues with images.


Reply to this email directly or view it on GitHubhttps://github.com/rack/rack/pull/457#issuecomment-12893560.

maletor commented Jan 30, 2013

I definitely agree with you Nathanael.

On Wednesday, January 30, 2013, Nathanael Jones wrote:

To me, the safest default scenario is opt-in based on mime-type.

HTML5 Bootstrap includes server-side configuration to compress the
following mime-types

text/html
text/css
text/plain
text/x-component
application/javascript
application/json
application/xml
application/xhtml+xml
application/x-font-ttf
application/x-font-opentype
application/vnd.ms-fontobject
image/svg+xml
image/x-icon;

I would consider the following list instead:

All html, text, css, and csv content should be compressed

text/plain
text/html
text/csv
text/css

Only vector graphics and uncompressed bitmaps can benefit from compression.

#GIF, JPG, and PNG already use a lz* algorithm, and certain browsers can get confused.
image/x-icon
image/svg+xml
application/x-font-ttf
application/x-font-opentype
application/vnd.ms-fontobject

All javascript should be compressed

text/javascript
application/ecmascript
application/json
application/javascript

All xml should be compressed

text/xml
application/xml
application/xml-dtd
application/soap+xml
application/xhtml+xml
application/rdf+xml
application/rss+xml
application/atom+xml

If it's not too late, I would suggest mime-type evaluation instead of URL
regexes. Providing a sane default set would be great; the current usage
pattern is causing lots of issues with images.


Reply to this email directly or view it on GitHubhttps://github.com/rack/rack/pull/457#issuecomment-12893560.

@jakubpawlowicz

This comment has been minimized.

Show comment Hide comment
@jakubpawlowicz

jakubpawlowicz Jan 30, 2013

Contributor

@raggi - any ideas about mime-type based evaluation?

Contributor

jakubpawlowicz commented Jan 30, 2013

@raggi - any ideas about mime-type based evaluation?

@nathanaeljones

This comment has been minimized.

Show comment Hide comment
@nathanaeljones

nathanaeljones Feb 4, 2013

Should I submit a pull request?

Should I submit a pull request?

@nathanaeljones

This comment has been minimized.

Show comment Hide comment
@nathanaeljones

nathanaeljones Feb 8, 2013

This is what I'm currently using: https://gist.github.com/nathanaeljones/4739210

This is what I'm currently using: https://gist.github.com/nathanaeljones/4739210

@jakubpawlowicz

This comment has been minimized.

Show comment Hide comment
@jakubpawlowicz

jakubpawlowicz Mar 26, 2013

Contributor

Any ideas how to push it forward?

Contributor

jakubpawlowicz commented Mar 26, 2013

Any ideas how to push it forward?

@masterkain

This comment has been minimized.

Show comment Hide comment
@masterkain

masterkain Aug 2, 2013

Looks awesome to have; HAProxy 1.5dev19 dropped (temporarly) support for transparent gzip compression of chunked responses so we have to implement it at app levels, the more options, the better.

Looks awesome to have; HAProxy 1.5dev19 dropped (temporarly) support for transparent gzip compression of chunked responses so we have to implement it at app levels, the more options, the better.

lib/rack/deflater.rb
+ end
+
+ # Skip if response body is too short
+ if @min_length > headers['Content-Length'].to_i

This comment has been minimized.

Show comment Hide comment
@masterkain

masterkain Aug 2, 2013

I'm testing this but using Rainbows!, Rails 3.2.14, http 1.1 and I don't see any Content-Length header in this point, this thus fails and compression is skipped.

Rainbows! was started with the -N option to not insert any default middleware, so my config.ru is:

use Rack::ContentLength
use Rack::Chunked
run Myapp::Application
@masterkain

masterkain Aug 2, 2013

I'm testing this but using Rainbows!, Rails 3.2.14, http 1.1 and I don't see any Content-Length header in this point, this thus fails and compression is skipped.

Rainbows! was started with the -N option to not insert any default middleware, so my config.ru is:

use Rack::ContentLength
use Rack::Chunked
run Myapp::Application
@jjb

This comment has been minimized.

Show comment Hide comment
@jjb

jjb Nov 24, 2013

@jakubpawlowicz -- this is an awesome patch. It has a lot of features all together, which is probably part of why it's taking so long to merge. Here are some thoughts on how it can be simplified and improved.

include/exclude

It seems like folks (including me) think that

  • whitelist is superior
  • mime-type/Content-Type is superior

I recommend:

  • remove the exclude option entirely (I can't think of a use case, tell me if I'm missing something)
  • reimplement include to operate on the Content-Type header instead of the URL.
  • thought: would be nice to allow the Content-Type list to accept regexes, to allow for text/* for example. but this could perhaps come in a subsequent pull request, to keep this one more simple.

skip_if

Maybe a better name for this to make it consistent with other DSLs would be unless. Also, IMO it's more simple to make it if. "skip/unless" make it feel like it's supposed to be a filter. "if" is more generic, allowing the user to simply put in a conditional as they see fit. (and indeed, even in the example you gave you have a !=, making your overall example a double negative).

min_content_length

tl;dr: this option should be removed entirely, because it can be trivially implemented with skip_if/unless/if

There are several problems with making a default value.

  • it's a behavior change, so it makes it more controversial to merge
  • you didn't explain how you chose 1024

There are 3 independent reasons I can think of to have a minimum value:

  1. the point at which compressing does not reduce the size, and in fact might increase it. this is an objective value that is relevant to all users. Some quick searching seems to indicate that the number is around 150 bytes.
  2. the point at which the size is smaller than a single packet anyway. to my understanding this varies between environments. Maybe others could speak to this more.
  3. the point at which the time-complexity of compression is not worth the reduction in data-transmission time. This is highly dependent on hardware and applications. Also, it can be balanced with the gzip compression level. setting the compression level to 5 results in a much different time-complexity/compression ratio than 9. so, a good solution to accommodate this tradeoff would also include adjusting the compression level based on content-type. I just checked -- rack::deflate uses zlib's default, which is 6.

To increase the chances of your patch being merged, I think the default should be removed. Instead, there could be a recommended minimum in documentation.

Furthermore, regarding point 3, this makes me think it would be nice to be able to adjust compression level as well. So after your patch is merged, I'd like to experiment with augmenting it so your compression_level option could be passed either an integer or a lambda, which would dictate if something is compressed at all, and with what compression level.

->(size){
case size
when 0..256
  nil
when 256..1024
  6
when 1025..Float::INFINITY
  9
end
}

And as I typed that out I realized: this could be achieved with skip_if -- so, you could rewrite skip_if to expect either a boolean or integer to be returned. When it's an integer, it indicates compression level.

jjb commented Nov 24, 2013

@jakubpawlowicz -- this is an awesome patch. It has a lot of features all together, which is probably part of why it's taking so long to merge. Here are some thoughts on how it can be simplified and improved.

include/exclude

It seems like folks (including me) think that

  • whitelist is superior
  • mime-type/Content-Type is superior

I recommend:

  • remove the exclude option entirely (I can't think of a use case, tell me if I'm missing something)
  • reimplement include to operate on the Content-Type header instead of the URL.
  • thought: would be nice to allow the Content-Type list to accept regexes, to allow for text/* for example. but this could perhaps come in a subsequent pull request, to keep this one more simple.

skip_if

Maybe a better name for this to make it consistent with other DSLs would be unless. Also, IMO it's more simple to make it if. "skip/unless" make it feel like it's supposed to be a filter. "if" is more generic, allowing the user to simply put in a conditional as they see fit. (and indeed, even in the example you gave you have a !=, making your overall example a double negative).

min_content_length

tl;dr: this option should be removed entirely, because it can be trivially implemented with skip_if/unless/if

There are several problems with making a default value.

  • it's a behavior change, so it makes it more controversial to merge
  • you didn't explain how you chose 1024

There are 3 independent reasons I can think of to have a minimum value:

  1. the point at which compressing does not reduce the size, and in fact might increase it. this is an objective value that is relevant to all users. Some quick searching seems to indicate that the number is around 150 bytes.
  2. the point at which the size is smaller than a single packet anyway. to my understanding this varies between environments. Maybe others could speak to this more.
  3. the point at which the time-complexity of compression is not worth the reduction in data-transmission time. This is highly dependent on hardware and applications. Also, it can be balanced with the gzip compression level. setting the compression level to 5 results in a much different time-complexity/compression ratio than 9. so, a good solution to accommodate this tradeoff would also include adjusting the compression level based on content-type. I just checked -- rack::deflate uses zlib's default, which is 6.

To increase the chances of your patch being merged, I think the default should be removed. Instead, there could be a recommended minimum in documentation.

Furthermore, regarding point 3, this makes me think it would be nice to be able to adjust compression level as well. So after your patch is merged, I'd like to experiment with augmenting it so your compression_level option could be passed either an integer or a lambda, which would dictate if something is compressed at all, and with what compression level.

->(size){
case size
when 0..256
  nil
when 256..1024
  6
when 1025..Float::INFINITY
  9
end
}

And as I typed that out I realized: this could be achieved with skip_if -- so, you could rewrite skip_if to expect either a boolean or integer to be returned. When it's an integer, it indicates compression level.

@jjb

This comment has been minimized.

Show comment Hide comment
@jjb

jjb Nov 24, 2013

as soon as i posted, that, i realized that include can also be implemented with if. So now I'm a bit conflicted

  • on the one hand, only implementing if and nothing else will be much more simple overall code and allow for greater flexibility for the user
  • on the other hand, the point of offering the library is to make it easy to achieve domain-specific things, so options like include and min_content_length make a lot of sense.

I've seen other libraries offer specific options and then an all-powerful if on the side, so maybe that's fine for here. too. My gut says that going with only include and if is a good approach to start with.

jjb commented Nov 24, 2013

as soon as i posted, that, i realized that include can also be implemented with if. So now I'm a bit conflicted

  • on the one hand, only implementing if and nothing else will be much more simple overall code and allow for greater flexibility for the user
  • on the other hand, the point of offering the library is to make it easy to achieve domain-specific things, so options like include and min_content_length make a lot of sense.

I've seen other libraries offer specific options and then an all-powerful if on the side, so maybe that's fine for here. too. My gut says that going with only include and if is a good approach to start with.

@jakubpawlowicz

This comment has been minimized.

Show comment Hide comment
@jakubpawlowicz

jakubpawlowicz Nov 24, 2013

Contributor

@jjb appreciate your awesome feedback!

  • to be honest I don't remember where 1024 came from (it's been a year since I wrote it) but I like your definition via lambda - let's have it that way.
  • whitelisting via include should work well (so exclude will be gone)
  • min_content_length was a convenience method but your idea is better

If you have a spare moment can you let me know what do you think about @nathanaeljones comment: #457 (comment) ? We could have it as an option for include, e.g.

use Rack::Deflater, { include: Rack::Deflater::DEFAULTS }

Thoughts?

Contributor

jakubpawlowicz commented Nov 24, 2013

@jjb appreciate your awesome feedback!

  • to be honest I don't remember where 1024 came from (it's been a year since I wrote it) but I like your definition via lambda - let's have it that way.
  • whitelisting via include should work well (so exclude will be gone)
  • min_content_length was a convenience method but your idea is better

If you have a spare moment can you let me know what do you think about @nathanaeljones comment: #457 (comment) ? We could have it as an option for include, e.g.

use Rack::Deflater, { include: Rack::Deflater::DEFAULTS }

Thoughts?

@jjb

This comment has been minimized.

Show comment Hide comment
@jjb

jjb Nov 25, 2013

@jakubpawlowicz looks good. ideally the list would be sourced from combining arrays from another project, like...

Rack::Mime::MIME_TYPES.select{|k,v| v.start_with? "text" }

MIME::Types[/^text/] # note that this blows up with mime-types 1.25, works with 2.0, haven't tried 1.25.1

source: http://stackoverflow.com/questions/16003824/get-all-video-mime-types-in-ruby

jjb commented Nov 25, 2013

@jakubpawlowicz looks good. ideally the list would be sourced from combining arrays from another project, like...

Rack::Mime::MIME_TYPES.select{|k,v| v.start_with? "text" }

MIME::Types[/^text/] # note that this blows up with mime-types 1.25, works with 2.0, haven't tried 1.25.1

source: http://stackoverflow.com/questions/16003824/get-all-video-mime-types-in-ruby

@jakubpawlowicz

This comment has been minimized.

Show comment Hide comment
@jakubpawlowicz

jakubpawlowicz Nov 26, 2013

Contributor

@jjb that's another good idea! Will wrap up all the changes in the coming days.

Contributor

jakubpawlowicz commented Nov 26, 2013

@jjb that's another good idea! Will wrap up all the changes in the coming days.

@jakubpawlowicz

This comment has been minimized.

Show comment Hide comment
@jakubpawlowicz

jakubpawlowicz Dec 9, 2013

Contributor

@jjb - that should be it!

I decided to use Rack::Mime::MIME_TYPES instead of MIME::Types to save on adding a dependency. So a regular expression can be passed and it is first matched against Content-Type header and then against list of known mime types. We should probably have some sanity checks first against missing Content-Type header too.

Once we are all set I will squeeze all the commits into two - one for refactored spec_deflater.rb and the other one for Deflater options.

Please let me know how it looks to you.

Contributor

jakubpawlowicz commented Dec 9, 2013

@jjb - that should be it!

I decided to use Rack::Mime::MIME_TYPES instead of MIME::Types to save on adding a dependency. So a regular expression can be passed and it is first matched against Content-Type header and then against list of known mime types. We should probably have some sanity checks first against missing Content-Type header too.

Once we are all set I will squeeze all the commits into two - one for refactored spec_deflater.rb and the other one for Deflater options.

Please let me know how it looks to you.

@jakubpawlowicz

This comment has been minimized.

Show comment Hide comment
@jakubpawlowicz

jakubpawlowicz Dec 9, 2013

Contributor

And regarding failing specs I noticed master fails on them too. It shouldn't be an excuse though...

Contributor

jakubpawlowicz commented Dec 9, 2013

And regarding failing specs I noticed master fails on them too. It shouldn't be an excuse though...

@jjb

This comment has been minimized.

Show comment Hide comment
@jjb

jjb Dec 9, 2013

@jakubpawlowicz awesome. the discussion about mime types was regarding where to get a list of defaults. it looks like you are using it to make sure that the provided types are valid. my feeling is that this isn't very useful and you can exclude this entirely.

jjb commented Dec 9, 2013

@jakubpawlowicz awesome. the discussion about mime types was regarding where to get a list of defaults. it looks like you are using it to make sure that the provided types are valid. my feeling is that this isn't very useful and you can exclude this entirely.

lib/rack/deflater.rb
+ # Skip if :if lambda is given and evaluates to false
+ if @if &&
+ !@if.call(env, status, headers, body)
+ return false

This comment has been minimized.

Show comment Hide comment
@jjb

jjb Dec 9, 2013

return false if @if && !@if.call(env, status, headers, body)
@jjb

jjb Dec 9, 2013

return false if @if && !@if.call(env, status, headers, body)

This comment has been minimized.

Show comment Hide comment
@jjb

jjb Dec 9, 2013

(just my style opinion...)

@jjb

jjb Dec 9, 2013

(just my style opinion...)

This comment has been minimized.

Show comment Hide comment
@jakubpawlowicz

jakubpawlowicz Dec 9, 2013

Contributor

Sure, it reads better.

@jakubpawlowicz

jakubpawlowicz Dec 9, 2013

Contributor

Sure, it reads better.

@jakubpawlowicz

This comment has been minimized.

Show comment Hide comment
@jakubpawlowicz

jakubpawlowicz Dec 9, 2013

Contributor

@jjb sorry my mistake! So let's make sure we are on the same page regarding :include:

  • it should be an array (assembled anyhow user wants)
  • we should default to all content types matching /^text/ - what about application/json?
  • if above then there's no need to make it default explicitly via use Rack::Deflater, { include: Rack::Deflater::DEFAULTS } as plain use Rack::Deflater will be enough
Contributor

jakubpawlowicz commented Dec 9, 2013

@jjb sorry my mistake! So let's make sure we are on the same page regarding :include:

  • it should be an array (assembled anyhow user wants)
  • we should default to all content types matching /^text/ - what about application/json?
  • if above then there's no need to make it default explicitly via use Rack::Deflater, { include: Rack::Deflater::DEFAULTS } as plain use Rack::Deflater will be enough
@jjb

This comment has been minimized.

Show comment Hide comment
@jjb

jjb Dec 9, 2013

that is one way. but, the current deflater approach is to deflate everything. so, you could keep this default behavior, and only change the behavior if include is specified. I recommend this as it's more likely to be accepted by the maintainers and is just generally a more conservative approach.

So your code already does this :-D. so if you agree, you can just remove the check that it's in Rack::Mime::MIME_TYPES.values and IMO the code will be more simple.

(i think i overexplained this but just trying to be extra clear :-D)

jjb commented Dec 9, 2013

that is one way. but, the current deflater approach is to deflate everything. so, you could keep this default behavior, and only change the behavior if include is specified. I recommend this as it's more likely to be accepted by the maintainers and is just generally a more conservative approach.

So your code already does this :-D. so if you agree, you can just remove the check that it's in Rack::Mime::MIME_TYPES.values and IMO the code will be more simple.

(i think i overexplained this but just trying to be extra clear :-D)

lib/rack/deflater.rb
+ !(@include.match(headers['Content-Type']) && Rack::Mime::MIME_TYPES.values.include?(headers['Content-Type']))
+ return false
+ end
+

This comment has been minimized.

Show comment Hide comment
@jjb

jjb Dec 9, 2013

return false if @include && !@include.match(headers['Content-Type'])
@jjb

jjb Dec 9, 2013

return false if @include && !@include.match(headers['Content-Type'])

This comment has been minimized.

Show comment Hide comment
@jakubpawlowicz

jakubpawlowicz Dec 9, 2013

Contributor

Since @include should be an array it should rather read:

return false if @include && !@include.include?(headers['Content-Type'])

or to make it more readable

return false if @include && @include.index(headers['Content-Type']).nil?
@jakubpawlowicz

jakubpawlowicz Dec 9, 2013

Contributor

Since @include should be an array it should rather read:

return false if @include && !@include.include?(headers['Content-Type'])

or to make it more readable

return false if @include && @include.index(headers['Content-Type']).nil?

This comment has been minimized.

Show comment Hide comment
@jjb

jjb Dec 9, 2013

sounds good. i find the first more readable, i'm not familiar with index and the required .nil? at the end seems complicated

@jjb

jjb Dec 9, 2013

sounds good. i find the first more readable, i'm not familiar with index and the required .nil? at the end seems complicated

This comment has been minimized.

Show comment Hide comment
@jakubpawlowicz

jakubpawlowicz Dec 9, 2013

Contributor

as long as it works as expected I'm fine with the first too

@jakubpawlowicz

jakubpawlowicz Dec 9, 2013

Contributor

as long as it works as expected I'm fine with the first too

@jakubpawlowicz

This comment has been minimized.

Show comment Hide comment
@jakubpawlowicz

jakubpawlowicz Dec 9, 2013

Contributor

Cool, so we are on the same page. Will update the code shortly.

Contributor

jakubpawlowicz commented Dec 9, 2013

Cool, so we are on the same page. Will update the code shortly.

lib/rack/deflater.rb
@app = app
+
+ @if = options[:if]
+ @include = options[:include]

This comment has been minimized.

Show comment Hide comment
@jjb

jjb Dec 9, 2013

the symbols are for the DSL but you could make more descriptive variables within the class

@condition = options[:if]
@compressible_types = options[:include]
@jjb

jjb Dec 9, 2013

the symbols are for the DSL but you could make more descriptive variables within the class

@condition = options[:if]
@compressible_types = options[:include]

This comment has been minimized.

Show comment Hide comment
@jakubpawlowicz

jakubpawlowicz Dec 9, 2013

Contributor

👍 nice one!

@jakubpawlowicz

jakubpawlowicz Dec 9, 2013

Contributor

👍 nice one!

@jjb

This comment has been minimized.

Show comment Hide comment
@jjb

jjb Dec 9, 2013

I made this SO question to try to find a nice list of types for us to recommended in the documentation http://stackoverflow.com/questions/20477558/where-can-i-find-a-list-of-textual-mime-types

jjb commented Dec 9, 2013

I made this SO question to try to find a nice list of types for us to recommended in the documentation http://stackoverflow.com/questions/20477558/where-can-i-find-a-list-of-textual-mime-types

@jakubpawlowicz

This comment has been minimized.

Show comment Hide comment
@jakubpawlowicz

jakubpawlowicz Dec 9, 2013

Contributor

@jjb here you go! Let's see what your SO question brings.

Contributor

jakubpawlowicz commented Dec 9, 2013

@jjb here you go! Let's see what your SO question brings.

@jakubpawlowicz

This comment has been minimized.

Show comment Hide comment
@jakubpawlowicz

jakubpawlowicz Dec 9, 2013

Contributor

@jjb So it's quashed into two commits, with better docs, and updated PR description

@raggi @nathanaeljones You may like it much more right now!

Contributor

jakubpawlowicz commented Dec 9, 2013

@jjb So it's quashed into two commits, with better docs, and updated PR description

@raggi @nathanaeljones You may like it much more right now!

lib/rack/deflater.rb
+ end
+
+ # Skip if @compressible_types are given and does not include request's content type
+ return false if @compressible_types && !@compressible_types.include?(headers['Content-Type'])

This comment has been minimized.

Show comment Hide comment
@jayschab

jayschab Jan 7, 2014

First thing, thanks for this code it is a big help. One issue I ran into was this line. When the Content-type has the optional parameter value (in my case the character encoding) the content-type check fails. See RFC-1341. I am using .split(";")[0] and it is working for my use cases.

@jayschab

jayschab Jan 7, 2014

First thing, thanks for this code it is a big help. One issue I ran into was this line. When the Content-type has the optional parameter value (in my case the character encoding) the content-type check fails. See RFC-1341. I am using .split(";")[0] and it is working for my use cases.

This comment has been minimized.

Show comment Hide comment
@jakubpawlowicz

jakubpawlowicz Jan 7, 2014

Contributor

Thanks @jayschab for a valuable input. I'll add a test case and your solution and we should be all good. 👍

@jakubpawlowicz

jakubpawlowicz Jan 7, 2014

Contributor

Thanks @jayschab for a valuable input. I'll add a test case and your solution and we should be all good. 👍

This comment has been minimized.

Show comment Hide comment
@jakubpawlowicz

jakubpawlowicz Jan 7, 2014

Contributor

@jayschab Solved in bd3723b

This comment has been minimized.

Show comment Hide comment
@jayschab

jayschab Jan 15, 2014

Found a bug in my suggested code in my testing. Content-Type isn't always set.

I ended up changing my version to this:

return false if @compressible_types && !(headers.has_key('Content-Type') && @compressible_types.include?(headers['Content-Type'][/[^;]*/]))

@jayschab

jayschab Jan 15, 2014

Found a bug in my suggested code in my testing. Content-Type isn't always set.

I ended up changing my version to this:

return false if @compressible_types && !(headers.has_key('Content-Type') && @compressible_types.include?(headers['Content-Type'][/[^;]*/]))

lib/rack/deflater.rb
+
+ # Skip if @compressible_types are given and does not include request's content type
+ content_type = headers['Content-Type'].split(';')[0]
+ return false if @compressible_types && !@compressible_types.include?(content_type)

This comment has been minimized.

Show comment Hide comment
@jakubpawlowicz

jakubpawlowicz Jan 15, 2014

Contributor

@jayschab I can't see your comment anymore. Does it mean this way is fine?

@jakubpawlowicz

jakubpawlowicz Jan 15, 2014

Contributor

@jayschab I can't see your comment anymore. Does it mean this way is fine?

This comment has been minimized.

Show comment Hide comment
@jayschab

jayschab Jan 15, 2014

Sorry I didn't forgot I was commenting on an outdated diff (it is still there just not obvious in the GitHub UX #457 (comment)). Copied below:

Found a bug in my suggested code in my testing. Content-Type isn't always set.

I ended up changing my version to this:

return false if @compressible_types && !(headers.has_key('Content-Type') && @compressible_types.include?(headers['Content-Type'][/[^;]*/]))

@jayschab

jayschab Jan 15, 2014

Sorry I didn't forgot I was commenting on an outdated diff (it is still there just not obvious in the GitHub UX #457 (comment)). Copied below:

Found a bug in my suggested code in my testing. Content-Type isn't always set.

I ended up changing my version to this:

return false if @compressible_types && !(headers.has_key('Content-Type') && @compressible_types.include?(headers['Content-Type'][/[^;]*/]))

jakubpawlowicz added some commits Dec 9, 2013

Adds deflater options to control compression on per-request level.
* Adds :if option which should be given a lambda accepting env, status, headers, and body options.
* When :if evaluates to false a response body won't be compressed.
* Adds :include option which should be given an array of compressible content types.
* When :include don't include request's content type then response body won't be compressed.
+ end
+
+ # Skip if @compressible_types are given and does not include request's content type
+ return false if @compressible_types && !(headers.has_key?('Content-Type') && @compressible_types.include?(headers['Content-Type'][/[^;]*/]))

This comment has been minimized.

Show comment Hide comment
@jakubpawlowicz

jakubpawlowicz Jan 15, 2014

Contributor

@jayschab Updated it per your suggestion.

Found it problematic to test scenario without the 'Content-Type' header as it gets set every time in specs.
Any ideas?

@jakubpawlowicz

jakubpawlowicz Jan 15, 2014

Contributor

@jayschab Updated it per your suggestion.

Found it problematic to test scenario without the 'Content-Type' header as it gets set every time in specs.
Any ideas?

@jakubpawlowicz

This comment has been minimized.

Show comment Hide comment
@jakubpawlowicz

jakubpawlowicz May 14, 2014

Contributor

is there anything we can do to push it forward after 1.5 years in PR?

Contributor

jakubpawlowicz commented May 14, 2014

is there anything we can do to push it forward after 1.5 years in PR?

@felixbuenemann

This comment has been minimized.

Show comment Hide comment
@felixbuenemann

felixbuenemann May 21, 2014

This would be nice to have. I'm currently shuffling around middleware to keep dragonfly jobs from being compressed.

This would be nice to have. I'm currently shuffling around middleware to keep dragonfly jobs from being compressed.

raggi added a commit that referenced this pull request Aug 3, 2014

Merge pull request #457 from jakubpawlowicz/master
Configuration options for Rack::Deflate

@raggi raggi merged commit 62dcc83 into rack:master Aug 3, 2014

0 of 2 checks passed

default The Travis CI build failed
Details
continuous-integration/travis-ci The Travis CI build is in progress
Details
@jjb

This comment has been minimized.

Show comment Hide comment
@jjb

jjb Aug 3, 2014

😹

jjb commented Aug 3, 2014

😹

@jakubpawlowicz

This comment has been minimized.

Show comment Hide comment
@jakubpawlowicz

jakubpawlowicz Aug 3, 2014

Contributor

Wicked! Thanks @raggi!

Contributor

jakubpawlowicz commented Aug 3, 2014

Wicked! Thanks @raggi!

@jjb

This comment has been minimized.

Show comment Hide comment
@jjb

jjb Mar 19, 2015

in case someone finds this thread, here's what I'm using for mime-types:

include: Rack::Mime::MIME_TYPES.select{|k,v| v =~ /text|json|javascript/ }.values.uniq

jjb commented Mar 19, 2015

in case someone finds this thread, here's what I'm using for mime-types:

include: Rack::Mime::MIME_TYPES.select{|k,v| v =~ /text|json|javascript/ }.values.uniq
@campbecf

This comment has been minimized.

Show comment Hide comment
@campbecf

campbecf Oct 28, 2015

I had to do use Rack::Deflater, if: lambda { |env, status, headers, body| body.body.length > 512 } to get the example to work.

I had to do use Rack::Deflater, if: lambda { |env, status, headers, body| body.body.length > 512 } to get the example to work.

@jjb jjb referenced this pull request in romanbsd/heroku-deflater Mar 11, 2017

Closed

Rack::Deflater already has this functionality? #33

@jjb

This comment has been minimized.

Show comment Hide comment
@jjb

jjb Mar 11, 2017

@campbecf looks like the example got fixed here 🎉 : c987ffa

jjb commented Mar 11, 2017

@campbecf looks like the example got fixed here 🎉 : c987ffa

@jjb jjb referenced this pull request in romanbsd/heroku-deflater Apr 16, 2017

Open

leverage Rack::Deflater's capabilities #37

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment