Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

brotli support in gzip plugin #1557

Merged
merged 1 commit into from Apr 4, 2017
Merged

brotli support in gzip plugin #1557

merged 1 commit into from Apr 4, 2017

Conversation

myraid
Copy link
Contributor

@myraid myraid commented Mar 8, 2017

The requirements are outlined in
#1555

For brotli to work the "proxy.config.http.normalize_ae_gzip" configuration should be set to '0'
Should we control this flag manually or through the gzip plugin?

@bryancall @shukitchan can you review.

@bryancall bryancall self-requested a review March 8, 2017 21:09
@bryancall bryancall added this to the 7.2.0 milestone Mar 8, 2017
@bryancall
Copy link
Contributor

bryancall commented Mar 8, 2017

Added the requirements here:

Here are the requirements for supporting brotli encoding.

If the Accept-Encoding has brotli, then the normalize AE should pick up brotli. (Should we send both "br" and "gzip" in AE?
If the response is already encoded then do nothing. Should we have a configuration to force brotli encode the response?
If the response is unencoded and if Accept-Encoding has brotli then encode using brotli.

@maskit
Copy link
Member

maskit commented Mar 8, 2017

Having one plugin for multiple compression algorithms is fine with me. However, I'm not a big fun of adding it with "else if" statements, though we already did that for deflate. It looks like ifdef things and it's not so easy to read.

I think it should have an abstract parent class for all the algorithms and each algorithms should be implemented as subclasses.

@shukitchan
Copy link
Contributor

I will take a closer look in the coming days.
In the meanwhile, can you update the documentation for the gzip plugin as well for this change?

@zwoop zwoop self-requested a review March 9, 2017 03:40
@zwoop
Copy link
Contributor

zwoop commented Mar 9, 2017

I need to read this some more, but it feels like we need a configuration option here as well.

@myraid
Copy link
Contributor Author

myraid commented Mar 9, 2017

@bryancall @zwoop I am working on a enhancement to force brotli compression even if content-encoding header is present.
Do you think it will be a good idea to have a separate plugin for unzip and combine it with this plugin if there is a need for force encoding brotli?

@@ -95,18 +104,29 @@ gzip_data_alloc(int compression_type)
}
}

if(compression_type == COMPRESSION_TYPE_BROTLI) {
debug("gzip-transform: brotli compression. Create Brotli Encoder Instance.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please clang-format over the change. gmake clang-format.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@bryancall
Copy link
Contributor

It looks like brotli is on by default? It would be nice to have an option on what types of compression algorithms that you want to enable.

@myraid
Copy link
Contributor Author

myraid commented Mar 9, 2017

@bryancall Yes the default behavior is br. Are you suggesting that the option should dictate normalizing the AE? What will be the behavior when both gzip and br are selected?

@bryancall
Copy link
Contributor

@myraid
I can see cases where someone might not want to spend the CPU resources on Brotli compression, but doesn't mind if the origin does Broli encoding. The plugin would compress the response from the origin using certain algorithms.

The configuration option would be something like:
algorithms deflate,gzip,brotli

It would really be nice to enable Brotli on certain content types (e.g. text/javascript) that are static and not on types that would be more dynamic (e.g. text/html). Unfortunately the configuration doesn't easily lend itself to this.

error("brotli-transform: ERROR: output lengths don't match (%d, %ld)", data->downstream_length, data->bstrm.total_out);
}
debug("brotli-transform: Finished brotli");
// gzip_log_ratio(data->bstrm.total_in, data->downstream_length);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we still do similar logging brotli? Or at least some useful data

@shukitchan
Copy link
Contributor

My feedbacks

  1. Perhaps we are overloading the "Accept-Encoding" header in the request. Let's say we create another header called "X-Normalized-Accept-Encoding". Its value will be the best algorithm we allow (through the configuration option) in the plugin chosen from the incoming "Accept-Encoding" of the request.

  2. We need to change the incoming "Vary" header from response (in read_response hook) and replace "Accept-Encoding" with "X-Normalized-Accept-Encoding". That will cause the cache to only cache according to X-Normalized-Accept-Encoding instead of "Accept-Encoding".

  3. Now we only force brotli encryption or gzip encryption (whichever X-Normalized-Accept-Encoding indicates) when the origin returns plain text.

  4. If we want to force brotli encryption for certain url patterns, we probably can use lua/header_write plugin to change the incoming "Accept-Encoding" to remove "gzip" and "deflate" and leave "br" value in before the gzip plugin runs. This feature can be baked into the gzip plugin later on with more configuration options

The pros is that

  1. We send the "Accept-Encoding" over to the origin and let origin decides what compression to use.
  2. We still only caches a limited amount of copy per URL. The worst case is one per compression algorithm we allow from the plugin configuration option.

The cons is that

  1. We should remove X-Normalized-Accept-Encoding in send_request hook.
  2. We should replace X-Normalized-Accept-Encoding with Accept-Encoding to the Vary response header in send_response hook.

Any thoughts? @zwoop @bryancall @myriad ?

@zwoop
Copy link
Contributor

zwoop commented Mar 14, 2017

See also ##776.

@myraid
Copy link
Contributor Author

myraid commented Mar 16, 2017

Added the supported-algorithms cofiguration.
Normalizing Accept-Encoding now supports both br and gzip together. (Accept-Encoding: br, gzip)
If the origin supports brotli compression then the plugin just sends the data downstream.
If origin does not support brotli, then origin can still fallback to gzip.

compression will be triggered for response based on Accept-Encoding and supported algorithms (brotli is checked before gzip). A transform is not even created if the there is no intersect between Accept-Encoding and supported algorithms.

Caching scenarios.

We only cache 2 copies

  • with 'Accept-Encoding: br, gzip' -- if content encoding header is present, just cache it. Otherwise encode the content and cache. The User-Agent might receive either br or gzip encoded response which is OK.

  • with 'Accept-Encoding: gzip' -- The User-Agent will always receive a gzip encoded response.

@shukitchan Accept-Encoding header will suffice. This is similar to what you have suggested, but without an additional header. you agree?

@shukitchan
Copy link
Contributor

@myraid your description seems to be good. i will spend some times tomorrow to review the code.

meanwhile, pls update this as well
https://github.com/apache/trafficserver/blob/master/doc/admin-guide/plugins/gzip.en.rst
and perhaps this as well
https://github.com/apache/trafficserver/blob/master/plugins/gzip/README

@shukitchan
Copy link
Contributor

two more points

  1. I think we need to maintain the existing behavior as much as possible. So perhaps the default for the new configuration option "supported-algorithms" should not be a no-op? Also it would be great to update the sample configuration file to show how the new option can be used.

  2. One problem with "Accept-Encoding: br, gzip" is that the origin will likely just return the gzip version almost all the time. And therefore we can't take advantage of br in almost all cases. Can we perhaps introduce an additional configuration option called "enforce-brotli" ? e.g. If I have "enforce-brotli /notthis/*.js" in the config file, then I will set normalize "Accept-Encoding" to "br" instead of "br, gzip" or "gzip" for the matched URL path.

P.S. I think enforce-brotli and supported-algorithms should be independent of each other.

@myraid
Copy link
Contributor Author

myraid commented Mar 17, 2017

@shukitchan My Comments inline

  1. I think we need to maintain the existing behavior as much as possible. So perhaps the default for the new configuration option "supported-algorithms" should not be a no-op? Also it would be great to update the sample configuration file to show how the new option can be used.

[myraid] I agree. the "supported-algorithms" configuration will be honored if present, otherwise compression could be defaulted to gzip.

  1. One problem with "Accept-Encoding: br, gzip" is that the origin will likely just return the gzip version almost all the time. And therefore we can't take advantage of br in almost all cases. Can we perhaps introduce an additional configuration option called "enforce-brotli" ? e.g. If I have "enforce-brotli /notthis/*.js" in the config file, then I will set normalize "Accept-Encoding" to "br" instead of "br, gzip" or "gzip" for the matched URL path.

[myraid] There is a configuration option "remove-accept-encoding" and if present will hide the Accept-Encoding header. Using this flag we can easily trick the origin to send unencoded data.
I agree that we should have the "enfore-brotli" flag to provide a config option to always enforce brotli compression even if content-encoding is present. e.g. If the origin sends gzip, then the plugin should still be able to inflate and compress back using brotli. I am working on introducing this option.

@shukitchan
Copy link
Contributor

wouldn't it be easier if "enforce-brotli" will override the "Accept-Encoding" to "br"?
So the origin will either return plain text or brotli encoded data. And the plugin only need to brotli-encode the plain text version if that's returned.

Another concern @bryancall made in an email with us is whether we should only do this brotli encoding transform only if response is cacheable. I wonder whether we should do it in this PR or have a custom version internally instead.

@myraid
Copy link
Contributor Author

myraid commented Mar 17, 2017

Yes we could enforce brotli encoding by overriding the Accept-Encoding to "br" if we are OK with origin responding with plain text. Infact this is cleaner approach.
Do we have to worry about the use case where origin sends gzip and the plugin needs to compress it to brotli? Most origins support gzip and is a huge bandwidth saver.

May be you are thinking of a separate unzip plugin. Ether ways I am fine. Infact we can keep this plugin clean without adding gunzip functionality.

@shukitchan I will work the changes you suggested.

@myraid
Copy link
Contributor Author

myraid commented Mar 18, 2017

@shukitchan @bryancall can you guys review.

@myraid
Copy link
Contributor Author

myraid commented Mar 22, 2017

@shukitchan @bryancall can you review the new code.
changed enforce_brotli to wildcard pattern matching for urls.
Now enforce_brotli will override some of the existing configuration for url that matched the pattern.

@shukitchan
Copy link
Contributor

[approve ci]

@shukitchan
Copy link
Contributor

it looks good to me.
@bryancall @zwoop can either of you guys take a look, too? thanks.

@atsci
Copy link

atsci commented Mar 23, 2017

@atsci
Copy link

atsci commented Apr 3, 2017

@atsci
Copy link

atsci commented Apr 3, 2017

@atsci
Copy link

atsci commented Apr 3, 2017

@atsci
Copy link

atsci commented Apr 3, 2017

@atsci
Copy link

atsci commented Apr 3, 2017

FreeBSD11 build failed! https://ci.trafficserver.apache.org/job/freebsd-github/1823/

@atsci
Copy link

atsci commented Apr 3, 2017

clang-analyzer build failed! https://ci.trafficserver.apache.org/job/clang-analyzer-github/385/

@atsci
Copy link

atsci commented Apr 3, 2017

Intel CC build failed! https://ci.trafficserver.apache.org/job/icc-github/253/

@bryancall
Copy link
Contributor

[approve ci]

@atsci
Copy link

atsci commented Apr 3, 2017

@atsci
Copy link

atsci commented Apr 3, 2017

@atsci
Copy link

atsci commented Apr 3, 2017

@atsci
Copy link

atsci commented Apr 3, 2017

FreeBSD11 build failed! https://ci.trafficserver.apache.org/job/freebsd-github/1824/

@atsci
Copy link

atsci commented Apr 3, 2017

clang-analyzer build failed! https://ci.trafficserver.apache.org/job/clang-analyzer-github/386/

@atsci
Copy link

atsci commented Apr 3, 2017

Intel CC build failed! https://ci.trafficserver.apache.org/job/icc-github/254/

@atsci
Copy link

atsci commented Apr 3, 2017

@bryancall
Copy link
Contributor

[approve ci]

@atsci
Copy link

atsci commented Apr 3, 2017

RAT check successful! https://ci.trafficserver.apache.org/job/RAT-github/144/

@atsci
Copy link

atsci commented Apr 3, 2017

@atsci
Copy link

atsci commented Apr 3, 2017

@atsci
Copy link

atsci commented Apr 3, 2017

Linux build successful! https://ci.trafficserver.apache.org/job/linux-github/1717/

@atsci
Copy link

atsci commented Apr 3, 2017

FreeBSD11 build successful! https://ci.trafficserver.apache.org/job/freebsd-github/1826/

@atsci
Copy link

atsci commented Apr 3, 2017

Intel CC build successful! https://ci.trafficserver.apache.org/job/icc-github/256/

@atsci
Copy link

atsci commented Apr 3, 2017

clang-analyzer build successful! https://ci.trafficserver.apache.org/job/clang-analyzer-github/388/

@bryancall bryancall merged commit 39d045e into apache:master Apr 4, 2017
@zwoop zwoop modified the milestones: 7.2.0, 8.0.0 Apr 25, 2017
@bryancall bryancall modified the milestones: 7.1.0, 8.0.0 May 3, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants