Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Support WebP #140

Closed
heynemann opened this Issue Dec 19, 2012 · 40 comments

Comments

Projects
None yet
9 participants
Owner

heynemann commented Dec 19, 2012

I was reading this article http://www.igvita.com/2012/12/18/deploying-new-image-formats-on-the-web/ and found myself thinking that it should be fairly easy for us to support webp format (supported currently by chrome and opera).

Do you guys think we should provide some flag that would make thumbor return the smallest possible image? The only downside is that browsers (as is said on the article) do NOT send proper Accept headers.

We could allow the users to specify what formats they want to support. Thumbor would generate the image in all the formats, then pick the smallest one.

What do you guys think?

Contributor

dhardy92 commented Dec 19, 2012

Why not, if toogable of course. As said, there is also some risks on the caching part: From our private caches multiplying entries for a single resource, and for CDN part that seams to be very weak on this part and very uncontrollable.
If this is done we should provide load tests results for overhead knowledge.

This would definitely be useful in mobile app cases where we more closely control what we are requesting. So iOS can just ask for WebP to save mobile bandwidth and speed it all up. The web generally is a bit more tricky but would still be useful.

Owner

heynemann commented Jan 25, 2013

That's what I thought... ;)

Bernardo Heynemann
Developer @ globo.com

On Fri, Jan 25, 2013 at 4:44 PM, Dan Harvey notifications@github.comwrote:

This would definitely be useful in mobile app cases where we more closely
control what we are requesting. So iOS can just ask for WebP to save mobile
bandwidth and speed it all up. The web generally is a bit more tricky but
would still be useful.


Reply to this email directly or view it on GitHubhttps://github.com/globocom/thumbor/issues/140#issuecomment-12714711.

Member

rafaelcaricio commented Feb 6, 2013

Very interesting article and I think that would be nice if Thumbor had such support.

Contributor

runa commented Feb 8, 2013

+1
Another article http://blog.chromium.org/2013/02/using-webp-to-improve-speed.html
In LATAM Chrome traffic is huge so I guess this will help a big % of our users.

How would you implement it, @heynemann ? Checking the Accept header and, if useless, sniffing the User-Agent?

Owner

heynemann commented Feb 8, 2013

Probably not sniffing the user agent as that's VERY dangerous. If we
identify it incorrectly, BOOM, all images go blank for that user agent.

Using the Accept header seems like the way to go. What do you guys think?

Cheers,

Bernardo Heynemann
Developer @ globo.com

On Fri, Feb 8, 2013 at 11:50 AM, martin sarsale notifications@github.comwrote:

+1
Another article
http://blog.chromium.org/2013/02/using-webp-to-improve-speed.html
In LATAM Chrome traffic is huge so I guess this will help a big % of
our users.

How would you implement it, @heynemann https://github.com/heynemann ?
Checking the Accept header and, if useless, sniffing the User-Agent?


Reply to this email directly or view it on GitHubhttps://github.com/globocom/thumbor/issues/140#issuecomment-13290877.

Member

rafaelcaricio commented Feb 8, 2013

Seems that just looking at the Accept header will only work properly if the final user uses the Opera browser. However maybe the management of the Accept header sent by the user's browser to check which user agents can support webp and if true appending the acceptance of webp format to the request header can be assigned to a layer (nginx | varnish | whatever) on top of the Thumbor instances. So, Thumbor code can stay safe and unbound of whenever the browsers decide to update their implementations to follow better the specifications of HTTP 1.1 toward to archive a better way to tell the web servers which types of images they can actually accept.
What do you guys think?

Contributor

runa commented Feb 8, 2013

Yes I was also going to add that trying to detect from the User-Agent might be quite dangerous... But I think the Accept header support would be good.

I was just thinking how this might affect our CDN caches, we're using CloudFront and I don't think they support caching by different content-types in the Accept header... So I'm not sure what happens in this case, maybe also adding support for requesting the format in the url parameters might work well?

Or at least support both but add configuration for these so they can be configured in different cases depending on your CDN support.

If we use different urls we then could also request the different images in the browser without needing client support.

Contributor

wichert commented Feb 8, 2013

I suspect some browsers will just claim to accept image/* or even /, so I would be careful with trusting the Accept header. If anything I would test for explicitly mention of WebP in Accept.

Member

rafaelcaricio commented Feb 8, 2013

@danharvey I was thinking about it and the idea of explicitly request the image format in the url would be nice.

@wichert This is exactly the point and if we (as users of thumbor) want to bypass the missing support of it by the browser's we could just put the logic of explicitly mention of WebP in Accept in some layer on top of thumbor.

sboisson commented Feb 8, 2013

What about implementing a filter which always output WebP?

Owner

heynemann commented Feb 8, 2013

I think I'd rather use the accepts header and wait for chrome to support it.

You can do the same as I'm doing and spread the word:
https://code.google.com/p/chromium/issues/detail?id=169182

Ask everyone you know to vote for this ticket and retweet me:
http://twitter.com/heynemann/status/299938408522985472

Cheers,
Bernardo Heynemann

Bernardo Heynemann
Developer @ globo.com

On Fri, Feb 8, 2013 at 3:11 PM, Rafael Carício notifications@github.comwrote:

@danharvey https://github.com/danharvey I was thinking about it and the
idea of explicitly request the image format in the url would be nice.

@wichert https://github.com/wichert This is exactly the point and if we
(as users of thumbor) want to bypass the missing support of it by the
browser's we could just put the logic of explicitly mention of WebP in
Accept in some layer on top of thumbor.


Reply to this email directly or view it on GitHubhttps://github.com/globocom/thumbor/issues/140#issuecomment-13300459.

Owner

heynemann commented Feb 8, 2013

Btw, I found it searching here:
https://groups.google.com/a/webmproject.org/forum/?fromgroups#!searchin/webp-discuss/detecting

It's a pretty active group.

Cheers,
Bernardo Heynemann

Bernardo Heynemann
Developer @ globo.com

On Fri, Feb 8, 2013 at 3:51 PM, Bernardo Heynemann heynemann@gmail.comwrote:

I think I'd rather use the accepts header and wait for chrome to support
it.

You can do the same as I'm doing and spread the word:
https://code.google.com/p/chromium/issues/detail?id=169182

Ask everyone you know to vote for this ticket and retweet me:
http://twitter.com/heynemann/status/299938408522985472

Cheers,
Bernardo Heynemann

Bernardo Heynemann
Developer @ globo.com

On Fri, Feb 8, 2013 at 3:11 PM, Rafael Carício notifications@github.comwrote:

@danharvey https://github.com/danharvey I was thinking about it and
the idea of explicitly request the image format in the url would be nice.

@wichert https://github.com/wichert This is exactly the point and if
we (as users of thumbor) want to bypass the missing support of it by the
browser's we could just put the logic of explicitly mention of WebP in
Accept in some layer on top of thumbor.


Reply to this email directly or view it on GitHubhttps://github.com/globocom/thumbor/issues/140#issuecomment-13300459.

Member

rafaelcaricio commented Feb 8, 2013

@sboisson Maybe it's is possible, but at least we need to modify this line https://github.com/globocom/thumbor/blob/master/thumbor/handlers/__init__.py#L153

Anyway, I think it's something that belongs to "thumbor core" to handle.

Owner

heynemann commented Feb 8, 2013

So you can implement it client side? I'd say it's very dangerous. Any bugs
in your browser detection and you get NO images.

As I said, I'd rather stick to the http spec and use a combination of:

  • Accepts header to decide what to send using a best-case scenario
    heuristics. If we can send webp, that's it, otherwise use whatever was the
    previous image.
  • Vary: Accept to enable caches worldwide to cache the webp images.

Cheers,
Bernardo Heynemann

Bernardo Heynemann
Developer @ globo.com

On Fri, Feb 8, 2013 at 3:32 PM, Stéphane notifications@github.com wrote:

What about implementing a filter which always output WebP?


Reply to this email directly or view it on GitHubhttps://github.com/globocom/thumbor/issues/140#issuecomment-13301507..

Owner

heynemann commented Feb 8, 2013

Just so you guys know, this change is scheduled to be released with Chrome
27.

Even though this is a little far in the future, it is going to happen. ;)

Cheers,
Bernardo Heynemann

Bernardo Heynemann
Developer @ globo.com

On Fri, Feb 8, 2013 at 3:57 PM, Bernardo Heynemann heynemann@gmail.comwrote:

So you can implement it client side? I'd say it's very dangerous. Any bugs
in your browser detection and you get NO images.

As I said, I'd rather stick to the http spec and use a combination of:

  • Accepts header to decide what to send using a best-case scenario
    heuristics. If we can send webp, that's it, otherwise use whatever was the
    previous image.
  • Vary: Accept to enable caches worldwide to cache the webp images.

Cheers,
Bernardo Heynemann

Bernardo Heynemann
Developer @ globo.com

On Fri, Feb 8, 2013 at 3:32 PM, Stéphane notifications@github.com wrote:

What about implementing a filter which always output WebP?


Reply to this email directly or view it on GitHubhttps://github.com/globocom/thumbor/issues/140#issuecomment-13301507..

Member

rafaelcaricio commented Feb 8, 2013

@heynemann You can implement in the client-side, but I think the best choice would be if you use a third part server to redirect to the url which explicitly request the WebP image. So, no need to client-side behaviour.

That still leaves the issue with not supporting content-type caching in CDN's?

Owner

heynemann commented Feb 8, 2013

It seems to me that cloundfront does support it already:
http://news.ycombinator.com/item?id=2097326

DISCLAIMER: I haven't tested it.

What CDN are you using?

Bernardo Heynemann
Developer @ globo.com

On Fri, Feb 8, 2013 at 4:09 PM, Dan Harvey notifications@github.com wrote:

That still leaves issues with not supporting content-type caching in CDN's?


Reply to this email directly or view it on GitHubhttps://github.com/globocom/thumbor/issues/140#issuecomment-13303278..

MaxCDN but will be moving to CloudFront. I'll have a go at testing both of these at some point.

Owner

heynemann commented Feb 8, 2013

Cool. I think that the CDN should have an option to include vary headers.
Otherwise you are missing on a lot of information (authenticated users,
differente accepted formats, etc).

Cheers,

Bernardo Heynemann
Developer @ globo.com

On Fri, Feb 8, 2013 at 4:18 PM, Dan Harvey notifications@github.com wrote:

MaxCDN but will be moving to CloudFront. I'll have a go at testing both of
these at some point.


Reply to this email directly or view it on GitHubhttps://github.com/globocom/thumbor/issues/140#issuecomment-13303645..

+1 for WebP support... 👍

A few thoughts on the deployment story...

First off, I think support for WebP in thumbor is somewhat orthogonal to how it is deployed. Said differently, I think thumbor could and definitely should support WebP, and the deployment part is really particular to your specific infrastructure and application.

With that out of the way, let's talk about deployment. There are a few different scenarios here, with different levels of complexity:

a) Native app - you control the client. In this case, we have WebP libraries both for Android and iOS, which you can embed in your app and not worry about the rest - simply serve WebP to your app, and enjoy all the benefits.

b) Web app, or any case where you do not control the client. This one gets more complicated...

First off, I wouldn't be that afraid of UA sniffing. Basically, you are simply looking for "is this Chrome, or Opera"? Which is a very easy function to write if you have a scripting layer in your server (ex, Varnish VCL, or even nginx rules). If the answer is yes, then you can serve WebP. In fact, this exactly what Torbit and PageSpeed optimization servers do...

Next question is, what about intermediate proxies? This is where it gets trickier. The safest way to do this is to customize the HTML of the page: run the UA check, and write out webp link in the markup. This means the HTML itself needs to be marked as Cache-Control: private - intermediaries won't cache it, but the client will. This allows the intermediaries to cache webp assets, since they have different URLs.

The strategy above is what mod_pagespeed / ngx_pagespeed and PageSpeed Service use to serve WebP images. If you're curious, you can see it in action on my blog: igvita.com.

A different strategy is to preserve the same URL between WebP and non-WebP assets, and then use UA sniffing to serve the right variant. The downside to this approach is that the images themselves have to be marked as Cache-Control: private, which doesn't play well with intermediate caches (which may be a problem for some).

So, in short:

  • native apps: clean deployment, easy, lots of benefits
  • web apps with many clients: use UA sniffing + either mark the HTML as non-cacheable and write WebP links, or use UA sniffing on the images themselves
Owner

heynemann commented Feb 21, 2013

Thanks a lot Ilya for shedding some light into the subject. I think we can add support for thumbor in two different ways:

  • If the image request comes with a content-type that specifies webp or the .webp extension we serve webp, no matter what (support for this configurable in thumbor.conf).
  • If the image request is for a jpeg image and the accepts header specifies that webp is supported, we serve webp (also configurable in thumbor.conf).

This is as far as I think thumbor will go. As Ilya has eloquently said, the task of deciding whether to serve webp will most of the times end in the edge (Varnish here at globo.com, maybe nginx or apache elsewhere).

What do you guys think?

I think there is a small overlap in the two scenarios you described:

  • The client can send an Accept header with image/webp. If such is present, you could transcode the image and serve WebP
  • If the request specifically asks for .webp as the filename extension then serve webp

Both of the above should probably be configurable (on/off). Then with these two mechanisms in place, you have all the right tools to adjust to different deployment types.

(Note that there is small caveat in case 2.. which is that the request asks for .webp, but what is the original format? :-) ... This may either require convention, like "all original images are .jpeg's or .png's", or some extra metadata in the URL itself (ex: image.jpeg.webp). Not saying the latter is pretty, but it would work)

Is this also not two separate things. WebP support and image transcoding.

What if someone wanted to prefer to transcode everything to png? If that's
in the accept header or format extension it would be the same too.

With the accept header is it worth having a priority order Thumbor will go
through when looking at the client request header?

Even with older clients that don't support WebP there are benefits in
consistent encoding and compression levels.

On Thursday, 21 February 2013, Ilya Grigorik wrote:

I think there is a small overlap in the two scenarios you described:

  • The client can send an Accept header with image/webp. If such is
    present, you could transcode the image and serve WebP
  • If the request specifically asks for .webp as the filename extension
    then serve webp

Both of the above should probably be configurable (on/off). Then with
these two mechanisms in place, you have all the right tools to adjust to
different deployment types.

(Note that there is small caveat in case 2.. which is that the request
asks for .webp, but what is the original format? :-) ... This may either
require convention, like "all original images are .jpeg's or .png's", or
some extra metadata in the URL itself (ex: image.jpeg.webp). Not saying the
latter is pretty, but it would work)


Reply to this email directly or view it on GitHubhttps://github.com/globocom/thumbor/issues/140#issuecomment-13868248.

I sincerely hope no-one would actually transcode everything to png.. that would be painful. :-)

But yeah, having a "force this format" option is a good thing to have.

True, it was more an example we're coupling picking the transcoding order,
and WebP support discussion here.

On Friday, 22 February 2013, Ilya Grigorik wrote:

I sincerely hope no-one would actually transcode everything to png..
that would be painful. :-)

But yeah, having a "force this format" option is a good thing to have.


Reply to this email directly or view it on GitHubhttps://github.com/globocom/thumbor/issues/140#issuecomment-13931176.

Hey all. A slight tangent, but may be of interest: https://plus.google.com/events/crnuusj7lb8dshbjmb5gs33k5e0

Tuesday, March 5th @ 1PM PST -- it'll also be on YouTube afterwards. Let me know if there are particular questions or topics you want us to cover.

@heynemann heynemann added a commit that referenced this issue Jul 2, 2013

@heynemann heynemann Partial support to using WebP (#140). Added support for "format" filter.
Only PIL engine supports the webp format and some tests are failing for the other engines.
Work still needs to be done to level all the engines.
a8e2c42
Owner

heynemann commented Jul 2, 2013

Just to clarify that last commit: currently thumbor is not working with Pillow 2.1.0. We'll be working to fix that and use the latest possible version.

Other than that, webp support is not there yet, mainly due to ICC Profile support. We even realized Chrome does not read ICC Profiles (only canary does right now). WebP quality results with thumbor might vary depending on the original image.

We are working with @igrigorik and the awesome team at Google to fix these issues and fully support webp in thumbor the way it should be supported.

👍

Owner

heynemann commented Jul 4, 2013

Working with @cezarsa, we created a pull request for the Python Imaging Library that enables support for metadata in it (python-imaging/Pillow#271).

This, combined with the support of filter:format(webp) in thumbor as of the last release, will enable thumbor users to use webp properly. I think we may now close this ticket. Anyone want to add something?

Owner

heynemann commented Jul 5, 2013

The pull request has been merged. It will be in the next release of Pillow. As soon as they release it we can change the version in thumbor and be done with this. YAY for WebP Support. We'll start adoption here at globo.com. I'll get a blog post up about our experience with it.

Thanks a lot for everyone's help. You guys rock! Thanks @igrigorik for helping connect with the right people. It made a world of difference.

@heynemann heynemann closed this Jul 5, 2013

Awesome - keep me posted on your progress!

estahn commented Nov 19, 2013

When will webp be available through ppa?

estahn commented Dec 2, 2013

@heynemann I just clarified this with CloudFront Support. Accept and Vary Header are not supported. We're currently evaluating other ideas how to implement webp. Any ideas are welcome.

Hi Enrico,

At the moment the Accept/Vary header are not supported.

here you find the link to our documentation:
Content Negotiation

The only acceptable value for the Vary header is Accept-Encoding. CloudFront ignores other values.

http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/RequestAndResponseBehaviorCustomOrigin.html#ResponseCustomContentNegotiation

CloudFront forwards requests that have the Accept-Encoding field values "identity" and "gzip". For more information, see Serving Compressed Files.

http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/RequestAndResponseBehaviorCustomOrigin.html.

I have submitted that as a feature request to the cloudfront service team but I can not commit if the functionality will be added or when it will be added.

I kindly ask you to check our site withe the new feature release
http://aws.amazon.com/about-aws/whats-new/

Best regards,

Vincenzo T.
Amazon Web Services

Owner

heynemann commented Dec 2, 2013

@estahn it´s a little hard for us, then.

The reason we rely on the accepts header is that it´s future-proof. Once the other browsers decide to support WebP, they'll report it via accepts and you won't need a deploy in order to reap the benefits (besides being the "right" way IMHO).

Do you have any suggestions on how to do this? They way we were doing it at globo.com before Chrome sent the accepts header is something like this:

a) Serve a draft jpg image with 10% quality;
b) Load a base64 webp image (1x1) and verify the width. If the browser reports the width as 1px, then it can process webp;
c) If it can process webp, we use a high-quality thumbor URL that forces the format to be webp (using the format filter);
d) Otherwise we load a high-quality jpeg image.

Do you think that works for your scenario?

estahn commented Dec 2, 2013

@heynemann I understand and really would like to use content negotiation.

In the workflow you describe how do you differentiate between a .jpg and .webp URLs? The URLs must be changed in the backend somehow based on some value, e.g. cookie because the URL's are signed and you can't just add the format filter, correct?

Options we came up with:

  • Content Negotiation (not an option with CloudFront atm)
  • Use client side detection via JS (e.g. webp.js) and rename URL's

The latter has the problem that URL's are signed and can't easily be changed. One of the options we thought of was only partially sign the URL, e.g. extension is not signed. This would require changes in thumbor which i would rather like to avoid. The solution we came up with will require a change in NGINX.

Example:

  • JPEG URL: .../test.jpg
  • WEBP URL: .../test.jpg.webp

The first URL is generated as usual. The client side detects somehow that WEBP is supported and adds .webp at the end of the URL. NGINX will rewrite paths ending on .webp and add the required content negotiation header.

Thoughts?

Enrico

Owner

heynemann commented Dec 2, 2013

The way we do it is by pre-generating both urls and using something like this:

(img src="http://thumbor-server/filters:quality(10)/some/image.jpg" data-jpeg="http://thumbor-server/some/image.jpg" data-webp="http://thumbor-server/filters:format(webp)/some/image.jpg" /)
(replace the parentheses with angular brackets)

Then at runtime you just change src to either data-jpeg or data-webp.

Does that work for you?

estahn commented Dec 2, 2013

@heynemann Ok, that's what i thought. We considered this and it will work too. I don't know why we came up with the changes for NGINX.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment