feat(DBW): add audit for detecting unoptimized images #1452

patrickhulce · 2017-01-12T00:13:29Z

Addresses one of several concerns around image bloat and another step towards PSI parity (#909).

Gathers all sameorigin or data URI images off of the network and runs them through canvas to determine a stripped down JPEG and WebP size.

Audit will fail if one of the conditions are met:

There is at least one JPEG or bitmap image that was larger than canvas encoded JPEG.
There is at least one image that would have saved more than 50KB by using WebP.
The savings of moving all images to WebP is greater than 100KB.

Addresses one of several concerns around image bloat and another step towards PSI parity (#909). Gathers all sameorigin or data URI images off of the network and runs them through canvas to determine a stripped down JPEG and WebP size. Audit will fail if one of the conditions are met: * There is at least one JPEG or bitmap image that was larger than canvas encoded JPEG. * There is at least one image that would have saved more than 50KB by using WebP. * The savings of moving all images to WebP is greater than 100KB.

patrickhulce · 2017-01-12T00:15:10Z

Some open questions:

What's the right number for both of the byte cutoffs?
What's the right number for the quality to use when rendering with canvas?
Should we expand to support crossorigin in a future PR?
Should we still report all of the WebP savings when it is small (current behavior)?

paulirish · 2017-01-12T04:04:43Z

lighthouse-core/gather/gatherers/dobetterweb/optimized-images.js

+    function getTypeStats(type, quality) {
+      const dataURI = canvas.toDataURL(type, quality);
+      const base64 = dataURI.slice(dataURI.indexOf(',') + 1);
+      return {base64: base64.length, binary: atob(base64).length};


instead of atob you can use canvasElem.toBlob and then just check the blob.size

Is there a reason to avoid atob? toBlob is async no?

It's sync: https://developers.google.com/web/updates/2016/03/canvas-toblob-in-chrome-50

🤔 that looks async to me

just wondering if there's a benefit I'm missing. Double nested callback just seems to add some awkwardness.

I'm sorry. You're totally right. They changed it after years in the making and I refused to believe it :) https://developer.mozilla.org/en-US/docs/Web/API/HTMLCanvasElement/toBlob

You could do a promise wrapper.

Generally, data urls add about 33% overhead to the filesize. If a page has a lot of images that might eat up some memory but I'm not sure it will be a huge issue. A benefit of passing a blob around would be that the caller can read it in many formats (binary string, data url, etc) using the FileReader API.

Haha no worries :) Yeah I'm actually purposefully calculating the data URI size here (though I could check first to see if it wasn't a data URI to begin with and skip it) so that I can compare it under the assumption if you're shipping an image as a data URI you'd likely send the optimized version as one as well. I suppose avoiding large data URI's should really be an audit by itself too though 😄

patrickhulce · 2017-01-12T18:08:44Z

Another open question is how to best present the information in the report. Here's what it looks like right now with JPEG savings being shown only if the image was JPEG or bitmap to begin with and could save by recompressing at q=0.8

brendankenny

looking good. Initial review pass

brendankenny · 2017-01-12T23:24:16Z

lighthouse-cli/test/fixtures/dobetterweb/dbw_tester.html

@@ -102,6 +102,10 @@
  </style>
 </template>

+<template id="unoptimized-images-tmpl">


are you going to do anything with this in the expectations file? Also, it is a very big addition to this file :)

should I check-in the image separately instead? And yes, haha, done :)

brendankenny · 2017-01-12T23:55:16Z

lighthouse-core/gather/gatherers/dobetterweb/optimized-images.js

+ */
+
+ /**
+  * @fileoverview Determines optimized jpeg/webp filesizes for all same-origin and dataURI images.


same origin makes sense (those are the images you have control over), but warning developers that a cross origin image is hurting them is also helpful. What does PSI do?

would also be useful if this had a description of how this is calculated, especially as we aren't claiming this is 100% foolproof, just an ongoing best effort. Could put the description on top of or in getOptimizedNumBytes, but might be more apparent here at the top.

Uses a canvas in the target browser to re-encode each image at 80% etc etc

I was also thinking we should go x-origin for this one. Images are such a big, wasteful part of apps. 60% or whatever. And a lot of people use images hosted off a CDN. That will be x-origin.

Yeah my main concern was not processing CDN images at all which is most people who make some effort. The issue I ran into was canvas not dealing with them and adding crossorigin='anonymous' didn't necessarily fix. Fine to investigate in future PR?

Right, crossorigin='anonymous'. File a bug so we don't forget!

done! #1467

brendankenny · 2017-01-12T23:56:53Z

lighthouse-core/gather/gatherers/dobetterweb/optimized-images.js

+/* global document */
+
+/* istanbul ignore next */
+function getOptimizedNumBytes(url) {


should indicate in comment description that this runs in the browser

brendankenny · 2017-01-13T00:01:42Z

lighthouse-core/gather/gatherers/dobetterweb/optimized-images.js

+class OptimizedImages extends Gatherer {
+  /**
+   * @param {string} pageUrl
+   * @param {NetworkRecords} networkRecords


{!NetworkRecords}

brendankenny · 2017-01-13T00:02:24Z

lighthouse-core/gather/gatherers/dobetterweb/optimized-images.js

+  /**
+   * @param {string} pageUrl
+   * @param {NetworkRecords} networkRecords
+   * @return {!Array.<!{url: string, isBase64DataUri: boolean, mimeType: string, resourceSize: number}>}


FWIW, structural types aren't default nullable (and you don't need the .), so you can do @return {!Array<{url: string...}>} here

oh cool, done

brendankenny · 2017-01-13T00:03:40Z

lighthouse-core/gather/gatherers/dobetterweb/optimized-images.js

+  }
+
+  /**
+   * @param {Object} driver


{!Object}

brendankenny · 2017-01-13T00:09:46Z

lighthouse-core/gather/gatherers/dobetterweb/optimized-images.js

+
+  /**
+   * @param {Object} driver
+   * @param {!{url: string, isBase64DataUri: boolean, resourceSize: number}} networkRecord


can drop the !

brendankenny · 2017-01-13T00:09:56Z

lighthouse-core/gather/gatherers/dobetterweb/optimized-images.js

+  /**
+   * @param {Object} driver
+   * @param {!{url: string, isBase64DataUri: boolean, resourceSize: number}} networkRecord
+   * @return {!Promise.<!{originalSize: number, jpegSize: number, webpSize: number}>}


can drop the second ! and the .

brendankenny · 2017-01-13T00:25:02Z

lighthouse-core/audits/dobetterweb/uses-optimized-images.js

+        The following images could have smaller file sizes when compressed with
+        [WebP](https://developers.google.com/speed/webp/) or JPEG at 80 quality.
+        [Learn more about image optimization](https://developers.google.com/web/fundamentals/performance/optimizing-content-efficiency/image-optimization).
+      `.trim(),


this still has newlines that only get ignored in HTML. We should probably use the same style of single string per line with + concat as in other audits anyways

yeah :/ where is the helpText used outside HTML?

ebidel · 2017-01-13T01:53:45Z

lighthouse-core/audits/dobetterweb/uses-optimized-images.js

+   * @return {{bytes: number, kb: number, percent: number}}
+   */
+  static computeSavings(image, type) {
+    const bytes = image.originalSize - image[type + 'Size'];


is there ever a risk for an increase?

Yes, but there are checks in the audit logic to see if this is a positive number before reporting

ebidel · 2017-01-13T01:57:14Z

lighthouse-core/audits/dobetterweb/uses-optimized-images.js

+   * @return {string}
+   */
+  static getUrl(image) {
+    return image.isBase64DataUri ? image.url.slice(0, 80) + '...' : image.url;


could we use the brand new URL.getDisplayName to do the pretty name?

ebidel · 2017-01-13T01:58:31Z

lighthouse-core/config/default.json

-        "unused-css-rules": {
-          "expectedValue": false
-        }
+        "unused-css-rules": {},


the first brave soul to omit expectedValue :)

ebidel · 2017-01-13T01:59:45Z

lighthouse-core/gather/gatherers/dobetterweb/optimized-images.js

+ */
+
+ /**
+  * @fileoverview Determines optimized jpeg/webp filesizes for all same-origin and dataURI images.


I was also thinking we should go x-origin for this one. Images are such a big, wasteful part of apps. 60% or whatever. And a lot of people use images hosted off a CDN. That will be x-origin.

ebidel · 2017-01-13T02:01:07Z

lighthouse-core/gather/gatherers/dobetterweb/optimized-images.js

+        canvas.width = img.width;
+        context.drawImage(img, 0, 0);
+
+        const jpeg = getTypeStats('image/jpeg', 0.8);


Just a thought for @WeiweiAtGit. Might be interesting to be able to adjust the quality value and re-run with perf-x. Maybe make 0.8 a property that can be configured?

ebidel · 2017-01-13T03:25:51Z

lighthouse-core/audits/dobetterweb/uses-optimized-images.js

+    let debugString;
+    if (failedImages.length) {
+      const urls = failedImages.map(image => UsesOptimizedImages.getUrl(image));
+      debugString = `Lighthouse was unable to parse some of your images: ${urls.join(', ')}`;


This is failed requests for images? I may say something other than "parse".

No it's usually canvas failing because the image was bad/not something we could read. Decode probably makes more sense here?

Or "Lighthouse was unable to determine if some of your images could be optimized:.."

ebidel · 2017-01-13T16:32:49Z

lighthouse-core/audits/dobetterweb/uses-optimized-images.js

+    }
+
+    return UsesOptimizedImages.generateAuditResult({
+      displayValue, debugString,


can we newline just for clarity?

ebidel · 2017-01-13T16:34:40Z

lighthouse-core/gather/gatherers/dobetterweb/optimized-images.js

+      return {base64: base64.length, binary: atob(base64).length};
+    }
+
+    document.body.appendChild(canvas);


Haven't used canvas in a while. It needs to be in the DOM for this to work?

no that was me testing that it didn't, haha I'll remove

ebidel · 2017-01-13T16:39:46Z

lighthouse-core/gather/gatherers/dobetterweb/optimized-images.js

+    return driver.evaluateAsync(script).then(stats => {
+      const isBase64DataUri = networkRecord.isBase64DataUri;
+      return {
+        originalSize: isBase64DataUri ? networkRecord.url.length : networkRecord.resourceSize,


if it's a data url, should we use the full length (data:..;base64,....) as you have) or just the file content part?

Ah yes good catch

ebidel · 2017-01-13T16:45:50Z

lighthouse-core/gather/gatherers/dobetterweb/optimized-images.js

+    return networkRecords.reduce((prev, record) => {
+      const isOptimizableImage = /image\/(png|bmp|jpeg)/.test(record._mimeType);
+      const isSameOrigin = URL.hostsMatch(pageUrl, record._url);
+      const isBase64DataUri = /^data.{2,40}base64/.test(record._url);


more specific: /^data:.{2,40};base64,/

Should include record._mimeType in the regex?

done, are there cases where a URL could begin with ^data: and not be a data URI? don't want to make it brittle/less readable with a bunch of whitespace matchers if it doesn't matter

Not that I'm aware of.

ebidel · 2017-01-13T16:59:07Z

What's the right number for the quality to use when rendering with canvas?

IIRC, 0.8 is the default for both encodings when you use tools. So 👍 to what you have.

Should we expand to support crossorigin in a future PR?

As mentioned, I think it would be more useful to include crossorigin images now.

Should we still report all of the WebP savings when it is small (current behavior)?

An improvement is an improvement, but we can see how noisy it gets. Mapping total bytes transferred to a RUM like FMP time might be good in the future. That way, it's not just "you can save x bytes", but "users will see your site Xms faster".

ebidel · 2017-01-13T21:13:46Z

lighthouse-core/gather/gatherers/dobetterweb/optimized-images.js

+      return {base64: base64.length, binary: atob(base64).length};
+    }
+
+    img.addEventListener('load', () => {


Is there any value in using a img.addEventListener('error'.. to catch failed img requests? Is there something for failed image requests that I'm not seeing?

Well the underlying assumption here was that because we're picking which images to check based on the network requests and mimetype it would already be cached and 🤞 guaranteed to load, but I'll add an explicit error check

brendankenny

LGTM2
📣📏🖼

ebidel added the DoBetterWeb label Jan 12, 2017

paulirish reviewed Jan 12, 2017

View reviewed changes

ignore more small images

9bb0453

brendankenny requested changes Jan 13, 2017

View reviewed changes

ebidel suggested changes Jan 13, 2017

View reviewed changes

patrickhulce added 4 commits January 13, 2017 10:06

feedback

29d862d

Merge remote-tracking branch 'origin/master' into optimized-images

dc912dc

more feedback

cf311f5

fixes

0c5367d

ebidel reviewed Jan 13, 2017

View reviewed changes

add error check for img load

1ade32e

patrickhulce mentioned this pull request Jan 13, 2017

Gatherer: Add support for cross-origin image optimization checks #1467

Closed

ebidel approved these changes Jan 13, 2017

View reviewed changes

brendankenny approved these changes Jan 13, 2017

View reviewed changes

brendankenny merged commit d49244e into master Jan 13, 2017

brendankenny deleted the optimized-images branch January 13, 2017 23:54

brendankenny mentioned this pull request Jan 13, 2017

Audits Meta: Feature parity with PSI, MFT, DevTools #909

Closed

16 tasks

feat(DBW): add audit for detecting unoptimized images #1452

feat(DBW): add audit for detecting unoptimized images #1452

Conversation

patrickhulce commented Jan 12, 2017

patrickhulce commented Jan 12, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

patrickhulce commented Jan 12, 2017

brendankenny left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ebidel commented Jan 13, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brendankenny left a comment

Choose a reason for hiding this comment