-
Notifications
You must be signed in to change notification settings - Fork 9.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(DBW): add audit for detecting unoptimized images #1452
Conversation
Addresses one of several concerns around image bloat and another step towards PSI parity (#909). Gathers all sameorigin or data URI images off of the network and runs them through canvas to determine a stripped down JPEG and WebP size. Audit will fail if one of the conditions are met: * There is at least one JPEG or bitmap image that was larger than canvas encoded JPEG. * There is at least one image that would have saved more than 50KB by using WebP. * The savings of moving all images to WebP is greater than 100KB.
Some open questions:
|
function getTypeStats(type, quality) { | ||
const dataURI = canvas.toDataURL(type, quality); | ||
const base64 = dataURI.slice(dataURI.indexOf(',') + 1); | ||
return {base64: base64.length, binary: atob(base64).length}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of atob
you can use canvasElem.toBlob
and then just check the blob.size
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to avoid atob
? toBlob
is async no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 that looks async to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just wondering if there's a benefit I'm missing. Double nested callback just seems to add some awkwardness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry. You're totally right. They changed it after years in the making and I refused to believe it :) https://developer.mozilla.org/en-US/docs/Web/API/HTMLCanvasElement/toBlob
You could do a promise wrapper.
Generally, data urls add about 33% overhead to the filesize. If a page has a lot of images that might eat up some memory but I'm not sure it will be a huge issue. A benefit of passing a blob around would be that the caller can read it in many formats (binary string, data url, etc) using the FileReader API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haha no worries :) Yeah I'm actually purposefully calculating the data URI size here (though I could check first to see if it wasn't a data URI to begin with and skip it) so that I can compare it under the assumption if you're shipping an image as a data URI you'd likely send the optimized version as one as well. I suppose avoiding large data URI's should really be an audit by itself too though 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking good. Initial review pass
@@ -102,6 +102,10 @@ | |||
</style> | |||
</template> | |||
|
|||
<template id="unoptimized-images-tmpl"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you going to do anything with this in the expectations file? Also, it is a very big addition to this file :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should I check-in the image separately instead? And yes, haha, done :)
*/ | ||
|
||
/** | ||
* @fileoverview Determines optimized jpeg/webp filesizes for all same-origin and dataURI images. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same origin makes sense (those are the images you have control over), but warning developers that a cross origin image is hurting them is also helpful. What does PSI do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would also be useful if this had a description of how this is calculated, especially as we aren't claiming this is 100% foolproof, just an ongoing best effort. Could put the description on top of or in getOptimizedNumBytes
, but might be more apparent here at the top.
Uses a canvas in the target browser to re-encode each image at 80% etc etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was also thinking we should go x-origin for this one. Images are such a big, wasteful part of apps. 60% or whatever. And a lot of people use images hosted off a CDN. That will be x-origin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah my main concern was not processing CDN images at all which is most people who make some effort. The issue I ran into was canvas not dealing with them and adding crossorigin='anonymous'
didn't necessarily fix. Fine to investigate in future PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, crossorigin='anonymous'
. File a bug so we don't forget!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done! #1467
/* global document */ | ||
|
||
/* istanbul ignore next */ | ||
function getOptimizedNumBytes(url) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should indicate in comment description that this runs in the browser
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
class OptimizedImages extends Gatherer { | ||
/** | ||
* @param {string} pageUrl | ||
* @param {NetworkRecords} networkRecords |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{!NetworkRecords}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
/** | ||
* @param {string} pageUrl | ||
* @param {NetworkRecords} networkRecords | ||
* @return {!Array.<!{url: string, isBase64DataUri: boolean, mimeType: string, resourceSize: number}>} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, structural types aren't default nullable (and you don't need the .
), so you can do @return {!Array<{url: string...}>}
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh cool, done
} | ||
|
||
/** | ||
* @param {Object} driver |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{!Object}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
/** | ||
* @param {Object} driver | ||
* @param {!{url: string, isBase64DataUri: boolean, resourceSize: number}} networkRecord |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can drop the !
/** | ||
* @param {Object} driver | ||
* @param {!{url: string, isBase64DataUri: boolean, resourceSize: number}} networkRecord | ||
* @return {!Promise.<!{originalSize: number, jpegSize: number, webpSize: number}>} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can drop the second !
and the .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
The following images could have smaller file sizes when compressed with | ||
[WebP](https://developers.google.com/speed/webp/) or JPEG at 80 quality. | ||
[Learn more about image optimization](https://developers.google.com/web/fundamentals/performance/optimizing-content-efficiency/image-optimization). | ||
`.trim(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this still has newlines that only get ignored in HTML. We should probably use the same style of single string per line with + concat as in other audits anyways
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah :/ where is the helpText used outside HTML?
* @return {{bytes: number, kb: number, percent: number}} | ||
*/ | ||
static computeSavings(image, type) { | ||
const bytes = image.originalSize - image[type + 'Size']; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there ever a risk for an increase?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but there are checks in the audit logic to see if this is a positive number before reporting
* @return {string} | ||
*/ | ||
static getUrl(image) { | ||
return image.isBase64DataUri ? image.url.slice(0, 80) + '...' : image.url; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we use the brand new URL.getDisplayName
to do the pretty name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
"unused-css-rules": { | ||
"expectedValue": false | ||
} | ||
"unused-css-rules": {}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the first brave soul to omit expectedValue :)
*/ | ||
|
||
/** | ||
* @fileoverview Determines optimized jpeg/webp filesizes for all same-origin and dataURI images. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was also thinking we should go x-origin for this one. Images are such a big, wasteful part of apps. 60% or whatever. And a lot of people use images hosted off a CDN. That will be x-origin.
canvas.width = img.width; | ||
context.drawImage(img, 0, 0); | ||
|
||
const jpeg = getTypeStats('image/jpeg', 0.8); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a thought for @WeiweiAtGit. Might be interesting to be able to adjust the quality value and re-run with perf-x. Maybe make 0.8
a property that can be configured?
let debugString; | ||
if (failedImages.length) { | ||
const urls = failedImages.map(image => UsesOptimizedImages.getUrl(image)); | ||
debugString = `Lighthouse was unable to parse some of your images: ${urls.join(', ')}`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is failed requests for images? I may say something other than "parse".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No it's usually canvas failing because the image was bad/not something we could read. Decode probably makes more sense here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or "Lighthouse was unable to determine if some of your images could be optimized:.."
} | ||
|
||
return UsesOptimizedImages.generateAuditResult({ | ||
displayValue, debugString, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we newline just for clarity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
return {base64: base64.length, binary: atob(base64).length}; | ||
} | ||
|
||
document.body.appendChild(canvas); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't used canvas in a while. It needs to be in the DOM for this to work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no that was me testing that it didn't, haha I'll remove
return driver.evaluateAsync(script).then(stats => { | ||
const isBase64DataUri = networkRecord.isBase64DataUri; | ||
return { | ||
originalSize: isBase64DataUri ? networkRecord.url.length : networkRecord.resourceSize, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it's a data url, should we use the full length (data:..;base64,....
) as you have) or just the file content part?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes good catch
return networkRecords.reduce((prev, record) => { | ||
const isOptimizableImage = /image\/(png|bmp|jpeg)/.test(record._mimeType); | ||
const isSameOrigin = URL.hostsMatch(pageUrl, record._url); | ||
const isBase64DataUri = /^data.{2,40}base64/.test(record._url); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
more specific: /^data:.{2,40};base64,/
Should include record._mimeType
in the regex?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, are there cases where a URL could begin with ^data:
and not be a data URI? don't want to make it brittle/less readable with a bunch of whitespace matchers if it doesn't matter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that I'm aware of.
IIRC, 0.8 is the default for both encodings when you use tools. So 👍 to what you have.
As mentioned, I think it would be more useful to include crossorigin images now.
An improvement is an improvement, but we can see how noisy it gets. Mapping total bytes transferred to a RUM like FMP time might be good in the future. That way, it's not just "you can save x bytes", but "users will see your site Xms faster". |
return {base64: base64.length, binary: atob(base64).length}; | ||
} | ||
|
||
img.addEventListener('load', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any value in using a img.addEventListener('error'..
to catch failed img requests? Is there something for failed image requests that I'm not seeing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well the underlying assumption here was that because we're picking which images to check based on the network requests and mimetype it would already be cached and 🤞 guaranteed to load, but I'll add an explicit error check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM2
📣📏🖼
Addresses one of several concerns around image bloat and another step towards PSI parity (#909).
Gathers all sameorigin or data URI images off of the network and runs them through canvas to determine a stripped down JPEG and WebP size.
Audit will fail if one of the conditions are met: