Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new_audit(preconnect): add preconnect audit #4362

Merged
merged 20 commits into from
Apr 26, 2018
Merged

Conversation

wardpeet
Copy link
Collaborator

@wardpeet wardpeet commented Jan 25, 2018

  • Filter same origin
  • Filter all records requested by main resource
  • Filter all non http(s) records (urls with no origin)
  • Filter out allready connected origins (connection time is 0)
  • Filter out all records that took place after 10s
  • Added unit tests
  • Added smoketests

image

@paulirish
Copy link
Member

paulirish commented Jan 31, 2018

Copy link
Member

@paulirish paulirish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Details here: #3106 (comment)

basically let's do the first half of #4425 and then the two checkboxes below (exclude same origin, and depth===1). Then we're mostly good here.

@addyosmani addyosmani added this to the Sprint Once: March 12 - 23 milestone Mar 16, 2018
@wardpeet wardpeet changed the title WIP: new-audit(preconnect): Add preconnect audit new-audit(preconnect): Add preconnect audit Mar 22, 2018
@wardpeet wardpeet changed the title new-audit(preconnect): Add preconnect audit new-audit(preconnect): add preconnect audit Mar 22, 2018
const requestDelay = record._startTime - mainResource._endTime;
const recordOrigin = URL.getOrigin(record.url);

if (preconnectResults[recordOrigin]) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better way to find unique origins?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO use a Set for preconnectResults instead

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I might flip this around a bit ward

  1. create a set of all origins
  2. filter out the bad origins
  3. for each origin, find the earliest record that meets the other criteria

const requestDelay = record._startTime - mainResource._endTime;
const recordOrigin = URL.getOrigin(record.url);

if (preconnectResults[recordOrigin]) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO use a Set for preconnectResults instead

// filter out all resources that are not loaded by the document
.filter(record => {
return record.initiatorRequest() !== mainResource && record !== mainResource;
// filter out urls that do not have an origin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like L62 would already have taken care of these.

// filter out urls that do not have an origin
}).filter(record => {
return !!URL.getOrigin(record.url);
// filter out all resources where origins are already resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we're only displaying origins and not full request URLs, I feel like this is already taken care of by reduction down to unique origins below.

(The assumption being that connect is always a cost for new origins... which seems a reasonable assumption. right @patrickhulce ?)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah but he's trying to find the first representative network record for the origin so you do need to filter out the network records that were already connected

const Util = require('../report/v2/renderer/util');
const UnusedBytes = require('./byte-efficiency/byte-efficiency-audit');
const URL = require('../lib/url-shim');
const PRECONNECT_SOCKET_MAX_IDLE = 10;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some comments?

Preconnect establishes a "clean" socket. Chrome's socket manager will keep an unused socket
around for 10s. Meaning, the time delta between processing preconnect a request should be <10s, otherwise it's wasted

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also I think we could actually use 12-15 for our threshold, tbh. we'd include slightly more connections and there's a decent chance in real loads, the preconnect would be used.

}

// make sure the requests are below the PRECONNECT_SOCKET_MAX_IDLE (10s) mark
if (Math.max(0, requestDelay) < PRECONNECT_SOCKET_MAX_IDLE) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style wise it'd probably be better to check the inverse and return, like the other filters.

that way at the end we take everything that made it through the gauntlet and add to the set.

});

const headings = [
{key: 'url', itemType: 'url', text: 'URL'},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

text: 'Origin'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

based on your screenshot i'd say our url renderer is a little awkward in handling this origin-only URLs.

let's fix that. :)
we can check for pathname of / and if that's the case, not separate the two.... probably want most of that in Util.parseURL and small adjustments in the _renderTextURL

Copy link
Collaborator

@patrickhulce patrickhulce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice work! excited to get this one in 👍

' before an HTTP request is actually sent to the server. This will reduce multiple, ' +
`costly round trips to any origin. [Learn more](${learnMoreUrl}).`,
requiredArtifacts: ['devtoolsLogs'],
scoringMode: Audit.SCORING_MODES.NUMERIC,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scoring display mode

*/
static get meta() {
return {
category: 'Performance',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no more category :)

return !!URL.getOrigin(record.url);
// filter out all resources where origins are already resolved
}).filter(record => {
return !UsesRelPreconnectAudit.hasAlreadyConnectedToOrigin(record);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might want to explicitly call out invalid timing information as well, i.e. if dnsStart/etc are not finite positive numbers

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

call out? Do you mean log to sentry?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh sorry, call out just in a comment with an extra check :)

I meant, "let's also filter requests with invalid timing information"

// filter out urls that do not have an origin
}).filter(record => {
return !!URL.getOrigin(record.url);
// filter out all resources where origins are already resolved
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah but he's trying to find the first representative network record for the origin so you do need to filter out the network records that were already connected

});

const results = Object.values(preconnectResults).map(record => {
const wastedMs = record.timing.connectEnd - record.timing.dnsStart;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'll want to Math.min this with the delta between connectStart and main document resource end

const requestDelay = record._startTime - mainResource._endTime;
const recordOrigin = URL.getOrigin(record.url);

if (preconnectResults[recordOrigin]) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I might flip this around a bit ward

  1. create a set of all origins
  2. filter out the bad origins
  3. for each origin, find the earliest record that meets the other criteria

@wardpeet
Copy link
Collaborator Author

@paulirish @patrickhulce
I think I have all your review comments done, not sure if I implemented them like you guys wanted 🙄.

i'm not really sure about the displayUrl helpers though..

Also smoketests are failing on travis as it's such a low number. Could this be a docker issue? that host is has these origins cached, should I just check for items than?

Copy link
Collaborator

@patrickhulce patrickhulce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

review #️⃣2️⃣!

description: 'Avoid multiple, costly round trips to any origin',
informative: true,
helpText:
'Consider using<link rel="preconnect dns-prefetch"> to set up early connections ' +
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@paulirish may have opinions here too, but WDYT about?

Consider adding preconnect or dns-prefetch resource hints to establish early connections to important third-party origins. Learn more.

Copy link
Collaborator Author

@wardpeet wardpeet Mar 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fine by me i most of the time just copy what you guys say 😄 but I think <link tags shouldn't be in the description like you describe


const origins = networkRecords
.filter(record => {
// filter out all resources that have the same origin
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: move this comment down a line like the others :)

const results = [];
preconnectOrigins.forEach(origin => {
const records = networkRecords.filter(record => URL.getOrigin(record.url) === origin);
if (!records.length) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems impossible, you didn't run across this IRL did you?

seems like the same checks from above also belong in this spot though?


// Sometimes requests are done simultaneous and the connection has not been made
// chrome will try to connect for each network record, we get the first record
let firstRecordOfOrigin;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@paulirish is this an acceptable occasion for reduce in your 👀 ? :)

this works too

firstRecordOfOrigin._startTime * 1000 -
mainResource._endTime * 1000 +
firstRecordOfOrigin.timing.connectStart;
// calculate delta between connectionTime and timeToConnectionStart from main resource
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment is a bit confusing, didn't we already compute the delta and the line below is ensuring we just cap the connectionTime savings?

@@ -110,7 +110,7 @@ class CriticalRequestChainRenderer {
const {file, hostname} = Util.parseURL(segment.node.request.url);
const treevalEl = dom.find('.crc-node__tree-value', chainsEl);
dom.find('.crc-node__tree-file', treevalEl).textContent = `${file}`;
dom.find('.crc-node__tree-hostname', treevalEl).textContent = `(${hostname})`;
dom.find('.crc-node__tree-hostname', treevalEl).textContent = hostname ? `(${hostname})` : '';
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice find :) if it doesn't have a hostname though should we be showing it? what type of request is that, data URI?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no it comes back from the parseUrl where a request might be the root so I don't return a hostname.

it actually looks weird so I have to do the origin display inside the report and not in the parseUrl
image

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to
critical request chain:
image

preconnect table:
image

url: 'https://www.example.com/',
_endTime: 1,
};
const mainResourceRecord = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do these two need to be different? :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good question 😛

assert.equal(extendedInfo.value.length, 2);
assert.deepStrictEqual(extendedInfo.value, [
{url: 'https://cdn.example.com', wastedMs: Util.formatMilliseconds(200)},
{url: 'https://othercdn.example.com', wastedMs: Util.formatMilliseconds(400)},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would you mind explaining this result? seems like it should be min of 500 (600 - 100) and 300 (1000 * (1.2 - 1) + 100)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when looking in code I use connectStart where I should have used dnsStart 🤦‍♂️

const connectionTime = 600 - 100;
const timeBetweenMainResourceAndConnectStart =
  1.2 * 1000 -
   1 * 1000 +
  200;
// calculate delta between connectionTime and timeToConnectionStart from main resource
const wastedMs = Math.min(500, 400);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yeah I missed that too :) seems like timeBetweenMainResourceAndDnsStart should fix it 👍 (or fallback to connect start I guess if dns start is invalid, is that a real case?)

Copy link
Collaborator

@patrickhulce patrickhulce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

few more comments sorry for the delay!

@@ -7,3 +7,6 @@
/* eslint-disable */

document.write('<script src="level-2.js?delay=500"></script>')

// load another origin
fetch('http://localhost:10503/preload.html', () => {});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: do we even need the callback?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure why I even added it as it returns a promise

const PRECONNECT_SOCKET_MAX_IDLE = 15;

const learnMoreUrl =
'https://developers.google.com/web/fundamentals/performance/resource-prioritization#preconnect';
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as @brendankenny pointed out to me we can throw this inline without violating our max-len since it's a URL 👍

.filter(record => {
return (
// filter out all resources that have the same origin
!URL.originsMatch(mainResource.url, record.url) &&
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC new URL is actually pretty expensive and can take seconds in the worst case, so let's reorder for some cheaper ones first? maybe just sticking initiator request and hasAlreadyConnected first is enough

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we use webinspector parsedurl instead?
mainResource.parsedURL.securityOrigin() === record.parsedURL.securityOrigin()

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah even better! 👍

// filter out urls that do not have an origin (data, ...)
!!URL.getOrigin(record.url) &&
// filter out all resources where origins are already resolved
!UsesRelPreconnectAudit.hasAlreadyConnectedToOrigin(record) &&
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like we'll need a corresponding filter to make sure the timing information is valid

static hasValidTiming(record) {
  return record.timing && record.timing.connectEnd > 0 && record.timing.connectStart > 0;
}

// filter out all resources where timing info was invalid
!UsesRelPreconnect...

in most cases it'll be the same, but we've seen some weird edge cases in other environments where timing is just nonsense negative numbers or undefined

maxWasted = Math.max(wastedMs, maxWasted);
results.push({
url: new URL(firstRecordOfOrigin.url).origin,
wastedMs: Util.formatMilliseconds(wastedMs),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's keep this a number and mark the wastedMs as a type: 'ms'

{key: 'wastedMs', itemType: 'ms', granularity: 1, text: 'Potential Savings'}

@@ -96,8 +96,8 @@ class DetailsRenderer {
let title;
try {
const parsed = Util.parseURL(url);
displayedPath = parsed.file;
displayedHost = `(${parsed.hostname})`;
displayedPath = parsed.file === '/' ? parsed.origin :parsed.file;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: space after colon : parsed.file

@GoogleChrome GoogleChrome deleted a comment from patrickhulce Apr 12, 2018
@wardpeet
Copy link
Collaborator Author

@patrickhulce @paulirish @brendankenny
fixed all comments

Copy link
Collaborator

@patrickhulce patrickhulce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % nits! nice work @wardpeet!

const preconnectOrigins = new Set(origins);
const results = [];
preconnectOrigins.forEach(origin => {
const records = networkRecords.filter(record => URL.getOrigin(record.url) === origin);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we still want to double check that we're only looking at records with valid timing, also let's use parsedURL.securityOrigin to be consistent with above
could probably combine this with the forEach below too and do it in one pass :)

name = name.slice(0, MAX_LENGTH - 1 - (name.length - dotIndex)) +
// Show file extension
`${ELLIPSIS}${name.slice(dotIndex)}`;
name =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

think we can revert some of this diff noise?

@paulirish paulirish changed the title new-audit(preconnect): add preconnect audit new_audit(preconnect): add preconnect audit Apr 25, 2018
@paulirish
Copy link
Member

btw for whatever reason 2/3rds of my last review comments are collapsed but they are unaddressed so far. Probably because I commented on that specific commit, and there was one following it. shrug.

@paulirish
Copy link
Member

lgtm. ready to merge when green.

@paulirish paulirish merged commit 1176a63 into master Apr 26, 2018
@paulirish paulirish deleted the feature/preconnect branch April 26, 2018 22:04
@addyosmani
Copy link
Member

\o/ Thanks for the diligent reviews getting this audit firmed up @paulirish @patrickhulce and @brendankenny. Big thanks for all your work getting this audit landed, @wardpeet high-fives

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants