New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

core(lantern): improve RTT estimates #4552

Merged

paulirish merged 4 commits into master from baller_lantern_rtt_estimates

Feb 22, 2018

Collaborator

patrickhulce commented Feb 15, 2018

changes

estimates the observed RTT of each origin and takes this into account when simulating
estimates the server response time of each origin using all available records, instead of just the records involved in the graph (prior behavior responsible for 1st bug in list below)
ignores observed connectionId from Chrome when simulating and creates its own set of connections instead

consequences

fixes a bug where the pessimistic estimate could be smaller than the optimistic estimate due to different connections being used
should substantially improve accuracy observed in LR when the connectionIds are all 0
reduces absolute estimate error by ~4% across the board (25% error -> 21% error, not 25% -> 24% 😉)
marginally increases rank correlation on FCP, marginally decreases rank correlation on FMP, no impact to rank correlation of TTI

patrickhulce added 2 commits

February 12, 2018 16:56


          core(predictive-perf): improve RTT estimates

fc21111


          core(simulator): rework simulator to use network-analyzer output

0fd9b6c

devtools-bot added the waiting4reviewer label

patrickhulce changed the title ~~core(predictive-perf): improve RTT estimates~~ core(lantern): improve RTT estimates


          add lots of comments

23e6778

paulirish reviewed

View reviewed changes

Member

paulirish left a comment

ignores observed connectionId from Chrome when simulating and creates its own set of connections instead

In the non-LR case, we have the original data. So can we see how well our simulation did against observed connectionIds?

as an aside, think now's a good time to get type-checking enabled for lib/dependency-graph?

lighthouse-core/audits/predictive-perf.js Outdated

+                  });
+                  const rttSummaries = Array.from(NetworkAnalyzer.estimateRTTByOrigin(records).entries());
+                  const rttByOrigin = new Map(rttSummaries.map(item => [item[0], item[1].min]));

Member

paulirish Feb 21, 2018

can we destructure item here and on 93/97? would help keeping track of what each obj is.

Collaborator Author

patrickhulce Feb 22, 2018

yeah I cleaned this whole function up

lighthouse-core/audits/predictive-perf.js Outdated

+                  const rttSummaries = Array.from(NetworkAnalyzer.estimateRTTByOrigin(records).entries());
+                  const rttByOrigin = new Map(rttSummaries.map(item => [item[0], item[1].min]));
+                  const responseTimeSummaries = NetworkAnalyzer.estimateServerResponseTimeByOrigin(records, {

Member

paulirish Feb 21, 2018

you can move this down to before line 96 and then structurally we have the RTT and response time guys as the two chunks here. maybe a comment or two, too?

Collaborator Author

patrickhulce Feb 22, 2018

yeah I cleaned this whole function up

lighthouse-core/audits/predictive-perf.js Outdated

+                 * @param {!Node} dependencyGraph
+                 * @return {!Object}
+                 */
+                static computeOptions(dependencyGraph) {

Member

paulirish Feb 21, 2018

i appreciate the thought you put into the names in this file, but i think we can upgrade this one a bit.
computeRTTandSR ?

Collaborator Author

patrickhulce Feb 22, 2018

sg done

lighthouse-core/audits/predictive-perf.js Outdated

-                      const estimate = new LoadSimulator(graphs[key]).simulate();
-                      const lastLongTaskEnd = PredictivePerf.getLastLongTaskEndTime(estimate.nodeTiming);
+                      const estimate = new LoadSimulator(graphs[key], options).simulate();
+                      const longTaskThreshold = /optimistic/.test(key) ? 100 : 50;

Member

paulirish Feb 21, 2018

startsWith?

Collaborator Author

patrickhulce Feb 22, 2018

done

lighthouse-core/audits/predictive-perf.js Outdated

+                    if (node.type === Node.TYPES.NETWORK) records.push(node.record);
+                  });
+                  const rttSummaries = Array.from(NetworkAnalyzer.estimateRTTByOrigin(records).entries());

Member

paulirish Feb 21, 2018

later on you use this pattern, which i think is a lot more readable:

      for (const [origin, summary] of rttByOrigin.entries()) {
        rttByOrigin.set(origin, summary.min);
      }

does take more lines, but worth it IMO

Collaborator Author

patrickhulce Feb 22, 2018

yeah cleaned this whole thing up

Collaborator Author

patrickhulce commented Feb 22, 2018

In the non-LR case, we have the original data. So can we see how well our simulation did against observed connectionIds?

yes there's even a test for that ;)

lighthouse/lighthouse-core/test/lib/dependency-graph/simulator/network-analyzer-test.js

Lines 147 to 156 in 23e6778

    
           it('should approximate well with either method', () => { 
        
             return computedArtifacts.requestNetworkRecords(devtoolsLog).then(records => { 
        
               const result = NetworkAnalyzer.estimateRTTByOrigin(records).get(NetworkAnalyzer.SUMMARY); 
        
               const resultApprox = NetworkAnalyzer.estimateRTTByOrigin(records, { 
        
                 forceCoarseEstimates: true, 
        
               }).get(NetworkAnalyzer.SUMMARY); 
        
               assertCloseEnough(result.min, resultApprox.min, 20); 
        
               assertCloseEnough(result.avg, resultApprox.avg, 30); 
        
               assertCloseEnough(result.median, resultApprox.median, 30); 
        
             });

during simulation though this is a slightly different concern, essentially the connection IDs that were used during the fast load aren't necessarily the ones you want to force when simulating a very small subset. i.e. we happened to pick 3 assets that all shared the same connection ID when loading with 100 other resources but would've obviously been put onto different connections when loaded alone. In this sense simulator becomes a little more rational/browser-like.

as an aside, think now's a good time to get type-checking enabled for lib/dependency-graph?

yes, but the error list was ~200 lines long so I'll save that for a followup :)


          feedback

5cc2374

Member

paulirish commented Feb 22, 2018

nice cleanup! thxx

In the non-LR case, we have the original data. So can we see how well our simulation did against observed connectionIds?

yes there's even a test for that ;)

nice nice. looked at the 'should approximate well with either method' test and the results.

min
diff:  11.084 result:  2.621 approx:  13.705
avg
diff:  28.577 result:  10.712 approx:  39.29
median
diff:  10.237 result:  3.666 approx:  13.903

the gap between the result and approx is pretty big for all of these. I would have expected <= 20% difference. i'm not sure how we should quantify these are reasonable approximations; what do you think?

paulirish added waiting4committer and removed waiting4reviewer labels

Collaborator Author

patrickhulce commented Feb 22, 2018 •

edited

Loading

the gap between the result and approx is pretty big for all of these. I would have expected <= 20% difference. i'm not sure how we should quantify these are reasonable approximations; what do you think?

~10ms gap is about as good as I could possibly hope for given we're essentially trying to blindly guess what % of ~100ms request was due to RTT and what was server response time

it was more impressive when run on traces that actually had some latency (40 vs. 50 isn't such a big deal), but all ours are existing ones were pretty slim. want me to add some beefy ones to fixtures from the throttled dataset?

also note that it will skew over estimating crazy low RTT (~5ms is like top 0.3% percentile 😛) and under estimating crazy high RTT since the penalty for going big gets really huge

paulirish approved these changes

View reviewed changes

paulirish merged commit 8e32c19 into master

paulirish deleted the baller_lantern_rtt_estimates branch

February 22, 2018 23:12

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment