Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core(bootup-time): better attribution #7059

Merged
merged 4 commits into from
Feb 28, 2019
Merged

core(bootup-time): better attribution #7059

merged 4 commits into from
Feb 28, 2019

Conversation

patrickhulce
Copy link
Collaborator

@patrickhulce patrickhulce commented Jan 18, 2019

Summary
There are a couple things going on here all sparked by #7005. This bootup time audit is somewhat inconsistent because it's pulling double duty as two different things.

  1. It flags the amount of time spent loading due to script execution.
  2. It flags the CPU time hogged by the responsible script URL.

These are different because a script can force a layout. It's not fair to say the layout time was "script execution" necessarily because it was the browser doing the work, but it's also not fair to ignore the cost of layouts from third party scripts when saying "these are your problem scripts". Any ideas to solve this problem are welcome :)

Now on to the PR changes...

  1. Script executions that are the result of a setTimeout are now assigned to the script that installed the timer rather than the code that executed during the timer. This solves part of Performance Audit attribution from third-party wrapping #7005.
  2. If we cannot find a URL to attribute the execution cost to, we put it into an "Other" bucket. While trying to get a solution to the problem presented in Performance Audit attribution from third-party wrapping #7005 for XHR/fetch callbacks it uncovered the fact that because async stacks are off, frequently a script execution with a clear URL loses its stack trace and becomes URL-less. Previously all such work was disappearing from our totals.
  3. The display value of the audit is now the sum of all script execution + script parse time. With the creation of the "Other" bucket, summing all execution time makes the total exactly the same as "Main thread breakdown".

Future things to address:

  • Turning on async stacks should enable us to better attribute the 'Other' category.
  • We aren't utilizing all of the URLs we could be for attribution. Leveraging frame data and scriptId could help here.
  • How do we handle legit inline script executions? Right now we prefer to find the first script file that executed because nearly all executions have the main page as the first "attributable URL" (obviously everything started because the page asked it to one way or another). This also means though that for scripts that legitimately have the inline script of the page to blame we still blame the first script file that executed. i.e. this PR does not actually solve the specific repro example provided in Performance Audit attribution from third-party wrapping #7005

Related Issues/PRs
#7005

@connorjclark
Copy link
Collaborator

Turning on async stacks should enable us to better attribute the 'Other' category.

Could you talk about why this is currently not enabled for LH? Were there performance concerns?

Next v8 release will do that by default. They claim zero-cost, so perhaps performance issues have been resolved?

@patrickhulce
Copy link
Collaborator Author

Could you talk about why this is currently not enabled for LH? Were there performance concerns?

There were performance boogeyman concerns from enabling the debugger category. I never found statistically significant changes though.

Next v8 release will do that by default. They claim zero-cost, so perhaps performance issues have been resolved?

This is awesome news!!! Enables more accurate lantern estimates, better attribution, huzzah!!!! :D 🎉 🎉 🎉 🎉

@paulirish
Copy link
Member

In order to satisfy #7005 (and to generally improve bootup-time), I think we need to handle attribution a little differently. ⛵️ First, changes 2 and 3 above SGTM.. I only have a concern with the new timer attribution.

Example scenario:

jQuery(document).ready(function(){ 
	myapp.initalize();
});

jquery.js owns the event handler so it will get all the self-time attribution, even though the cost of this function is over in myapp.js. :/

Basically anytime the flame chart has a different color at the top vs the bottom, our current method leads to inaccurate results.

We have avoided turning on the JS sampling profiler 'till now, but I think that's the correct move here.


Here's Nic's test page with JS sampling off.. see that all bottomup time is associated to the start of the JS stack.
image

And again but with JS sampling on. Correct selftime association now:
image

@patrickhulce
Copy link
Collaborator Author

Hm, I'm not sure we're on the same page with that example @paulirish. myapp.js should still receive attribution in that case because while the jquery file may have been the line that installed the handler, the attributable URL that installed the handler was still myapp.js.

If in the jquery.js script evaluation it installed it's own jQuery(document).ready listeners then jquery.js would receive attribution. The only edge case that remains after this is when the jQuery(document).ready occurs inline of the page in which case jQuery gets mistakenly blamed.

@patrickhulce
Copy link
Collaborator Author

I'm definitely not against enabling the sampling profiler though if we can use DZL to prove it's OK :D better fidelity is better fidelity and I won't fight that!

@paulirish
Copy link
Member

Okay slightly diff example:

+------------------------------------------+
|       jQuery DCL handler (100ms)         |
+-------------------+----------------------+
|    jquery fns     |     myapp fns        |
+-------------------+----------------------+

I'd expect to see jquery.js get 50ms and myapp to get 50ms. Right?

@patrickhulce
Copy link
Collaborator Author

Ah, I see what you're saying now.

Yeah, we don't split tasks at all right now. It's a bit messy to try to half do self-time though, don't you think?

In

+------------------------------------------+
|               Myapp (100ms)              |
+-------------------+----------------------+
|    myapp fns      |     jquery fns       |
+-------------------+----------------------+

I want myapp to get full blame for all of this work. From #7005 these were my thoughts:

Two Main Options I see:

  • We attribute time to each script according to the aggregate selfTime of all its executions. i.e. if we have jQuery on the page, then jQuery would receive the attribution for all the users' calls to $.whatever. This is typically what a developer wants to see when working on identifying bottlenecks during a performance deep dive. However, this is probably not what we want to surface in Lighthouse. This somewhat presumes that the calls to functions are fixed and we're trying to have the largest overall impact improving things, but we're trying to surface what scripts instigated the most work. Let's take a look at number 2.
  • We attribute time to each script according to the aggregate totalTime of all tasks it initiated. i.e. if we have jQuery on the page, then jQuery would receive attribution for its initial installation and any handlers, setTimeouts, or additional work it defers, but not for any calls to jQuery functions from other scripts. This is what we're trying to achieve, but there's currently a hole where we are attributing to the first URL that executed in the timeout instead of the URL that scheduled the timeout.

@patrickhulce
Copy link
Collaborator Author

patrickhulce commented Feb 8, 2019

Basically, I'm saying it's really difficult to understand that a library installed it's own handler while it was being initialized and then is also letting other scripts piggyback on the same exact handler. Instead of registering new ones once it's invoked.

As a result, I'd rather sacrifice this case to still be able to attribute to the other instigators.

@patrickhulce
Copy link
Collaborator Author

@paulirish we settled on this being a good step forward, right? LGTY?

@paulirish
Copy link
Member

we settled on this being a good step forward, right? LGTY?

yes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants