Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Audit attribution from third-party wrapping #7005

Open
nicjansma opened this Issue Jan 14, 2019 · 3 comments

Comments

Projects
None yet
3 participants
@nicjansma
Copy link

commented Jan 14, 2019

Feature request summary

We are a third-party analytics provider that provides Real User Monitoring (RUM) and other features such as JavaScript Error Tracking through a JavaScript library our customers include on their pages called Boomerang.

Lately, we've had several customers raise support issues over their concerns that our JavaScript library is causing excessive JavaScript CPU usage, according to Lighthouse/PSI reports.

Here's an example:

image

Upon investigation, we found that our Error Tracking plugin was causing Lighthouse to be confused about work being done in our customer's app. The plugin wraps several high-level functions, and the callbacks of, setTimeout(), addEventListener() and others. It does this to ensure that the customer's application can get detailed error messages (instead of the generic Script error. message) when errors are generated from scripts in other origins. More details are here, and other error tracking libraries do this wrapping as well.

Because Boomerang is wrapping the callbacks in setTimeout()/etc, any function in any file that calls setTimeout() will be attributed to Boomerang instead, because Boomerang's wrapped callback is at the "bottom" of the stack. As you can imagine, this accounts for a lot of the JavaScript CPU time on many sites. The actual work done in the wrapped function is minimal (but non-zero), but the time of the entire callback is attributed to Boomerang instead of the "real" work being done. Our understanding is the "URL" of the JavaScript boot-up time /etc tables are based on the function/URL at the bottom of the stack.

If we were to disable Error Tracking and re-load the same page, we can see the real attribution moves a lot to the Page and other scripts (as well as Boomerang), but the "cost" of Boomerang has decreased from 770ms to 127ms:

image

We were hoping there would be a way to change it so these wrapping functions don't get blamed for the entire callback cost. One idea would be to have a list of files or functions that could be excluded from "blame" if at the bottom of the stack, similar to library blackboxing.

We're not sure if that's feasible though:

  • We'd still want the non-wrapping code in Boomerang to be judged based on the work it does, of course.
  • We'd like for our customers's Lighthouse/PSI runs to have this exclusion applied without them having to make any changes or configure any lists (can this be built-into Lighthouse?)
  • Each of our customers load our third-party library from distinct URLs

The only other alternates we've come up with is better education (explaining the issue to the customer, which we're doing now), or disabling Boomerang's wrapping if Lighthouse is detected (which we'd prefer to not do).

I have a repro page here, which merely does a setTimeout() that busy-loops for 2 seconds:

image

image

I'm willing to work on this change, if there's agreement on a correct approach.

What is the motivation or use case for changing this?

To accurately reflect JavaScript CPU cost when third-party libraries are in the callstack.

How is this beneficial to Lighthouse?

More accurate JavaScript CPU attribution.

@patrickhulce

This comment has been minimized.

Copy link
Collaborator

commented Jan 14, 2019

Thanks so much for the detailed information and willingness to work with us @nicjansma !

I agree we should be trying to attribute these costs more accurately in Lighthouse core (and therefore PSI, etc) without needing the user to do any opting-in to a plugin/config/options.

Let's start with where the cost should be going and follow up with how we're not quite getting there yet.

There are a couple ways to look at this.

  1. We attribute time to each script according to the aggregate selfTime of all its executions. i.e. if we have jQuery on the page, then jQuery would receive the attribution for all the users' calls to $.whatever. This is typically what a developer wants to see when working on identifying bottlenecks during a performance deep dive. However, this is probably not what we want to surface in Lighthouse. This somewhat presumes that the calls to functions are fixed and we're trying to have the largest overall impact improving things, but we're trying to surface what scripts instigated the most work. Let's take a look at number 2.
  2. We attribute time to each script according to the aggregate totalTime of all tasks it initiated. i.e. if we have jQuery on the page, then jQuery would receive attribution for its initial installation and any handlers, setTimeouts, or additional work it defers, but not for any calls to jQuery functions from other scripts. This is what we're trying to achieve, but you've identified a hole where we are attributing to the first URL that executed in the timeout instead of the URL that scheduled the timeout.

Does that sound right to everyone?

Assuming it does, fixing this is a bit tricky for a first-time contribution.

We will need to bring the same logic we have for tracking timer installations over in page-dependency-graph.js and bring it into the way we compute attributeableURLs in main-thread-tasks.js

case 'TimerInstall':
// @ts-ignore - 'TimerInstall' event means timerId exists.
timers.set(evt.args.data.timerId, node);
stackTraceUrls.forEach(url => addDependencyOnUrl(node, url));
break;
case 'TimerFire': {
// @ts-ignore - 'TimerFire' event means timerId exists.
const installer = timers.get(evt.args.data.timerId);
if (!installer) break;
installer.addDependent(node);
break;
}

switch (task.event.name) {
/**
* Some trace events reference a specific script URL that triggered them.
* Use this URL as the higher precedence attributable URL.
* @see https://cs.chromium.org/chromium/src/third_party/blink/renderer/devtools/front_end/timeline/TimelineUIUtils.js?type=cs&q=_initEventStyles+-f:out+f:devtools&sq=package:chromium&g=0&l=678-744
*/
case 'v8.compile':
case 'EvaluateScript':
case 'FunctionCall':
taskURLs = [argsData.url].concat(stackFrameURLs);
break;
case 'v8.compileModule':
taskURLs = [task.event.args.fileName].concat(stackFrameURLs);
break;
default:
taskURLs = stackFrameURLs;
break;
}

@nicjansma

This comment has been minimized.

Copy link
Author

commented Feb 8, 2019

Totally agree on the above thoughts @patrickhulce and #7059 looks like a great start. It looks like that may solve our customer-facing issue (with boomerang on the page), though possibly not the specific repro case above, correct?

Anything I can look into to help?

@patrickhulce

This comment has been minimized.

Copy link
Collaborator

commented Feb 8, 2019

That is correct, #7059 should solve your customer-facing issue 👍

I think it just needs to be reviewed so no further action needed on your part!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.