-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
General purpose observer injection #14
Conversation
I'm still wondering if there is value in quantity over quality. Even lightweight, is it useful to know which properties are used most often? For some of the sensitive ones, like window.navigator, I could see there being more value in a more-expensive hook that gets the call stack and counts occurrences by script origin. |
I reduced the amount of noise in my local tests by renaming the WPT metric name to {
"call_stacks": {
"navigator.userAgent": []
},
"performance.clearMarks": 1,
"performance.clearMeasures": 1,
"performance.getEntriesByType": 5,
"performance.timing": 9,
"performance.timing.domInteractive": 1,
"performance.timing.domContentLoadedEventStart": 1,
"performance.timing.domContentLoadedEventEnd": 1,
"performance.timing.domComplete": 1,
"performance.timing.loadEventStart": 1,
"performance.timing.loadEventEnd": 1,
"Array.prototype.indexOf": 182,
"String.prototype.match": 50,
"String.prototype.substring": 20
} @pmeenan I assume there's not much we can do about those besides subtracting after the fact. |
@rviscomi we may be able to move the WPT metrics collection to be after running the custom metrics but that will open it up to a little risk of custom metrics breaking the WPT metrics (mostly marks/measures if a custom metric does anything destructive) |
For our private instance, how feasible would it be for WPT code to execute between toggling |
I moved things around a bit to minimize the amount of agent logic between the scripts being injected and the custom metrics. {
"call_stacks": {
"navigator.userAgent": []
},
"Array.prototype.indexOf": 182,
"String.prototype.match": 50,
"String.prototype.substring": 20
} Not sure where those uses are coming from but it's possible some of them are from the injected script itself. |
Added them to the props to trace 😄 const PROPERTIES_TO_TRACE = new Set([
'navigator.userAgent',
'String.prototype.substring',
'String.prototype.match',
'Array.prototype.indexOf'
]); {
"call_stacks": {
"navigator.userAgent": [],
"Array.prototype.indexOf": {
"Error\n at Array.get (<anonymous>:99:19)\n at WptAgentFlatten (<anonymous>:32:50)\n at WptAgentReportPerformanceTiming (<anonymous>:50:90)\n at PerformanceObserver.<anonymous> (<anonymous>:55:5)": 5,
"Error\n at Array.get (<anonymous>:99:19)\n at WptAgentFlatten (<anonymous>:16:75)\n at WptAgentReportPerformanceTiming (<anonymous>:50:90)\n at PerformanceObserver.<anonymous> (<anonymous>:55:5)": 1,
"Error\n at Array.get (<anonymous>:99:19)\n at WptAgentFlatten (<anonymous>:32:50)\n at WptAgentFlatten (<anonymous>:17:33)\n at WptAgentReportPerformanceTiming (<anonymous>:50:90)\n at PerformanceObserver.<anonymous> (<anonymous>:55:5)": 11,
"Error\n at Array.get (<anonymous>:99:19)\n at WptAgentFlatten (<anonymous>:16:75)\n at WptAgentFlatten (<anonymous>:17:33)\n at WptAgentReportPerformanceTiming (<anonymous>:50:90)\n at PerformanceObserver.<anonymous> (<anonymous>:55:5)": 165
},
"String.prototype.match": {
"Error\n at String.get (<anonymous>:99:19)\n at WptAgentFlatten (<anonymous>:40:22)\n at WptAgentReportPerformanceTiming (<anonymous>:50:90)\n at PerformanceObserver.<anonymous> (<anonymous>:55:5)": 9,
"Error\n at String.get (<anonymous>:99:19)\n at WptAgentFlatten (<anonymous>:40:22)\n at WptAgentFlatten (<anonymous>:17:33)\n at WptAgentReportPerformanceTiming (<anonymous>:50:90)\n at PerformanceObserver.<anonymous> (<anonymous>:55:5)": 41
},
"String.prototype.substring": {
"Error\n at String.get (<anonymous>:99:19)\n at WptAgentFlatten (<anonymous>:33:33)\n at WptAgentReportPerformanceTiming (<anonymous>:50:90)\n at PerformanceObserver.<anonymous> (<anonymous>:55:5)": 5,
"Error\n at String.get (<anonymous>:99:19)\n at WptAgentFlatten (<anonymous>:36:49)\n at WptAgentReportPerformanceTiming (<anonymous>:50:90)\n at PerformanceObserver.<anonymous> (<anonymous>:55:5)": 5,
"Error\n at String.get (<anonymous>:99:19)\n at WptAgentFlatten (<anonymous>:33:33)\n at WptAgentFlatten (<anonymous>:17:33)\n at WptAgentReportPerformanceTiming (<anonymous>:50:90)\n at PerformanceObserver.<anonymous> (<anonymous>:55:5)": 5,
"Error\n at String.get (<anonymous>:99:19)\n at WptAgentFlatten (<anonymous>:36:49)\n at WptAgentFlatten (<anonymous>:17:33)\n at WptAgentReportPerformanceTiming (<anonymous>:50:90)\n at PerformanceObserver.<anonymous> (<anonymous>:55:5)": 5
}
},
"Array.prototype.indexOf": 182,
"String.prototype.match": 50,
"String.prototype.substring": 20
} |
Ahh, looks like it is all from the injected performance observer that gets the LCP and CLS element details at the time of the event. We can probably update the injected script to either disable/re-enable measurement when it is handling an event or bypass the overrides to call the native methods directly. |
Could that execute after custom metrics? 🤔 |
No, it has to execute while the page is loading so it can get the state of the LCP element at the time the event fires. |
Tweaking the inject script to remove as many instrumented methods as possible (switching to comparisons insted of indexof, etc). I'm down to one substring call now. |
I wouldn't worry too much about that since this list of observers may grow in the future. I think we'll need a way to detect when a feature is used directly by WPT and ignore it. It'd add a lot of overhead but we can get call stacks for every feature and look for WPT keywords. WDYT? |
For now we're in good shape. Updated test {"call_stacks":{"navigator.userAgent":[]}} I prefer that we be careful about the methods that we instrument to make sure there is enough of a ROI/use case to measure them before we talk about adding more overhead to each event because adding call stacks to every string and array method access is going to add SIGNIFICANT overhead to some sites. I like what we did with navigator.userAgent because we have an explicit interest from the privacy analytics for seeing what specific scripts are fingerprinting. On the call-stacks side of things, I think we want to do some agent-side processing on those as well to pull out just the top-level origin of the URL of the calling script (maybe in addition to the full stack) so it is easy to query. |
Update observers.js
@pmeenan if you're still ok with the full stack in the results, I think we can add some shared functions on the SQL side to make processing easier. With that, does this PR LGTY? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
If you're watching requestIdleCallback, might be good to also watch isInputPending
Good suggestion, thanks. Added. |
Progress on:
HTTPArchive/almanac.httparchive.org#2881
HTTPArchive/almanac.httparchive.org#2890
HTTPArchive/almanac.httparchive.org#2891
Changes the observer behavior to be able to handle wildcards in the property pathname. For example,
navigator.*
will observe all properties on thenavigator
object. Special handling was added for__proto__
properties. Also adds the ability to capture the call stack for any particular properties.Tests
example.com
Results
It seems like the reset isn't being applied early enough. This might fix itself after the change is merged and the WPT agents consistently execute them in alphabetical order.
As expected,
navigator.userAgent
is not detected and the call stack is empty.https://codepen.io/rviscomi/pen/rNJMmgy
Results
Omitted the custom metric observations from these results for clarity, but they're the same as above.
nytimes.com
Results
Show custom metric results