Skip to content
This repository has been archived by the owner on Feb 29, 2020. It is now read-only.

Figure out perceived performance benchmarks #2005

Closed
Mardak opened this issue Jan 18, 2017 · 8 comments
Closed

Figure out perceived performance benchmarks #2005

Mardak opened this issue Jan 18, 2017 · 8 comments

Comments

@Mardak
Copy link
Member

Mardak commented Jan 18, 2017

Split from #1977. Determine "perceived performance" benchmarks (talk to philipp sackl). Mainly page loading to interactability of highlights/topsites. Probably good to have something around search.

@dmose
Copy link
Member

dmose commented Jan 23, 2017

https://www.instartlogic.com/blog/perceptual-speed-index-psi-measuring-above-fold-visual-performance-web-pages has relevant info, including a link to the VisualMetrics github repo, which has code to compute PSI from a video.

@digitarald might also have thoughts here...

@digitarald
Copy link

webpagetest (we have a private instance) gives you Speed Index but also can show user timings.

The gold standard for timings depends on you knowing how important each aspect of the content is. I would recommend adding timings for important content manually. Steve Souders has some good reference for this. Combine this with keeping the site interactive while streaming in the elements (webpagetest shows CPU and starts to add interactivity), as @Mardak said.

Happy to chat more about specifics if needed :)

@Mardak Mardak moved this from Unassigned to Milestone 1 in Land in Nightly / Graduate Mar 15, 2017
@tspurway tspurway added the P1 label Apr 17, 2017
@tspurway tspurway added this to the Xchamsiks (April 30) milestone Apr 17, 2017
@dmose dmose self-assigned this Apr 20, 2017
@dmose
Copy link
Member

dmose commented Apr 26, 2017

In addition to the useful reading that @digitarald suggested, I also got a bunch out of reading LinkedIn's blog post.

After a bunch of research, and ruminating, and poking around and discussion with Kate, I'm going to propose a set of user timings that I think are highest priority. In an ideal world, we'd do a bunch of user research around our existing implementation, but the SDK one is probably not the right starting point for performance stuff like this. So these measurements are in large part informed by what's easy to bootstrap/measure.

In addition to replacing about:newtab, activity stream also replaces about:home.
There are really three key performance contexts here:

about:newtab always preloaded in a hidden context, and then made visible in a fresh tab.

about:home: by default, is the first page that comes up at browser start in a new tab. This is effectively a normal render.

about:home from the toolbar button (not shown by default): will make the current tab load activity stream. Tab-blanking followed by a normal render.

After that, I'll also include an outline of stuff that we are likely to want to look at as time goes on.

I'll break down the user-timings session into graduation issues soon, and then resolve this bug.

Here are the first user timings that I propose we include; they should be useful in both development/profiling, synthetic regression testing, and Real User Monitoring (aka telemetry).

There is some perceptual stuff that I hope that we can get out of WebPageTest/Talos as well as performance.timing eventually, but I don't think that's the highest priority, since so much of that is focused on loading stuff over the network, and that's a set of issues we just don't have.

maybe/later

The workaround is having a "heartbeat" (continues 50ms timer, not having much perf impact). Using the skew of expected time and actual time you can tell if the main thread was blocked. Make sure to also include page visibility as factor as that can throttle timers. Somes sites started using this technique + meaningful paint to guess TTI.

  • tti
    • performance.timing.timeToInteractive
      • syn/RUM: once bug 1299118 lands
  • painting/display
    • first meaningful paint: "the user feels that the primary content of the page is visible” “biggest layout chunk painted”
      • syn/rum: use performance.timing.firstMeaningfulPaint
        • set pref: dom.performance.first-meaningful-paint.enabled
        • after bug 1299117 lands
    • SI (progress of above-fold loading)
    • PSI (above-fold loading, but notices visual jitter/layout thrashing)
      • syn: WPT
      • rum: (none)

@dmose
Copy link
Member

dmose commented Apr 27, 2017

I'm going to edit the above list in place as a checklist.

@digitarald
Copy link

for each of (about:newtab, about:home)

I am unsure about what the aforementioned about:newtab/about:home measure, is it delay to first paint after input (probably tab opening or blank, depending on interaction). Maybe those can be tracked as what kind of response is expected:

  • Time to tab blank for navigation within same tab
  • Time to tab opened and page starting to initialize

For page load metrics, the good old page load event will not provide much detail for SPA apps like AS. It is best to annotate when the components in the viewport load. To break it down even further I would recommend 3 stages:

  • First non-blank paint: only if first paint is not already content but some placeholder)
  • Hero Element: the first element to be painted that users interact with most, like the top sites panels)
  • Visual Complete: All above the fold content is loaded, including images
  • Display Done: Assuming anything loads below the fold this marks when all components are fully rendered, otherwise Meaningful Paint is Display Done.

This assumes the page is usable after Hero Element, meaning that no lazy loaded component causes noticeable jank (rule of thumb is frames longer than 50ms):

  • Slice any processing you can into smaller chunks and process it using requestIdleCallback
  • Never have any processing that can take longer than 10ms in response to an event. Delegate work of any input handler to requestIdleCallback.
  • Process data off the main thread using workers when possible

@dmose
Copy link
Member

dmose commented Apr 28, 2017

In addition to replacing about:newtab, activity stream also replaces about:home.
To be clearer, there are really three key performance contexts here:

about:newtab always preloaded in a hidden context, and then made visible in a fresh tab.

about:home: by default, is the first page that comes up at browser start in a new tab. This is effectively a normal render.

about:home from the toolbar button (not shown by default): will make the current tab load activity stream. Tab-blanking followed by a normal render.

The first two cases are clearly the most important. One thing all of these cases is that none of them (will, in the system-addon implementation) load any content directly from the network.

for each of (about:newtab, about:home)

I am unsure about what the aforementioned about:newtab/about:home measure,

Nothing, that was a header indicating that I want simple page-load proxy measures for each of two loading contexts (I just realized the third one while writing this up, so I'll modify the plan :-).

For page load metrics, the good old page load event will not provide much detail for SPA apps like AS.

Agreed; I mostly wanted to collect that to see where it shows up in the sequence compared to other events. That's probably not a good enough reason to collect it, especially given that it'll show up in the profiler when we profile.

It is best to annotate when the components in the viewport load.

As currently written, I was intending to just look at when the top-level React component had rendered and painted (equivalent to Display Done). My suspicion is that given that none of this stuff will be network loaded, and we're offloading more things to other threads, we'll be fast enough for most users that Display Done may show us that we don't need to break it down further.

To break it down even further I would recommend 3 stages:

  • First non-blank paint: only if first paint is not already content but some placeholder)

We'll add that if we end up needing a placeholder in the system add-on version.

  • Hero Element: the first element to be painted that users interact with most, like the top sites panels)

Yeah, I just checked our data, and topsites is the one. I'm thinking the best way to do it is using http://stackoverflow.com/a/34999925 .

  • Visual Complete: All above the fold content is loaded, including images

We don't specifically know exactly what content is going to be above the fold. For that reason, this seems like enough work that I'm inclined to put it off until we see the numbers we get back from Display Done and Hero Element (Top Sites Painted).

@digitarald I presumably it's hard to quantitatively monitor jank until we implement PerformanceFrameTiming. Or is there some other way to do that?

  • Never have any processing that can take longer than 10ms in response to an event. Delegate work of any input handler to requestIdleCallback.

So it sounds like we want all event handles to have performance.mark bookending?

@digitarald
Copy link

Yeah, I just checked our data, and topsites is the one. I'm thinking the best way to do it is using http://stackoverflow.com/a/34999925 .

Double rfa might be also an option here and is used by FB et al.

@digitarald I presumably it's hard to quantitatively monitor jank until we implement PerformanceFrameTiming. Or is there some other way to do that?

The workaround is having a "heartbeat" (continues 50ms timer, not having much perf impact). Using the skew of expected time and actual time you can tell if the main thread was blocked. Make sure to also include page visibility as factor as that can throttle timers. Somes sites started using this technique + meaningful paint to guess TTI.

So it sounds like we want all event handles to have performance.mark bookending?

That would be ideal. For more complex interactions you could even break it down into a R(A)IL measure to know how fast the first response paint happens and when the final state is painted.

@dmose
Copy link
Member

dmose commented Jun 5, 2017

I've spun off a new bug to make plans for synthetic perf tests which references the discussion in this bug.

I believe all the critical stuff from this bug has now been filed (and annotated in #2005 (comment) ), so I'm closing this one.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
No open projects
Development

No branches or pull requests

6 participants