Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extension trace issues (occasional -1 on TTI) #753

Closed
2 of 5 tasks
brendankenny opened this issue Oct 6, 2016 · 1 comment
Closed
2 of 5 tasks

extension trace issues (occasional -1 on TTI) #753

brendankenny opened this issue Oct 6, 2016 · 1 comment

Comments

@brendankenny
Copy link
Member

brendankenny commented Oct 6, 2016

I spent a little while looking into the latest trace issues, which seem to be only occurring in the extension. Usual manifestation is a -1 on TTI due to an error inside it while calculating FMP, even though the exact same calculation of FMP on the same trace was successful in an earlier audit. news.google.com is a site that can often produce this. Information dump in case this takes me a little while to get back to or someone else wants to take over.

tl;dr, I'm currently thinking:

  • Sort in ts order for FMP calculation. This would make us more correct but also unable to calculate FMP in more cases in the extension, at least.
  • Fix anywhere else we're dependent on ordering of entries in traceEvents array
  • Fix Adopt firstMeaningfulPaint trace event #618 when m54 comes out (~end of October)
  • Find out why ordering of events from the same thread can be different when looking at ts vs tts and (if expected) which timestamp should win in those cases
  • Better surface debugStrings in the extension so it's clearer why -1s are popping up. They're in the pretty print version of the report for the CLI, so it would be nice to at least log.warn them somewhere (in lighthouse-background.js before populating the report, maybe?) so you don't have to set breakpoints just to see them on the extension side.

What I've found so far:
The proximate cause is weird trace event ordering. When we're trying to find what we consider to be navigation start in a trace, we look for the first navigationStart event after the TracingStartedInPage event. A typical example problem trace, highlighting the TracingStartedInPage and navigationStart events:

screen shot 2016-10-06 at 12 33 07

  • If we look at event.ts ("the tracing clock timestamp") in this example, both navigationStart events occur before the TracingStartedInPage event, so according to ts, there is no nav start event after tracing started.
  • If we look at event.tts ("the thread clock timestamp of the event"), there is a navigationStart event after the tracing started.

All three events are in the same thread, so I don't see how they could have a different ordering based on the clock used. Interestingly, after many many runs I haven't been able to recreate this situation in the CLI, so I'm wondering if this is a weird timestamp bug somehow due to tracing from an extension.

I've uploaded an example trace to a gist, which you can look at in timeline viewer.

Other issues at work here:

  • The reason that one FMP calculation is successful while the next fails is that catapult/traceviewer is changing the array order of the trace events out from under us when we pass the trace into it between the two FMP calls. Specifically, it appears to sort the events purely by ts, which is not the order before we pass it in to traceviewer (it appears to be mostly, but not completely, in tts order).

    The calculation code assumes that a navigationStart event will be found after TracingStartedInPage. As stated above, if in tts order, we find the navigationStart where we expect it. When traceviewer sorts in ts order, no navigationStart is after tracing start, so we fail.

    We could prevent traceviewer from changing things by only passing a copy of the trace in or we could cache the FMP calculation, but...

  • We're sensitive to ordering of events in the trace, which we shouldn't be. In this case, if we explicitly order by tts before calculating FMP (as we implicitly depend on in the first run through FMP), we can continue to calculate successfully. However, we use ts for basically all other calculations, so it feels like we should be explicitly ordering by ts, always fail at calculating FMP in this situation, and hope this can be fixed upstream.

  • When Chrome 54 hits stable, we'll have a totally new way of calculating FMP which won't be sensitive to this, so some of this will soon (~end of October) be irrelevant. However, we should be more certain on the timestamps we use, be insensitive to trace event ordering, and we'll still need to know which navigationStart event we care about for other calculations.

paulirish pushed a commit that referenced this issue Dec 2, 2016
* Use fMP from chrome timings instead of calculating it ourselves.
* Cleanup audit & driver
* Cleanup unused functions
* Fix fmp calculations
* Fix test records

fixes #618, #1010, #890 and a part of #753
pr: #1066
@paulirish
Copy link
Member

FMP is now coming from the trace event, thanks to #1066

andrewrota pushed a commit to andrewrota/lighthouse that referenced this issue Jan 13, 2017
* Use fMP from chrome timings instead of calculating it ourselves.
* Cleanup audit & driver
* Cleanup unused functions
* Fix fmp calculations
* Fix test records

fixes GoogleChrome#618, GoogleChrome#1010, GoogleChrome#890 and a part of GoogleChrome#753
pr: GoogleChrome#1066
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants