Skip to content
This repository has been archived by the owner on Feb 20, 2023. It is now read-only.

Analyze start up time telemetry to steer perf team direction #15926

Closed
mcomella opened this issue Oct 15, 2020 · 5 comments
Closed

Analyze start up time telemetry to steer perf team direction #15926

mcomella opened this issue Oct 15, 2020 · 5 comments
Labels
performance Possible performance wins wontfix

Comments

@mcomella
Copy link
Contributor

mcomella commented Oct 15, 2020

Problem: we're optimizing start up from the app icon (with no tabs) but users may rarely use it. We should figure out the most common ways users open the app.

In particular, the questions we want to answer with this analysis are:

  • What is the most common way to open the app? By how much?
    • COLD/WARM/HOT/etc.
    • MAIN/VIEW/etc.
      • If MAIN is non-trivial, how often do users have tabs open when they do it?
  • How common is savedInstanceState? Can it be safely ignored?
  • How does savedInstanceState impact start up time?
  • Are there any (slow) outliers in start up time that we should be concerned about?

┆Issue is synchronized with this Jira Task

@mcomella
Copy link
Contributor Author

MarcLeclair and I spoke and decided we should talk to a data scientist to see if we're in over our heads on this analysis because it's getting complicated. In particular, I sent about what we're trying to do and why it implies this is hard:

We have a probe, app_opened_all_startup that records a start up event and details about it (e.g. which entry point did the user use? Did the app have to load from disk or was it in memory?). We’re trying to answer the question, “Where should we spend engineering time improving start up performance?” which roughly equates to, “What are the most common ways and conditions under which users open the app?”

I think the analysis boils down to, for a start up event, what are the most common TYPE + SOURCE extras (the two main details we care about).

However, as for why (I think) we need a sample, I’m not sure we care about all users – I think we want to improve performance for the average users. For example, there may be users who are automation robots – we don’t want their start ups influencing our results. There may be users that open the app a handful of times and don’t use it – we maybe don’t care about their results. There may be one user who opens the app 1m times one particular way that biases the data – we don’t want their results.

The data help channel suggested we file a data scientist request to speak to a data scientist.

@mcomella
Copy link
Contributor Author

I have some WIP:

This other query might be helpful to validate some of our findings (just a simple total counter).

Someone from GLAM also mentioned that if we recorded all of these metrics in a single custom ping, it'd automatically be ingested by BigQuery into a single row that would make analysis much easier (rather than this nested extra non-sense) and automatically be imported into GLAM in the first iteration.

@liuche liuche removed the needs:triage Issue needs triage label Oct 22, 2020
@mcomella
Copy link
Contributor Author

Waiting on data science: https://jira.mozilla.com/browse/DO-363 because we're in over our head.

@stale
Copy link

stale bot commented Apr 26, 2021

See: #17373 This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Apr 26, 2021
@mcomella
Copy link
Contributor Author

Dupe #19085 and #18426.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
performance Possible performance wins wontfix
Projects
None yet
Development

No branches or pull requests

2 participants