For #8803: add frameworkStart telemetry #9788

mcomella · 2020-04-07T22:13:08Z

This is intended to measure the "dex" time before we can execute code. This builds upon my learning of how we want to capture telemetry metrics in #9153.

@Dexterp37 Please review this from a Glean perspective.

@boek or @liuche Please review as a data steward.

@MarcLeclair Please review as the primary code reviewer!

Reviews requests:

Data review
Review from the Glean team
Code review

Pull Request checklist

Tests: This PR includes thorough tests or an explanation of why it does not
Screenshots: This PR includes screenshots or GIFs of the changes made or an explanation of why it does not
Accessibility: The code in this PR follows accessibility best practices or does not include any user facing features. In addition, it includes a screenshot of a successful accessibility scan to ensure no new defects are added to the product.

After merge

Milestone: Make sure issues finished by this pull request are added to the milestone of the version currently in development.

To download an APK when reviewing a PR:

click on Show All Checks,
click Details next to "Taskcluster (pull_request)" after it appears and then finishes with a green checkmark,
click on the "Fenix - assemble" task, then click "Run Artifacts".
the APK links should be on the left side of the screen, named for each CPU architecture

mcomella · 2020-04-07T22:20:43Z

Request for data collection review form

All questions are mandatory. You must receive review from a data steward peer on your responses to these questions before shipping new data collection.

What questions will you answer with this data?

How long does the Android framework block for on various devices before giving us the ability to execute code?

Why does Mozilla need to answer these questions? Are there benefits for users? Do we need this information to address product or business requirements? Some example responses:

Understand how significant the time before we can execute code is so that we can choose whether or not we want to invest in optimizing the behavior.

What alternative methods did you consider to answer these questions? Why were they not sufficient?

We could measure locally on our reference devices but we wouldn't be able to determine 1) the variance across a wide range of devices or 2) how frequent and impactful outliers to local testing may be.

Can current instrumentation answer these questions?

No.

List all proposed measurements and indicate the category of data collection for each measurement, using the Firefox data collection categories found on the Mozilla wiki.

Note that the data steward reviewing your request will characterize your data collection based on the highest (and most sensitive) category.

Measurement Description	Data Collection Category	Tracking Bug #
framework_start: timespan	Category 1	#8803
framework_start_error: event	Category 1	#8803
clock_ticks_per_second: counter	Category 1	#8803

How long will this data be collected? Choose one of the following:

I want this data to be collected for 6 months initially (potentially renewable).

What populations will you measure?

All.

If this data collection is default on, what is the opt-out mechanism for users?

Standard telemetry opt-out.

Please provide a general description of how you will analyze this data.

Look at frameworkStart to understand how long it takes for the framework to start across all devices
Watch frameworkStartError to determine if there are any implementation errors.
Look at clockTicksPerSecond to see if this value changes across devices and, if so, potentially segment frameworkStart data to group data to the same significant figures

Where do you intend to share the results of your analysis?

Internally.

Is there a third-party tool (i.e. not Telemetry) that you are proposing to use for this data collection? If so:

No.

codecov-io · 2020-04-07T22:37:49Z

Codecov Report

Merging #9788 into master will increase coverage by 0.14%.
The diff coverage is 87.17%.

@@             Coverage Diff              @@
##             master    #9788      +/-   ##
============================================
+ Coverage     19.17%   19.31%   +0.14%     
- Complexity      521      536      +15     
============================================
  Files           336      339       +3     
  Lines         13729    13701      -28     
  Branches       1842     1830      -12     
============================================
+ Hits           2632     2647      +15     
+ Misses        10857    10816      -41     
+ Partials        240      238       -2

Impacted Files	Coverage Δ	Complexity Δ
...pp/src/main/java/org/mozilla/fenix/HomeActivity.kt	`10.06% <0.00%> (-0.07%)`	`10.00 <0.00> (ø)`
app/src/main/java/org/mozilla/fenix/perf/Stat.kt	`87.50% <87.50%> (ø)`	`5.00 <5.00> (?)`
...ain/java/org/mozilla/fenix/perf/StartupTimeline.kt	`62.50% <88.88%> (+62.50%)`	`2.00 <1.00> (+2.00)`
...rc/main/java/org/mozilla/fenix/FenixApplication.kt	`12.41% <100.00%> (+1.22%)`	`4.00 <1.00> (+1.00)`
...lla/fenix/perf/StartupFrameworkStartMeasurement.kt	`100.00% <100.00%> (ø)`	`9.00 <9.00> (?)`
.../fenix/settings/advanced/LocaleManagerExtension.kt	`62.50% <0.00%> (-8.34%)`	`0.00% <0.00%> (ø%)`
...nix/components/toolbar/BrowserToolbarController.kt	`61.04% <0.00%> (-3.66%)`	`0.00% <0.00%> (ø%)`
.../src/main/java/org/mozilla/fenix/utils/Settings.kt	`76.35% <0.00%> (-0.24%)`	`30.00% <0.00%> (ø%)`
...lla/fenix/components/toolbar/DefaultToolbarMenu.kt	`45.49% <0.00%> (-0.04%)`	`11.00% <0.00%> (-2.00%)`
... and 39 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update aef827e...b51198c. Read the comment docs.

Dexterp37 · 2020-04-08T12:40:39Z

This is intended to measure the "dex" time before we can execute code.

I can tell you that time! (sorry, had to make the joke)

Dexterp37 · 2020-04-08T12:41:37Z

app/metrics.yaml

+    bugs:
+      - https://github.com/mozilla-mobile/fenix/issues/8803
+    data_reviews:
+      - https://github.com/mozilla-mobile/fenix/pull/9788#issuecomment-610648980


Note that you need to link to the answers of the data-review by the data stewards.

I'll apply these changes after data review. :)

Dexterp37 · 2020-04-08T12:42:02Z

app/metrics.yaml

+    bugs:
+      - https://github.com/mozilla-mobile/fenix/issues/8803
+    data_reviews:
+      - https://github.com/mozilla-mobile/fenix/pull/9788#issuecomment-610648980


Dexterp37 · 2020-04-08T12:42:10Z

app/metrics.yaml

+    bugs:
+      - https://github.com/mozilla-mobile/fenix/issues/8803
+    data_reviews:
+      - https://github.com/mozilla-mobile/fenix/pull/9788#issuecomment-610648980


Dexterp37 · 2020-04-08T12:42:58Z

app/pings.yaml

+  bugs:
+    - https://github.com/mozilla-mobile/fenix/issues/8803
+  data_reviews:
+    - https://github.com/mozilla-mobile/fenix/pull/9788#issuecomment-610648980


And here :)

app/src/main/java/org/mozilla/fenix/perf/StartupFrameworkStartMeasurement.kt

Dexterp37

r+ With the data-review request fields fixed :)

app/metrics.yaml

app/src/main/java/org/mozilla/fenix/FenixApplication.kt

pocmo · 2020-04-14T15:15:14Z

app/src/main/java/org/mozilla/fenix/FenixApplication.kt

+    protected fun recordOnInit() {
+        // This gets called by more than one process. Ideally we'd only run this in the main process
+        // but the code to check which process we're in crashes because the Context isn't valid yet.
+        StartupTimeline.onApplicationInit()


nit: The comment in init is warning that nothing should happen before this. However recordOnInit() looks fairly harmless and I wonder if it could happen that someone adds something in front of it inside this method. Is there any reason we can't inline the call to StartupTimeline?

Good point. I wanted to:

avoid duplicating the comment, "this gets called by more than one process..."

(I care about this less) make it easy for additional Application classes to call this method, if necessary

@pocmo Is MigratingFenixApplication going away soon (merged into FenixApplication)? If so, I'll inline with duplicated comment. If not, I'll add a comment to this method not to add anything above that line.

app/src/migration/java/org/mozilla/fenix/MigratingFenixApplication.kt

app/src/main/java/org/mozilla/fenix/perf/StartupTimeline.kt

app/src/main/java/org/mozilla/fenix/perf/StartupFrameworkStartMeasurement.kt

MarcLeclair

The math checks out, so does the calculation method. I think using this for Telemetry is the best way we can go about gathering data about the Dex size since we can't control every users environment to be able to track the app creation.

liuche

data-review+

Data Review Form (to be filled by Data Stewards)

Is there or will there be documentation that describes the schema for the ultimate data set in a public, complete, and accurate way?

Yes, metrics.yaml gets generated into metrics.md

Is there a control mechanism that allows the user to turn the data collection on and off?

Yes, controlled by Fenix telemetry controls in settings

If the request is for permanent data collection, is there someone who will monitor the data over time?

6mo

Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under?

Type 1, technical data collected about startup time and errors.

Is the data collection request for default-on or default-off?

Default on

Does the instrumentation include the addition of any new identifiers (whether anonymous or otherwise; e.g., username, random IDs, etc. See the appendix for more details)?

No

Is the data collection covered by the existing Firefox privacy notice?

Yes

Does there need to be a check-in in the future to determine whether to renew the data? (Yes/No) (If yes, set a todo reminder or file a bug if appropriate)**

Has 6mo expiry

Does the data collection use a third-party collection tool? If yes, escalate to legal.

No

mcomella · 2020-04-16T23:34:44Z

Unit tests failed with failure I can't reproduce but is in the test that I added
UI tests failed with failure in SettingsAboutTest which seems like the intermittent [Bug] Possible intermittent UI test failure - SettingsAboutTest.verifyAboutFirefoxPreview #9685. That failure apparently can happen if the app crashes

This is the failure:

[task 2020-04-16T00:50:17.726Z] SUITE: org.mozilla.fenix.perf.StartupHomeActivityLifecycleObserverTest
[task 2020-04-16T00:50:17.726Z]   TEST: WHEN onStop is called THEN the metrics are set and the ping is submitted
[task 2020-04-16T00:50:17.726Z]   FAILURE
[task 2020-04-16T00:50:17.726Z] 
[task 2020-04-16T00:50:17.726Z] java.lang.AssertionError: Verification failed: number of calls happened not matching exact number of verification sequence
[task 2020-04-16T00:50:17.726Z] 
[task 2020-04-16T00:50:17.726Z] Matchers: 
[task 2020-04-16T00:50:17.726Z] StartupFrameworkStartMeasurement(frameworkStartMeasurement#1678).setExpensiveMetric())
[task 2020-04-16T00:50:17.726Z] PingType(startupTimeline#1679).submit(null()))
[task 2020-04-16T00:50:17.726Z] 
[task 2020-04-16T00:50:17.726Z] Calls:
[task 2020-04-16T00:50:17.726Z] 1) StartupFrameworkStartMeasurement(frameworkStartMeasurement#1678).setExpensiveMetric()

I'll rebase and see if it goes away.

…start metrics.

We need to access the data in stat to get the process start time, so we can calculate the time from process start until application.init for the frameworkStart probe.

This class controls the central logic around the metrics we want to record.

We primarily want to determine if this is a problem area for us to investigate rather than a long term measurement to keep so we should set the expiration date accordingly. Furthermore, this code executes before crash reporting is init so it's ideal to remove it sooner rather than later.

…t capture methods.

mcomella force-pushed the 8803-telemetry-dex-ticks branch from 91a193b to ec574f9 Compare April 7, 2020 22:22

mcomella requested review from Dexterp37 and boek and removed request for boek April 7, 2020 22:22

mcomella force-pushed the 8803-telemetry-dex-ticks branch from ec574f9 to 5d20d5a Compare April 7, 2020 22:26

mcomella mentioned this pull request Apr 7, 2020

For #8803: add StartupTimelineMeasurements + PoC metric #9153

Closed

10 tasks

Dexterp37 suggested changes Apr 8, 2020

View reviewed changes

mcomella requested review from Dexterp37, boek and MarcLeclair April 9, 2020 23:21

mcomella force-pushed the 8803-telemetry-dex-ticks branch from 5d20d5a to a7aaf76 Compare April 9, 2020 23:23

Dexterp37 approved these changes Apr 10, 2020

View reviewed changes

pocmo reviewed Apr 14, 2020

View reviewed changes

MarcLeclair reviewed Apr 14, 2020

View reviewed changes

app/src/main/java/org/mozilla/fenix/perf/StartupTimeline.kt Show resolved Hide resolved

MarcLeclair reviewed Apr 14, 2020

View reviewed changes

app/src/main/java/org/mozilla/fenix/perf/StartupFrameworkStartMeasurement.kt Show resolved Hide resolved

MarcLeclair approved these changes Apr 15, 2020

View reviewed changes

mcomella requested a review from liuche April 16, 2020 00:36

liuche approved these changes Apr 16, 2020

View reviewed changes

mcomella added 7 commits April 16, 2020 16:35

For mozilla-mobile#8803: add StartupTimeline ping type and framework_…

17d1ef3

…start metrics.

For mozilla-mobile#8803: add Stat and test.

7eef0cd

We need to access the data in stat to get the process start time, so we can calculate the time from process start until application.init for the frameworkStart probe.

For mozilla-mobile#8803: add StartupFrameworkStartMeasurement.

ac5f393

This class controls the central logic around the metrics we want to record.

For mozilla-mobile#8803: hook up frameworkStart metric.

5f103a7

For mozilla-mobile#8803 - review: Add clarifying comments to onAppIni…

a36eb94

…t capture methods.

For mozilla-mobile#8803 - post: update metrics & pings data review URL.

b51198c

mcomella force-pushed the 8803-telemetry-dex-ticks branch from 15cb5b1 to b51198c Compare April 16, 2020 23:44

mcomella merged commit 909ee73 into mozilla-mobile:master Apr 17, 2020

mcomella deleted the 8803-telemetry-dex-ticks branch April 17, 2020 16:12

liuche mentioned this pull request Oct 6, 2020

Issue #14142 - Telemetry renewal to 08-01-2021 #15713

Merged

3 tasks

mcomella added the Feature:Performance Used for data reviews to label metrics related to performance label Jun 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

For #8803: add frameworkStart telemetry #9788

For #8803: add frameworkStart telemetry #9788

mcomella commented Apr 7, 2020 •

edited

Loading

mcomella commented Apr 7, 2020

codecov-io commented Apr 7, 2020 •

edited

Loading

Dexterp37 commented Apr 8, 2020

Dexterp37 Apr 8, 2020

mcomella Apr 9, 2020

Dexterp37 Apr 8, 2020

Dexterp37 Apr 8, 2020

Dexterp37 Apr 8, 2020

Dexterp37 left a comment

pocmo Apr 14, 2020

mcomella Apr 15, 2020

MarcLeclair left a comment

liuche left a comment

mcomella commented Apr 16, 2020

For #8803: add frameworkStart telemetry #9788

For #8803: add frameworkStart telemetry #9788

Conversation

mcomella commented Apr 7, 2020 • edited Loading

Reviews requests:

Pull Request checklist

After merge

To download an APK when reviewing a PR:

mcomella commented Apr 7, 2020

Request for data collection review form

codecov-io commented Apr 7, 2020 • edited Loading

Codecov Report

Dexterp37 commented Apr 8, 2020

Dexterp37 Apr 8, 2020

Choose a reason for hiding this comment

mcomella Apr 9, 2020

Choose a reason for hiding this comment

Dexterp37 Apr 8, 2020

Choose a reason for hiding this comment

Dexterp37 Apr 8, 2020

Choose a reason for hiding this comment

Dexterp37 Apr 8, 2020

Choose a reason for hiding this comment

Dexterp37 left a comment

Choose a reason for hiding this comment

pocmo Apr 14, 2020

Choose a reason for hiding this comment

mcomella Apr 15, 2020

Choose a reason for hiding this comment

MarcLeclair left a comment

Choose a reason for hiding this comment

liuche left a comment

Choose a reason for hiding this comment

Data Review Form (to be filled by Data Stewards)

mcomella commented Apr 16, 2020

mcomella commented Apr 7, 2020 •

edited

Loading

codecov-io commented Apr 7, 2020 •

edited

Loading