Skip to content
This repository has been archived by the owner on Feb 20, 2023. It is now read-only.

Closes #19147: Move startup metrics to right after Glean initialization #19252

Merged
merged 1 commit into from
May 5, 2021

Conversation

rocketsroger
Copy link
Contributor

Pull Request checklist

  • Tests: This PR includes thorough tests or an explanation of why it does not
  • Screenshots: This PR includes screenshots or GIFs of the changes made or an explanation of why it does not
  • Accessibility: The code in this PR follows accessibility best practices or does not include any user facing features. In addition, it includes a screenshot of a successful accessibility scan to ensure no new defects are added to the product.

To download an APK when reviewing a PR:

  1. click on Show All Checks,
  2. click Details next to "Taskcluster (pull_request)" after it appears and then finishes with a green checkmark,
  3. click on the "Fenix - assemble" task, then click "Run Artifacts".
  4. the APK links should be on the left side of the screen, named for each CPU architecture

@rocketsroger rocketsroger requested review from a team as code owners April 26, 2021 19:27
@rocketsroger rocketsroger added the needs:review PRs that need to be reviewed label Apr 26, 2021
@gabrielluong
Copy link
Member

Is there also a data eng/sci peer that can take a look at this since we are also updating the glean types for some metrics?

@gabrielluong gabrielluong self-requested a review April 27, 2021 01:52
@rocketsroger
Copy link
Contributor Author

Is there also a data eng/sci peer that can take a look at this since we are also updating the glean types for some metrics?

Sure, I'll add reviewer from data science.

@rocketsroger
Copy link
Contributor Author

@Dexterp37 please confirm that this change is what we still want for #12131. Thanks,

Copy link
Member

@badboy badboy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking here while I wait for confirmation that just changing the type won't break anything.

This is now code that's executed earlier, is there any way to verify that it has a neglibile impact on startup?

Comment on lines 520 to 521
* limitations of the metrics ping. Events are only sent in a metrics ping if the user have made
* changes between each ping. However, in some cases we want current values to be sent even if
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure which "Events" you mean here. Can you expand?
As I see it the shared preferences are simply there to store preferences, making it the source of truth, so reading from them is the right thing to do. What am I missing?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see now that is just copied from the old code.
Still wonder what this part means.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this was copied from the original function. I will update it so it makes more sense. Thanks,

@@ -1270,7 +1270,7 @@ metrics:

preferences:
show_search_suggestions:
type: string_list
type: boolean
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for your awareness: This will work, but land in a different column.
Old: metrics.string_list.preferences_show_search_suggestions
New: metrics.boolean.preferences_show_search_suggestions

That means comparing these across versions is a bit more involved, but not impossible.
(There should never be a client that has both set though)

(I'm still waiting for confirmation that what I said is actually true)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please confirm that this is OK. Since this workaround and because currently there are high number of startup metric pings that have null values, I would argue that comparing against data we've collected before is not very useful.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree it sounds like it's worthwhile to make the change. You still may want to add in the description field "Prior to release X, this metric was of type string_list" so that if someone wanted to trace back through the (problematic) historical data they'd at least know how...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This kind of change is not supported by our tooling, a similar change of a string to a datetime in the past has caused delays on our schema deploys: mozilla/mozilla-schema-generator#118

Type changes is typically frowned upon at a low level because it breaks schema evolution rules that allows for low operational overhead during ingestion time. It's possible to support changing types in the schema generator based on the behavior of moving the probe to a new namespace (which avoids a schema incompatible change). However, this will be confusing from a metadata perspective since we lose context on the type change.

I suggest expiring these probes and creating a new probe with the correct type, while clearly documenting the reason for deprecation so that tools like the glean dictionary can pick this up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood. I will look into expiring the probes and create new ones with the correct type. Thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rocketsroger Can you ping me when you think you need a final review? Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gabrielluong Sure, this should be the final change but I'll ping you once I have the go ahead from the data science team. Thanks,

Comment on lines 516 to 522
* This function is called right after Glean is initialized. Part of this function depends on
* shared preferences to be updated so the correct value is sent with the metrics ping.
*
* The reason we're using shared preferences to track some of these values is due to the
* limitations of the metrics ping. Events are only sent in a metrics ping if the user have made
* changes between each ping. However, in some cases we want current values to be sent even if
* the user have not changed anything between pings.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why we gotta keep the double spaces when it's not utilized everywhere.

Suggested change
* This function is called right after Glean is initialized. Part of this function depends on
* shared preferences to be updated so the correct value is sent with the metrics ping.
*
* The reason we're using shared preferences to track some of these values is due to the
* limitations of the metrics ping. Events are only sent in a metrics ping if the user have made
* changes between each ping. However, in some cases we want current values to be sent even if
* the user have not changed anything between pings.
* This function is called right after Glean is initialized. Part of this function depends on
* shared preferences to be updated so the correct value is sent with the metrics ping.
*
* The reason we're using shared preferences to track some of these values is due to the
* limitations of the metrics ping. Events are only sent in a metrics ping if the user have made
* changes between each ping. However, in some cases we want current values to be sent even if
* the user have not changed anything between pings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll update this since I'll rewrite this. Not sure why originally this was double spaced.

@rocketsroger
Copy link
Contributor Author

This is now code that's executed earlier, is there any way to verify that it has a neglibile impact on startup?

Yes, I think this will have some impact on startup performance. However, this is currently the only workaround suggestion from the Glean team. Once the Glean investigation is over we will update this if there is a better solution.

@gabrielluong
Copy link
Member

Will take another look at this thoroughly tomorrow.

@gabrielluong
Copy link
Member

Sorry, didn't get to this today, will review it tomorrow!

@codecov-commenter
Copy link

codecov-commenter commented Apr 29, 2021

Codecov Report

Merging #19252 (9d95275) into master (5b5e7eb) will increase coverage by 0.01%.
The diff coverage is 57.57%.

❗ Current head 9d95275 differs from pull request most recent head c545a7b. Consider uploading reports for the commit c545a7b to get more accurate results
Impacted file tree graph

@@             Coverage Diff              @@
##             master   #19252      +/-   ##
============================================
+ Coverage     34.87%   34.88%   +0.01%     
+ Complexity     1618     1617       -1     
============================================
  Files           540      540              
  Lines         21816    21806      -10     
  Branches       3249     3249              
============================================
- Hits           7609     7608       -1     
+ Misses        13307    13300       -7     
+ Partials        900      898       -2     
Impacted Files Coverage Δ Complexity Δ
...la/fenix/components/metrics/GleanMetricsService.kt 19.88% <ø> (-5.91%) 2.00 <0.00> (-3.00)
...rc/main/java/org/mozilla/fenix/FenixApplication.kt 26.20% <57.14%> (+16.00%) 10.00 <2.00> (+2.00)
...ain/java/org/mozilla/fenix/components/Analytics.kt 66.66% <100.00%> (ø) 1.00 <0.00> (ø)
...rg/mozilla/fenix/settings/account/AccountUiView.kt 36.58% <0.00%> (+2.43%) 4.00% <0.00%> (ø%)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5b5e7eb...c545a7b. Read the comment docs.

Copy link
Member

@badboy badboy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me now!

Copy link
Member

@gabrielluong gabrielluong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to check with the perf-team?

@@ -856,10 +846,7 @@ private val Event.wrapper: EventWrapper<*>?
}

class GleanMetricsService(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use a kDoc if we can come up with a nice description of what GleanMetricsService does

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll try to describe this service.

private fun setPreferenceMetrics(
settings: Settings
) {
// We purposefully make all of our preferences the string_list format to make data analysis
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine we should change this comment if not remove it entirely since they're boolean now. I think this comment shouldn't be applied to the with block either since it only applies to line 613-623.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Thanks I'll update

@rocketsroger
Copy link
Contributor Author

Do we need to check with the perf-team?

I'll notify the perf-team. This will affect the startup performance but unfortunately there's no other solutions.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants