Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good overall, I gave this a first pass and I'd like to do a second one :-)
Please have a read at our migration guide for porting products to Glean from service-telemetry.
This PR is additionally removing service-telemetry: we'd recommend not doing this or, at least, have a short overlapping period to do a thorough data validation :-) Happy to talk about this!
app/src/main/java/org/mozilla/focus/telemetry/TelemetryWrapper.kt
Outdated
Show resolved
Hide resolved
You might still want to advertise where to find data documentation in your README file :-)
That's probably because the first time it starts it doesn't have any recorded metrics in the 'metrics' ping.
What do you mean?
It should be the email of the person/group of people that should get notified in case of data quality problems or in case somebody looking at the data has questions. We usually recommend adding at least 2 email addresses: one mailing list contact (e..g. if the Focus team has a mailing list or something) and an individual's email address (yours, since you're implementing :) ). |
@jonalmeida chances are that you still need legacy telemetry around for a bit. If that's the case, I think we should probably add the legacy telemetry client id to the new pings (or at least to some of them) so that analysts can link historical data to the new data |
Thanks! That actually helps make this initial PR smaller and I can separate the two systems out without needing to delete and add code all around the place at the same time. Will leave more comments to yours above in a bit. 🙂 |
Sorry, it wasn't clear. This metrics.yaml was taken from Fenix with just some bare metrics that we'd want to use in Focus, so that we could test that the system works. Hence the "To Be Decided" placeholders for the DS reviews and email. 🙂
Thanks, I can add mine for now since I'm not aware of any mailing list for Focus that we can use today. |
Blerg, I forgot that in order to make service-telemetry functional, we need to backport this PR that vendors the library into Focus as well, otherwise crash on startup in a release build. |
Didn't have many conflicts when cherry-picking this and now both telemetry clients seem to be working together. @Dexterp37 I added the client ID from service-telemetry as part of the deletion-request which should satisfy #4898 I think. If you could take another peek at the last commit, that would be great. :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would recommend getting rid entirely of the MetricsService
abstraction, unless it's strictly required in the short term to connect to other services as well (Leanplum)!
I'd strongly recommend dropping the complexity that comes with this abstraction, as this has been a major pain point in Fenix (for the Fenix devs :-) )-
app/src/main/java/org/mozilla/focus/telemetry/GleanMetricsService.kt
Outdated
Show resolved
Hide resolved
} | ||
} | ||
|
||
override fun track(event: Event) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd recommend not following this approach :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be an alternative? I'm open to other options, although I prefer not to have Glean generated code around the code base.
It tends to be a problem if there are build issues with the parser and code generation fails (it those cases it's easier to disable one file that contains all the generated code), mocking out these classes in tests add overhead.
Another reason that is not in our direct path, but worth noting, is that forks of our browsers tend to remove our telemetry in place of their own, so having that in one place would also make that easier.
See my comment above about avoiding the MetricController abstraction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be an alternative? I'm open to other options, although I prefer not to have Glean generated code around the code base.
The alternative is the recommended path: calling the Glean APIs where needed, instead of dispatching.
Note that the tone of this is, again, of a recommendation (albeit I strongly recommend doing that): your team owns Focus so it's ultimately your decision. However, the current approach has a series of documented shortcomings that bit us already in Fenix:
- A perf investigation found that
.track
is a performance problem. - A Fenix + Glean postmortem identified that this abstraction made it unsustainable to add new metrics at the pace required by Fenix.
- An investigation by our team with the help of some Fenix folks to identify a better option to streamline metric addition..
It tends to be a problem if there are build issues with the parser and code generation fails (it those cases it's easier to disable one file that contains all the generated code), mocking out these classes in tests add overhead.
Do you have a specific example in mind? If code generation fails, that's a bug that should be fixed by us rather than mocked/ignored. Moreover, Glean is pretty mature now and we don't expect these to happen frequently (they did not happen very frequently even early on FWIW :-) ).
Another reason that is not in our direct path, but worth noting, is that forks of our browsers tend to remove our telemetry in place of their own, so having that in one place would also make that easier.
While I can sympathise with this, we already perform our due diligence by making disabling data collection extremely simple. Folks who want to maintain a fork can still do so, committing a little time stripping Glean if they wish so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for detailed response that gives really good context!
I want to move forward with this patch, so I'll take your recommendation on that and if we can discuss more in the future if things need to change. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for detailed response that gives really good context!
My pleasure! It took a bit of bugzilla-archeology but now I have the context back as well :-)
I want to move forward with this patch, so I'll take your recommendation on that and if we can discuss more in the future if things need to change. :)
Thank you so much! We're more than happy to discuss changes going forward and help with any problem that might arise!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My pleasure! It took a bit of bugzilla-archeology but now I have the context back as well :-)
Will reference your comment in future discussions so the effort doesn't go to waste! 😄
It's worth noting, a replacement service for Leanplum might be coming in the future. |
396e77c
to
7046499
Compare
I'm really hoping and planning on Nimbus filling those requirements, and then it won't require distributing the telemetry between two systems. |
Even when using Leanplum in Fenix, the callsites for Leanplum events were just a few and could have just been simple calls at the call site :-) See the bug I referenced above. Again, that's your call, but I would advise to not go down that path again |
Don't forget data-review, please! |
We're tracking this with #4901 and shouldn't affect merging this PR; we do not release from |
ec84802
to
4b66617
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
r+wc: let's make sure the tests pass before merging. Happy to help getting this unblocked as needed!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pocmo This should be good to land if you want to take a peek as well. 🙂
Some notes:
metrics.md
file (replaced with https://dictionary.telemetry.mozilla.org/).adb shell am start -n org.mozilla.focus.debug/mozilla.telemetry.glean.debug.GleanDebugActivity --es sendPing metrics --ez logPings true --es debugViewTag test-metrics-ping
notification_emails
?If this looks good to the Glean folks as well, follow-up PRs will be code refactoring and creating metrics for the events that I've commented out.