Skip to content
This repository has been archived by the owner on Jan 12, 2023. It is now read-only.

Setup and initian glean #4891

Merged
merged 3 commits into from Jun 15, 2021
Merged

Setup and initian glean #4891

merged 3 commits into from Jun 15, 2021

Conversation

jonalmeida
Copy link
Contributor

Some notes:

  • Glean project setup issue.
  • Channel is based on Glean/Data's recommendation (here).
  • We no longer need to generate a metrics.md file (replaced with https://dictionary.telemetry.mozilla.org/).
  • Tested the activation with adb shell am start -n org.mozilla.focus.debug/mozilla.telemetry.glean.debug.GleanDebugActivity --es sendPing metrics --ez logPings true --es debugViewTag test-metrics-ping
    • One oddity is that the first time you run this, you only see the activation_id, the second time is when you see the other metrics. This is consistent behaviour, but I'm not sure if it's intentional.
  • Ignoring the DS requests for the sample metrics for now.
  • What email do we use for notification_emails?

If this looks good to the Glean folks as well, follow-up PRs will be code refactoring and creating metrics for the events that I've commented out.

@jonalmeida jonalmeida added the 🕵️‍♀️ needs review PRs that need to be reviewed label May 18, 2021
Copy link

@Dexterp37 Dexterp37 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good overall, I gave this a first pass and I'd like to do a second one :-)

Please have a read at our migration guide for porting products to Glean from service-telemetry.

This PR is additionally removing service-telemetry: we'd recommend not doing this or, at least, have a short overlapping period to do a thorough data validation :-) Happy to talk about this!

build.gradle Show resolved Hide resolved
app/build.gradle Outdated Show resolved Hide resolved
app/metrics.yaml Outdated Show resolved Hide resolved
app/metrics.yaml Outdated Show resolved Hide resolved
app/metrics.yaml Outdated Show resolved Hide resolved
app/metrics.yaml Outdated Show resolved Hide resolved
gradle.properties Outdated Show resolved Hide resolved
app/metrics.yaml Outdated Show resolved Hide resolved
@Dexterp37
Copy link

Some notes:
* We no longer need to generate a metrics.md file (replaced with https://dictionary.telemetry.mozilla.org/).

You might still want to advertise where to find data documentation in your README file :-)

* Tested the activation with `adb shell am start -n org.mozilla.focus.debug/mozilla.telemetry.glean.debug.GleanDebugActivity --es sendPing metrics --ez logPings true --es debugViewTag test-metrics-ping`
  
  * One oddity is that the first time you run this, you only see the activation_id, the second time is when you see the other metrics. This is consistent behaviour, but I'm not sure if it's intentional.

That's probably because the first time it starts it doesn't have any recorded metrics in the 'metrics' ping.

* Ignoring the DS requests for the sample metrics for now.

What do you mean?

* What email do we use for `notification_emails`?

It should be the email of the person/group of people that should get notified in case of data quality problems or in case somebody looking at the data has questions. We usually recommend adding at least 2 email addresses: one mailing list contact (e..g. if the Focus team has a mailing list or something) and an individual's email address (yours, since you're implementing :) ).

@Dexterp37
Copy link

@jonalmeida chances are that you still need legacy telemetry around for a bit. If that's the case, I think we should probably add the legacy telemetry client id to the new pings (or at least to some of them) so that analysts can link historical data to the new data

@jonalmeida
Copy link
Contributor Author

@jonalmeida chances are that you still need legacy telemetry around for a bit. If that's the case, I think we should probably add the legacy telemetry client id to the new pings (or at least to some of them) so that analysts can link historical data to the new data

Thanks! That actually helps make this initial PR smaller and I can separate the two systems out without needing to delete and add code all around the place at the same time.

Will leave more comments to yours above in a bit. 🙂

@jonalmeida
Copy link
Contributor Author

* Ignoring the DS requests for the sample metrics for now.

What do you mean?

Sorry, it wasn't clear. This metrics.yaml was taken from Fenix with just some bare metrics that we'd want to use in Focus, so that we could test that the system works. Hence the "To Be Decided" placeholders for the DS reviews and email. 🙂

* What email do we use for `notification_emails`?

It should be the email of the person/group of people that should get notified in case of data quality problems or in case somebody looking at the data has questions. We usually recommend adding at least 2 email addresses: one mailing list contact (e..g. if the Focus team has a mailing list or something) and an individual's email address (yours, since you're implementing :) ).

Thanks, I can add mine for now since I'm not aware of any mailing list for Focus that we can use today.

@jonalmeida
Copy link
Contributor Author

jonalmeida commented May 26, 2021

Blerg, I forgot that in order to make service-telemetry functional, we need to backport this PR that vendors the library into Focus as well, otherwise crash on startup in a release build.

@jonalmeida
Copy link
Contributor Author

jonalmeida commented May 26, 2021

Blerg, I forgot that in order to make service-telemetry functional, we need to backport this PR that vendors the library into Focus as well, otherwise crash on startup in a release build.

Didn't have many conflicts when cherry-picking this and now both telemetry clients seem to be working together.

@Dexterp37 I added the client ID from service-telemetry as part of the deletion-request which should satisfy #4898 I think.

If you could take another peek at the last commit, that would be great. :)

Copy link

@Dexterp37 Dexterp37 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend getting rid entirely of the MetricsService abstraction, unless it's strictly required in the short term to connect to other services as well (Leanplum)!
I'd strongly recommend dropping the complexity that comes with this abstraction, as this has been a major pain point in Fenix (for the Fenix devs :-) )-

app/metrics.yaml Show resolved Hide resolved
app/metrics.yaml Outdated Show resolved Hide resolved
app/metrics.yaml Outdated Show resolved Hide resolved
}
}

override fun track(event: Event) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend not following this approach :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be an alternative? I'm open to other options, although I prefer not to have Glean generated code around the code base.

It tends to be a problem if there are build issues with the parser and code generation fails (it those cases it's easier to disable one file that contains all the generated code), mocking out these classes in tests add overhead.

Another reason that is not in our direct path, but worth noting, is that forks of our browsers tend to remove our telemetry in place of their own, so having that in one place would also make that easier.

See my comment above about avoiding the MetricController abstraction.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be an alternative? I'm open to other options, although I prefer not to have Glean generated code around the code base.

The alternative is the recommended path: calling the Glean APIs where needed, instead of dispatching.
Note that the tone of this is, again, of a recommendation (albeit I strongly recommend doing that): your team owns Focus so it's ultimately your decision. However, the current approach has a series of documented shortcomings that bit us already in Fenix:

It tends to be a problem if there are build issues with the parser and code generation fails (it those cases it's easier to disable one file that contains all the generated code), mocking out these classes in tests add overhead.

Do you have a specific example in mind? If code generation fails, that's a bug that should be fixed by us rather than mocked/ignored. Moreover, Glean is pretty mature now and we don't expect these to happen frequently (they did not happen very frequently even early on FWIW :-) ).

Another reason that is not in our direct path, but worth noting, is that forks of our browsers tend to remove our telemetry in place of their own, so having that in one place would also make that easier.

While I can sympathise with this, we already perform our due diligence by making disabling data collection extremely simple. Folks who want to maintain a fork can still do so, committing a little time stripping Glean if they wish so.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for detailed response that gives really good context!

I want to move forward with this patch, so I'll take your recommendation on that and if we can discuss more in the future if things need to change. :)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for detailed response that gives really good context!

My pleasure! It took a bit of bugzilla-archeology but now I have the context back as well :-)

I want to move forward with this patch, so I'll take your recommendation on that and if we can discuss more in the future if things need to change. :)

Thank you so much! We're more than happy to discuss changes going forward and help with any problem that might arise!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My pleasure! It took a bit of bugzilla-archeology but now I have the context back as well :-)

Will reference your comment in future discussions so the effort doesn't go to waste! 😄

@jonalmeida
Copy link
Contributor Author

I would recommend getting rid entirely of the MetricsService abstraction, unless it's strictly required in the short term to connect to other services as well (Leanplum)!

It's worth noting, a replacement service for Leanplum might be coming in the future.

@jonalmeida jonalmeida force-pushed the glean branch 2 times, most recently from 396e77c to 7046499 Compare May 26, 2021 15:53
@travis79
Copy link
Member

I would recommend getting rid entirely of the MetricsService abstraction, unless it's strictly required in the short term to connect to other services as well (Leanplum)!

It's worth noting, a replacement service for Leanplum might be coming in the future.

I'm really hoping and planning on Nimbus filling those requirements, and then it won't require distributing the telemetry between two systems.

@Dexterp37
Copy link

I would recommend getting rid entirely of the MetricsService abstraction, unless it's strictly required in the short term to connect to other services as well (Leanplum)!

It's worth noting, a replacement service for Leanplum might be coming in the future.

Even when using Leanplum in Fenix, the callsites for Leanplum events were just a few and could have just been simple calls at the call site :-) See the bug I referenced above.

Again, that's your call, but I would advise to not go down that path again

@travis79
Copy link
Member

Don't forget data-review, please!

@jonalmeida
Copy link
Contributor Author

Don't forget data-review, please!

We're tracking this with #4901 and shouldn't affect merging this PR; we do not release from main for now.

@jonalmeida jonalmeida force-pushed the glean branch 2 times, most recently from ec84802 to 4b66617 Compare June 3, 2021 20:26
@jonalmeida jonalmeida requested a review from Dexterp37 June 3, 2021 20:36
Copy link

@Dexterp37 Dexterp37 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r+wc: let's make sure the tests pass before merging. Happy to help getting this unblocked as needed!

app/src/main/java/org/mozilla/focus/telemetry/Event.kt Outdated Show resolved Hide resolved
Copy link
Contributor Author

@jonalmeida jonalmeida left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pocmo This should be good to land if you want to take a peek as well. 🙂

@jonalmeida jonalmeida added 🛬 needs landing PRs that are ready to land and removed 🕵️‍♀️ needs review PRs that need to be reviewed labels Jun 15, 2021
@mergify mergify bot merged commit 7d04568 into mozilla-mobile:main Jun 15, 2021
@jonalmeida jonalmeida deleted the glean branch June 15, 2021 14:27
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
🛬 needs landing PRs that are ready to land
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants