Skip to content
This repository has been archived by the owner on Jan 17, 2023. It is now read-only.

Implement metrics broker #1735

Closed
ghost opened this issue Nov 3, 2016 · 11 comments
Closed

Implement metrics broker #1735

ghost opened this issue Nov 3, 2016 · 11 comments
Assignees
Milestone

Comments

@ghost
Copy link

ghost commented Nov 3, 2016

There is an idea/dream/blueprint at #1662 of a new metrics function implemented in Test Pilot. We should implement it (as a new function, probably, to not mess with the current experiments).

At a minimum lets connect GA and Telemetry. For extra credit, sync up with Tim or Marina and let's start pinging Tim's pipeline too. Activity Stream code uses it, and they have some docs at https://github.com/mozilla/activity-stream/blob/master/data_dictionary.md and https://github.com/mozilla/activity-stream/blob/master/data_events.md . I can help with who to talk to on that team.

This issue is probably bigger than it seems because we're working with 3rd party systems.

@ghost ghost added this to the Sprint 14 milestone Nov 4, 2016
@ghost ghost assigned jaredhirsch Nov 7, 2016
@jaredhirsch
Copy link
Member

@wresuolc Taking this, since it's still up for grabs.

Does Test Pilot need to batch up packets, or can we just fire off events as they are received? If they need to be batched, what are the requirements? (Or is contacting the Telemetry and Tiles teams part of this bug?)

@jaredhirsch
Copy link
Member

Another question for myself or anybody: what data has turned out to be most valuable, so far?

Ping the experiment teams to find out what data led to product changes, and what data wasn't used for decision making.

Once we have that data, we can turn it into suggestions / examples in the metrics doc (#1662).

@ghost
Copy link
Author

ghost commented Nov 7, 2016

No batching, fire away. I've got an open thread w/ legal about your "who owns the GA account" concern

@ghost
Copy link
Author

ghost commented Nov 10, 2016

I've confirmed with legal that we're good to go on the GA plan here (allowing someone to put their credentials in the add-on).

Turns out the agreements we have with google are actually available to anyone. In the GA account you can go into the settings and adjust how much your data is smooshed together with other data, etc. We'll require that experiments have those settings adjusted to protect privacy.

If an experiment won't do that for some reason, we'll just leave the GA credentials blank and data won't be collected there.

@jaredhirsch
Copy link
Member

@wresuolc @lmorchard Based on the docs in #1662, I sketched out a super rough draft of a testpilot-metrics library. I'd really appreciate feedback, just to get the sense of whether this is headed in the right direction: https://github.com/6a68/testpilot-metrics/tree/wip-getting-started

One thing I'm not clear on is whether testpilot-metrics is actually going to ping the telemetry servers, or just fire an nsIObserver signal over to the test pilot addon.

@lmorchard
Copy link
Contributor

lmorchard commented Nov 15, 2016

One thing I'm not clear on is whether testpilot-metrics is actually going to ping the telemetry servers, or just fire an nsIObserver signal over to the test pilot addon.

I'd say ping the main addon with nsIObserver (SDK & bootstrapped & XUL) or BroadcastChannel (webextension). By going with the messaging channels, we tried to simplify the usage of TelemetryController.submitExternalPing for legacy add-ons - and to make it possible at all for webextensions.

There's also more that goes on in Telemetry than just pinging a server via HTTP. The common submission API we use with TelemetryController.submitExternalPing handles some throttling & retry logic, as well as folding in a blob of environment data for even more context in the ping. (And I'm not sure WebExtensions can assemble the equivalent of that env data)

@lmorchard
Copy link
Contributor

lmorchard commented Nov 15, 2016

Sounds like what we need here is reusable code that works for:

  1. Non-SDK add-ons (restartless, bootstrapped, XUL, I don't even know what all the terms are anymore)
  2. SDK add-ons (jetpack, jpm, etc)
  3. WebExtension add-ons
  4. Embedded WebExtensions? (basically WebExtension centers candy-coated in a legacy add-on shell)

And we need that code to offer a function that accepts a generic reporting data structure and maps it to specific pings sent to:

  1. Google Analytics
  2. Telemetry
  3. Onyx (maybe?)

Seems like GA & Onyx are simple (?) HTTP services. Telemetry is a bit more complicated. Also, will a single module on npm work for every add-on type we want to support?

@jaredhirsch
Copy link
Member

jaredhirsch commented Nov 15, 2016

wresuolc, lmorchard, and I met to discuss this issue earlier today. Notes are here: https://public.etherpad-mozilla.org/p/test-pilot-metrics-chat-20161115

Copying over the conclusions/decisions to make life easy:

  • experiments will send messages to TxP, and TxP docs will show how to do it (this is already done)
  • TxP will forward pings to telemetry and GA and ping centre
  • TxP will only support the 'event' GA hit type for v1
    • if experiments need deeper GA integration, they can do it themselves
  • new experiments will use the event-object unified ping type
    • for the next new experiment, we will coordinate with ping centre + telemetry teams to ensure everyone's on board with the proposed new standard
  • old experiments can stick with whatever format they are using
    • if ping centre has the bandwidth, they can add support for each existing experiment, as each has a metrics.md with schema

My todo list for this issue:

  • delete the mozilla/testpilot-metrics repo
  • land the metrics broker code in the test pilot addon
  • update metrics docs as needed to reflect the above
  • contact ping centre team to see if we can start sending over throwaway packets (answer: yes, it drops unknown packets)

@johngruen johngruen modified the milestone: Sprint 14 Nov 16, 2016
@jaredhirsch jaredhirsch modified the milestone: Sprint 14 Nov 21, 2016
@ghost ghost modified the milestones: Sprint 15, Sprint 14 Nov 23, 2016
@jaredhirsch
Copy link
Member

I've opened a PR against ping-centre to add support for the testpilot ping type: mozilla/ping-centre#23.

Currently blocked on landing that PR and ping-centre being published to npm (mozilla/ping-centre#22).

@jaredhirsch
Copy link
Member

Here's a WIP branch while we're waiting for ping-centre to hit npm: https://github.com/6a68/testpilot/tree/add-ping-centre

@jaredhirsch
Copy link
Member

Closed by #2151

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants