-
Notifications
You must be signed in to change notification settings - Fork 112
Determine constraints/what needs to be collected in our data pipeline #27
Comments
@nchapman can you help wrap up the requirements here? |
Target metrics
Other metrics (TBD)
Thoughts? @oyiptong @tspurway @nchapman @k88hudson @emtwo |
GUID needs to persist over time. Initially could be FxA ID? Test Pilot ID? Scroll position could always be tracked with 'click' metrics. quick and dirty duration could be (start timer on newtab) - (stop timer on 'defocus') (send ping) Basic navigation and interaction events should be tracked. On click we will track what position the clicked item was and scroll depth. how old are the items that users are clicking on (ages should be client time differences (in seconds))? check out mixpanel and GA data formats (use standard ping/metrics formats) add-on version, experiment versions, cohort id, etc. performance metrics - load times, latencies, size of history, number of bookmarks, time spent in the browser (active usage hours) install addon, uninstall, updated (lifecycle events of the addon itself) |
We are defining an Activity Stream A ping will be sent whenever the session ends (which will be when the tab looses focus). The ping will contain the following data:
Are we missing anything here @emtwo ? |
|
Based on the feedback of @oyiptong
Edit1: scratch
|
Do we think we'll want to do any analytics on the time of day of activations? I was keeping track of a start timestamp, wondering if it's worth keeping? |
The assumption is that the server-side will take note of the receipt time of a ping. The client doesn't need to keep track of the start time. |
@emtwo FYI, the Onyx endpoint for AS is live in stage: "https://onyx_tiles.stage.mozaws.net/v3/links/activity-stream" In production, it will be "https://tiles.services.mozilla.com/v3/links/activity-stream" The add-on can send pings to Onyx via HTTP POST, it returns status code 200 upon success with an empty response, otherwise, it returns 400 if any error occurs. |
@oyiptong is the server aware of the timezone of the client? If not, then the server doesn't actually know what time of day a given user is browsing right? Edit: sounds like we can infer timezone on the server using geoip, so no need to send client timestamp |
I see what you mean. The local datetime. And yes, we obtain the country data on the server-side. |
Usually it doesn't make sense to save time stamp from the clients, as we can't control how they set their local time. I will try to get the timezone from the geo, which in turn generated by IP address. |
fix(addon): #27 Update data collection format.
The MaxMind GeoIP database we use has Timezone as a field (https://www.maxmind.com/en/geoip2-city#features) |
AFAIK, our servers are provisioned with the country db. We'll need to ensure they are provisioned with the city DB |
That's a good point - I will doublecheck with travis we have the proper license somewhere |
We can capture the ping and just not count it when we do our aggregate queries. The information could still be useful. |
Moving the duration filtering to the query level, that also sounds viable. Plus keeping the long-lived session pings could facilitate the further investigation. OK, will take out this filter from Infernyx. Thanks! |
Just confirm that the timezone is not available in the |
I just noticed there was a question about the |
As for the |
Also, is it possible to grab telemetry data from #19 and include it in this ping? @ncloudioj @emtwo @mzhilyaev |
I think it should be fine to add metrics from #19 to the ping. @mzhilyaev should be able to confirm. |
Yes, we can attach performance events or any metric computable from them to the ping |
@emtwo @tspurway FYI, based on the discussion we had today,
Note:
Question:
Edit 1: We've decided to include the perf metrics in the ping. |
Performance metrics to be included in the ping:
@oyiptong @mzhilyaev could you add other interested metrics to the list above please? |
load time metric is Milliseconds , client default is null. |
Update: Add-on telemetry has been enabled since 1.0.4, data processing and persistence look good so far. Will keep watching it as more metrics will be brought to the pipeline. |
If we are collecting data in splice:
Notes:
@tspurway do you want to add some clarification here?
The text was updated successfully, but these errors were encountered: