Skip to content
This repository has been archived by the owner on Feb 20, 2023. It is now read-only.

For #4456: Adds total_uri_count to metrics core ping #6003

Merged
merged 1 commit into from Oct 30, 2019

Conversation

sblatz
Copy link
Contributor

@sblatz sblatz commented Oct 14, 2019


Pull Request checklist

  • Quality: This PR builds and passes detekt/ktlint checks (A pre-push hook is recommended)
  • Tests: This PR includes thorough tests or an explanation of why it does not
  • Screenshots: This PR includes screenshots or GIFs of the changes made or an explanation of why it does not
  • Accessibility: The code in this PR follows accessibility best practices or does not include any user facing features

After merge

  • Milestone: Make sure issues finished by this pull request are added to the milestone of the version currently in development.

To download an APK when reviewing a PR:

  1. click on Show All Checks,
  2. click Details next to "Taskcluster (pull_request)" after it appears and then finishes with a green checkmark,
  3. click on the "Fenix - assemble" task, then click "Run Artifacts".
  4. the APK links should be on the left side of the screen, named for each CPU architecture

@sblatz sblatz requested a review from boek October 14, 2019 15:01
@@ -307,6 +307,11 @@ class Settings private constructor(
default = 0
)

var totalUriCount by longPreference(
appContext.getPreferenceKey(R.string.pref_key_total_uri),
default = 0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a security risk here? We don't normally store telemetry data in prefs. 😬

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's a security risk in this information here - number of urls opened is not PII or anything else.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we ever reset this number? It's not clear to me that we do, and if we don't, this number will only be useful if we normalize it (e.g. 100 urls / days opened).

@codecov-io
Copy link

codecov-io commented Oct 14, 2019

Codecov Report

Merging #6003 into master will increase coverage by 0.03%.
The diff coverage is 71.42%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #6003      +/-   ##
============================================
+ Coverage     14.49%   14.53%   +0.03%     
  Complexity      323      323              
============================================
  Files           272      272              
  Lines         11039    11046       +7     
  Branches       1593     1593              
============================================
+ Hits           1600     1605       +5     
- Misses         9311     9313       +2     
  Partials        128      128
Impacted Files Coverage Δ Complexity Δ
...rc/main/java/org/mozilla/fenix/FenixApplication.kt 3.77% <0%> (-0.04%) 2 <0> (ø)
...la/fenix/components/metrics/GleanMetricsService.kt 7.8% <100%> (+0.32%) 3 <0> (ø) ⬇️
.../src/main/java/org/mozilla/fenix/utils/Settings.kt 66.66% <100%> (+0.57%) 15 <0> (ø) ⬇️
...ava/org/mozilla/fenix/browser/UriOpenedObserver.kt 58.33% <50%> (-0.76%) 3 <0> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d797358...ab062d2. Read the comment docs.

@sblatz
Copy link
Contributor Author

sblatz commented Oct 14, 2019

Request for data collection review form

All questions are mandatory. You must receive review from a data steward peer on your responses to these questions before shipping new data collection.

  1. What questions will you answer with this data?
    -How many URI's does a user navigate to?

  2. Why does Mozilla need to answer these questions? Are there benefits for users? Do we need this information to address product or business requirements?

  • We want to know how many sites a user visits in order to better understand engagement with the product.
  1. What alternative methods did you consider to answer these questions? Why were they not sufficient?
  • N/A (These are baseline metrics)
  1. Can current instrumentation answer these questions?
  • Currently we track this event in Glean every time a user enters a URI, however we don't get this in the core ping so we cannot query it in as useful of a way.
  1. List all proposed measurements and indicate the category of data collection for each measurement, using the Firefox data collection categories on the found on the Mozilla wiki.
  • All data is Category 2.
  1. How long will this data be collected?

Until 03/01/2020

  1. What populations will you measure?
  • All release, beta, and nightly users with telemetry enabled.
  1. Please provide a general description of how you will analyze this data.
  • Glean / Amplitude
  1. Where do you intend to share the results of your analysis?
  • Only on Glean, Amplitude and with mobile teams.

@sblatz sblatz added the needs:data-review PR is awaiting a data review label Oct 14, 2019
Copy link
Contributor

@liuche liuche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we set this count to 0 in preferences, when sending the ping?

@@ -307,6 +307,11 @@ class Settings private constructor(
default = 0
)

var totalUriCount by longPreference(
appContext.getPreferenceKey(R.string.pref_key_total_uri),
default = 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's a security risk in this information here - number of urls opened is not PII or anything else.

@@ -307,6 +307,11 @@ class Settings private constructor(
default = 0
)

var totalUriCount by longPreference(
appContext.getPreferenceKey(R.string.pref_key_total_uri),
default = 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we ever reset this number? It's not clear to me that we do, and if we don't, this number will only be useful if we normalize it (e.g. 100 urls / days opened).

@sblatz
Copy link
Contributor Author

sblatz commented Oct 16, 2019

@liuche:

Do we ever reset this number? It's not clear to me that we do, and if we don't, this number will only be useful if we normalize it (e.g. 100 urls / days opened).

When should we reset this? Do we reset it currently in any way for the other metric? My understanding is that it's a counter that always grows. @boek

Looks like Chenxia's suggestion is to do this:

Shouldn't we set this count to 0 in preferences, when sending the ping?

Are we wanting total_uri_count to be reset each session?

@liuche
Copy link
Contributor

liuche commented Oct 16, 2019

Good question! Quick glance shows me 3 increments in that file: two of them are counters so we stop showing a hint after a certain number of times, which I agree with - that one won't grow unbounded.

The other counter is "number of time PB has been opened". I really do think that this one is also suspect - just having a huge number doesn't really make sense, and it's hard to understand what we could actually do with that number! Like would we do something different if we knew a user EVER opened PB 100x vs 10000x, vs opened PB in a session 2x vs 10x?

I'd argue that "total number of urls opened even" is much less useful of a metric than "number of urls opened in a session".

@liuche
Copy link
Contributor

liuche commented Oct 18, 2019

@baron-severin also a related total_uri_count

@liuche
Copy link
Contributor

liuche commented Oct 18, 2019

@fbertsch from a data perspective, is there an easy way to look at total_uri_count for the lifetime of the app vs. per session? Do you have any opinions on how much information each of those provides? (Would we need to normalize lifetime uri count for instance, and if so is that easy to do?)

@severinrudie
Copy link
Contributor

severinrudie commented Oct 19, 2019

@baron-severin also a related total_uri_count

@liuche It looks like this is sent from the same place as the code I was looking at, so unless I've made a mistake, it will suffer from #6126 as well.

EDIT: and #3676 too

@fbertsch
Copy link

@liuche almost certainly we want total_uri_count for the session. We can always aggregate the data on our end across a user's history if we want to get it over their lifetime, but the specific use-case is to determine if the user was active on a given day (total_uri_count > 5).

@liuche
Copy link
Contributor

liuche commented Oct 23, 2019

@sblatz I'd go with Frank's suggestion here! It sounds like it will both give us the information that is useful for data analysis, as well as be aggregated to give the total lifetime count.

@liuche liuche added the pr:needs-changes PRs that need some changes/fixes before they can land label Oct 25, 2019
@sblatz
Copy link
Contributor Author

sblatz commented Oct 28, 2019

@liuche I am now zeroing this value out inside of setupApplication which should happen on each new "session": https://github.com/mozilla-mobile/fenix/pull/6003/files#diff-3c4353a6227c0da83801f46757491524R67

Copy link
Contributor

@liuche liuche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Data Review Form (to be filled by Data Stewards)

  1. Is there or will there be documentation that describes the schema for the ultimate data set in a public, complete, and accurate way?

yes, in metrics.yaml and the generated metrics.md

  1. Is there a control mechanism that allows the user to turn the data collection on and off? (Note, for data collection not needed for security purposes, Mozilla provides such a control mechanism) Provide details as to the control mechanism available.

Yes, coreping is also controlled by Fenix data controls

  1. If the request is for permanent data collection, is there someone who will monitor the data over time?

Has expiry

  1. Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under?

Type 2, number of url loads and reloads made by the user

  1. Is the data collection request for default-on or default-off?

Default on

  1. Does the instrumentation include the addition of any new identifiers (whether anonymous or otherwise; e.g., username, random IDs, etc. See the appendix for more details)?

No, just a count of uris browsed.

  1. Is the data collection covered by the existing Firefox privacy notice?

Yes

  1. Does there need to be a check-in in the future to determine whether to renew the data? (Yes/No) (If yes, set a todo reminder or file a bug if appropriate)**

Has expiry

  1. Does the data collection use a third-party collection tool?

No

@liuche liuche merged commit 8549b80 into mozilla-mobile:master Oct 30, 2019
@severinrudie severinrudie mentioned this pull request Nov 6, 2019
30 tasks
pocmo added a commit to pocmo/fenix that referenced this pull request Feb 25, 2020
pocmo added a commit to pocmo/fenix that referenced this pull request Feb 25, 2020
pocmo added a commit to pocmo/fenix that referenced this pull request Feb 25, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
needs:data-review PR is awaiting a data review pr:needs-changes PRs that need some changes/fixes before they can land
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants