Skip to content
This repository has been archived by the owner on Feb 20, 2023. It is now read-only.

For #10434: Handle cases when proc/$pid/stat is not accessible. #10481

Merged
merged 1 commit into from Jun 3, 2020

Conversation

mcarare
Copy link
Contributor

@mcarare mcarare commented May 7, 2020

Pull Request checklist

  • Tests: This PR includes thorough tests or an explanation of why it does not
  • Screenshots: This PR includes screenshots or GIFs of the changes made or an explanation of why it does not
  • Accessibility: The code in this PR follows accessibility best practices or does not include any user facing features. In addition, it includes a screenshot of a successful accessibility scan to ensure no new defects are added to the product.

After merge

  • Milestone: Make sure issues finished by this pull request are added to the milestone of the version currently in development.

To download an APK when reviewing a PR:

  1. click on Show All Checks,
  2. click Details next to "Taskcluster (pull_request)" after it appears and then finishes with a green checkmark,
  3. click on the "Fenix - assemble" task, then click "Run Artifacts".
  4. the APK links should be on the left side of the screen, named for each CPU architecture

@mcarare mcarare requested a review from mcomella May 7, 2020 12:02
@mcarare
Copy link
Contributor Author

mcarare commented May 7, 2020

@mcomella I wasn't sure how we would want to handle exceptions. I thought that a value of 0 will be ignored when assessing this metric. Let me know if this should be handled differently. TY!

Copy link
Contributor

@mcomella mcomella left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing this! I think reporting 0 is misleading in our data so I think we should either not record or record a new error.

app/src/main/java/org/mozilla/fenix/perf/Stat.kt Outdated Show resolved Hide resolved
try {
stat.getProcessStartTimeTicks(Process.myPid())
} catch (e: NegativeArraySizeException) {
return 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than returning 0, I think it's preferable to report an error through a new telemetry probe.

That being said, if that's too much work, it's probably fine to catch the exception in setExpensiveMetric and record nothing if it's thrown. Just bump the metric version and change the docs to make it explicit this is happening.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mcomella I added a telemetry probe, going to also post data review form and mark this PR with needs:data-review. Let me know if any more changes are needed. TY!

@mcomella
Copy link
Contributor

Sorry it took so long to get to this 😞

@mcomella
Copy link
Contributor

Ideally, we could extend the tests to cover this case as well (I think Stat & friends have high code coverage).

@mcarare
Copy link
Contributor Author

mcarare commented May 13, 2020

Sorry it took so long to get to this 😞

No worries. I know it happens. Thank you for the review!

@mcarare mcarare force-pushed the 10434 branch 2 times, most recently from 0d6a3e7 to 2725aa8 Compare May 13, 2020 07:27
@mcarare
Copy link
Contributor Author

mcarare commented May 13, 2020

Request for data collection review form

All questions are mandatory. You must receive review from a data steward peer on your responses to these questions before shipping new data collection.

  1. What questions will you answer with this data?
  • How often a custom OS will block us from reading stats needed for startup time evaluation?
  1. Why does Mozilla need to answer these questions? Are there benefits for users? Do we need this information to address product or business requirements? Some example responses:
  • Measure how often our startup time implementation is not suitable for OS customizations.
  1. What alternative methods did you consider to answer these questions? Why were they not sufficient?
  • We could not gather data locally on a wide range of devices and OS customizations that are not suitable for our startup measurement implementation.
  1. Can current instrumentation answer these questions?

No.

  1. List all proposed measurements and indicate the category of data collection for each measurement, using the Firefox data collection categories found on the Mozilla wiki.

Note that the data steward reviewing your request will characterize your data collection based on the highest (and most sensitive) category.

Measurement Description Data Collection Category Tracking Bug #
framework_start_error: event Category 1 #10434
  1. How long will this data be collected? Choose one of the following:
  • I want this data to be collected until 2020-07-15 (like the rest of startup telemetry data)
  1. What populations will you measure?

All.

  1. If this data collection is default on, what is the opt-out mechanism for users?

Standard telemetry opt-out.

  1. Please provide a general description of how you will analyze this data.
  • Watch frameworkStartReadError to determine if there are recurrent read errors.
  1. Where do you intend to share the results of your analysis?

Internally.

  1. Is there a third-party tool (i.e. not Telemetry) that you are proposing to use for this data collection? If so:

No.

@mcarare mcarare added the needs:data-review PR is awaiting a data review label May 13, 2020
Copy link
Contributor

@mcomella mcomella left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. If we want to avoid the churn, I'm happy for this to land without the nits too (after data review). Thanks mcarare !

@@ -25,6 +25,9 @@ private const val FIELD_POS_STARTTIME = 21 // starttime naming matches field in
open class Stat {

@VisibleForTesting(otherwise = PRIVATE)
/**
* @throws [java.io.FileNotFoundException]
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think kdoc is conventionally above the annotation

@@ -25,6 +25,9 @@ private const val FIELD_POS_STARTTIME = 21 // starttime naming matches field in
open class Stat {

@VisibleForTesting(otherwise = PRIVATE)
/**
* @throws [java.io.FileNotFoundException]
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd add a kdoc to getFrameworkStartNanos too. It's more important than this method because it's the non-private method of this class

@mcomella
Copy link
Contributor

@boek Please data review.

@mcarare mcarare assigned mcarare and boek and unassigned mcarare May 25, 2020
@codecov-commenter
Copy link

Codecov Report

Merging #10481 into master will decrease coverage by 0.00%.
The diff coverage is 50.00%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #10481      +/-   ##
============================================
- Coverage     19.63%   19.63%   -0.01%     
  Complexity      631      631              
============================================
  Files           363      363              
  Lines         14950    14953       +3     
  Branches       2017     2017              
============================================
  Hits           2936     2936              
- Misses        11737    11740       +3     
  Partials        277      277              
Impacted Files Coverage Δ Complexity Δ
app/src/main/java/org/mozilla/fenix/perf/Stat.kt 87.50% <ø> (ø) 5.00 <0.00> (ø)
...lla/fenix/perf/StartupFrameworkStartMeasurement.kt 90.00% <50.00%> (-10.00%) 9.00 <0.00> (ø)
...ix/home/sessioncontrol/SessionControlController.kt 71.42% <0.00%> (-0.90%) 0.00% <0.00%> (ø%)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7c7d860...1a68109. Read the comment docs.

@mcomella mcomella requested a review from liuche May 30, 2020 00:46
@mcomella
Copy link
Contributor

@liuche Please data review.

Copy link
Contributor

@liuche liuche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Data Review Form (to be filled by Data Stewards)

  1. Is there or will there be documentation that describes the schema for the ultimate data set in a public, complete, and accurate way?

Yes, documented usage in startup-timeline ping in metrics.md

  1. Is there a control mechanism that allows the user to turn the data collection on and off?

Yes, custom pings are controlled by the Fenix data settings

  1. If the request is for permanent data collection, is there someone who will monitor the data over time?

Expires in 7/2020

  1. Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under?
    Type 1, measuring error frequency when blocked reading from /proc

  2. Is the data collection request for default-on or default-off?

default on

  1. Does the instrumentation include the addition of any new identifiers (whether anonymous or otherwise; e.g., username, random IDs, etc. See the appendix for more details)?

No

  1. Is the data collection covered by the existing Firefox privacy notice?

yes

  1. Does there need to be a check-in in the future to determine whether to renew the data? (Yes/No) (If yes, set a todo reminder or file a bug if appropriate)**

No, has expiry

  1. Does the data collection use a third-party collection tool? If yes, escalate to legal.

No

@liuche liuche merged commit 2090b11 into mozilla-mobile:master Jun 3, 2020
@mcarare mcarare deleted the 10434 branch June 3, 2020 10:49
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
needs:data-review PR is awaiting a data review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants