New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add content histograms to longitudinal view #155
Conversation
Autolander could not find a bug number in your pull request title. All pull requests should be in the format of: Bug [number] - [description]. |
@vitillo, Please take a look |
Current coverage is 20.13% (diff: 93.75%)@@ master #155 diff @@
==========================================
Files 25 25
Lines 2323 2329 +6
Methods 2161 2163 +2
Messages 0 0
Branches 106 112 +6
==========================================
+ Hits 456 469 +13
+ Misses 1867 1860 -7
Partials 0 0
|
Reviewed 3 of 3 files at r1. src/main/scala/com/mozilla/telemetry/histograms/Histogram.scala, line 22 at r1 (raw file):
I am not sure I understand why we would need to add (manually?) new keys/suffixes to the dataset schema if we want to automatically scan for new processes under Ideally we don't want to edit this job every time a new process type is added so it would be great if we can automate this part. src/main/scala/com/mozilla/telemetry/histograms/Histogram.scala, line 32 at r1 (raw file):
s/HistFormat/HistogramFormat Comments from Reviewable |
Review status: all files reviewed at latest revision, 2 unresolved discussions. src/main/scala/com/mozilla/telemetry/histograms/Histogram.scala, line 22 at r1 (raw file): Previously, vitillo (Roberto Agostino Vitillo) wrote…
Every histogram name needs to be registered in the schema generated by Comments from Reviewable |
Review status: all files reviewed at latest revision, 2 unresolved discussions. src/main/scala/com/mozilla/telemetry/histograms/Histogram.scala, line 22 at r1 (raw file): Previously, harterrt (Ryan Harter) wrote…
@Dexterp37, is there a static list of valid processes somewhere? Comments from Reviewable |
Review status: all files reviewed at latest revision, 3 unresolved discussions. src/test/scala/com/mozilla/telemetry/LongitudinalTest.scala, line 186 at r1 (raw file):
@vitillo, if I understand your comment in Bug 1325653 [0], this field will not be available in the Heka message, correct? I'll have to refactor this change to read directly from the payload object. Please confirm. [0] https://bugzilla.mozilla.org/show_bug.cgi?id=1325653#c1 Comments from Reviewable |
Review status: all files reviewed at latest revision, 3 unresolved discussions. src/test/scala/com/mozilla/telemetry/LongitudinalTest.scala, line 186 at r1 (raw file): Previously, harterrt (Ryan Harter) wrote…
That was a more general comment about why things are what they are. I assumed you had already checked where exactly those fields sit in the payload; if not you can easily do that from the Hindsight UI. Comments from Reviewable |
Review status: all files reviewed at latest revision, 3 unresolved discussions. src/main/scala/com/mozilla/telemetry/histograms/Histogram.scala, line 22 at r1 (raw file): Previously, vitillo (Roberto Agostino Vitillo) wrote…
The list of Telemetry supported processes is only available here: the ping section names are built from that by removing the "#" character. I might refactor this a bit as part of bug 1278556 that is supposed to land soon-ish. Comments from Reviewable |
Review status: all files reviewed at latest revision, 3 unresolved discussions. src/main/scala/com/mozilla/telemetry/histograms/Histogram.scala, line 22 at r1 (raw file): Previously, Dexterp37 (Alessio Placitelli) wrote…
Hm, that's not very parser friendly though. What about storing the valid process types in a yaml or json file somewhere? src/test/scala/com/mozilla/telemetry/LongitudinalTest.scala, line 186 at r1 (raw file): Previously, vitillo (Roberto Agostino Vitillo) wrote…
See also the corresponding code that generates the data stored on S3. Comments from Reviewable |
Review status: all files reviewed at latest revision, 3 unresolved discussions. src/main/scala/com/mozilla/telemetry/histograms/Histogram.scala, line 22 at r1 (raw file): Previously, vitillo (Roberto Agostino Vitillo) wrote…
I just filed bug 1328230 for that. Comments from Reviewable |
f750165
to
a733ec4
Compare
Review status: 0 of 3 files reviewed at latest revision, 3 unresolved discussions. src/main/scala/com/mozilla/telemetry/histograms/Histogram.scala, line 22 at r1 (raw file): Previously, Dexterp37 (Alessio Placitelli) wrote…
@vitillo, can we continue with this PR and refactor once Bug 1328230 is complete? src/main/scala/com/mozilla/telemetry/histograms/Histogram.scala, line 32 at r1 (raw file): Previously, vitillo (Roberto Agostino Vitillo) wrote…
Done. src/test/scala/com/mozilla/telemetry/LongitudinalTest.scala, line 186 at r1 (raw file): Previously, vitillo (Roberto Agostino Vitillo) wrote…
Great. I refactored this change to read content histograms directly from the json payload. I ran the revised code over data from December 2016 and the results look good. The resulting data is in s3://telemetry-test-bucket/longitudinal/v20170103/. Comments from Reviewable |
Review status: 0 of 3 files reviewed at latest revision, 2 unresolved discussions. src/main/scala/com/mozilla/telemetry/histograms/Histogram.scala, line 22 at r1 (raw file): Previously, harterrt (Ryan Harter) wrote…
@Dexterp37, do you have a rough idea when that Bug will be tackled and how long it might take? Comments from Reviewable |
Review status: 0 of 3 files reviewed at latest revision, 2 unresolved discussions. src/main/scala/com/mozilla/telemetry/histograms/Histogram.scala, line 22 at r1 (raw file): Previously, vitillo (Roberto Agostino Vitillo) wrote…
@harterrt in the meantime you can assume that you are given a list with process names, e.g. List("content", "gpu"), instead of histogram locations. That will make it easier to adapt the code later on. Comments from Reviewable |
Review status: 0 of 3 files reviewed at latest revision, 2 unresolved discussions. src/main/scala/com/mozilla/telemetry/histograms/Histogram.scala, line 22 at r1 (raw file): Previously, vitillo (Roberto Agostino Vitillo) wrote…
We're going to tackle that as soon as bug 1278556 lands, so probably in a week or so. Comments from Reviewable |
59198fb
to
684d33e
Compare
Review status: 0 of 3 files reviewed at latest revision, 2 unresolved discussions. src/main/scala/com/mozilla/telemetry/histograms/Histogram.scala, line 22 at r1 (raw file): Previously, Dexterp37 (Alessio Placitelli) wrote…
@vitillo done. PTAL. I'm happy to refactor once bug 1328230 is closed, but I'd like to unblock this PR. src/test/scala/com/mozilla/telemetry/LongitudinalTest.scala, line 186 at r1 (raw file): Previously, harterrt (Ryan Harter) wrote…
Done. Comments from Reviewable |
Reviewed 1 of 3 files at r2, 2 of 2 files at r4. src/main/scala/com/mozilla/telemetry/histograms/Histogram.scala, line 22 at r1 (raw file): Previously, harterrt (Ryan Harter) wrote…
OK, please open a Bug for the refactor. src/main/scala/com/mozilla/telemetry/views/Longitudinal.scala, line 805 at r4 (raw file):
Doesn't parse return a Comments from Reviewable |
Review status: all files reviewed at latest revision, 2 unresolved discussions, some commit checks failed. src/main/scala/com/mozilla/telemetry/histograms/Histogram.scala, line 22 at r1 (raw file): Previously, vitillo (Roberto Agostino Vitillo) wrote…
Done: src/main/scala/com/mozilla/telemetry/views/Longitudinal.scala, line 805 at r4 (raw file): Previously, vitillo (Roberto Agostino Vitillo) wrote…
Done. Comments from Reviewable |
e7d036d
to
9d032d3
Compare
This change also makes it easy to add new histogram endpoints in the future. Just add a path and suffix to histogramLocations or keyedHistogramLocations. This whitelist updates both the parsing code and the dataset schema. I prototyped a version of this code which would remove the whitelist by automatically scanning for new keys under "payload.processes". However, we have to add these new keys/suffixes to the dataset schema.
9d032d3
to
b4e1ffb
Compare
Rebased to clear the merge conflict. Review status: 1 of 3 files reviewed at latest revision, 2 unresolved discussions. Comments from Reviewable |
@vitillo RFAL Review status: 1 of 3 files reviewed at latest revision, 2 unresolved discussions. Comments from Reviewable |
Reviewed 2 of 2 files at r5. Comments from Reviewable |
This change also makes it easy to add new histogram endpoints in the
future. Just add a path and suffix to histogramLocations or
keyedHistogramLocations. This whitelist updates both the parsing code
and the dataset schema.
I prototyped a version of this code which would remove the whitelist by
automatically scanning for new keys under "payload.processes". However,
we have to add these new keys/suffixes to the dataset schema.
This change is