Add case classes representing crash and main pings.#41
Conversation
Codecov Report
@@ Coverage Diff @@
## master #41 +/- ##
==========================================
+ Coverage 60% 68.39% +8.39%
==========================================
Files 2 3 +1
Lines 170 231 +61
Branches 7 12 +5
==========================================
+ Hits 102 158 +56
- Misses 68 73 +5
Continue to review full report at Codecov.
|
|
Per bug 1334457: What about moving these case classes to moztelemetry, where we can import them into either t-b-v, here, or (eventually) other projects? Could then start to move other case classes there as well. |
| import org.json4s._ | ||
| import org.json4s.jackson.JsonMethods.parse | ||
|
|
||
| package object pings { |
There was a problem hiding this comment.
Let's move this to the moztelemetry library as suggested by Frank.
There was a problem hiding this comment.
While I like the principle, I think this would be a pain to handle in practice for 2 reasons:
1- Changes to a central ping declaration (think of a new non-optional field) will require changes in every place where that ping is used, across different repositories.
2- Different projects may need different parts of a ping. For example in this patch I kept the definition of the pings intentionally loose. This made it easier to create test fixtures and it likely keeps the memory footprint at a minimum (not too sure about this though).
That said, this is the second time that I spend time defining the structure of a crash ping as a case class, I can see the benefits of moving this to moztelemetry.
@vitillo @fbertsch final thoughts?
|
|
||
| private def parseCrashPing(ping: CrashPing): Tuple1[Row] = { | ||
| // Non-main crashes are already retrieved from main pings | ||
| if(!ping.isMain()) return Tuple1(null) |
There was a problem hiding this comment.
How can a crash ping be a main ping?
There was a problem hiding this comment.
I should rename that method to isMainCrash, it's a bit confusing with MainPing around
| dimensions.build | ||
| } | ||
|
|
||
| private def parseCrashPing(ping: CrashPing): Tuple1[Row] = { |
There was a problem hiding this comment.
Can we move this functionality within the CrashPing class itself?
There was a problem hiding this comment.
This function is very specific to ErrorAggregator, the value it returns is coupled to the output schema.
There was a problem hiding this comment.
Right, but you could create a subclass or use the "pimp my library" pattern.
There was a problem hiding this comment.
I swear I was going to propose that. I love to pimp my classes 😎
| Tuple1(RowBuilder.merge(dimensions, stats.build)) | ||
| } | ||
|
|
||
| private def parseMainPing(ping: MainPing): Tuple1[Row] = { |
There was a problem hiding this comment.
Can we move this functionality within the MainPing class itself?
a8ab146 to
1eb966a
Compare
|
RFOL |
| // Environment is omitted it's partially available under meta | ||
| meta: Meta | ||
| ){ | ||
|
|
There was a problem hiding this comment.
Please remove the newline. We should add a linter to travis to have an uniform style.
There was a problem hiding this comment.
Activated scalastyle and fixed the problem
| meta: Meta | ||
| ){ | ||
|
|
||
| def isMainCrash(): Boolean = { |
There was a problem hiding this comment.
Please add another method for content process crashes.
There was a problem hiding this comment.
I'm not sure it's needed because we are only taking main crashes coming from crash pings. Everything else is filtered out to avoid double counting.
| } | ||
|
|
||
| def timestamp(): Timestamp = { | ||
| new Timestamp(this.meta.Timestamp / 1000000) |
There was a problem hiding this comment.
Please add a comment specifying the timestamp resolution.
There was a problem hiding this comment.
I moved this method to pings.Meta and added a comment
| // Environment omitted because it's mostly available under meta | ||
| meta: Meta | ||
| ){ | ||
| def getCountHistogramValue(histogram_name: String): Int = { |
There was a problem hiding this comment.
Please add an assert to verify that the desired histogram is in fact a count histogram. The same comment applies elsewhere in this class.
| case JInt(count) => count.toInt | ||
| case _ => 0 | ||
| } | ||
| } catch { case _: Throwable => 0 } |
There was a problem hiding this comment.
A missing histogram should not generate a value of 0 but something equivalent to "NA" (e.g. null), as the semantics is quite different. The same comment applies elsewhere in this class.
There was a problem hiding this comment.
I changed these methods to return an Option
|
|
||
| case class OS( | ||
| name: Option[String], | ||
| version: Option[String]) |
There was a problem hiding this comment.
The formatting of case classes is inconsistent. E.g. sometimes the closing parentheses is right next to the last field while other times it isn't. Please use a linting tool.
There was a problem hiding this comment.
I reformatted these classes following the scala style guidelines. I also activated scalastyle, but unfortunately it doesn't provide any linting for this kind of problems.
| name: Option[String], | ||
| version: Option[String]) | ||
|
|
||
| def messageToCrashPing(message: Message): CrashPing = { |
There was a problem hiding this comment.
Can we make this method a constructor?
| ping.extract[CrashPing] | ||
| } | ||
|
|
||
| def messageToMainPing(message: Message): MainPing = { |
There was a problem hiding this comment.
Can we make this method a constructor?
| dimensions("os_version") = meta.`environment.system`.map(_.os.version) | ||
| dimensions("architecture") = meta.`environment.build`.flatMap(_.architecture) | ||
| dimensions("country") = Some(meta.geoCountry) | ||
| dimensions("experiment_id") = for { |
There was a problem hiding this comment.
This works for old style telemetry experiments but Quantum experiments are going to use shield, which supports multiple experiments. @sunahsuh can comment on the right fields to use.
There was a problem hiding this comment.
Talked to @maurodoglio offline but for posterity, the docs for the new experiment block is here: https://gecko.readthedocs.io/en/latest/toolkit/components/telemetry/telemetry/data/environment.html#experiments
| dimensions.build | ||
| } | ||
|
|
||
| class ParsableCrashPing(ping: CrashPing) { |
There was a problem hiding this comment.
I don't think this is a good name for this class.
There was a problem hiding this comment.
Changed to ErrorAggregatorCrashPing. I couldn't think of anything shorter and meaningful.
3fa6704 to
e4e92e2
Compare
|
I think I addressed all the issues. @vitillo this is RFAL! |
vitillo
left a comment
There was a problem hiding this comment.
LGTM. Please verify that the aggregates match the ones we are currently generating.
e4e92e2 to
f81ab09
Compare
|
Apparently this version of the job is not producing any data, neither in streaming nor in batch mode. I'm debugging it. |
|
I'll |
This also increases the test coverage adding some tests for the MainPing methods.
This gives us the ability to eventually return more than one Row per ping.
8313651 to
82d28ba
Compare
This PR introduces a few case classes representing crash and main pings.
This classes are then used to extract ping data out of HekaMessages
Fixes #39