Skip to content
This repository was archived by the owner on Dec 6, 2019. It is now read-only.

Add case classes representing crash and main pings.#41

Merged
maurodoglio merged 17 commits intomasterfrom
use-case-classes
Jun 6, 2017
Merged

Add case classes representing crash and main pings.#41
maurodoglio merged 17 commits intomasterfrom
use-case-classes

Conversation

@maurodoglio
Copy link
Copy Markdown
Contributor

@maurodoglio maurodoglio commented May 11, 2017

This PR introduces a few case classes representing crash and main pings.
This classes are then used to extract ping data out of HekaMessages
Fixes #39

@maurodoglio maurodoglio changed the title This PR introduces a few case classes representing crash and main pings. This classes are then used to extract ping data out of HekaMessages This PR introduces a few case classes representing crash and main pings. May 11, 2017
@maurodoglio maurodoglio changed the title This PR introduces a few case classes representing crash and main pings. Add case classes representing crash and main pings. May 11, 2017
@maurodoglio maurodoglio requested a review from vitillo May 11, 2017 16:39
@codecov-io
Copy link
Copy Markdown

codecov-io commented May 11, 2017

Codecov Report

Merging #41 into master will increase coverage by 8.39%.
The diff coverage is 92.66%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #41      +/-   ##
==========================================
+ Coverage      60%   68.39%   +8.39%     
==========================================
  Files           2        3       +1     
  Lines         170      231      +61     
  Branches        7       12       +5     
==========================================
+ Hits          102      158      +56     
- Misses         68       73       +5
Impacted Files Coverage Δ
...ala/com/mozilla/telemetry/timeseries/package.scala 88.46% <100%> (+0.96%) ⬆️
...in/scala/com/mozilla/telemetry/pings/package.scala 85% <85%> (ø)
.../mozilla/telemetry/streaming/ErrorAggregator.scala 61.21% <97.01%> (+5.73%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d7fc30e...82d28ba. Read the comment docs.

@fbertsch
Copy link
Copy Markdown
Contributor

Per bug 1334457: What about moving these case classes to moztelemetry, where we can import them into either t-b-v, here, or (eventually) other projects? Could then start to move other case classes there as well.

import org.json4s._
import org.json4s.jackson.JsonMethods.parse

package object pings {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move this to the moztelemetry library as suggested by Frank.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I like the principle, I think this would be a pain to handle in practice for 2 reasons:
1- Changes to a central ping declaration (think of a new non-optional field) will require changes in every place where that ping is used, across different repositories.
2- Different projects may need different parts of a ping. For example in this patch I kept the definition of the pings intentionally loose. This made it easier to create test fixtures and it likely keeps the memory footprint at a minimum (not too sure about this though).
That said, this is the second time that I spend time defining the structure of a crash ping as a case class, I can see the benefits of moving this to moztelemetry.
@vitillo @fbertsch final thoughts?


private def parseCrashPing(ping: CrashPing): Tuple1[Row] = {
// Non-main crashes are already retrieved from main pings
if(!ping.isMain()) return Tuple1(null)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can a crash ping be a main ping?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should rename that method to isMainCrash, it's a bit confusing with MainPing around

dimensions.build
}

private def parseCrashPing(ping: CrashPing): Tuple1[Row] = {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this functionality within the CrashPing class itself?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is very specific to ErrorAggregator, the value it returns is coupled to the output schema.

Copy link
Copy Markdown
Contributor

@vitillo vitillo May 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but you could create a subclass or use the "pimp my library" pattern.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I swear I was going to propose that. I love to pimp my classes 😎

Tuple1(RowBuilder.merge(dimensions, stats.build))
}

private def parseMainPing(ping: MainPing): Tuple1[Row] = {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this functionality within the MainPing class itself?

@maurodoglio
Copy link
Copy Markdown
Contributor Author

RFOL

// Environment is omitted it's partially available under meta
meta: Meta
){

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the newline. We should add a linter to travis to have an uniform style.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Activated scalastyle and fixed the problem

meta: Meta
){

def isMainCrash(): Boolean = {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add another method for content process crashes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it's needed because we are only taking main crashes coming from crash pings. Everything else is filtered out to avoid double counting.

}

def timestamp(): Timestamp = {
new Timestamp(this.meta.Timestamp / 1000000)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment specifying the timestamp resolution.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this method to pings.Meta and added a comment

// Environment omitted because it's mostly available under meta
meta: Meta
){
def getCountHistogramValue(histogram_name: String): Int = {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add an assert to verify that the desired histogram is in fact a count histogram. The same comment applies elsewhere in this class.

case JInt(count) => count.toInt
case _ => 0
}
} catch { case _: Throwable => 0 }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A missing histogram should not generate a value of 0 but something equivalent to "NA" (e.g. null), as the semantics is quite different. The same comment applies elsewhere in this class.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed these methods to return an Option


case class OS(
name: Option[String],
version: Option[String])
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The formatting of case classes is inconsistent. E.g. sometimes the closing parentheses is right next to the last field while other times it isn't. Please use a linting tool.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reformatted these classes following the scala style guidelines. I also activated scalastyle, but unfortunately it doesn't provide any linting for this kind of problems.

name: Option[String],
version: Option[String])

def messageToCrashPing(message: Message): CrashPing = {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this method a constructor?

ping.extract[CrashPing]
}

def messageToMainPing(message: Message): MainPing = {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this method a constructor?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

dimensions("os_version") = meta.`environment.system`.map(_.os.version)
dimensions("architecture") = meta.`environment.build`.flatMap(_.architecture)
dimensions("country") = Some(meta.geoCountry)
dimensions("experiment_id") = for {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works for old style telemetry experiments but Quantum experiments are going to use shield, which supports multiple experiments. @sunahsuh can comment on the right fields to use.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talked to @maurodoglio offline but for posterity, the docs for the new experiment block is here: https://gecko.readthedocs.io/en/latest/toolkit/components/telemetry/telemetry/data/environment.html#experiments

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

dimensions.build
}

class ParsableCrashPing(ping: CrashPing) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a good name for this class.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to ErrorAggregatorCrashPing. I couldn't think of anything shorter and meaningful.

@maurodoglio
Copy link
Copy Markdown
Contributor Author

I think I addressed all the issues. @vitillo this is RFAL!

Copy link
Copy Markdown
Contributor

@vitillo vitillo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Please verify that the aggregates match the ones we are currently generating.

@maurodoglio
Copy link
Copy Markdown
Contributor Author

Apparently this version of the job is not producing any data, neither in streaming nor in batch mode. I'm debugging it.

@maurodoglio
Copy link
Copy Markdown
Contributor Author

I'll rebase --autosquash before merging

@maurodoglio maurodoglio self-assigned this Jun 6, 2017
@maurodoglio maurodoglio merged commit ba394a7 into master Jun 6, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants