Skip to content

Bug 1293776 initial pass on StatsdMetrics#3439

Merged
willkg merged 4 commits into
mozilla-services:masterfrom
willkg:1293776-crash-size
Aug 24, 2016
Merged

Bug 1293776 initial pass on StatsdMetrics#3439
willkg merged 4 commits into
mozilla-services:masterfrom
willkg:1293776-crash-size

Conversation

@willkg
Copy link
Copy Markdown
Collaborator

@willkg willkg commented Aug 22, 2016

This creates a new kind of external component called Metrics. It gets used in the breakpad collector after saving the crash to log data about the incoming crash at the point we've collected it.

r?

This creates a new kind of external component called Metrics. It gets
used in the breakpad collector after saving the crash to log data
about the incoming crash at the point we've collected it.
@willkg
Copy link
Copy Markdown
Collaborator Author

willkg commented Aug 22, 2016

This can land as is. After it lands, we'll want to make a configuration change to switch to the StatsdMetrics class so that it can log things to datadog.

@willkg
Copy link
Copy Markdown
Collaborator Author

willkg commented Aug 22, 2016

Mmm... It occurred to me that the placement of the metrics capture prevents us from capturing sizes for rejected crash reports. We probably want that, too. I'll fix it now.

@willkg
Copy link
Copy Markdown
Collaborator Author

willkg commented Aug 22, 2016

I'll have to track down that test bug, too. I've never seen that test before, so I'll have to go figure out how it gets set up.

willkg added 2 commits August 22, 2016 16:44
This adjusts the crash size report metrics such that they also capture
data about rejected crash reports. Further, it tweaks the data key so
it's easier to intuit what it is in datadog and other similar systems.
I think this fixes the integration test that Travis runs using the
Collector2015App that we don't use. That app re-arranges where
configuration is located in the collector app hierarchy of components,
so "config.metrics" gets handled differently.
@willkg
Copy link
Copy Markdown
Collaborator Author

willkg commented Aug 23, 2016

^^^ Travis is happy and I fixed the crash size metrics placement. So this is good to go.

To review:

  1. Run the tests if you like
  2. Read through the code and my pr comments

@peterbe Can you look at this tomorrow (Tuesday)? I'd love to land this Wednesday at the latest so we can collect data for the work week.

'metrics kicked up exception: %s',
str(exc),
exc_info=True
)
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure this section is safe. Does it look safe to you? If it goes sideways, does it tell us enough information that we can do something useful?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might as well still putting the exc into the string. The exc_info=True will take care of that much better.
Also, on the paranoid side of things, what if the exc object doesn't respond well to being str() called? E.g. some unicode object in there that's trouble.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was surprised our logger doesn't have a .exception() method. I thought about looking into what logger actually is here, but then never got around to it.

Anyhow, these are good points. I'll nix the str(exc) bit.

size_key: crash_report_size
}
self.metrics.capture_stats(metrics_data)
except Exception as exc:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps only the self.metrics.capture_stats(metrics_data) line should be in the try:except:.
Or perhaps the self._get_content_length() should be included too?

However, we might be best to put this into stage for a couple of days/weeks and see if we ever get any exceptions on any of the lines other than self.metrics.capture_stats(metrics_data).

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good thoughts here. I'll move it all into the try/except block.

@peterbe
Copy link
Copy Markdown
Contributor

peterbe commented Aug 23, 2016

There are some aspects of the whole functionality I don't understand but to understand it fully would be too big a time commitment and then you'd never have this into stage :)

r+

@peterbe
Copy link
Copy Markdown
Contributor

peterbe commented Aug 23, 2016

Not merging yet because of the question mark on the doc string capture_data.

* fix docstring
* move the rest of the metrics code into the try/except block
* reduce the work the except block does so as to reduce the likelihood of errors
  in the error handling
@willkg
Copy link
Copy Markdown
Collaborator Author

willkg commented Aug 24, 2016

-stage is free again, so I'm going to merge. Thank you for reviewing this!

@willkg willkg merged commit 27f5957 into mozilla-services:master Aug 24, 2016
@willkg willkg deleted the 1293776-crash-size branch October 25, 2016 13:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants