Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Ready to land] Detecting suspicious/explosive crashes #1394

Merged
merged 12 commits into from Aug 27, 2013

Conversation

@shuhaowu
Copy link
Contributor

@shuhaowu shuhaowu commented Aug 9, 2013

Still work in progress. But this should be the idea for the cron job. Feedback will be nice

Right now the model is fairly "dumb". Smarter models could be developed but most of them are based on the existing math (and code framework) and should be fairly easy to add in.

Todos:

  • Testing? What is the policy on this?
  • DB migration: created a new table. Haven't done this yet
  • Further model creation (ARIMA model next. Don't have parameters yet) [DROPPED FOR NOW]
  • Remove some debugging code: I left some in, I'll take them out as I get closer to a more final version.
  • Middleware
  • Front end code
  • Time series graph
  • Legal clarifications for the code
  • Middleware unittests
  • Webapp unittests
  • Middleware documentations

Note about code review: There is a blob of math somewhere in this PR (not sure what the final structure will look like). There are some comments describing them and links to the original implementations/papers. As (if) I develop more complicated models, more math functions (blobs) will be added into the mix. While for some of these stuff I don't have theoretical verifications that I did this correctly, I did experimentally verify them to the best of my abilities (results are in docstrings/comments). If someone wants to mathematically prove it and tell me I'm wrong, please do so :).

Almost ready for merge. There are some optimizations that needs to be done. Though those might be best with Bugzilla bugs.

R?

@shuhaowu
Copy link
Contributor Author

@shuhaowu shuhaowu commented Aug 20, 2013

One thing: do i need to test the view that renders the HTML page for the report?

@shuhaowu
Copy link
Contributor Author

@shuhaowu shuhaowu commented Aug 20, 2013

Some stuff that we can refine (possibly at a later time), in no particular order.

  • Switch to use reports_clean table to speed up the graph look up, or cache the time series during the cron job.
  • Allow switching of date range in the web UI (backend is ready for this).
  • Allow filtering of products/versions on the web UI.
  • Compute explosiveness for particular products/versions (not sure how easy this would be).

These, with possibly the exception of the first one, should probably be bugzilla bugs.

df: Degrees of freedom.

Notes:
It is not exactly know if there is any problems with this
Copy link
Contributor

@lonnen lonnen Aug 21, 2013

Did you follow up on this?

Copy link
Contributor Author

@shuhaowu shuhaowu Aug 21, 2013

Yeah, everything about this has been resolved.

@lonnen
Copy link
Contributor

@lonnen lonnen commented Aug 21, 2013

This PR has some meat on it. Most of the time we'd land this is stages -- cron + database, middleware service, then UI. I left a few things to address, but we should get other sets of eyes on this as well.

It would be nice if @selenamarie could check the migration and give some advice about reports vs reports clean.
I'd also like to see someone take a second look over the django work. You may need to pester individuals directly in IRC to get their attention (or find them in person, since we're all in MV this week).

op.create_table(u'suspicious_crash_signatures',
sa.Column(u'id', sa.INTEGER()),
sa.Column(u'signature', sa.VARCHAR(255)),
sa.Column(u'date', sa.TIMESTAMP(timezone=True))
Copy link
Contributor

@selenamarie selenamarie Aug 21, 2013

Let's call this report_date because date is a reserved word and then would require double quotes when being used.


params = external_common.parse_arguments(filters, kwargs)

if not params.signature or not params.start_date:
Copy link
Contributor

@peterbe peterbe Aug 26, 2013

Break this up into two if statements.

@peterbe
Copy link
Contributor

@peterbe peterbe commented Aug 27, 2013

I noticed you got two test failures.

One is weird. The one about the private tempdir. I wouldn't be surprised that that'll just work next time you try.

However, the other test you have there is failing bad. I'm sure you can figure it out.

Once you figure out the test and jenkins passes; r+

@shuhaowu
Copy link
Contributor Author

@shuhaowu shuhaowu commented Aug 27, 2013

I couldn't replicate both tests locally. I changed the date casting thing and it is running on Jenkins again and it seems like it is working...

shuhaowu added 12 commits Aug 27, 2013
Not yet complete. Not really tested with real data.

Fixed some typos

Fixed small bug in buckify code
Corrected some simple mistakes.

Also added some integration tests.
Boom!

Went back to Evan's implementation.

Added docs, middleware tests

Rebased and got everything hooked up to work.

Fixed views and directory layout.

Nothing like coding at 11:00PM with Rockband in front of you.
Fixed window and added message if no crashes.

Added last view test.

Changed cache to 18 hours as the view is slow

We also don't need to recompute within essentially 24 hours.
For mware and cron job only.

Changed getting counts from reports_clean

Adjusted based on feedback.

Changes based on feedback

Fixed according to comments

Fix jenkins error

Fixed static URL
Unexposed api

Also fixed a small bug.
Changed crashes to datetime.

Updated with space
Should be using yesterday's data as that's the time when we have a full
day of data where as we do not have a full day of data today.

Fixed last comment that was forgotten about..

Turns out some files were not added

I was editing the generated copy as oppose to the working copy.
To be reverted later when we are confident that this works.
Fixed things came up in comments.

Additional fixes according to comments.

Date casting change..
shuhaowu added a commit that referenced this issue Aug 27, 2013
Boom! 

Man. Had to wait so long to make that joke.
@shuhaowu shuhaowu merged commit e3305ab into mozilla-services:master Aug 27, 2013
1 check passed
@shuhaowu shuhaowu deleted the suspicious branch Aug 27, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants