New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 1249359 store correlations in es #3251
Bug 1249359 store correlations in es #3251
Conversation
@adngdb I know the test is failing but I can't figure out what the error means. It works locally. Does it make any sense to you? |
--source.crashstorage_class=socorro.analysis.correlations.correlations_app.LocallyCachedBotoS3CrashStorage | ||
|
||
in your call to `socorro correlations ...` | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, that is awesome!
@adngdb See the extra new commit.
That meant the scroll scan has to be open for a crazy long time. Instead now, it does this:
Now the scroll scan can close as soon as all crash IDs are extracted out and we can return to the FTS machinery. |
@adngdb I just pushed another little change.
I.e. it takes, on average, 3.7 SECONDS! to write one of these crashes into a
Instead, I just removed this fancy gzip stuff. Now it takes AMAZINGLY less time to write and read:
Also, since we always have ujson installed, I used that. Sure, it uses up more space.
|
'date', | ||
'key', | ||
'signature', | ||
'count', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think count
should not be in there. It could change when you reload the job, if new crashes came in in the mid-time for example. I can happen if someone triggers the processing of a crash that has been throttled. Or say we had a problem with processing, and we want to re-run correlations a bit later when all crashes made it into the database.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. I've changed it.
It seems to have worked great locally! r+ when the |
edfdf11
to
e9fe8e6
Compare
The way to run this is like this:
That'll start 5 threads that jointly download 2000 processed crashes and then generates the summaries which it inserts into a local ES.
Note: I have my ssh tunnel set to so that localhost:9222 is the stage ES. That's where I get a days worth of UUIDs from.