Fixes bug 901977 - Store raw crash data into elasticsearch.#1647
Fixes bug 901977 - Store raw crash data into elasticsearch.#1647adngdb wants to merge 17 commits into
Conversation
There was a problem hiding this comment.
instead of making the raw crash a branch of the processed crash, could you consider making a two branch tree instead?
raw_and_processed = {
'raw_crash': raw_crash,
'processed_crash': processed_crash
}
or would that be too disruptive to all the data that's already in ES?
One of my current initiatives is to unify the fragmentation of the processed_crash format. The current state is that PG, ES, and HB/FS all store the processed crash in a little bit different form. PG/HB/FS are all lossy - the new redaction methods and the saving the json form of the processed crash in PG are all about making them all store exactly the same data.
If you add the 'raw_crash' key to the processed crash, you're making the ES processed crash different from the others. When we eventually document the processed_crash schema, we'll have to make an exception for ES and point out the difference.
There was a problem hiding this comment.
Doing that would indeed imply changes to both advanced search and supersearch as well as a full reindexing of our database. We might need to do the reindexing at some point, especially since we will want to have that raw_crash field everywhere. Maybe it is worth putting the effort now.
I'm a bit concerned that this change might break search for a little though. I'm not quite sure what the strategy for data would be here. I expect that we will need to reprocess the last 6 months of crashes (but putting them in elasticsearch only, no need to reindex in postgres and hbase). Reprocessing will be needed because we don't have unredacted processed crashes in HBase yet, and we want PII data to be in elasticsearch.
I would be happy to discuss with you a strategy for reprocessing for elasticsearch only.
|
Closing for the moment, will reopen when it is ready for review. |
Conflicts: docs/middleware.rst
…remove-deprecated-middleware Fixes bug 891921 - Removed all files related to the old, obsolete middleware.
…sig-hist-doc Fixes bug 938410 - Fixed example in signature_history documentation.
…block Bug 939141 - Annotate the largest free VM block in the processed crash. r=ted
…riencing failure.
Fixes Bug 931147 - tagged logging of transaction failures with name of the resource experiencing failure
…6-non-plotted-graphs-on-topcrasher Bug789526 non plotted graphs on topcrasher
…ches. Updating backfill app.
…eds manual testing.
@twobraids r?