-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metabug: Make minidump-processor good enough to replace mozilla's minidump-stackwalk #153
Comments
If the minidump has an exception in it, the exception stream has a I think you're correct though, since the existing code uses |
Things I don't know off the top of my head:
|
Initial schema sketch from when minidump-stackwalk was being developed
|
That was an old format that Socorro doesn't use anymore. I don't think there are other users of minidump-stackwalk, so my vote is that we can ditch it. If someone wants it back, I think they can write a wrapper that converts the JSON output to their pipe-delimited monstrosity format.
The input JSON file is the crash annotations sent with the crash report. We have a list of annotations at https://hg.mozilla.org/mozilla-central/file/tip/toolkit/crashreporter/CrashAnnotations.yaml but no schema. That's on my todo list because not having a schema is bonkers.
Socorro runs minidump-stackwalk in the processor and only uses the JSON output. There's a minidump-stackwalk has that pipe-delimited output format which we can ditch. minidump-stackwalk also generates a lot of logging output to stderr which is helpful for debugging. If I had my druthers, it'd be better for minidump-stackwalk to do its logging output to stdout/stderr in a conventional fashion and emit the JSON output into an output file specified in the command line arguments.
minidump-stackwalk outputs JSON and the Socorro processor sticks that in the processed crash in the "json_dump" field. Then the Socorro processor runs a bunch of other processing rules and extracts data from the There's no schema for the processed crash, either. I've got that on my todo list, too.
minidump-stackwalk spends the majority of its time downloading SYM files and parsing them. To reduce that time spent, it caches the SYM files on disk in a wildly irritating way. It's still spending all the time parsing them, though. I have a bug for investigating how not caching SYM files on disk affects minidump-stackwalk performance, but I haven't done it, yet. I'm guessing it affects it, but not much. Let's talk about multiple-url support... minidump-stackwalk supports looking at multiple places for SYM files. That's great, but it doesn't keep track of where it found the SYM file when it caches the SYM file on disk. So then when it goes to populate the url for the SYM file, it's wrong in a bunch of cases. This is complicated by the fact we have a "private bucket" of SYM files and the url for that is something like Back to parsing... Symbolic parses SYM files (which takes a long time) and then generates a symcache file. If we cache those on disk, then we get to skip the downloading and parsing steps and it's much faster. That's what Eliot does currently and I think that's what Symbolicator does, too. Given that it'd be faster for all these systems if they didn't have to parse SYM files, I've been tossing around changing Tecken so that it generates a symcache file for every SYM file uploaded and stores that in S3. So then we'd change things to try to download the symcache file and if it's there, use that. If it's not or it's in an older symcache format, download the sym file, parse it, generate a symcache, and use that. symcache files are not small and I think they do change format periodically, so maybe it makes sense to just generate symcache files for often used files. That's a project for a different day. I think for here, I'd cache the symbolic symcache file on disk because reusing that file where possible is a big win. minidump-stackwalk cached files, but didn't do anything to maintain or cleanup the cache--that was done by another process that runs on the Socorro processor nodes that I maintain. If caching symcache files on disk sounds arduous, then I'm game for thinking about other ideas, but I don't think I have time to rewrite Tecken upload for a while. |
For the minidump-stackwalk JSON output schema, I think I want to spend some quality time documenting it officially in that repo. Then we can point to that as the "official schema". Towards that end, I took your schema and worked on it and created a WIP PR: mozilla-services/minidump-stackwalk#32 |
So much progress... so close........ |
Ok after a bit of delay from illness and sentry deciding to pivot to other problems, I am reasonably confident that rust-minidump's minidump-stackwalk is ~ready to be seriously tested as a replacement for socorro's backend. I have published all the crates at version 0.9.0 to make it easy to grab the latest build and also to indicate we're close to a 1.0. Some useful links:
TODO:
I'm not really sure how to evaluate performance here. I could do some synthetic benchmarking but I feel like caching of sym files under a real workload must be such a huge part of the performance story that we should be testing in something resembling an actual production workload? |
With the 0.9.2 release published, I am tentatively declaring the rust-minidump side of the equation complete. I will now be focusing more on testing and hardening, and not feature development. I am hanging the issues I find off the socorro parity milestone so I don't have to manually manage the "dashboard". If you find any compatibility issues of your own, you can also file them against it (if github allows that, idk). |
The feature work is done, see the milestone dashboard for remaining bugfixes
Original metabug:
This is a metabug for the first milestone of Kill Breakpad
Replace mozilla/minidump-stackwalk with rust-minidump's minidump-processor
Subtasks
[ ] Replace breakpad-symbols with something based on symbolic #159 - Use symbolic for breakpad-symbols?Needed for strict parity but not necessarily a blocker
[ ] Implement an exploitability heuristic like Breakpad #25 - Implement an exploitability heuristicProbably lots more to do, will expand as we find them.
The text was updated successfully, but these errors were encountered: