-
Notifications
You must be signed in to change notification settings - Fork 72
STACK WIN unwinding #240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
STACK WIN unwinding #240
Conversation
|
reminder if anyone wants to test this out on some examples that I have instructions on how to use this with tecken here: https://github.com/luser/rust-minidump/tree/master/minidump-stackwalk#analyzing-firefox-minidumps |
|
Hmm the extra The reason I convinced myself I needed the -1 was that I was seeing a few cases where we were one-past-the-end of the record we were supposed to be using. Perhaps I'm still not computing/handling the length right? |
I think this is actually not unreasonable, because the return address on the stack is the instruction after the |
Hah, amazing! I'm sure you could grovel through blame and bugzilla for examples, but all the actual crash reports are assuredly lost to the mists of time. Given that this is all 32-bit Windows-specific, maybe it's not worth the effort? |
|
Ah right, having slept on it I realized of course that we don't adjust the |
As it turns out, most of my implementation was actually correct! The main issues seemed to stem from the breakpad symbol file being a nasty thing full of Lies. Most notably: * The byte length of a STACK WIN entry may actually cover the whole function(?) and just expect later STACK WIN entries that overlap with it to mask it out. To handle this, we now truncate the previous STACK WIN entry to the starting address of the next one during parsing. * The parameter_size field in FUNC entries is actually a trap, and the STACK WIN values are the ones that should be used. It's not 100% clear to me if the FUNC entries are *ever* reasonable, or if we should refuse to evaluate CFI if there is no STACK WIN entry to provide that value. For now I'm maximally permissive and will both use the FUNC value as a fallback, and further use 0 as a fallback if even that isn't present. (Defaulting to 0 is necessary when unwinding the context frame since there is no grand_callee yet, but it's not clear to me if that default should apply otherwise.) I also realized that I should be using the callee's adjusted "instruction" value rather than the raw value of `$eip` to lookup modules/symbols. This is more universally applied in a refactor in the next patch. This also adds a few "basic" tests from the breakpad testsuite. Notably I have omitted the tests for all the weird corner-case heuristics that breakpad uses like scanning from .raSearchStart. So far I haven't found an actual firefox minidump (in my limited testing) that needs these, so I'll punt on them until proven otherwise.
cfi unwining should be using callee.instruction instead of the pc/ip in the context. Rather than duplicate the logic to specially handle the context frame everywhere, just use the fact that we keep the right value in the callee and pass it down. To avoid having a million things passed down I also remove several other arguments that are just in the callee frame already. This adds some boilerplate for pulling out the callee validity, but I think it's fine.
also a little tweak to prevent register forwarding when using STACK WIN that I just absentmindedly added while reviewing the documentation
|
PR complete, I think? |
|
I'd say yes! I'm going to throw some particularly nasty minidumps at it to look at the results but the few I've already tried all looked very good already. |
|
Ok I got one (1) socorro crash report confidently verified to have the same unwind for all threads using rust-minindump, so I'm happy landing this now. Can iterate on this more later. (Verified using my new little script: https://github.com/Gankra/socc-pair/) |
This seems to get the implementation working well on some examples in socorro I tested.
Notable problems I found and fixed:
lengthof a STACK WIN record just covers the entirety of the function(?) and subsequent entries mask over it. So we have to truncate the lengths as we parse them to clean this up.$eipis incorrect, and that you need to subtract 1? (Which is what adjust_instruction does, so really we should just be using that value instead of eip?)Also I added in a quick and dirty more fancy version of the address_seems_valid that actually looks up the function record for the module, but it's not quite right because it doesn't have a case for "it's in the module but we couldn't load any function records". I should probably delete the code, I only wrote it while I was desperately trying to figure out what was breaking.
Surprisingly, I now don't have any examples which seem to need the accursed scanning or other hacks that breakpad is stuffed with for STACK WIN cfi!
TODO: