Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Make the Win-x64 CFI code more faithful #549

Merged
merged 3 commits into from
May 17, 2022
Merged

Conversation

Swatinem
Copy link
Member

This is trying to fix #546 by more faithfully implementing Win-x64 unwinding, specifically:
It restores all the registers that were saved, and adds support for FP-based unwinding.

I’m still struggling to make this code work correctly with RtlDispatchException which we somehow still can’t unwind through.

@Swatinem
Copy link
Member Author

Swatinem commented May 6, 2022

@loewenheim I double-checked the MachineFrame handling, and it does generate the same ASCII records/offsets as before.
So this should be strictly an improvement to before. I would still appreciate someone double-checking the SetFPRegister handling, as I couldn’t verify that in my own testing. CC @gabrielesvelto

@gabrielesvelto
Copy link
Contributor

I couldn't dig too deep but here's my results from testing on one of our crashes. The old stack trace looked like this:

 0  ntdll.dll!LdrpICallHandler + 0xf
    Found by: given as instruction pointer in context
 1  ntdll.dll!RtlpExecuteHandlerForException + 0xe
    Found by: call frame info
 2  ntdll.dll!RtlDispatchException + 0x243
    Found by: call frame info

The new one looks like this:

 0  ntdll.dll!LdrpICallHandler + 0xf
    Found by: given as instruction pointer in context
 1  ntdll.dll!RtlpExecuteHandlerForException + 0xe
    Found by: call frame info
 2  ntdll.dll!RtlDispatchException + 0x243
    Found by: call frame info
 3  ntdll.dll!KiUserExceptionDispatch + 0x2d
    Found by: call frame info
 4  KERNELBASE.dll!<unknown in kernelbase.dll> + 0x250
    Found by: stack scanning
 5  KERNELBASE.dll!RegOpenKeyExInternalW + 0x1c2
    Found by: call frame info
 6  KERNELBASE.dll!RegOpenKeyExW + 0x18
    Found by: call frame info
 7  SS3DevProps.dll + 0x12449
    Found by: call frame info
 8  ntdll.dll!SbSelectProcedure + 0x18e
    Found by: stack scanning

Visual studio reports this stack:

1 	ntdll.dll!LdrpICallHandler()
2	ntdll.dll!RtlpExecuteHandlerForException()
3 	ntdll.dll!RtlDispatchException()
4 	ntdll.dll!KiUserExceptionDispatch()
5 	ntdll.dll!LdrpDispatchUserCallTarget()
6 	MMDevAPI.dll!CDeviceEnumerator::OnDeviceStateChanged(unsigned short const *,unsigned long)
7 	MMDevAPI.dll!CLocalEndpointEnumerator::OnMediaNotification()
8 	MMDevAPI.dll!CMediaNotifications::OnMediaNotificationWorkerHandler()
9 	ntdll.dll!TppSimplepExecuteCallback()
10 	ntdll.dll!TppWorkerThread()
11 	kernel32.dll!BaseThreadInitThunk()
12	[Inline Frame] mozglue.dll!mozilla::interceptor::FuncHook<mozilla::interceptor::WindowsDllInterceptor<mozilla::interceptor::VMSharingPolicyShared>,void (*)(int, void *, void *)>::operator()(int & aArgs, void * & aArgs, void * & aArgs)
13 	mozglue.dll!patched_BaseThreadInitThunk(int aIsInitialThread, void * aStartAddress, void * aThreadParam)

Note how we're walking RtlDispatchException() correctly but still going in the woods after KiUserExceptionDispatch(). I haven't had time to compare register contents but I'll try soon.

@Swatinem
Copy link
Member Author

@gabrielesvelto thanks for having a look at this!
This does look promising. I’m wondering what the unwind codes for KiUserExceptionDispatch are…
From the name, it looks a bit like some kind of kernel interface. It may as well be possible this has one of these machine frame unwind codes. I tried to keep those in line with whatever was generated for them before, but it might as well be the case that those were broken all along.

@gabrielesvelto
Copy link
Contributor

I’m wondering what the unwind codes for KiUserExceptionDispatch are…

That's a very good point. Before your patch the directives for KiUserExceptionDispatch were the following:

STACK CFI INIT a0bd0 5c .cfa: $rsp 8 + .ra: .cfa 8 - ^
STACK CFI a0bd0 .cfa: $rsp 1472 + $rsp: .cfa 24 - ^ .ra: .cfa 48 - ^

With your patch applied they become:

STACK CFI INIT a0bd0 5c .cfa: $rsp 1472 + $rsp: .cfa 24 - ^ .ra: cfa 48 - ^ $rbx: .cfa 1328 - ^ $rbp: .cfa 1312 - ^ $rsi: .cfa 1304 - ^ $rdi: .cfa 1296 - ^ $r12: .cfa 1256 - ^ $r13: .cfa 1248 - ^ $r14: .cfa 1240 - ^ $r15: .cfa 1232 - ^

So I had an idea, and I copy-pasted the old CFI directives for KiUserExceptionDispatch into the .sym file generated with your patch applied and here's the stack trace I got:

 0  ntdll.dll!LdrpICallHandler + 0xf
    Found by: given as instruction pointer in context
 1  ntdll.dll!RtlpExecuteHandlerForException + 0xe
    Found by: call frame info
 2  ntdll.dll!RtlDispatchException + 0x243
    Found by: call frame info
 3  ntdll.dll!KiUserExceptionDispatch + 0x2d
    Found by: call frame info
 4  ntdll.dll!LdrpDispatchUserCallTarget + 0xd
    Found by: call frame info
 5  MMDevAPI.dll!virtual long CDeviceEnumerator::OnDeviceStateChanged(unsigned short const *,unsigned long) + 0xef
    Found by: call frame info
 6  MMDevAPI.dll!CLocalEndpointEnumerator::OnMediaNotification + 0x31c
    Found by: call frame info
 7  MMDevAPI.dll!void CMediaNotifications::OnMediaNotificationWorkerHandler(struct _TP_CALLBACK_INSTANCE *) + 0x1f3
    Found by: call frame info
 8  ntdll.dll!TppSimplepExecuteCallback + 0x98
    Found by: call frame info
 9  ntdll.dll!TppWorkerThread + 0x689
    Found by: call frame info
10  kernel32.dll!BaseThreadInitThunk + 0x13
    Found by: call frame info
11  mozglue.dll!patched_BaseThreadInitThunk(int, void*, void*) [WindowsDllBlocklist.cpp:1c7f7adc90e2b4c8d64548938bb1499033c5be8f : 576 + 0x14]
    Found by: call frame info
12  ntdll.dll!RtlUserThreadStart + 0x20
    Found by: call frame info

Bam! Full stack! So, your changes are doing the right thing for RtlDispatchException but they are somehow regressing KiUserExceptionDispatch. If it helps the file this is generated from is this one and the associated PDB is this one.

@Swatinem
Copy link
Member Author

Oh, looks like I had a very hard to spot typo in there, lol. (notice the missing . for cfa here: .ra: cfa 48; well, I didn’t until now)

Lets try now ;-)

@codecov-commenter
Copy link

Codecov Report

Merging #549 (02aaacb) into master (b4597b2) will decrease coverage by 0.04%.
The diff coverage is 52.54%.

@@            Coverage Diff             @@
##           master     #549      +/-   ##
==========================================
- Coverage   66.87%   66.83%   -0.05%     
==========================================
  Files         101      101              
  Lines       18840    18884      +44     
==========================================
+ Hits        12600    12621      +21     
- Misses       6240     6263      +23     

@gabrielesvelto
Copy link
Contributor

Oh, looks like I had a very hard to spot typo in there, lol. (notice the missing . for cfa here: .ra: cfa 48; well, I didn’t until now)

Lets try now ;-)

It's working fine! There's differences in the contents of other registers but the stack trace is exactly what it's supposed to be. If I have some time I'll compare the register contents with what Visual Studio gets but in the meantime this looks ready to land.

@Swatinem Swatinem merged commit 2eddda3 into master May 17, 2022
@Swatinem Swatinem deleted the fix/cfi-win-fpunwind branch May 17, 2022 12:58
@jrmuizel
Copy link

Do you have a plans to do a release that include this change soon?

@Swatinem
Copy link
Member Author

Sorry for the delay, we had some automation issues. Anyhow, we managed to release 8.7.3.

@gabrielesvelto
Copy link
Contributor

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

symbolic-minidump mishandling UnwindOperation::SetFPRegister?
5 participants