Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The first sos command fails in live user mode debugging #8909

Closed
tvass83 opened this issue Sep 11, 2017 · 12 comments
Closed

The first sos command fails in live user mode debugging #8909

tvass83 opened this issue Sep 11, 2017 · 12 comments

Comments

@tvass83
Copy link
Contributor

tvass83 commented Sep 11, 2017

  • It only happens in live debugging, not when working with dump files.
  • The command can be anything from sos.dll.
  • The same command succeeds the second time and from that point on, there are no errors.
0:074> .loadby sos clr 
0:074> !threads 
c0000005 Exception in C:\Windows\Microsoft.NET\Framework\v4.0.30319\sos.threads debugger extension. 
      PC: 0f0ab823  VA: 00000000  R/W: 0  Parameter: 00000004
@swgillespie
Copy link
Contributor

I've experienced this too, when running debugger tests.

@mikem8361
Copy link
Member

Are you running .NET Core because the desktop SOS was loaded?

C:\Windows\Microsoft.NET\Framework\v4.0.30319\sos.threads debugger extension.

@tvass83
Copy link
Contributor Author

tvass83 commented Sep 12, 2017

@swgillespie , @mikem8361: ahh, you were right, sorry for the confusion. It works perfectly with .NET Core as you can see it in the example below.
Any ideas where I should file this bug for the full .NET Framework?

ModLoad: 00100000 0011e000   c:\Program Files (x86)\dotnet\dotnet.exe
ModLoad: 77730000 778a9000   C:\WINDOWS\SYSTEM32\ntdll.dll
ModLoad: 772c0000 773b0000   C:\WINDOWS\SYSTEM32\KERNEL32.DLL
ModLoad: 76ca0000 76e16000   C:\WINDOWS\SYSTEM32\KERNELBASE.dll
ModLoad: 74760000 747ce000   C:\WINDOWS\SYSTEM32\SYSFER.DLL
ModLoad: 6d1c0000 6d29c000   C:\WINDOWS\SYSTEM32\ucrtbase.dll
ModLoad: 54c00000 54c3f000   c:\Program Files (x86)\dotnet\host\fxr\2.0.0\hostfxr.dll
ModLoad: 54770000 547d5000   c:\Program Files (x86)\dotnet\shared\Microsoft.NETCore.App\2.0.0\hostpolicy.dll
ModLoad: 52cc0000 530dc000   c:\Program Files (x86)\dotnet\shared\Microsoft.NETCore.App\2.0.0\coreclr.dll
ModLoad: 76030000 760ab000   C:\WINDOWS\SYSTEM32\ADVAPI32.dll
ModLoad: 76bb0000 76c6e000   C:\WINDOWS\SYSTEM32\msvcrt.dll
ModLoad: 774e0000 77523000   C:\WINDOWS\SYSTEM32\sechost.dll
ModLoad: 76950000 769fc000   C:\WINDOWS\SYSTEM32\RPCRT4.dll
ModLoad: 74840000 7485e000   C:\WINDOWS\SYSTEM32\SspiCli.dll
ModLoad: 74830000 7483a000   C:\WINDOWS\SYSTEM32\CRYPTBASE.dll
ModLoad: 747d0000 74829000   C:\WINDOWS\SYSTEM32\bcryptPrimitives.dll
ModLoad: 77010000 770fa000   C:\WINDOWS\SYSTEM32\ole32.dll
ModLoad: 77100000 772ba000   C:\WINDOWS\SYSTEM32\combase.dll
ModLoad: 76590000 766dc000   C:\WINDOWS\SYSTEM32\GDI32.dll
ModLoad: 74a10000 74b50000   C:\WINDOWS\SYSTEM32\USER32.dll
ModLoad: 76f70000 77005000   C:\WINDOWS\SYSTEM32\OLEAUT32.dll
ModLoad: 76ec0000 76f04000   C:\WINDOWS\SYSTEM32\SHLWAPI.dll
ModLoad: 73d10000 73d18000   C:\WINDOWS\SYSTEM32\VERSION.dll
ModLoad: 73ed0000 73eeb000   C:\WINDOWS\SYSTEM32\bcrypt.dll
ModLoad: 76c70000 76c9b000   C:\WINDOWS\SYSTEM32\IMM32.DLL
ModLoad: 74b50000 74c70000   C:\WINDOWS\SYSTEM32\MSCTF.dll
ModLoad: 51c60000 52582000   c:\Program Files (x86)\dotnet\shared\Microsoft.NETCore.App\2.0.0\System.Private.CoreLib.dll
ModLoad: 773b0000 773bc000   C:\WINDOWS\SYSTEM32\kernel.appcore.dll
ModLoad: 06ae0000 06ae8000   d:\!vsprojs\TestSOS\bin\Debug\netcoreapp2.0\TestSOS.dll
ModLoad: 72fc0000 72fcc000   c:\Program Files (x86)\dotnet\shared\Microsoft.NETCore.App\2.0.0\System.Runtime.dll
ModLoad: 51a60000 51b54000   c:\Program Files (x86)\dotnet\shared\Microsoft.NETCore.App\2.0.0\clrjit.dll
ModLoad: 53920000 53944000   c:\Program Files (x86)\dotnet\shared\Microsoft.NETCore.App\2.0.0\System.Console.dll
ModLoad: 54be0000 54bf2000   c:\Program Files (x86)\dotnet\shared\Microsoft.NETCore.App\2.0.0\System.Threading.dll
ModLoad: 537f0000 53859000   c:\Program Files (x86)\dotnet\shared\Microsoft.NETCore.App\2.0.0\System.Runtime.Extensions.dll
(44f8.28bc): Break instruction exception - code 80000003 (first chance)
eax=7ec66000 ebx=00000000 ecx=777d0f50 edx=04400401 esi=777d0f50 edi=777d0f50
eip=7779ac30 esp=090bfd68 ebp=090bfd94 iopl=0         nv up ei pl zr na pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000246
ntdll!DbgBreakPoint:
7779ac30 cc              int     3
0:006> .loadby sos coreclr
0:006> !threads
ThreadCount:      2
UnstartedThread:  0
BackgroundThread: 1
PendingThread:    0
DeadThread:       0
Hosted Runtime:   no
                                                                         Lock  
       ID OSID ThreadOBJ    State GC Mode     GC Alloc Context  Domain   Count Apt Exception
   4    1 22cc 046b9c10     21220 Preemptive  00000000:00000000 0463cee0 0     Ukn (Finalizer) 
   0    2 59f4 04647aa8     20020 Preemptive  06C240D8:00000000 0463cee0 1     Ukn 

@kevingosse
Copy link
Contributor

kevingosse commented Sep 12, 2017

It only happens in live debugging, not when working with dump files.

I actually have observed the same issue with dump files (with .NET Framework), but only if the dump has been taken on the same computer as the one I'm using to run Windbg. It makes me think it could be an issue in the codepath used by Windbg when loading the local mscordacwks.dll (when opening a dump from another machine, the file will instead be fetched from the symbol store)

@bendono
Copy link
Contributor

bendono commented Sep 13, 2017

I have been hitting this issue on nearly a daily basis for several years now. Further details as well as the cause are included in #5193. It is a bug in dbgeng. However, that is not part of the CoreClr stack, so the issue was worked around in dotnet/coreclr#3513. No such workaround (or a proper fix) has made it to the full CLR.

@mikem8361
Copy link
Member

I'm not that familiar with the desktop process and where to file a bug. I'm including @lt72. He may be able to address that.

@tvass83
Copy link
Contributor Author

tvass83 commented Sep 14, 2017

@bendono: you made a pretty good job there! Thanks for letting me know the details. It seems that I need to improve my search skills before submitting a duplicate issue. Lesson learned. :-)

@poizan42
Copy link
Contributor

poizan42 commented Sep 18, 2017

I reported this to the windbg team back in february last year, it's because dbgeng.dll fails to realise that the currently loaded sos version is the correct one, so sos gets reinitialized, Andy Luhrs said they had filed an issue internally. Here is my original report:

If I

  1. Make a dump of a process (doesn't matter whether it's x86 or x64) running my currently installed .NET version (4.6.1 on Windows 10)
  2. Load that dump into windbg
  3. Load SoS
  4. Run any SoS commands needing the DAC, such as !clrstack

– then the command throws an exception first time its run, but works afterwards:
The exception is shown
The exception here turns out to be an AV caused by sos!g_ExtControl being null - which is weird because the SoS commands in question starts by calling the INIT_API macro, which contains a call to sos!ExtQuery, which either initializes this or bails out with an error.

Upon further investigation it turns out that g_ExtControl is indeed initialized, until the IG_GET_CLR_DATA_INTERFACE ioctl is performed in LoadClrDebugDll. This causes dbgeng to attempt to find and load the version of SoS that belongs to the version of the CLR loaded in the memory dump.

This would normally be fine, however in this case the version of SoS to load is the same as the one already loaded, and that the ioctl was called from. This results in SoS being reinitialized which in turn resets the interface pointers in sos!DebugExtensionInitialize (and leaking reference to the old ones).
stack trace of exception

The version of DbgEng.dll I have tested this with is 10.0.10586.0, and sos.dll is 4.6.1038.0

@mikem8361
Copy link
Member

We will try to include a fix to the desktop SOS for the next update or immediately after. We will create a desktop tracking issue.

Since this is fixed in coreclr would you mind closing this issue?

@bendono
Copy link
Contributor

bendono commented Sep 19, 2017

@mikem8361 The CoreCLR fixes this by working around the problem in dbgeng.dll. I imagine a similar workaround could be done for the full desktop, but has there been any consideration to fixing the actual problem in dbgeng.dll? I would think that addressing the real problem would be the better fix than a workaround.

@poizan42
Copy link
Contributor

@bendono Andy Lurhs (Program Manager for Debugging Tools for Windows) told me they had filed the issue as a bug, so I assume that means that they intend to fix it. However that was back in february last year and the issue is still present in the recent "WinDbg Preview", so it would seem that they consider it quite low priority.

@tvass83
Copy link
Contributor Author

tvass83 commented Sep 19, 2017

Thank you all for the clarification and details provided. I close this issue as it can be considered fixed in coreclr.

@tvass83 tvass83 closed this as completed Sep 19, 2017
@msftgits msftgits transferred this issue from dotnet/coreclr Jan 31, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 20, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants