Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOS sets the invalid parameter handler in the CRT on Windows in an unsafe way #4070

Closed
davidwrighton opened this issue Jul 13, 2023 · 7 comments
Assignees
Labels
bug Something isn't working sos
Milestone

Comments

@davidwrighton
Copy link
Member

Description

In https://github.com/dotnet/diagnostics/blob/a54655ea621579a60530d8cf9d516c72ea8f23c2/src/SOS/Strike/exts.cpp#L288C12-L288C12 SOS sets the invalid parameter handler to something that will throw an exception, etc.

There is a comment there that this is safe as SOS is compiled against the static CRT. Unfortunately, this isn't the case anymore, and we now run the SOS invalid parameter handler when CRT calls from windbg itself are used. (note: it appears that the debug build of SOS is linked against a static CRT, but the release build is not).

This should be fixed by either changing to link against the static CRT or by not setting process global state.

@davidwrighton davidwrighton added the bug Something isn't working label Jul 13, 2023
@davidwrighton
Copy link
Member Author

@mikem8361 @tommcdon

I'm not actually confident that we should be using the static CRT, but it seems quite bad that we are configuring all CRT invalid paramters to run through our logic. I hit this when windbg behaved incorrectly (and that has since been fixed in the live version of Windbg), but in general this seems like a bad thing to do.

@tommcdon tommcdon added this to the 8.0.0 milestone Jul 13, 2023
@tommcdon tommcdon added the sos label Jul 13, 2023
@hoyosjs
Copy link
Member

hoyosjs commented Jul 14, 2023

From a security and distribution perspective, we probably shouldn't try to statically link the CRT.

@mikem8361
Copy link
Member

I have a vague feeling that there was a reason for SOS to statically link to the CRT, but I don't remember it. It has been this way since the desktop Framework days (I know that "it has always been that way" isn't a valid reason). @leculver or @noahfalk do remember anything?

@leculver
Copy link
Contributor

leculver commented Jul 14, 2023

That decision was before my time (I think that's been there since early 2000s), but I do remember the reasons. The following is the history of why we made that decision, but not any defense of the behavior. :)

The original author(s) of SOS hooked this API for the following reasons:

  1. Debuggers were MUCH buggier back in the day, and it was a lot harder to get a fix for one. So the issue which @davidwrighton ran into in WinDbg might take a few weeks to find a working build while we might be under time constraints to solve an issue. In this case, the work around doesn't work anymore (probably due to better security), but historically this kind of workaround regularly saved us from a lot of headache.
  2. The original design of the dac was to regularly throw Access Violations. Something like !dumpstackobjects could throw hundreds of AVs as it attempted to interpret every pointer on the stack as an object. The work in .net 7/8 to avoid AVs on Linux/OSX (since we can't recover from those SIGSEGV as we can on Windows) likely cleaned up a lot of this behavior in .net core.
  3. Historically, distributing the CRT with SOS was a pain point. SOS was a monolithic debugging dll which didn't depend on anything so you could copy/paste it onto a machine without a CRT and it would work. Additionally, we had a lot more dll-hell related to CRTs back in the day, and something like Watson (which had to process crash dumps from thousands of CLR builds) would result in a machine requiring 100s of CRTs. (Or maybe it wasn't that bad, but my recollection was that it was pretty bad.) SOS is no longer a monolithic DLL, so changing this behavior now isn't as big of a deal from this standpoint.

I have no objections to changing that behavior, it's the right thing to do from a security perspective. Here's two caveats that might come back to bite you that might need to be mitigated:

  • SOS here is used for both .Net Core and Desktop CLR. AFAIK, the fixes to not throw AVs in the dac have not been backported to the desktop runtime, and doing so would be a challenge, to say the least. You may need to keep that behavior when debugging a desktop runtime. (Or maybe not, it would be a good thing to test and see what happens. This old "feature" might be broken everywhere.) Edit: Keep in mind this isn't about plain-old AV's but rather passing invalid parameters to OS features. It might be hard to find a deterministic place where this happens in the desktop DAC. Here I'm only saying: If you turn this call off, be sure to test it on Desktop CLR, and if there's no obvious issues we are probably fine...we can always fall back to the old installed SOS.
  • You may need to include the CRT in the big bag of dlls that SOS pulls down. For example, including it in the debugging gallery or nuget package to make sure it is available on any given computer. I'm not familiar with the state of the CRT though, maybe it's backwards (forwards?) compatible enough now that we don't have those kinds of issues. I haven't dug into that in years.

Hope that helps!

@mikem8361
Copy link
Member

@davidwrighton, this line and these lines in the SOS CMakeLists.txt seem to indicate that we do statically link the CRT into SOS.dll so the comment below on the invalid parameter hook is accurate.

   // Make sure we do not tear down the debugger when a security function fails
   // Since we link statically against CRT this will only affect the SOS module.
   _set_invalid_parameter_handler(_SOS_invalid_parameter);

I'd rather not change this if it isn't really necessary for all the reasons Lee outlined above unless for the security concerns that Juan pointed out.

@davidwrighton
Copy link
Member Author

@mikem8361 When I looked into this I found that for the build of sos.dll that I was using it was linked to the dynamic CRT. I know there are various comments and possibly attempts in the CMake stuff to use the static CRT, but I found that it wasn't actually doing so. I believe that with our current codebase we should link against the static CRT, but my issue is that we are not doing so.

@mikem8361
Copy link
Member

mikem8361 commented Oct 9, 2023

Thanks David. Yes, I confirmed via dumpbin /imports sos.dll on a release build shows the dynamically linked CRT imports. I will remove the invalid parameter hook. The CMakeLists.txt stuff is very confusing.

mikem8361 added a commit to mikem8361/diagnostics that referenced this issue Oct 9, 2023
SOS does get built dynamically linking to the C++ CRT in release so the invalid parameter handler that
gets installed affects all the code in the process.

Fixes issue: dotnet#4070
mikem8361 added a commit that referenced this issue Oct 10, 2023
SOS does get built dynamically linking to the C++ CRT in release so the
invalid parameter handler that gets installed affects all the code in
the process.

Fixes issue: #4070
@ghost ghost locked as resolved and limited conversation to collaborators Nov 9, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working sos
Projects
None yet
Development

No branches or pull requests

5 participants