[API Proposal]: new System.Diagnostics.StackTrace(System.Threading.Thread) #79463

jhudsoncedaron · 2022-12-09T18:14:49Z

Background and motivation

Enable applications to provide self diagnostics:

This has come up a few times where we've had a runaway thread in production (sometimes our hosted environment, sometimes the customer's environment). There doesn't be a better way to get the running stacks than attaching a debugger, taking a dump, and walking a dump. We did indeed look into doing exactly that but are unhappy with the half a gigabyte of temp space this allocates and the performance of the operation (atm that's secondary but if this becomes a standard debugging technique it won't be.... Think about it: if there's a diagnostics button support is going to push that button a lot whether or not it makes sense to push it).

I looked into how to do it; found StackFrameHelper and discovered that it takes a Thread in its constructor. This looks ideal, so I tried it and found that it asserts that it is passed current thread or a thread that isn't running. The comments suggest that a suspended thread should be usable but the actual code at the point of assertion doesn't check for a suspended thread.

Now at this point somebody's going to jump in and say that SuspendThread was banished for a reason and they'd be right. I definitely don't want a SuspendThread as it was either. However consider this: the GC is able to suspend a thread and it doesn't cause the issues that SuspendThread normally causes. I ran out of puff attempting to determine how this works however we know the GC isn't troubled by threads currently being in native code when it walks their stacks, nor is it troubled by deadlocks when it suspends them for full mark/sweep GC. Stack walk itself is in native code and would not be troubled by a managed-suspend so we should be able to do this.

API Proposal

namespace System.Diagnostics;

public partial class StackTrace
{
    public StackTrace(System.Threading.Thread thread, int numFramesToSkip = 0);
}

API Usage

    IActionResult DebugGetWokerStacks() =>
        Content(string.Join("\r\n", w.Name + ": " + string.Join("\r\n", WorkerThreads.Select((w) => new StackTrace(w).GetStackFrames()));
}

Alternative Designs

I wouldn't have bothered with numFramesToSkip except for the source code already has it; all the work goes into unlocking the ability in native code; the argument is already passed to it.

Risks

Unless I'm very much mistaken there is no risk that comes into play unless somebody actually calls the function. I can see a bad enough bug in the implementation causing sporadic deadlocks, but such a bug would still only be triggered if somebody calls the function.

The text was updated successfully, but these errors were encountered:

dotnet-issue-labeler · 2022-12-09T18:14:53Z

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

ghost · 2022-12-09T18:43:28Z

Tagging subscribers to this area: @tommcdon
See info in area-owners.md if you want to be subscribed.

Issue Details

Background and motivation

Enable applications to provide self diagnostics:

This has come up a few times where we've had a runaway thread in production (sometimes our hosted environment, sometimes the customer's environment). There doesn't be a better way to get the running stacks than attaching a debugger, taking a dump, and walking a dump. We did indeed look into doing exactly that but are unhappy with the half a gigabyte of temp space this allocates and the performance of the operation (atm that's secondary but if this becomes a standard debugging technique it won't be.... Think about it: if there's a diagnostics button support is going to push that button a lot whether or not it makes sense to push it).

I looked into how to do it; found StackFrameHelper and discovered that it takes a Thread in its constructor. This looks ideal, so I tried it and found that it asserts that it is passed current thread or a thread that isn't running. The comments suggest that a suspended thread should be usable but the actual code at the point of assertion doesn't check for a suspended thread.

Now at this point somebody's going to jump in and say that SuspendThread was banished for a reason and they'd be right. I definitely don't want a SuspendThread as it was either. However consider this: the GC is able to suspend a thread and it doesn't cause the issues that SuspendThread normally causes. I ran out of puff attempting to determine how this works however we know the GC isn't troubled by threads currently being in native code when it walks their stacks, nor is it troubled by deadlocks when it suspends them for full mark/sweep GC. Stack walk itself is in native code and would not be troubled by a managed-suspend so we should be able to do this.

API Proposal

namespace System.Diagnostics;

public partial class StackTrace
{
    public StackTrace(System.Threading.Thread thread, int numFramesToSkip = 0);
}

API Usage

    IActionResult DebugGetWokerStacks() =>
        Content(string.Join("\r\n", w.Name + ": " + string.Join("\r\n", WorkerThreads.Select((w) => new StackTrace(w).GetStackFrames()));
}

Alternative Designs

I wouldn't have bothered with numFramesToSkip except for the source code already has it; all the work goes into unlocking the ability in native code; the argument is already passed to it.

Risks

Unless I'm very much mistaken there is no risk that comes into play unless somebody actually calls the function. I can see a bad enough bug in the implementation causing sporadic deadlocks, but such a bug would still only be triggered if somebody calls the function.

Author:	jhudsoncedaron
Assignees:	-
Labels:	`api-suggestion`, `area-System.Diagnostics`, `untriaged`
Milestone:	-

jander-msft · 2022-12-09T21:03:58Z

If you are open to using out-of-process tools rather than requiring an API, you can use:

dotnet-stack tool; this runs completely out-of-process and only requires the diagnostic event pipe to be available
dotnet-monitor tool's /stack route (currently experimental; this loads a profiler into the process to get the information)

jhudsoncedaron · 2022-12-09T21:15:17Z

@jander-msft : I'm fine with out of process tools; the problem I ran into was taking the dump of the entire process to do so. It's quite overweight.

jander-msft · 2022-12-09T21:20:59Z

Neither of these tools capture a dump of the process. They collect stack information directly from the runtime. However, you won't get as great of data fidelity as a dump since they only report the stack frames (modules, method names, argument types) for each thread.

If you use dotnet-stack and have feedback, feel free to log issues at https://github.com/dotnet/diagnostics/issues
If you use dotnet-monitor and have feedback, feel free to log issues at https://github.com/dotnet/dotnet-monitor/issues

jhudsoncedaron · 2022-12-09T21:36:13Z

@jander-msft : Apparently these don't exist as libraries. (dotnet tool install isn't something that can be packaged up.)

If they were libraries I'd just do Process.Start(...) to a bundled binary that takes care of the serialization to standard output.

jander-msft · 2022-12-09T22:09:18Z

dotnet-stack has a direct download option (find "Direct download" in the Install section) that allows you to run a self-extracting framework-dependent executable.

At this time, dotnet-monitor is not offered as such a package (I'll look into see how this tool can provide a similar acquisition experience in the future) and requires the .NET SDK to install it. There's probably a way you can install it on one machine using the .NET SDK, zip up the bits, and unzip it on the target machine, but that's a bit unnatural to do.

jhudsoncedaron · 2022-12-09T22:14:50Z

"framework-dependent executable" Can't use. :( I'll gladly withdraw this for a nuget package I can link against though.

epeshk · 2022-12-11T08:58:12Z

For example, Java has Thread.getAllStackTraces(). It is a quite useful API that can save a lot of time when ThreadPool starvation or deadlock issues appear.

Yes, external tools can be used, but these tools must be deployed on the server. And for self-contained apps, not only additional tools, but also an entire runtime. Maybe an option to include dotnet-stack to the application bundle (as it done for createdump by default) would be useful?

Or it may be done manually when #53834 will be done

jhudsoncedaron · 2022-12-12T15:40:06Z

@epeshk : I have a local solution to #53834 that depends on 1) all exe targets using the exact same runtimeframework version and target RID, 2) all exe targets being built framework independent, and 3) *.deps.json files being generated. I can handle references to different mutually incompatible versions of the same nuget package.

jhudsoncedaron added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Dec 9, 2022

ghost added the untriaged New issue has not been triaged by the area owner label Dec 9, 2022

teo-tsirpanis added the area-System.Diagnostics label Dec 9, 2022

tommcdon added this to the 8.0.0 milestone Dec 13, 2022

ghost removed the untriaged New issue has not been triaged by the area owner label Dec 13, 2022

hoyosjs added the enhancement Product code improvement that does NOT require public API changes/additions label Jan 7, 2023

tommcdon modified the milestones: 8.0.0, 9.0.0 Jul 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[API Proposal]: new System.Diagnostics.StackTrace(System.Threading.Thread) #79463

[API Proposal]: new System.Diagnostics.StackTrace(System.Threading.Thread) #79463

jhudsoncedaron commented Dec 9, 2022

dotnet-issue-labeler bot commented Dec 9, 2022

ghost commented Dec 9, 2022

Background and motivation

API Proposal

API Usage

Alternative Designs

Risks

jander-msft commented Dec 9, 2022

jhudsoncedaron commented Dec 9, 2022

jander-msft commented Dec 9, 2022

jhudsoncedaron commented Dec 9, 2022 •

edited

jander-msft commented Dec 9, 2022

jhudsoncedaron commented Dec 9, 2022

epeshk commented Dec 11, 2022

jhudsoncedaron commented Dec 12, 2022

[API Proposal]: new System.Diagnostics.StackTrace(System.Threading.Thread) #79463

[API Proposal]: new System.Diagnostics.StackTrace(System.Threading.Thread) #79463

Comments

jhudsoncedaron commented Dec 9, 2022

Background and motivation

API Proposal

API Usage

Alternative Designs

Risks

dotnet-issue-labeler bot commented Dec 9, 2022

ghost commented Dec 9, 2022

Background and motivation

API Proposal

API Usage

Alternative Designs

Risks

jander-msft commented Dec 9, 2022

jhudsoncedaron commented Dec 9, 2022

jander-msft commented Dec 9, 2022

jhudsoncedaron commented Dec 9, 2022 • edited

jander-msft commented Dec 9, 2022

jhudsoncedaron commented Dec 9, 2022

epeshk commented Dec 11, 2022

jhudsoncedaron commented Dec 12, 2022

jhudsoncedaron commented Dec 9, 2022 •

edited