Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: differentiate between "user" and "system" throws #51485

Closed
prattmic opened this issue Mar 4, 2022 · 11 comments
Closed

runtime: differentiate between "user" and "system" throws #51485

prattmic opened this issue Mar 4, 2022 · 11 comments
Assignees
Milestone

Comments

@prattmic
Copy link
Member

@prattmic prattmic commented Mar 4, 2022

"User" throws are throws due to some invariant broken by the application. "System" throws are due to some invariant broken by the runtime, environment, etc (i.e., not the fault of the application).

Differentiating these allows:

  • Simpler triage of potential source of issue (Go or user code).
  • We expect users to primarily consume only "user" throws. We could keep those simple, like panic, but add extra information to "system" throws.

cc @golang/runtime @aclements @mknyszek

@prattmic prattmic added this to the Go1.19 milestone Mar 4, 2022
@prattmic prattmic self-assigned this Mar 4, 2022
@beoran
Copy link

@beoran beoran commented Mar 5, 2022

It is already possible to distinguish between the types of panic using a type switch or reflection in the recover. How exactly is this different, and how would it look in code?

@aclements
Copy link
Member

@aclements aclements commented Mar 5, 2022

@beoran, this is referring specifically to runtime throws (aka, fatal panics), not general panics. These indicate a serious issue and halt the program immediately without running defers or giving any chance to recover. The problem is we use these for two distinct purposes: some indicate a problem with the application, like running out of memory or a racy map access that may have corrupted the map, while others indicate a bug in the runtime. The goal of differentiating these is mostly to help the runtime developers detect and debug runtime issues.

@beoran
Copy link

@beoran beoran commented Mar 6, 2022

Ok, thanks for explaining this. Since either kind of fatal error cannot be caught by recover, the only difference in outcome would be in the stack trace, I presume?

@aclements
Copy link
Member

@aclements aclements commented Mar 7, 2022

The stack trace would be slightly different, yes. That helps with triaging issues. For example, we have an internal crash monitoring dashboard and right now it's difficult to focus it on just the "system" throws we're most interested in (as the compiler/runtime team).

We can also print more information for system throws that's of interest to us, but would just clutter up user throws. For example, but default, tracebacks hide most runtime frames. We would probably want to include all runtime frames in system throws.

@gopherbot
Copy link

@gopherbot gopherbot commented Mar 7, 2022

Change https://go.dev/cl/390421 mentions this issue: runtime: record throw type in m.throwing

@gopherbot
Copy link

@gopherbot gopherbot commented Mar 7, 2022

Change https://go.dev/cl/390422 mentions this issue: runtime: always include runtime frames in system throws

@gopherbot
Copy link

@gopherbot gopherbot commented Mar 7, 2022

Change https://go.dev/cl/390420 mentions this issue: runtime: differentiate "user" and "system" throws

@mvdan
Copy link
Member

@mvdan mvdan commented Mar 7, 2022

Would this also fix #46995?

@prattmic
Copy link
Member Author

@prattmic prattmic commented Mar 7, 2022

My CLs above intentionally don't change the behavior of the user throws, like concurrent map writes. But with this split it should be simpler to clean up inconsistencies like #46995.

@gopherbot
Copy link

@gopherbot gopherbot commented Mar 8, 2022

Change https://go.dev/cl/390815 mentions this issue: runtime: simplify most user throw output

@gopherbot
Copy link

@gopherbot gopherbot commented Mar 8, 2022

Change https://go.dev/cl/390814 mentions this issue: runtime: use ABIInternal for most calls to sigtrampgo

gopherbot pushed a commit that referenced this issue Apr 26, 2022
sigtramp on openbsd-arm64 is teetering on the edge of the nosplit stack
limit. Add more headroom by calling sigtrampgo using ABIInternal, which
eliminates a 48-byte ABI wrapper frame.

openbsd-amd64 has slightly more space, but is also close to the limit,
so convert it as well.

Other operating systems don't have it as bad, but many have nearly
identical implementations of sigtramp, so I have converted them as well.

I've omitted darwin-arm64 and solaris, as those are quite different and
would benefit from not needing ifdef for both cases.

For #51485.

Change-Id: I70512645d4208b346a59d5e5d03836a45833b1d7
Reviewed-on: https://go-review.googlesource.com/c/go/+/390814
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
gopherbot pushed a commit that referenced this issue Apr 28, 2022
"User" throws are throws due to some invariant broken by the application.
"System" throws are due to some invariant broken by the runtime,
environment, etc (i.e., not the fault of the application).

This CL sends "user" throws through the new fatal. Currently this
function is identical to throw, but with a different name to clearly
differentiate the throw type in the stack trace, and hopefully be a bit
more clear to users what it means.

This CL changes a few categories of throw to fatal:

1. Concurrent map read/write.
2. Deadlock detection.
3. Unlock of unlocked sync.Mutex.
4. Inconsistent results from syscall.AllThreadsSyscall.

"Thread exhaustion" and "out of memory" (usually address space full)
throws are additional throws that are arguably the fault of user code,
but I've left off for now because there is no specific invariant that
they have broken to get into these states.

For #51485

Change-Id: I713276a6c290fd34a6563e6e9ef378669d74ae32
Reviewed-on: https://go-review.googlesource.com/c/go/+/390420
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
gopherbot pushed a commit that referenced this issue Apr 28, 2022
This gives explicit names to the possible states of throwing (-1, 0, 1).

m.throwing is now one of:

throwTypeOff: not throwing, previously == 0
throwTypeUser: user throw, previously == -1
throwTypeRuntime: runtime throw, previously == 1

For runtime throws, we now always include frame metadata and system
goroutines regardless of GOTRACEBACK to aid in debugging the runtime.

For user throws, we no longer include frame metadata or runtime frames,
unless GOTRACEBACK=system or higher.

For #51485.

Change-Id: If252e2377a0b6385ce7756b937929be4273a56c0
Reviewed-on: https://go-review.googlesource.com/c/go/+/390421
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Austin Clements <austin@google.com>
@prattmic prattmic self-assigned this Jun 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

No branches or pull requests

5 participants