Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: concurrent map iter+write fatal doesn't respect GOTRACEBACK #68019

Open
bradfitz opened this issue Jun 15, 2024 · 2 comments
Open

runtime: concurrent map iter+write fatal doesn't respect GOTRACEBACK #68019

bradfitz opened this issue Jun 15, 2024 · 2 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@bradfitz
Copy link
Contributor

bradfitz commented Jun 15, 2024

Go version

go version go1.22.4 linux/amd64

What did you do?

  • Introduce a data race on a map (repro: https://go.dev/play/p/rQ1QQ7PMZ86)
  • Don't have sufficient test coverage that runs concurrently to catch it in CI
  • Ship it to prod
  • Have over a million goroutines
  • Wait an hour....

What did you see happen?

... we hit the race on the map in prod after an hour and 💣

The Go runtime then did its fatal:

fatal error: concurrent map iteration and map write

But then we got 3 GB of GOTRACEBACK stacks from the Go runtime for 1M+ stacks, about 3 GB more than we needed to find the bug, overwhelming our logging system in the process, causing a secondary outage.

GOTRACEBACK docs say:

The failure prints stack traces for all goroutines if there is no current goroutine or the failure is internal to the run-time.

And arguably a map race is internal to the run-time insofar as maps are implemented in the runtime.

But the Go runtime know it's the user's fault; note the "but is used when user code is expected to be at fault for the failure" bit here:

// fatal triggers a fatal error that dumps a stack trace and exits.
//
// fatal is equivalent to throw, but is used when user code is expected to be
// at fault for the failure, such as racing map writes.
//
// fatal does not include runtime frames, system goroutines, or frame metadata
// (fp, sp, pc) in the stack trace unless GOTRACEBACK=system or higher.
//
//go:nosplit
func fatal(s string) {
        // Everything fatal does should be recursively nosplit so it
        // can be called even when it's unsafe to grow the stack.
        systemstack(func() {
                print("fatal error: ")
                printindented(s) // logically printpanicval(s), but avoids convTstring write barrier
                print("\n")
        })

        fatalthrow(throwTypeUser)
}

What did you expect to see?

I'd expect fatalthrow(throwTypeUser) to be treated as a normal panic with GOTRACEBACK=single respected.

Currently it's ignored:

bradfitz@book1pro ~ % GOTRACEBACK=single go run maprace.go 2>&1 | wc -l
     618
bradfitz@book1pro ~ % GOTRACEBACK=none go run maprace.go 2>&1 | wc -l
       2
bradfitz@book1pro ~ % GOTRACEBACK=system go run maprace.go 2>&1 | wc -l
    1684

(At least none works, but that's too quiet, hiding the actual problem).

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jun 15, 2024
@MikeMitchellWebDev
Copy link
Contributor

MikeMitchellWebDev commented Jun 15, 2024

It says the following in runtime1.go

// gotraceback returns the current traceback settings.
//
// If level is 0, suppress all tracebacks.
// If level is 1, show tracebacks, but exclude runtime frames.
// If level is 2, show tracebacks including runtime frames.
// If all is set, print all goroutine stacks. Otherwise, print just the current goroutine.
// If crash is set, crash (core dump, etc) after tracebacking.

@joedian joedian added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jun 25, 2024
@griesemer griesemer added this to the Go1.24 milestone Jun 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
Development

No branches or pull requests

6 participants