New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider handling panics of goroutines #10
Comments
@peterbourgon cc. Would you be open to adding this to oklog/run? |
No, recovering from panics would definitely be incorrect in a tool like run.Group. With few exceptions, panics should crash your program. |
Thx for quick response @peterbourgon
Can you give example of those exception? I think I agree that panics should crash your app / microservice. However currently we have case like this: thanos-io/thanos#874 particularly, some nil pointer exception for no really reason caused by this function (not filling a Posting): https://github.com/improbable-eng/thanos/blob/master/pkg/store/bucket.go#L1357:29 The issue is, that sure, panics should crash, but it completely gives almost 0 debuggability for issues like this. We got a gRPC request which nukes the Go binary leaving us with nothing just meaningless stack trace. Only few people can repro this, on few certain queries that touches few certain blocks (: The optional logic for run.Group that recovers panics is something useful here, probably controled by some flag, right? Anyway, we are happy host some wrapper over I believe, that once such issue is fixed we can get our panics to crash app again. ;p |
net/http installs a panic handler with each request-serving goroutine, which I personally wouldn't do if I were designing the package from scratch, but makes reasonable sense. Other than that class of service, no other exception comes to mind.
Adding panic recovery to a run.Group actor yourself is trivial. package main
import (
"context"
"log"
"time"
"github.com/oklog/run"
)
func main() {
var g run.Group
{
ctx, cancel := context.WithCancel(context.Background())
g.Add(func() error {
return foo(ctx)
}, func(error) {
cancel()
})
}
log.Printf("exit: %v", g.Run())
}
func foo(ctx context.Context) error {
oneshot := time.After(time.Second)
for {
select {
case <-oneshot:
panic("kaboom")
case <-ctx.Done():
return ctx.Err()
}
}
}
ctx, cancel := context.WithCancel(context.Background())
g.Add(func() error {
+ defer func() {
+ if x := recover(); x != nil {
+ log.Printf("debug: %v", x)
+ }
+ }()
return foo(ctx)
}, func(error) {
cancel()
})
Is there a reason this won't work? |
It will and that's more or less what we are planning to do however we thought it would be nicer to add this kind of functionality into |
run.Group
spawns one or more goroutines which execute given functions. However, goroutine do not and cannot inherit panic handlers from parent goroutines so a panic() in one of the child goroutines will kill the whole program. Please consider adding a new parameter toGroup.Run()
which will add arecover()
and then return it as an error to the caller. Until then we are forced to add a wrapper function which will do that like here: thanos-io/thanos#932.The text was updated successfully, but these errors were encountered: