Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spec: clarify when calling recover stops a panic #34530

Open
go101 opened this issue Sep 25, 2019 · 19 comments

Comments

@go101
Copy link

commented Sep 25, 2019

What version of Go are you using (go version)?

go version go1.13 linux/amd64

Does this issue reproduce with the latest release?

Yes

What did you do?

package main

import "fmt"

func demo() {
	defer func() {
		defer func() {
			// recover panic 2
			fmt.Println("panic", recover(), "is recovered")
		}()

		// recover panic 1
		defer fmt.Println(" (done).")
		defer recover()
		defer fmt.Print("To recover panic 1 ...")

		defer fmt.Println("now, two active panics coexist")
		panic(2) // If this line is commented out, then the program will not crash.
	}()
	panic(1)
}

func main() {
	demo()
}

What did you expect to see?

Not crash.

What did you see instead?

Crashes.

@go101

This comment has been minimized.

Copy link
Author

commented Sep 25, 2019

If this is the intended behavior, then the descriptions in the spec doesn't cover this case.

@randall77

This comment has been minimized.

Copy link
Contributor

commented Sep 25, 2019

@bcmills

This comment has been minimized.

Copy link
Member

commented Sep 26, 2019

Tentatively milestoning for 1.14 on the theory that it may be addressed alongside #34481, but feel free to re-milestone if that is not realistic.

@bcmills bcmills added this to the Go1.14 milestone Sep 26, 2019
@bcmills

This comment has been minimized.

Copy link
Member

commented Sep 26, 2019

@go101, personally I would expect this program to crash with panic 1.

Since both of the recover calls are within nested defer statements, I would expect them to only cover panics that originate within the call to the deferred function.

But certainly a clarification either way would be helpful.

@go101

This comment has been minimized.

Copy link
Author

commented Sep 26, 2019

By the explanations for recover mechanism in spec, the following program has no fundamental differences from the above one. But the following one exits without crashing.

package main

import "fmt"

func demo() {
	defer func() {
		// recover panic 1
		defer fmt.Println(" (done).")
		defer recover()
		defer fmt.Print("To recover panic 1 ...")
		
		defer func() {
			// recover panic 2
			fmt.Println("panic", recover(), "is recovered")
		}()

		defer fmt.Println("now, two active panics coexist")
		panic(2)
	}()
	panic(1)
}

func main() {
	demo()
}

My understanding by reading spec is a recover call will recover the panic in the caller of the caller, which must be deferred, of the recover call. Whether or not the recover call itself is deferred is not important.

@danscales

This comment has been minimized.

Copy link

commented Sep 26, 2019

Yes, the Go language spec is very vague and imprecise on panic/recover and I am hoping to fix it:

https://go-review.googlesource.com/c/go/+/189377

but there's not quite consensus yet.

I don't think there is any disagreement on the behavior of your example (unlike the abort behavior mentioned in #29226 ). A recover only recovers a panic if it is called directly in a defer function that is directly invoked as part of the panicking process of that panic. A recover does not apply and returns nil if it is not called directly in a defer function or if it is called from a defer that was not directly invoked by the panic sequence of the panic (i.e. is nested inside some other defer or function called by the defer).

Your second example seems to be a special little bug/quirk of doing 'defer recover()'. The detection mechanism in the implementation considers that recover is called directly in the containing function, so that recover does apply. Note that recover does not happen if you do

  defer func() {
    recover()
  }

I think the first thing in both these bugs are to get agreement on the spec - both specifying the current behavior that people depend on and/or would generally agree on, and also the behavior (like the abort behavior in #29226 ) that has been there for a while, but has never been specified and people might like to change.

@go101

This comment has been minimized.

Copy link
Author

commented Sep 26, 2019

Yes, this is a little quirk. There is not a way to verify whether or not the recovered value is really 1.
Whatever, I think the behaviors of the above two programs should be always consistent, for there are no fundamental differences between them.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Sep 27, 2019

The difference between the two programs has to do with the behavior of defer recover(). The question is simply whether recover is being called by a deferred function. When you write defer recover() in a normal function, it is not (a deferred call to recover is not a case in which recover is called by a deferred function). But when defer recover() occurs within a function that is itself deferred, then it is.

So, in the first program, the defer recover() sees the panic(2), but since it is not called from a deferred function, it does not stop the panic. That panic continues until the "recover panic 2" deferred function, which recovers it. And nothing recovers the panic(1), so the program crashes.

In the second program, the panic(2) is recovered. Then we run the defer recover() in a deferred function. In this case recover is called from a deferred function--the outer deferred function, not the defer recover() function--so that catches panic(1). And the program succeeds.

@ianlancetaylor ianlancetaylor changed the title runtime: sometimes recover calls fail to work spec: clarify when calling recover stops a panic Sep 27, 2019
@go101

This comment has been minimized.

Copy link
Author

commented Sep 27, 2019

@ianlancetaylor
Thanks for the clarification.

Your explanation makes me more believe that each goroutine maintains a panic stack. Only the top panic in the stack may be recovered. And to be recoverable, the top panic must not be created in runtime.Goexit() calls.

So my current understanding is like this:

Assume the frame depth of the entry call of a goroutine is 1, 
in the execution of the goroutine, when a function call ends 
and the frame depth becomes smaller than the size of the panic stack, 
then the top panic will collapse down (to replace the below panic) 
until the size of stack is the same as the frame depth. 

Each goroutine maintains a panic stack. A panic must be the top one in the
stack and must not be created in a runtime.Goexit call to be recoverable.
A recovered panic will be removed from the stack.

Is the understanding right?

[edit]: there may be some blank slots in the panic stack, or each panic will be associated with a depth property. The depth of a panic can decrease and will never be larger than the current execution frame depth.

@go101

This comment has been minimized.

Copy link
Author

commented Sep 27, 2019

@ianlancetaylor
By your above explanation, I think the current new panic explanations in https://go-review.googlesource.com/c/go/+/189377 still needs a little adjustment.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Sep 27, 2019

Sorry, I don't really understand your frame depth discussion.

I encourage you to comment on the CL where you think it needs adjustment.

@mdempsky

This comment has been minimized.

Copy link
Member

commented Sep 30, 2019

I think this is a really interesting test case. Thanks for reporting it, @go101.

On my phone, so I haven't looked at the Go spec wording or @danscales's suggestion recently, so I'm not sure the right behaviour. From memory though, I can see arguments either way.

@danscales

This comment has been minimized.

Copy link

commented Oct 2, 2019

Thanks for all the examples and discussion. I missed that there was a defer recover() even in the first example. @ianlancetaylor gives a very good explanation of the current behavior of defer recover(). Overall, the meaning of defer recover() isn't intuitively obvious and its actual behavior with the current implementation is confusing. So, we are thinking about banning the use of defer recover() (since it seems likely to be only useful for test code). Instead of defer recover(), you should always at least use defer func() { recover() }() (and of course, it is always recommended to look at the return code of recover()).

With respect to talking about (and spec'ing out) the behavior of panic/recover/recursive panics, it will be best not to use defer recover(), since that will just create more cases that tend to confuse the other issues.

@mdempsky

This comment has been minimized.

Copy link
Member

commented Oct 2, 2019

So, we are thinking about banning the use of defer recover() (since it seems likely to be only useful for test code).

Are we okay breaking Go 1 compat here? There's at least some code that uses defer recover(); e.g., https://github.com/billziss-gh/cgofuse/blob/master/fuse/host.go#L406

@bcmills

This comment has been minimized.

Copy link
Member

commented Oct 2, 2019

Are we okay breaking Go 1 compat here?

There are two changes we can make without breaking compatibility:

  1. We can make defer recover() a vet error everywhere, which would catch most uses during testing but not break uses in transitive dependencies.
  2. We can make defer recover() a compile-time error in any module whose go.mod file specifies go 1.14 or higher, since ~no such module exists yet.
@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Oct 2, 2019

That use of defer recover() in fuse is a bug, right? It is in effect a no-op.

@mdempsky

This comment has been minimized.

Copy link
Member

commented Oct 2, 2019

That use of defer recover() in fuse is a bug, right? It is in effect a no-op.

I believe so.

@go101

This comment has been minimized.

Copy link
Author

commented Oct 3, 2019

Personally, I think the cost of either vet or ban defer recover() is higher than making a little more explanations in spec, not only for compatibility problems, but also for sometimes defer recover() is desired (just like the second example above, it is a have-to. [edit]: there are really some workarounds, such as splitting the outer defer function into two. So the main reason is still to keep compatibility.).

[edit 2]: the validity of the defer recover() in the first example depends on whether or not panic 2 will happen. So vet will make some false positive reports (though this is not a problem for vet).

@go101

This comment has been minimized.

Copy link
Author

commented Oct 3, 2019

On the other hand, for consistency, it is not a bad idea to ban discarding returns of built-in functions. Currently, only the returns of copy and recover can be discarded.

@rsc rsc modified the milestones: Go1.14, Backlog Oct 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants
You can’t perform that action at this time.