New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: misleading panic on deferred func #29797

Open
dsymonds opened this Issue Jan 17, 2019 · 15 comments

Comments

Projects
None yet
6 participants
@dsymonds
Copy link
Member

dsymonds commented Jan 17, 2019

What version of Go are you using (go version)?

$ go version
go version devel +75f4aa86ba Thu Sep 27 22:02:08 2018 +0000 linux/amd64

What did you do?

https://play.golang.org/p/3aP7yqgWXdf

What did you expect to see?

The panic stack trace should point to where a nil pointer was dereferenced.

What did you see instead?

The panic stack trace pointed at an innocent line.

This is pretty similar to #14646 and #16011. The panic points at a line that is clearly quite capable of yielding that panic, so it led me on a wild goose chase. It took using gdb (gasp!) to figure out that the defer was involved. Consider this an argument for revisiting the "working as intended" judgment on #16011; it was a bug in my code, but the stack trace was very confusing.

If there was a way to indicate that the deferred function was being executed I would have spotted the issue earlier. The regular stack trace does not indicate that. gdb, however, shows runtime.jmpdefer as the immediate stack frame, which is how I figured it out.

@cespare

This comment has been minimized.

Copy link
Contributor

cespare commented Jan 17, 2019

In #16011, the line number in the panic points off the end of the function (it points at the line containing only }). In this case, it points at the last line of the function. Why is that? Is this one a bug? It seems like having the panic point off the end of the function at least clearly indicates that a defer was involved, even if it doesn't indicate which defer as @dsnet was asking for in #16011.

@randall77

This comment has been minimized.

Copy link
Contributor

randall77 commented Jan 18, 2019

The current behavior is that we return the line number of the return statement that triggered the defer, or the line number of the closing brace of the function if the defer was triggered by the implicit return at the end of the function.

So the line number seems right here, at least with respect to how we currently define "right".

@dsymonds

This comment has been minimized.

Copy link
Member Author

dsymonds commented Jan 18, 2019

I'd be less fussed about the specific line number if there were some other indication in the stack trace that deferred functions were running. In my original example, the panic is largely indistinguishable between the cases of the global something func var being nil and the t function argument being nil.

@randall77

This comment has been minimized.

Copy link
Contributor

randall77 commented Jan 18, 2019

We might be able to show jmpdefer on the stack in this situation. With GOTRACEBACK=crash it shows up.
Our we could put an explicit test in deferreturn and then that function would show up in the trace.

@alexbrainman

This comment has been minimized.

Copy link
Member

alexbrainman commented Jan 18, 2019

@dsymonds welcome to the club #25708. :-)

Alex

@randall77

This comment has been minimized.

Copy link
Contributor

randall77 commented Jan 18, 2019

But, if you point me to the return line, how would I know that the problem is in my defer that could be miles away? What happens if I have 2 defers in the function? Which of the 2 has the problem?
Alex

I'm opposed to just reporting the line number of the defer. If you do:

func f() {
    defer d()
    foo()
    return
}

Currently we report the line number of the return. That at least describes correctly how far execution has made it through f. If we report the line of the defer, it looks like foo hasn't executed yet, when in fact it has.

So I'm opposed to just changing the line number.

In the case of a defer of nil, we could report the jmpdefer frame, which would make it more obvious that the problem was in calling a defer at the end of the function.
In the case of a defer of something not nil, that defer's frame is on the stack, so hopefully it's clear what is going on - "there's no call to Close at that line! Oh, that's a return point, probably I deferred Close somewhere..."

Fundamentally the problem is there are 2 line numbers we'd like to report. Possibly we could print both line numbers?

package main

func main() {
	defer f()
	return
}

func f() {
	panic("foo")
}

Currently this prints:

panic: foo

goroutine 1 [running]:
main.f()
	/Users/khr/gowork/issue29797b.go:9 +0x39
main.main()
	/Users/khr/gowork/issue29797b.go:5 +0x3a

Maybe it could print:

panic: foo

goroutine 1 [running]:
main.f()
	/Users/khr/gowork/issue29797b.go:9 +0x39
main.main(), during return
	defer: /Users/khr/gowork/issue29797b.go:4 +0x??
	return: /Users/khr/gowork/issue29797b.go:5 +0x3a

(Needs a better layout, but that's the idea.)

@randall77

This comment has been minimized.

Copy link
Contributor

randall77 commented Jan 18, 2019

At runtime we'd just need to grab the parent pc to record in the defer structure. It shouldn't be terribly expensive, but it does mean defers are that much slower, when we have #14939 to fix.

@alexbrainman

This comment has been minimized.

Copy link
Member

alexbrainman commented Jan 18, 2019

Maybe it could print:

If you ask me, this new format might confuse people (and maybe some tools that parse stack output). Maybe just keep existing format, but print both return and defer traces?

panic: foo

goroutine 1 [running]:
main.f()
	/Users/khr/gowork/issue29797b.go:9 +0x39
main.main()
	/Users/khr/gowork/issue29797b.go:4 +0x??
main.main()
	/Users/khr/gowork/issue29797b.go:5 +0x3a

Also paging @aarzilli in case it affects delve in some way. @aarzilli do you have any bright suggestions on what to do here? Thank you.

Alex

@aarzilli

This comment has been minimized.

Copy link
Contributor

aarzilli commented Jan 18, 2019

At runtime we'd just need to grab the parent pc to record in the defer structure. It shouldn't be terribly expensive

what's the "parent pc"? the PC of the defer statement is already recorded in _defer. When you encounter runtime.jmpdefer in a traceback you could print that instead and the stacktrace would reflect better what's going on, like in @alexbrainman's example.

@aarzilli

This comment has been minimized.

Copy link
Contributor

aarzilli commented Jan 18, 2019

Although, now that I think about it, in that particular example:

package main

func main() {
	defer f()
	return
}

func f() {
	panic("foo")
}

neither runtime.jmpdefer nor runtime.deferreturn are left on the stack, so it would probably be much harder to do. But it would work for the OP.

@randall77

This comment has been minimized.

Copy link
Contributor

randall77 commented Jan 18, 2019

the PC of the defer statement is already recorded in _defer.

Indeed, so it wouldn't cost any extra at runtime.

neither runtime.jmpdefer nor runtime.deferreturn are left on the stack, so it would probably be much harder to do. But it would work for the OP.

Yes, the nil defer case is easier.
But in the normal case we could detect it because the call site is at the deferreturn point. We might need some extra mechanism to detect that, but it is feasible.

@aarzilli

This comment has been minimized.

Copy link
Contributor

aarzilli commented Jan 18, 2019

But in the normal case we could detect it because the call site is at the deferreturn point

Do you keep a list of those? If you do, I'd like to know because delve uses the disassembler for this (we need them for a different thing).
I don't know if it's worth the binary size increase for this.

@randall77

This comment has been minimized.

Copy link
Contributor

randall77 commented Jan 18, 2019

@aarzilli We don't, but we could keep track. We could keep a linked list in the G of all the frames that are currently in a deferreturn, and possibly what the pc of the defer call was.

@bcmills

This comment has been minimized.

Copy link
Member

bcmills commented Jan 29, 2019

@randall77, does this need to be addressed for 1.12, or should we milestone it to 1.13 or Unplanned?

@randall77

This comment has been minimized.

Copy link
Contributor

randall77 commented Jan 29, 2019

This issue has been present for many releases. Moving to 1.13.

@randall77 randall77 added this to the Go1.13 milestone Jan 29, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment