Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: mid-stack inlining vs. FuncForPC #29582

Closed
randall77 opened this issue Jan 5, 2019 · 5 comments

Comments

Projects
None yet
5 participants
@randall77
Copy link
Contributor

commented Jan 5, 2019

I've seen lots of instances of code like this:

pc, _, _, ok := runtime.Caller(skip)
if !ok { ... }
fmt.Printf("%s\n", runtime.FuncForPC(pc).Name())

(e.g. https://github.com/stretchr/testify/blob/ffdc059bfe9ce6a4e144ba849dbedead332c6053/mock/mock.go#L317)

This code won't work as expected when mid-stack inlining is enabled. If we have

func main() {
     f() // f will be inlined here
}
func f() {
     g()
}
func g() {
    pc, _, _, _ := runtime.Caller(1)
    fmt.Printf("%s\n", runtime.FuncForPC(pc).Name())
}

Then 1.12 prints main.main, not main.f as would be printed for 1.11.
This is because FuncForPC is specified as follows:

If pc represents multiple functions because of inlining, it returns the *Func describing the outermost function.

The pc returned by runtime.Caller is in main.main on an instruction from the body of main.f. Because of the spec worded as above, we get main.main as a result.

The current "fix" for this is to require everyone to move to using runtime.CallersFrames (see footnote 1) which handles inlined frames correctly. This is the long term solution, but requires users of runtime.Caller{,s} + runtime.FuncForPC to do something active to keep their code working for 1.12.

I propose instead to change how FuncForPC works so the above code does not need a fix. If we instead use this spec:

If pc represents multiple functions because of inlining, it returns the a *Func describing the innermost function, but with an entry of the outermost function.

(outermost -> innermost, plus some weasel words for what Entry() returns for an inlined function.)

This will fix the above code transparently. There are 2 wrinkles to this otherwise elegant solution:

  1. This changes the spec for an exported function.
  2. The runtime doesn't necessarily have a Func to return from FuncForPC for inlined functions.

I don't think number 1 is a huge problem. The only pcs that are exposed by the runtime are those returned by runtime.Caller{,s}. Most inlinings we do in 1.11 are leaf inlinings, which runtime.Caller{,s} can't observe. We also inline functions which call panic. Those runtime.Caller{,s} can observe, but would have to be called from within a deferred handler, which seems rare (and is ugly anyway because things like runtime.gopanic are on the stack).

Of course, someone could get a pc by other means, using unsafe, a profiling library, or who knows where else. Those users might see something unexpected.

I have a CL which will solve number 2. FuncForPC has enough information to construct a Func dynamically for the inlined body. It means FuncForPC might allocate, but otherwise it's not observable by the user (other than the weird "what does Entry() return" problem mentioned above).

footnote 1: The testify package fixed this problem by introducing //go:noinline directives. That doesn't seem to be a good approach in general.

@randall77 randall77 added this to the Go1.12 milestone Jan 5, 2019

@randall77 randall77 self-assigned this Jan 5, 2019

@gopherbot

This comment has been minimized.

Copy link

commented Jan 5, 2019

Change https://golang.org/cl/156364 mentions this issue: runtime: make FuncForPC return the innermost inlined frame

@FiloSottile

This comment has been minimized.

Copy link
Member

commented Jan 7, 2019

If this is to be included in 1.12, we need a decision ASAP.

/cc @rsc @ianlancetaylor @bradfitz

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Jan 7, 2019

I would like to get it in for 1.12 if it is clean and @randall77 agrees.

@randall77

This comment has been minimized.

Copy link
Contributor Author

commented Jan 7, 2019

One more tweak is needed to make runtime.FuncForPC work on the results runtime.Callers like it used to - we need to make the pcs point to the instruction immediately after the call instead of the call itself.
I moved the PC target as part of CL 152537, I'm working on a CL to change it back. Hopefully it will be all ready to go by the end of day. Can we wait until tomorrow at 11:30 for a final decision? I'd like to discuss this issue at the runtime/compiler meeting from 11-11:30.

@gopherbot

This comment has been minimized.

Copy link

commented Jan 7, 2019

Change https://golang.org/cl/156657 mentions this issue: runtime: store incremented PC in result of runtime.Callers

gopherbot pushed a commit that referenced this issue Jan 8, 2019

runtime: store incremented PC in result of runtime.Callers
In 1.11 we stored "return addresses" in the result of runtime.Callers.
I changed that behavior in CL 152537 to store an address in the call
instruction itself. This CL reverts that part of 152537.

The change in 152537 was made because we now store pcs of inline marks
in the result of runtime.Callers as well. This CL will now store the
address of the inline mark + 1 in the results of runtime.Callers, so
that the subsequent -1 done in CallersFrames will pick out the correct
inline mark instruction.

This CL means that the results of runtime.Callers can be passed to
runtime.FuncForPC as they were before. There are a bunch of packages
in the wild that take the results of runtime.Callers, subtract 1, and
then call FuncForPC. This CL keeps that pattern working as it did in
1.11.

The changes to runtime/pprof in this CL are exactly a revert of the
changes to that package in 152537 (except the locForPC comment).

Update #29582

Change-Id: I04d232000fb482f0f0ff6277f8d7b9c72e97eb48
Reviewed-on: https://go-review.googlesource.com/c/156657
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>

@gopherbot gopherbot closed this in 956879d Jan 8, 2019

nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Feb 26, 2019

log: fix remaining misuse of runtime.Callers/runtime.FuncForPC
Fixes cockroachdb#17770.

This commit fixes the last user of `runtime.Callers` that misused
the stdlib function by translating the PC values it returned directly
into symbolic information (see https://golang.org/pkg/runtime/#Callers).
Go's documentation warns that this is a recipe for disaster when mixed
with mid-stack inlining.

The other concern in cockroachdb#17770 was this comment:
cockroachdb#17770 (comment).
This was discussed in golang/go#29582 and
addressed in golang/go@956879d.

An alternative would be to use `runtime.Caller` here, but that would
force an allocation that was hard-earned in cockroachdb#29017. Instead, this commit
avoids any performance hit.

```
name                 old time/op    new time/op    delta
Header-4                267ns ± 1%     268ns ± 0%    ~     (p=0.584 n=10+10)
VDepthWithVModule-4     260ns ± 3%     255ns ± 1%  -1.87%  (p=0.018 n=10+9)

name                 old alloc/op   new alloc/op   delta
Header-4                0.00B          0.00B         ~     (all equal)
VDepthWithVModule-4     0.00B          0.00B         ~     (all equal)

name                 old allocs/op  new allocs/op  delta
Header-4                 0.00           0.00         ~     (all equal)
VDepthWithVModule-4      0.00           0.00         ~     (all equal)
```

Release note: None

nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Feb 26, 2019

log: fix remaining misuse of runtime.Callers/runtime.FuncForPC
Fixes cockroachdb#17770.

This commit fixes the last user of `runtime.Callers` that misused
the stdlib function by translating the PC values it returned directly
into symbolic information (see https://golang.org/pkg/runtime/#Callers).
Go's documentation warns that this is a recipe for disaster when mixed
with mid-stack inlining.

The other concern in cockroachdb#17770 was this comment:
cockroachdb#17770 (comment).
This was discussed in golang/go#29582 and
addressed in golang/go@956879d.

An alternative would be to use `runtime.Caller` here, but that would
force an allocation that was hard-earned in cockroachdb#29017. Instead, this commit
avoids any performance hit.

```
name                 old time/op    new time/op    delta
Header-4                267ns ± 1%     268ns ± 0%    ~     (p=0.584 n=10+10)
VDepthWithVModule-4     260ns ± 3%     255ns ± 1%  -1.87%  (p=0.018 n=10+9)

name                 old alloc/op   new alloc/op   delta
Header-4                0.00B          0.00B         ~     (all equal)
VDepthWithVModule-4     0.00B          0.00B         ~     (all equal)

name                 old allocs/op  new allocs/op  delta
Header-4                 0.00           0.00         ~     (all equal)
VDepthWithVModule-4      0.00           0.00         ~     (all equal)
```

Release note: None

nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Feb 26, 2019

log: fix remaining misuse of runtime.Callers/runtime.FuncForPC
Fixes cockroachdb#17770.

This commit fixes the last user of `runtime.Callers` that misused
the stdlib function by translating the PC values it returned directly
into symbolic information (see https://golang.org/pkg/runtime/#Callers).
Go's documentation warns that this is a recipe for disaster when mixed
with mid-stack inlining.

The other concern in cockroachdb#17770 was this comment:
cockroachdb#17770 (comment).
This was discussed in golang/go#29582 and
addressed in golang/go@956879d.

An alternative would be to use `runtime.Caller` here, but that would
force an allocation that was hard-earned in cockroachdb#29017. Instead, this commit
avoids any performance hit.

```
name                 old time/op    new time/op    delta
Header-4                267ns ± 1%     268ns ± 0%    ~     (p=0.584 n=10+10)
VDepthWithVModule-4     260ns ± 3%     255ns ± 1%  -1.87%  (p=0.018 n=10+9)

name                 old alloc/op   new alloc/op   delta
Header-4                0.00B          0.00B         ~     (all equal)
VDepthWithVModule-4     0.00B          0.00B         ~     (all equal)

name                 old allocs/op  new allocs/op  delta
Header-4                 0.00           0.00         ~     (all equal)
VDepthWithVModule-4      0.00           0.00         ~     (all equal)
```

Release note: None

craig bot pushed a commit to cockroachdb/cockroach that referenced this issue Feb 26, 2019

Merge #35014 #35199 #35203
35014: kv: introduce new "max outstanding size" setting to txnPipeliner r=nvanbenschoten a=nvanbenschoten

Fixes #32522.

This change creates a new cluster setting called:
```
kv.transaction.write_pipelining_max_outstanding_size
```

It limits the size in bytes that can be dedicated to tracking in-flight
writes that have been pipelined. Once this limit is hit, not more writes
will be pipelined by a transaction until some of the writes are proven
and removed from the outstanding write set.

This change once again illustrates the need for periodic proving of
outstanding writes. We touch upon that in the type definition's comment
and in #35009.

Release note: None

35199: log: fix remaining misuse of runtime.Callers/runtime.FuncForPC r=nvanbenschoten a=nvanbenschoten

Fixes #17770.

This commit fixes the last user of `runtime.Callers` that misused
the stdlib function by translating the PC values it returned directly
into symbolic information (see https://golang.org/pkg/runtime/#Callers) [1].
Go's documentation warns that this is a recipe for disaster when mixed
with mid-stack inlining.

The other concern in #17770 was this comment: #17770 (comment).
This was discussed in golang/go#29582 and addressed in golang/go@956879d.

An alternative would be to use `runtime.Caller` here, but that would
force an allocation that was hard-earned in #29017. Instead, this commit
avoids any performance hit.

```
name                 old time/op    new time/op    delta
Header-4                267ns ± 1%     268ns ± 0%    ~     (p=0.584 n=10+10)
VDepthWithVModule-4     260ns ± 3%     255ns ± 1%  -1.87%  (p=0.018 n=10+9)

name                 old alloc/op   new alloc/op   delta
Header-4                0.00B          0.00B         ~     (all equal)
VDepthWithVModule-4     0.00B          0.00B         ~     (all equal)

name                 old allocs/op  new allocs/op  delta
Header-4                 0.00           0.00         ~     (all equal)
VDepthWithVModule-4      0.00           0.00         ~     (all equal)
```

[1] I went through and verified that this was still correct.

Release note: None

35203: closedts: react faster to config changes r=danhhz a=tbg

This approximately halves the duration of

`make test PKG=./pkg/ccl/changefeedccl TESTS=/sinkless`,

from ~60s to ~30s.

Touches #34455.

Release note: None

Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
Co-authored-by: Tobias Schottdorf <tobias.schottdorf@gmail.com>

dt added a commit to dt/cockroach that referenced this issue Feb 27, 2019

Merge cockroachdb#35014 cockroachdb#35199 cockroachdb#35203
35014: kv: introduce new "max outstanding size" setting to txnPipeliner r=nvanbenschoten a=nvanbenschoten

Fixes cockroachdb#32522.

This change creates a new cluster setting called:
```
kv.transaction.write_pipelining_max_outstanding_size
```

It limits the size in bytes that can be dedicated to tracking in-flight
writes that have been pipelined. Once this limit is hit, not more writes
will be pipelined by a transaction until some of the writes are proven
and removed from the outstanding write set.

This change once again illustrates the need for periodic proving of
outstanding writes. We touch upon that in the type definition's comment
and in cockroachdb#35009.

Release note: None

35199: log: fix remaining misuse of runtime.Callers/runtime.FuncForPC r=nvanbenschoten a=nvanbenschoten

Fixes cockroachdb#17770.

This commit fixes the last user of `runtime.Callers` that misused
the stdlib function by translating the PC values it returned directly
into symbolic information (see https://golang.org/pkg/runtime/#Callers) [1].
Go's documentation warns that this is a recipe for disaster when mixed
with mid-stack inlining.

The other concern in cockroachdb#17770 was this comment: cockroachdb#17770 (comment).
This was discussed in golang/go#29582 and addressed in golang/go@956879d.

An alternative would be to use `runtime.Caller` here, but that would
force an allocation that was hard-earned in cockroachdb#29017. Instead, this commit
avoids any performance hit.

```
name                 old time/op    new time/op    delta
Header-4                267ns ± 1%     268ns ± 0%    ~     (p=0.584 n=10+10)
VDepthWithVModule-4     260ns ± 3%     255ns ± 1%  -1.87%  (p=0.018 n=10+9)

name                 old alloc/op   new alloc/op   delta
Header-4                0.00B          0.00B         ~     (all equal)
VDepthWithVModule-4     0.00B          0.00B         ~     (all equal)

name                 old allocs/op  new allocs/op  delta
Header-4                 0.00           0.00         ~     (all equal)
VDepthWithVModule-4      0.00           0.00         ~     (all equal)
```

[1] I went through and verified that this was still correct.

Release note: None

35203: closedts: react faster to config changes r=danhhz a=tbg

This approximately halves the duration of

`make test PKG=./pkg/ccl/changefeedccl TESTS=/sinkless`,

from ~60s to ~30s.

Touches cockroachdb#34455.

Release note: None

Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
Co-authored-by: Tobias Schottdorf <tobias.schottdorf@gmail.com>

danhhz pushed a commit to danhhz/cockroach that referenced this issue May 28, 2019

log: fix remaining misuse of runtime.Callers/runtime.FuncForPC
Fixes cockroachdb#17770.

This commit fixes the last user of `runtime.Callers` that misused
the stdlib function by translating the PC values it returned directly
into symbolic information (see https://golang.org/pkg/runtime/#Callers).
Go's documentation warns that this is a recipe for disaster when mixed
with mid-stack inlining.

The other concern in cockroachdb#17770 was this comment:
cockroachdb#17770 (comment).
This was discussed in golang/go#29582 and
addressed in golang/go@956879d.

An alternative would be to use `runtime.Caller` here, but that would
force an allocation that was hard-earned in cockroachdb#29017. Instead, this commit
avoids any performance hit.

```
name                 old time/op    new time/op    delta
Header-4                267ns ± 1%     268ns ± 0%    ~     (p=0.584 n=10+10)
VDepthWithVModule-4     260ns ± 3%     255ns ± 1%  -1.87%  (p=0.018 n=10+9)

name                 old alloc/op   new alloc/op   delta
Header-4                0.00B          0.00B         ~     (all equal)
VDepthWithVModule-4     0.00B          0.00B         ~     (all equal)

name                 old allocs/op  new allocs/op  delta
Header-4                 0.00           0.00         ~     (all equal)
VDepthWithVModule-4      0.00           0.00         ~     (all equal)
```

Release note: None

(sapling split of 6654c81)

danhhz pushed a commit to danhhz/cockroach that referenced this issue May 28, 2019

craig[bot]
Merge cockroachdb#35014 cockroachdb#35199 cockroachdb#35203
35014: kv: introduce new "max outstanding size" setting to txnPipeliner r=nvanbenschoten a=nvanbenschoten

Fixes cockroachdb#32522.

This change creates a new cluster setting called:
```
kv.transaction.write_pipelining_max_outstanding_size
```

It limits the size in bytes that can be dedicated to tracking in-flight
writes that have been pipelined. Once this limit is hit, not more writes
will be pipelined by a transaction until some of the writes are proven
and removed from the outstanding write set.

This change once again illustrates the need for periodic proving of
outstanding writes. We touch upon that in the type definition's comment
and in cockroachdb#35009.

Release note: None

35199: log: fix remaining misuse of runtime.Callers/runtime.FuncForPC r=nvanbenschoten a=nvanbenschoten

Fixes cockroachdb#17770.

This commit fixes the last user of `runtime.Callers` that misused
the stdlib function by translating the PC values it returned directly
into symbolic information (see https://golang.org/pkg/runtime/#Callers) [1].
Go's documentation warns that this is a recipe for disaster when mixed
with mid-stack inlining.

The other concern in cockroachdb#17770 was this comment: cockroachdb#17770 (comment).
This was discussed in golang/go#29582 and addressed in golang/go@956879d.

An alternative would be to use `runtime.Caller` here, but that would
force an allocation that was hard-earned in cockroachdb#29017. Instead, this commit
avoids any performance hit.

```
name                 old time/op    new time/op    delta
Header-4                267ns ± 1%     268ns ± 0%    ~     (p=0.584 n=10+10)
VDepthWithVModule-4     260ns ± 3%     255ns ± 1%  -1.87%  (p=0.018 n=10+9)

name                 old alloc/op   new alloc/op   delta
Header-4                0.00B          0.00B         ~     (all equal)
VDepthWithVModule-4     0.00B          0.00B         ~     (all equal)

name                 old allocs/op  new allocs/op  delta
Header-4                 0.00           0.00         ~     (all equal)
VDepthWithVModule-4      0.00           0.00         ~     (all equal)
```

[1] I went through and verified that this was still correct.

Release note: None

35203: closedts: react faster to config changes r=danhhz a=tbg

This approximately halves the duration of

`make test PKG=./pkg/ccl/changefeedccl TESTS=/sinkless`,

from ~60s to ~30s.

Touches cockroachdb#34455.

Release note: None

Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
Co-authored-by: Tobias Schottdorf <tobias.schottdorf@gmail.com>

(sapling split of 0aecab8)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.