-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/compile: inline forwarding thunk functions #8421
Comments
I opened issue #4735 last year for the more detailed debug info, for the same purpose. |
I have also run across cases where it would be helpful to inline "forwarding" functions. For instance, consider this pattern: func F() { f() } // well-documented, exported function in x.go func f() // stub for asm implementation in x_amd64.go func f() { // pure-go implementation in x_other.go
// ...
} If this is a very fast function, having a double function-call overhead instead of single is significant. (The alternative is to declare and document F twice.) And if you want to write the pure-Go implementation of f in x.go, so that a test may compare the asm with the pure-go implementation, then that requires another level of forwarding (so triple function-call overhead): // x.go
func F() { f() }
func fgo() { /* ... */ } func f() // stub for asm implementation in x_amd64.go func f() { fgo() } // x_other.go (I also described this issue in this go-nuts thread.) |
There's a forthcoming proposal for mid-stack inlining in #19348 that I believe should address this. |
See also https://golang.org/cl/147361. |
As a followup to that CL I started sending CLs to inline the fast paths of some of the I also tried to apply the same approach to the |
@CAFxX if you mark throw as //go:noinline does that help with the mutual recursion/budget problem? |
@josharian I didn't think of that but alas it doesn't work:
I suppose this is because the check for recursion is done before |
Another thing I noticed, that is only partially related though, is that it seems that no DCE is done before inlining, so things like the following happen (I'm compiling on linux amd64, race mode is disabled). Notice all the "dead" src/math/bits/bits.go:283:6: can inline Len as: func(uint) int { if UintSize == 32 { }; return Len64(uint64(x)) } src/runtime/cgocall.go:176:14: inlining call to dolockOSThread func() { if GOARCH == "wasm" { }; _g_ := getg(); _g_.m.lockedg.set(_g_); _g_.lockedm.set(_g_.m) } src/runtime/stack.go:1277:24: inlining call to stackmapdata func(*stackmap, int32) bitvector { if stackDebug > 0 { }; return bitvector literal } src/runtime/malloc.go:1105:6: can inline nextSample as: func() int32 { if GOOS == "plan9" { }; return fastexprand(MemProfileRate) } src/go/token/position.go:452:15: inlining call to sync.(*RWMutex).RLock method(*sync.RWMutex) func() { if bool(false) { }; if atomic.AddInt32(&sync.rw.readerCount, int32(1)) < int32(0) { sync.runtime_SemacquireMutex(&sync.rw.readerSem, bool(false)) }; if bool(false) { } } src/runtime/type.go:269:20: inlining call to reflectOffsUnlock func() { if raceenabled { }; unlock(&reflectOffs.lock) } (longer output is here: https://gist.github.com/CAFxX/c901a4420216d154144b4ef4b469b2cf) Now, all of these were clearly within the inlining budget, but it's likely that many others would be within the budget if DCE would run before inlining. I'm not sure if there is already a bug for this or if I should file one. |
Yes. I'd need to double-check, but IIRC the recursion check is done by visitBottomUp, which is shared across a few use cases. But we could add a flag to it to break on //go:noinline when using for inlining analysis. This is (probably) a straightforward change. If you'd like, I can look into it. Or you are welcome to.
We do basic DCE before inlining. See e.g. https://go-review.googlesource.com/c/go/+/37499/ and other links from #19699. See also a related discussion in #16871. There are others. The general problem is that inlining happens early, and we don't have much analysis completed to work off of to do a more complete DCE. Ideas for cheaply increasing early DCE coverage are always welcome (cheap both in execution time and code complexity). |
Change https://golang.org/cl/154058 mentions this issue: |
@CAFxX I had a few minutes to kill so I whipped up CL 154058, if you'd like to patch that in and then experiment some with your runtime lock/unlock changes. And it should be relatively clear how to beef up the change if there are mechanisms other than go:noinline that you want to use to avoid a recursive walk. (See inl.go for how to write said mechanisms.) |
DO NOT SUBMIT [Needs tests, compilebench, better write-up] Updates golang#8421 For code: package p //go:noinline func f() { g() } func g() { f() } Before: $ go tool compile -m -m x.go x.go:4:6: cannot inline f: recursive x.go:6:6: cannot inline g: recursive After: $ go tool compile -m -m x.go x.go:6:6: can inline g as: func() { f() } x.go:4:6: cannot inline f: marked go:noinline x.go:4:13: inlining call to g func() { f() } Change-Id: I49b4095451f4f23ca2e0778fdc617d9cba8698b2
This should be fixed with mid-stack inlining. |
@randall77 awesome! Just tested out tip and indeed, the forwarding overhead I mentioned in #8421 (comment) (both the "double" and "triple" versions) are gone. |
DO NOT SUBMIT [Needs tests, compilebench, better write-up] Updates golang#8421 For code: package p //go:noinline func f() { g() } func g() { f() } Before: $ go tool compile -m -m x.go x.go:4:6: cannot inline f: recursive x.go:6:6: cannot inline g: recursive After: $ go tool compile -m -m x.go x.go:6:6: can inline g as: func() { f() } x.go:4:6: cannot inline f: marked go:noinline x.go:4:13: inlining call to g func() { f() } Change-Id: I49b4095451f4f23ca2e0778fdc617d9cba8698b2
The text was updated successfully, but these errors were encountered: