Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: escape analysis on closures #39511

Closed
twmb opened this issue Jun 10, 2020 · 7 comments
Closed

cmd/compile: escape analysis on closures #39511

twmb opened this issue Jun 10, 2020 · 7 comments
Milestone

Comments

@twmb
Copy link
Contributor

@twmb twmb commented Jun 10, 2020

What version of Go are you using (go version)?

$ go version
go version devel +e92be18fd8 Wed Jun 10 14:56:01 2020 +0000 linux/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
$ go env
GO111MODULE="on"
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/twmb/.cache/go-build"
GOENV="/home/twmb/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/twmb/go/pkg/mod"
GOOS="linux"
GOPATH="/home/twmb/go"
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/home/twmb/go/go.tip"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/home/twmb/go/go.tip/pkg/tool/linux_amd64"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/home/twmb/testing/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build930817391=/tmp/go-build -gno-record-gcc-switches"

What did you do?

package esc

import "testing"

func LoopEach(fn func(int) func(int)) {
	for i := 0; i < 100; i++ {
		inner := fn(i)
		for j := 0; j < 100; j++ {
			inner(j)
		}
	}
}

func BenchmarkLoopEach(b *testing.B) {
	for i := 0; i < b.N; i++ {
		var v int
		LoopEach(func(i int) func(int) {
			return func(j int) {
				v += i
				v += j
			}
		})
	}
}

func BenchmarkLoopEachSaved(b *testing.B) {
	for i := 0; i < b.N; i++ {
		var v int
		var oi int
		r := func(j int) {
			v += oi
			v += j
		}
		LoopEach(func(i int) func(int) {
			oi = i
			return r
		})
	}
}

What did you expect to see?

I expect BenchmarkLoopEach to perform the same as BenchmarkLoopEachSaved. Preferably, I would also like to see BenchmarkLoopEach not allocate.

What did you see instead?

$ go test -bench . -benchmem
goos: linux
goarch: amd64
pkg: junk
BenchmarkLoopEach-12         	   32421	     54038 ns/op	    3208 B/op	     101 allocs/op
BenchmarkLoopEachSaved-12    	   52102	     22321 ns/op	      48 B/op	       3 allocs/op

In the former, the closure is allocated inside every j iteration in LoopEach while v is allocated once per benchmark iteration. In the latter, the closure, v, and oi are allocated once per benchmark iteration.

go test -gcflags '-m -m' Output
./esc_test.go:19:5: BenchmarkLoopEach.func1.1 capturing by ref: v (addr=true assign=true width=8)
./esc_test.go:19:10: BenchmarkLoopEach.func1.1 capturing by value: i (addr=false assign=false width=8)
./esc_test.go:21:4: BenchmarkLoopEach.func1 capturing by ref: v (addr=true assign=true width=8)
./esc_test.go:31:4: BenchmarkLoopEachSaved.func1 capturing by ref: v (addr=true assign=true width=8)
./esc_test.go:31:9: BenchmarkLoopEachSaved.func1 capturing by ref: oi (addr=true assign=true width=8)
./esc_test.go:35:4: BenchmarkLoopEachSaved.func2 capturing by ref: oi (addr=true assign=true width=8)
./esc_test.go:36:11: BenchmarkLoopEachSaved.func2 capturing by value: r (addr=false assign=false width=8)
./esc_test.go:5:15: fn does not escape
./esc_test.go:18:11: func literal escapes to heap:
./esc_test.go:18:11:   flow: ~r1 = &{storage for func literal}:
./esc_test.go:18:11:     from func literal (spill) at ./esc_test.go:18:11
./esc_test.go:18:11:     from return func literal (return) at ./esc_test.go:18:4
./esc_test.go:16:7: v escapes to heap:
./esc_test.go:16:7:   flow: {storage for func literal} = &v: 
./esc_test.go:16:7:     from func literal (captured by a closure) at ./esc_test.go:18:11
./esc_test.go:16:7:     from v (reference) at ./esc_test.go:19:5
./esc_test.go:14:24: b does not escape
./esc_test.go:16:7: moved to heap: v
./esc_test.go:17:12: func literal does not escape
./esc_test.go:18:11: func literal escapes to heap
./esc_test.go:30:8: func literal escapes to heap:
./esc_test.go:30:8:   flow: r = &{storage for func literal}:
./esc_test.go:30:8:     from func literal (spill) at ./esc_test.go:30:8
./esc_test.go:30:8:     from r := func literal (assign) at ./esc_test.go:30:5
./esc_test.go:30:8:   flow: ~r1 = r:
./esc_test.go:30:8:     from return r (return) at ./esc_test.go:36:4
./esc_test.go:29:7: oi escapes to heap:
./esc_test.go:29:7:   flow: {storage for func literal} = &oi:
./esc_test.go:29:7:     from func literal (captured by a closure) at ./esc_test.go:30:8
./esc_test.go:29:7:     from oi (reference) at ./esc_test.go:31:9
./esc_test.go:28:7: v escapes to heap:
./esc_test.go:28:7:   flow: {storage for func literal} = &v: 
./esc_test.go:28:7:     from func literal (captured by a closure) at ./esc_test.go:30:8
./esc_test.go:28:7:     from v (reference) at ./esc_test.go:31:4
./esc_test.go:26:29: b does not escape
./esc_test.go:28:7: moved to heap: v
./esc_test.go:29:7: moved to heap: oi
./esc_test.go:30:8: func literal escapes to heap
./esc_test.go:34:12: func literal does not escape

@twmb
Copy link
Contributor Author

@twmb twmb commented Jun 10, 2020

Potentially related to #18300

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Jun 10, 2020

@mdempsky
Copy link
Member

@mdempsky mdempsky commented Jun 10, 2020

There's a general limitation that func f() *T { return &T{...} } has to heap allocate the T object (except when f is inlined). This is a particular instance of that where the T object is a function closure.

I don't think there's anything that can be done specially here, unfortunately.

@mdempsky mdempsky closed this Jun 10, 2020
@twmb
Copy link
Contributor Author

@twmb twmb commented Jun 10, 2020

Knowing nothing about compiler optimization, is it possible to lift the closure in the more allocating benchmark to be declared similar to the less allocating one?

@mdempsky
Copy link
Member

@mdempsky mdempsky commented Jun 10, 2020

Theoretically, yes. But cmd/compile doesn't implement any optimizations like that currently.

This is #22081.

@twmb
Copy link
Contributor Author

@twmb twmb commented Jun 10, 2020

Thanks! Interesting, I thought this would be more similar to for loop variable lifting that I remember @josharian mentioning in the past.

@mdempsky
Copy link
Member

@mdempsky mdempsky commented Jun 10, 2020

You're right that it's conceptually similar. The key difference is lifting variables across call frames has implications on calling conventions, whereas lifting them out of a loop only affects how that function itself is compiled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.