Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: better append of unmodified slices #37694

Open
sylr opened this issue Mar 5, 2020 · 7 comments
Open

cmd/compile: better append of unmodified slices #37694

sylr opened this issue Mar 5, 2020 · 7 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Milestone

Comments

@sylr
Copy link

sylr commented Mar 5, 2020

What version of Go are you using (go version)?

$ go version
go version go1.14 darwin/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN="/Users/s.rabot/go/bin"
GOCACHE="/Users/s.rabot/Library/Caches/go-build"
GOENV="/Users/s.rabot/Library/Application Support/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOINSECURE=""
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/s.rabot/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/z0/rcywz4nn3jg6g96h67c_4xzdw31lvl/T/go-build757609694=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

package main

import (
	"testing"
)

func BenchmarkNoCap(b *testing.B) {
	for i := 0; i < b.N; i++ {
		var slice []string

		slice = append(slice, []string{"a", "q", "w"}...)
		slice = append(slice, []string{"z", "s", "x"}...)
	}
}

func BenchmarkCap(b *testing.B) {
	for i := 0; i < b.N; i++ {
		slice := make([]string, 0, 6)

		slice = append(slice, []string{"a", "q", "w"}...)
		slice = append(slice, []string{"z", "s", "x"}...)
	}
}

func BenchmarkCapNoMagicNumber(b *testing.B) {
	for i := 0; i < b.N; i++ {
		base := []string{"a", "q", "w"}
		fixed := []string{"z", "s", "x"}

		slice := make([]string, 0, len(base)+len(fixed))

		slice = append(slice, base...)
		slice = append(slice, fixed...)
	}
}

func BenchmarkCapNoMagicNumberFixedSlices(b *testing.B) {
	for i := 0; i < b.N; i++ {
		base := [...]string{"a", "q", "w"}
		fixed := [...]string{"z", "s", "x"}

		slice := make([]string, 0, len(base)+len(fixed))

		slice = append(slice, base[:]...)
		slice = append(slice, fixed[:]...)
	}
}

What did you expect to see?

I would have expected BenchmarkCapNoMagicNumber and BenchmarkCapNoMagicNumberFixedSlices to have the same throughput.

What did you see instead?

$ go test -bench . -count=3
goos: darwin
goarch: amd64
BenchmarkNoCap-12                          	 7771320	       142 ns/op
BenchmarkNoCap-12                          	 7877163	       142 ns/op
BenchmarkNoCap-12                          	 7857156	       143 ns/op

BenchmarkCap-12                            	59972187	        20.2 ns/op
BenchmarkCap-12                            	58082768	        23.4 ns/op
BenchmarkCap-12                            	44653009	        25.6 ns/op

BenchmarkCapNoMagicNumber-12               	15412473	        74.8 ns/op
BenchmarkCapNoMagicNumber-12               	14540749	        75.1 ns/op
BenchmarkCapNoMagicNumber-12               	15240964	        74.7 ns/op

BenchmarkCapNoMagicNumberFixedSlices-12    	64831053	        21.0 ns/op
BenchmarkCapNoMagicNumberFixedSlices-12    	64266398	        21.4 ns/op
BenchmarkCapNoMagicNumberFixedSlices-12    	60569865	        22.7 ns/op
@randall77 randall77 added this to the Unplanned milestone Mar 5, 2020
@randall77 randall77 changed the title Better inlining of slices ? cmd/compile: better append of sliced arrays Mar 5, 2020
@bcmills
Copy link
Contributor

bcmills commented Mar 5, 2020

@randall77, I think you've got the title backwards. It looks like the sliced array is outperforming the slice-literal version for some reason.

@randall77
Copy link
Contributor

Sorry, yes.

@bcmills bcmills changed the title cmd/compile: better append of sliced arrays cmd/compile: better append of unmodified slices Mar 5, 2020
@renthraysk
Copy link

renthraysk commented Mar 5, 2020

Slow routine allocates because len(base)+len(fixed) of slices isn't determined at compile time, whereas with arrays the expression is constant.

@josharian
Copy link
Contributor

Or to be more precise, during the escape analysis phase, len(base)+len(fixed) isn't known to be a constant. (Later on, SSA figures it out.)

@sylr
Copy link
Author

sylr commented Mar 6, 2020

So both are optimized to the extent of what they can be or is there room for improvement in the slice case ?

@josharian
Copy link
Contributor

It’s definitely possible for them to be improved. The question—which I can’t answer offhand—is whether it is possible to make a small, local improvement that would cover this case, or whether it would require an overhaul. (There are good independent reasons for an overhaul, but what is needed—eliminating one of the compiler’s intermediate representations—is a huge undertaking.)

@toothrot toothrot added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Mar 9, 2020
@bcmills
Copy link
Contributor

bcmills commented Mar 9, 2020

Is this a duplicate of #20533?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Projects
None yet
Development

No branches or pull requests

7 participants