Disclaimer: This is not necessarily an issue, I'm opening this thread to provide information that I hope may be helpful. It contains a microbenchmark which is not representative of real world performance, just a tiny subset. But there's something unusual/strange about it, which is why I think there's a chance this might be helpful and I'm reporting this. Please close if it's not helpful and nothing needs to be done.
I had a little microbenchmark snippet I used previously to compare gc and GopherJS performance, and I decided to try it on the SSA backend of Go 1.7. I found a surprise where one amd64 computer behaves very differently to all others I've tried it on, and I'm wondering if it's caused by an unintended bug somewhere or not.
What version of Go are you using (go version)?
go version go1.7 darwin/amd64
What operating system and processor architecture are you using (go env)?
GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/Dmitri/Dropbox/Work/2013/GoLanding:/Users/Dmitri/Dropbox/Work/2013/GoLand"
GORACE=""
GOROOT="/usr/local/go"
GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64"
CC="clang"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/g3/yrvdj9f55ll7jy7l5ygz82yc0000gn/T/go-build084371455=/tmp/go-build -gno-record-gcc-switches -fno-common"
CXX="clang++"
CGO_ENABLED="1"
What did you do?
I ran the following program with and without SSA backend on two different computers (both have amd64 CPU architecture).
https://play.golang.org/p/aAM1SuV6U4
Computer A is a MacBook Pro (15-inch, Late 2011), running OS X 10.11.6, with 2.4 GHz Intel Core i7-2760QM CPU @ 2.40GHz x 8.
$ go build -gcflags="-ssa=0" -o /tmp/o && /tmp/o
approximating pi with 1000000000 iterations.
3.1415926545880506
total time taken is: 6.431564409s
$ go build -gcflags="-ssa=1" -o /tmp/o && /tmp/o
approximating pi with 1000000000 iterations.
3.1415926545880506
total time taken is: 6.420316364s
Computer B is a MacBook (Retina, 12-inch, Early 2016), running OS X 10.11.6, with 1.1 GHz Intel Core m3-6Y30 CPU @ 0.90GHz x 4.
$ go build -gcflags="-ssa=0" -o /tmp/o && /tmp/o
approximating pi with 1000000000 iterations.
3.1415926545880506
total time taken is: 2.564973583s
$ go build -gcflags="-ssa=1" -o /tmp/o && /tmp/o
approximating pi with 1000000000 iterations.
3.1415926545880506
total time taken is: 5.771555271s
(There is a variance of about ±5% between individual runs.)
|
Computer A |
Computer B |
| SSA=0 |
6.431564409s |
2.564973583s |
| SSA=1 |
6.420316364s |
5.771555271s |
What did you expect to see?
Given that the SSA backend generated code that performed roughly equally well on computer A, I expected that it have a similar result on computer B.
What did you see instead?
Instead, I saw that on computer B (and but not on computer A) enabling SSA reduces the performance by a factor of more than two.
Disclaimer: This is not necessarily an issue, I'm opening this thread to provide information that I hope may be helpful. It contains a microbenchmark which is not representative of real world performance, just a tiny subset. But there's something unusual/strange about it, which is why I think there's a chance this might be helpful and I'm reporting this. Please close if it's not helpful and nothing needs to be done.
I had a little microbenchmark snippet I used previously to compare gc and GopherJS performance, and I decided to try it on the SSA backend of Go 1.7. I found a surprise where one
amd64computer behaves very differently to all others I've tried it on, and I'm wondering if it's caused by an unintended bug somewhere or not.What version of Go are you using (
go version)?What operating system and processor architecture are you using (
go env)?What did you do?
I ran the following program with and without SSA backend on two different computers (both have
amd64CPU architecture).https://play.golang.org/p/aAM1SuV6U4
Computer A is a MacBook Pro (15-inch, Late 2011), running OS X 10.11.6, with 2.4 GHz Intel Core i7-2760QM CPU @ 2.40GHz x 8.
Computer B is a MacBook (Retina, 12-inch, Early 2016), running OS X 10.11.6, with 1.1 GHz Intel Core m3-6Y30 CPU @ 0.90GHz x 4.
(There is a variance of about ±5% between individual runs.)
What did you expect to see?
Given that the SSA backend generated code that performed roughly equally well on computer A, I expected that it have a similar result on computer B.
What did you see instead?
Instead, I saw that on computer B (and but not on computer A) enabling SSA reduces the performance by a factor of more than two.