Skip to content

cmd/compile: SSA performance inconsistency/regression difference across amd64 CPUs. #16982

@dmitshur

Description

@dmitshur

Disclaimer: This is not necessarily an issue, I'm opening this thread to provide information that I hope may be helpful. It contains a microbenchmark which is not representative of real world performance, just a tiny subset. But there's something unusual/strange about it, which is why I think there's a chance this might be helpful and I'm reporting this. Please close if it's not helpful and nothing needs to be done.

I had a little microbenchmark snippet I used previously to compare gc and GopherJS performance, and I decided to try it on the SSA backend of Go 1.7. I found a surprise where one amd64 computer behaves very differently to all others I've tried it on, and I'm wondering if it's caused by an unintended bug somewhere or not.

What version of Go are you using (go version)?

go version go1.7 darwin/amd64

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/Dmitri/Dropbox/Work/2013/GoLanding:/Users/Dmitri/Dropbox/Work/2013/GoLand"
GORACE=""
GOROOT="/usr/local/go"
GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64"
CC="clang"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/g3/yrvdj9f55ll7jy7l5ygz82yc0000gn/T/go-build084371455=/tmp/go-build -gno-record-gcc-switches -fno-common"
CXX="clang++"
CGO_ENABLED="1"

What did you do?

I ran the following program with and without SSA backend on two different computers (both have amd64 CPU architecture).

https://play.golang.org/p/aAM1SuV6U4

Computer A is a MacBook Pro (15-inch, Late 2011), running OS X 10.11.6, with 2.4 GHz Intel Core i7-2760QM CPU @ 2.40GHz x 8.

$ go build -gcflags="-ssa=0" -o /tmp/o && /tmp/o
approximating pi with 1000000000 iterations.
3.1415926545880506
total time taken is: 6.431564409s

$ go build -gcflags="-ssa=1" -o /tmp/o && /tmp/o
approximating pi with 1000000000 iterations.
3.1415926545880506
total time taken is: 6.420316364s

Computer B is a MacBook (Retina, 12-inch, Early 2016), running OS X 10.11.6, with 1.1 GHz Intel Core m3-6Y30 CPU @ 0.90GHz x 4.

$ go build -gcflags="-ssa=0" -o /tmp/o && /tmp/o
approximating pi with 1000000000 iterations.
3.1415926545880506
total time taken is: 2.564973583s

$ go build -gcflags="-ssa=1" -o /tmp/o && /tmp/o
approximating pi with 1000000000 iterations.
3.1415926545880506
total time taken is: 5.771555271s

(There is a variance of about ±5% between individual runs.)

Computer A Computer B
SSA=0 6.431564409s 2.564973583s
SSA=1 6.420316364s 5.771555271s

What did you expect to see?

Given that the SSA backend generated code that performed roughly equally well on computer A, I expected that it have a similar result on computer B.

What did you see instead?

Instead, I saw that on computer B (and but not on computer A) enabling SSA reduces the performance by a factor of more than two.

Metadata

Metadata

Assignees

No one assigned

    Labels

    FrozenDueToAgeNeedsFixThe path to resolution is known, but the work has not been done.

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions