Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: unrolled loop results in extra loads/stores (suboptimal schedule?) #33251

Open
jeremyfaller opened this issue Jul 24, 2019 · 4 comments

Comments

@jeremyfaller
Copy link
Contributor

@jeremyfaller jeremyfaller commented Jul 24, 2019

What version of Go are you using (go version)?

go version devel +688f3eacea Mon Jul 22 14:19:18 2019 -0400 darwin/amd64

Does this issue reproduce with the latest release?

reproduces with slightly old sync of HEAD

What operating system and processor architecture are you using (go env)?

go env Output

GO111MODULE=""
GOARCH="amd64"
GOBIN="/Users/jfaller/go/bin"
GOCACHE="/Users/jfaller/Library/Caches/go-build"
GOENV="/Users/jfaller/Library/Application Support/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/jfaller/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/Users/jfaller/src/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/Users/jfaller/src/go/pkg/tool/darwin_amd64"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/kt/9h7gfgy955lgbm_cys108l2c002pwh/T/go-build625773973=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

While fooling around with compiler optimization passes, I found that unrolling a dot-product yields code that generates lots of extraneous loads/stores. I've been doing some digging in, and believe the schedule pass in the compiler generates a suboptimal schedule.

What did you expect to see?

I expected relatively straight-line LOAD/LOAD/MUL/ADD code.

What did you see instead?

The compiler generates a ton of extraneous load/stores.

prod.tar.gz

@randall77
Copy link
Contributor

@randall77 randall77 commented Jul 24, 2019

Yes, it looks like the scheduler is too aggressive moving loads earlier. Too many early loads means it has to immediately spill the loads, then restore them.

@randall77 randall77 added this to the Go1.14 milestone Jul 24, 2019
@randall77 randall77 changed the title cmd/compile: unrolled loop results in extra loads/stores (bad schedule?) cmd/compile: unrolled loop results in extra loads/stores (suboptimal schedule?) Jul 24, 2019
@rsc rsc modified the milestones: Go1.14, Backlog Oct 9, 2019
@andybons
Copy link
Member

@andybons andybons commented Nov 3, 2020

@jeremyfaller can you outline the exact steps you took to generate the ssa.html file? Thanks :)

@jeremyfaller
Copy link
Contributor Author

@jeremyfaller jeremyfaller commented Nov 3, 2020

It's env variable. You give it the name of the function you want the SSA for:

https://dave.cheney.net/2020/06/19/how-to-dump-the-gossafunc-graph-for-a-method

So for this instance, I think it was GOSSAFUNC=$FUNC go tool compile XXX.go.

When I filed this, I did some looking at this, it looked relatively simple to me. This is from memory, and likely to be wrong, but I think there's a heap in the schedule pass that just needs different priorities for the load/stores. (Or the heap isn't there, and that's what I was going to add.) From there it was just expanding the testing, etc. Seemed like a pretty simple fix.

@jeremyfaller
Copy link
Contributor Author

@jeremyfaller jeremyfaller commented Nov 3, 2020

Yeah, looking at src/cmd/compile/internal/ssa/schedule.go there's a heap. I think I thought it was as simple as turning that ValHeap into using float32 for scores, and just encoding the linenumber in decimal points just gently encourage "earlier in the code" to "earlier to schedule." It probably wasn't the best fix, but it would start the discussion, anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants
You can’t perform that action at this time.