Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: store combine not working on ppc64 and arm64 with global variables #24242

rasky opened this issue Mar 4, 2018 · 4 comments


Copy link

@rasky rasky commented Mar 4, 2018

While migrating the code generation tests for load/store combiners to to the top-level testsuite (fad31e5), I had to disable some tests for arm64 and ppc64le because it looks like that store combining is failing in some situations.

For instance:

func store_le16_idx(b []byte, idx int) {
// amd64:`MOVW\s`
// arm64(DISABLED):`MOVH`,-`MOVB`
// ppc64le(DISABLED):`MOVH\s`
binary.LittleEndian.PutUint16(b[idx:], sink16)

It looks like the bug is related to the fact that the store is made from a global variable rather than a local variable (as was being previously tested).

@rasky rasky added the NeedsFix label Mar 4, 2018

This comment has been minimized.

Copy link
Member Author

@rasky rasky commented Mar 4, 2018


This comment has been minimized.

Copy link

@gopherbot gopherbot commented Mar 5, 2018

Change mentions this issue: cmd/compile/internal/ssa: improve store combine optimization on arm64


This comment has been minimized.

Copy link

@laboger laboger commented Mar 5, 2018

I noticed this regression in combining little endian stores while testing my current change for combining loads and stores in big endian order on ppc64le. I have a fix that I was going to include with the big endian change, but could make it separate if that is preferred.


This comment has been minimized.

Copy link

@gopherbot gopherbot commented Mar 5, 2018

Change mentions this issue: cmd/compile,test: combine byte loads and stores on ppc64le

@gopherbot gopherbot closed this in 0596256 Mar 6, 2018
gopherbot pushed a commit that referenced this issue May 8, 2018
CL 74410 added rules to combine consecutive byte loads and
stores when the byte order was little endian for ppc64le. This
is the corresponding change for bytes that are in big endian order.
These rules are all intended for a little endian target arch.

This adds new testcases in test/codegen/memcombine.go

Fixes #22496
Updates #24242

Benchmark improvement for encoding/binary:
name                      old time/op    new time/op    delta
ReadSlice1000Int32s-16      11.0µs ± 0%     9.0µs ± 0%  -17.47%  (p=0.029 n=4+4)
ReadStruct-16               2.47µs ± 1%    2.48µs ± 0%   +0.67%  (p=0.114 n=4+4)
ReadInts-16                  642ns ± 1%     630ns ± 1%   -2.02%  (p=0.029 n=4+4)
WriteInts-16                 654ns ± 0%     653ns ± 1%   -0.08%  (p=0.629 n=4+4)
WriteSlice1000Int32s-16     8.75µs ± 0%    8.20µs ± 0%   -6.19%  (p=0.029 n=4+4)
PutUint16-16                1.16ns ± 0%    0.93ns ± 0%  -19.83%  (p=0.029 n=4+4)
PutUint32-16                1.16ns ± 0%    0.93ns ± 0%  -19.83%  (p=0.029 n=4+4)
PutUint64-16                1.85ns ± 0%    0.93ns ± 0%  -49.73%  (p=0.029 n=4+4)
LittleEndianPutUint16-16    1.03ns ± 0%    0.93ns ± 0%   -9.71%  (p=0.029 n=4+4)
LittleEndianPutUint32-16    0.93ns ± 0%    0.93ns ± 0%     ~     (all equal)
LittleEndianPutUint64-16    0.93ns ± 0%    0.93ns ± 0%     ~     (all equal)
PutUvarint32-16             43.0ns ± 0%    43.1ns ± 0%   +0.12%  (p=0.429 n=4+4)
PutUvarint64-16              174ns ± 0%     175ns ± 0%   +0.29%  (p=0.429 n=4+4)

Updates made to functions in gcm.go to enable their matching. An existing
testcase prevents these functions from being replaced by those in encoding/binary
due to import dependencies.

Change-Id: Idb3bd1e6e7b12d86cd828fb29cb095848a3e485a
Run-TryBot: Lynn Boger <>
TryBot-Result: Gobot Gobot <>
Reviewed-by: Keith Randall <>
@golang golang locked and limited conversation to collaborators Mar 6, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.