Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: improve PPC64.rules for loads and stores as found in encoding/binary ByteOrder interface functions #22496

Closed
laboger opened this issue Oct 30, 2017 · 3 comments

Comments

Projects
None yet
5 participants
@laboger
Copy link
Contributor

commented Oct 30, 2017

Use PPC64.rules to improve performance of statements that do multiple single byte loads or stores on consecutive bytes, such as is found in the encoding/binary ByteOrder interface. Instead do a single multi-byte load or store where possible, such as lhz, lwz, ld, sth, stw, std.

This improves functions that are found in encoding/binary:
Uint16([]byte) uint16
Uint32([]byte) uint32
Uint64([]byte) uint64
PutUint16([]byte, uint16)
PutUint32([]byte, uint32)
PutUint64([]byte, uint64)

The initial change for this will handle little endian cases.

@laboger laboger changed the title cmd/compile: improve PPC64.rules for loads and stores as found in encoding/binary ByteOrder interface functionsntXX cmd/compile: improve PPC64.rules for loads and stores as found in encoding/binary ByteOrder interface functions Oct 30, 2017

@gopherbot

This comment has been minimized.

Copy link

commented Oct 30, 2017

Change https://golang.org/cl/74410 mentions this issue: cmd/compile: add rules to improve consecutive byte loads and stores on ppc64le

gopherbot pushed a commit that referenced this issue Nov 3, 2017

cmd/compile: add rules to improve consecutive byte loads and stores o…
…n ppc64le

This adds new rules to recognize consecutive byte loads and
stores and lowers them to loads and stores such as lhz, lwz, ld,
sth, stw, std. This change only covers the little endian cases
on little endian machines, such as is found in encoding/binary
UintXX or PutUintXX for little endian. Big endian will be done
later.

Updates were also made to binary_test.go to allow the benchmark
for Uint and PutUint to actually use those functions because
the way they were written, those functions were being
optimized out.

Testcases were also added to cmd/compile/internal/gc/asm_test.go.

Updates #22496

The following improvement can be found in golang.org/x/crypto

poly1305:

Benchmark64-16              142           114           -19.72%
Benchmark1K-16              1717          1424          -17.06%
Benchmark64Unaligned-16     142           113           -20.42%
Benchmark1KUnaligned-16     1721          1428          -17.02%

chacha20poly1305:

BenchmarkChacha20Poly1305Open_64-16     1012       885   -12.55%
BenchmarkChacha20Poly1305Seal_64-16     971        836   -13.90%
BenchmarkChacha20Poly1305Open_1350-16   11113      9539  -14.16%
BenchmarkChacha20Poly1305Seal_1350-16   11013      9392  -14.72%
BenchmarkChacha20Poly1305Open_8K-16     61074      53431 -12.51%
BenchmarkChacha20Poly1305Seal_8K-16     61214      54806 -10.47%

Other improvements of around 10% found in crypto/tls.

Results after updating encoding/binary/binary_test.go:

BenchmarkLittleEndianPutUint64-16     1.87      0.93      -50.27%
BenchmarkLittleEndianPutUint32-16     1.19      0.93      -21.85%
BenchmarkLittleEndianPutUint16-16     1.16      1.03      -11.21%

Change-Id: I7bbe2fbcbd11362d58662fecd907a0c07e6ca2fb
Reviewed-on: https://go-review.googlesource.com/74410
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Munday <mike.munday@ibm.com>

laboger added a commit to powertechpreview/go that referenced this issue Feb 2, 2018

cmd/compile: rules to improve performance for consecutive byte loads
This is a backport to IBM's go 1.9 branch of golang issue golang#22496.

Improvements were seen in the std packages binary, crypto/tls,
and golang.org/x pkgs crypto/poly1305, and chacha.

laboger added a commit to powertechpreview/go that referenced this issue Feb 6, 2018

cmd/compile: rules to improve performance of consecutive byte loads
This is a backport of the change for golang issue golang#22496.

This change improves std pkgs binary, crypto/tls
golang.org/x/crypto packages crypto/poly1305 and chacha
by 5-20%.
@gopherbot

This comment has been minimized.

Copy link

commented Mar 2, 2018

Change https://golang.org/cl/98136 mentions this issue: cmd/compile: combine byte loads and stores on ppc64le

@andybons andybons added the NeedsFix label Mar 13, 2018

@andybons andybons added this to the Go1.11 milestone Mar 13, 2018

@bradfitz

This comment has been minimized.

Copy link
Member

commented May 7, 2018

Marking release-blocker (because it has a CL) and @randall77 as owner for review.

@gopherbot gopherbot closed this in 28edaf4 May 8, 2018

@golang golang locked and limited conversation to collaborators May 8, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.