Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: performance improvements for memclr on ppc64x #17348

Closed
laboger opened this issue Oct 5, 2016 · 3 comments

Comments

Projects
None yet
3 participants
@laboger
Copy link
Contributor

commented Oct 5, 2016

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

go version devel +a0d83eb Wed Oct 5 12:44:29 2016 -0500 linux/ppc64le

What operating system and processor architecture are you using (go env)?

Ubuntu 16.04 ppc64le

Looking at performance of runtime benchmarks and found that memclr could be improved on ppc64x.

@bradfitz

This comment has been minimized.

Copy link
Member

commented Oct 5, 2016

This needs more details. What is the code for memclr now, and what should it be?

@laboger

This comment has been minimized.

Copy link
Contributor Author

commented Oct 5, 2016

The file to be changed is memclr_ppc64x.s. The change will be similar to what is currently done in memmove_ppc64x.s, where loops are unrolled to improve performance.

For example when it can be determined that chunks of 32 bytes are being cleared:

loop:
std r0, 0(r3)
std r0, 8(r3)
std r0, 16(r3)
std r0, 24(r3)
bc loop

Does much better than

loop:
stdu r0,8(r3)
bc loop

@gopherbot

This comment has been minimized.

Copy link

commented Oct 5, 2016

CL https://golang.org/cl/30373 mentions this issue.

@gopherbot gopherbot closed this in 3107c91 Oct 5, 2016

ceseo added a commit to powertechpreview/go that referenced this issue Dec 1, 2016

runtime: memclr perf improvements on ppc64x
This updates runtime/memclr_ppc64x.s to improve performance,
by unrolling loops for larger clears.

Fixes golang#17348

benchmark                    old MB/s     new MB/s     speedup
BenchmarkMemclr/5-80         199.71       406.63       2.04x
BenchmarkMemclr/16-80        693.66       1817.41      2.62x
BenchmarkMemclr/64-80        2309.35      5793.34      2.51x
BenchmarkMemclr/256-80       5428.18      14765.81     2.72x
BenchmarkMemclr/4096-80      8611.65      27191.94     3.16x
BenchmarkMemclr/65536-80     8736.69      28604.23     3.27x
BenchmarkMemclr/1M-80        9304.94      27600.09     2.97x
BenchmarkMemclr/4M-80        8705.66      27589.64     3.17x
BenchmarkMemclr/8M-80        8575.74      23631.04     2.76x
BenchmarkMemclr/16M-80       8443.10      19240.68     2.28x
BenchmarkMemclr/64M-80       8390.40      9493.04      1.13x
BenchmarkGoMemclr/5-80       263.05       630.37       2.40x
BenchmarkGoMemclr/16-80      904.33       1148.49      1.27x
BenchmarkGoMemclr/64-80      2830.20      8756.70      3.09x
BenchmarkGoMemclr/256-80     6064.59      20299.46     3.35x

Change-Id: Ic76c9183c8b4129ba3df512ca8b0fe6bd424e088
Reviewed-on: https://go-review.googlesource.com/30373
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Munday <munday@ca.ibm.com>
Reviewed-by: David Chase <drchase@google.com>

Backport of 3107c91
by Lynn Boger <laboger@linux.vnet.ibm.com>

@golang golang locked and limited conversation to collaborators Oct 5, 2017

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.