Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile,math: improve performance for some math functions for ppc64x #21390

Closed
laboger opened this issue Aug 10, 2017 · 7 comments

Comments

Projects
None yet
4 participants
@laboger
Copy link
Contributor

commented Aug 10, 2017

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

go tip

What operating system and processor architecture are you using (go env)?

Ubuntu (or any) ppc64le

Making some improvements to several math functions by writing some in asm and implementing some as intrinsics.

I will be creating multiple CLs to do this, so instead of opening multiple issues, I would like to assign them all to this one.

The first change will be to implement the trunc, floor, and ceil functions as intrinsics.

@ALTree

This comment has been minimized.

Copy link
Member

commented Aug 10, 2017

This is (a subset of) #8913 (math, crypto, hash: Write power64 versions of assembler routines where applicable)

@gopherbot

This comment has been minimized.

Copy link

commented Aug 10, 2017

Change https://golang.org/cl/54654 mentions this issue: cmd/compile: intrinsics for trunc, floor, ceil on ppc64x

gopherbot pushed a commit that referenced this issue Aug 11, 2017

cmd/compile: intrinsics for trunc, floor, ceil on ppc64x
This implements trunc, floor, and ceil in the math package
as intrinsics on ppc64x.  Significant improvement mainly due
to avoiding call overhead of args and return value.

BenchmarkCeil-16                    5.95          0.69          -88.40%
BenchmarkFloor-16                   5.95          0.69          -88.40%
BenchmarkTrunc-16                   5.82          0.69          -88.14%

Updates #21390

Change-Id: I951e182694f6e0c431da79c577272b81fb0ebad0
Reviewed-on: https://go-review.googlesource.com/54654
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com>
Reviewed-by: David Chase <drchase@google.com>
@gopherbot

This comment has been minimized.

Copy link

commented Aug 29, 2017

Change https://golang.org/cl/59932 mentions this issue: cmd/compile, math/bits: intrinsics for rotates on ppc64le

@gopherbot

This comment has been minimized.

Copy link

commented Sep 12, 2017

Change https://golang.org/cl/63290 mentions this issue: cmd/compile,math: improve int<->float conversions on ppc64x

gopherbot pushed a commit that referenced this issue Sep 12, 2017

cmd/compile, math/bits: add rotate rules to PPC64.rules
This adds rules to match the code in math/bits RotateLeft,
RotateLeft32, and RotateLef64 to allow them to be inlined.

The rules are complicated because the code in these function
use different types, and the non-const version of these
shifts generate Mask and Carry instructions that become
subexpressions during the match process.

Also adds a testcase to asm_test.go.

Improvement in math/bits:

BenchmarkRotateLeft-16       1.57     1.32      -15.92%
BenchmarkRotateLeft32-16     1.60     1.37      -14.37%
BenchmarkRotateLeft64-16     1.57     1.32      -15.92%

Updates #21390

Change-Id: Ib6f17669ecc9cab54f18d690be27e2225ca654a4
Reviewed-on: https://go-review.googlesource.com/59932
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>

gopherbot pushed a commit that referenced this issue Sep 14, 2017

cmd/compile,math: improve int<->float conversions on ppc64x
The functions Float64bits and Float64frombits perform
poorly on ppc64x because the int<->float conversions
often result in load and store sequences to handle the
type change. This patch adds more rules to recognize
those sequences and use register to register moves
and avoid unnecessary loads and stores where possible.

There were some existing rules to improve these conversions,
but this provides additional improvements. Included here:

- New instruction FCFIDS to improve on conversion to 32 bit
- Rename Xf2i64 and Xi2f64 as MTVSRD, MFVSRD, to match the asm
- Add rules to lower some of the load/store sequences for
- Added new go asm to ppc64.s testcase.
conversions

Improvements:

BenchmarkAbs-16                2.16          0.93          -56.94%
BenchmarkCopysign-16           2.66          1.18          -55.64%
BenchmarkRound-16              4.82          2.69          -44.19%
BenchmarkSignbit-16            1.71          1.14          -33.33%
BenchmarkFrexp-16              11.4          7.94          -30.35%
BenchmarkLogb-16               10.4          7.34          -29.42%
BenchmarkLdexp-16              15.7          11.2          -28.66%
BenchmarkIlogb-16              10.2          7.32          -28.24%
BenchmarkPowInt-16             69.6          55.9          -19.68%
BenchmarkModf-16               10.1          8.19          -18.91%
BenchmarkLog2-16               17.4          14.3          -17.82%
BenchmarkCbrt-16               45.0          37.3          -17.11%
BenchmarkAtanh-16              57.6          48.3          -16.15%
BenchmarkRemainder-16          76.6          65.4          -14.62%
BenchmarkGamma-16              26.0          22.5          -13.46%
BenchmarkPowFrac-16            197           174           -11.68%
BenchmarkMod-16                112           99.8          -10.89%
BenchmarkAsinh-16              59.9          53.7          -10.35%
BenchmarkAcosh-16              44.8          40.3          -10.04%

Updates #21390

Change-Id: I56cc991fc2e55249d69518d4e1ba76cc23904e35
Reviewed-on: https://go-review.googlesource.com/63290
Reviewed-by: Michael Munday <mike.munday@ibm.com>
@gopherbot

This comment has been minimized.

Copy link

commented Sep 29, 2017

Change https://golang.org/cl/67130 mentions this issue: cmd/compile,math,cmd/internal/obj/ppc64: abs,copysign instrinsics on ppc64x

gopherbot pushed a commit that referenced this issue Oct 30, 2017

cmd/compile,cmd/internal/obj/ppc64: make math.Abs,math.Copysign instr…
…insics on ppc64x

This adds support for math Abs, Copysign to be instrinsics on ppc64x.

New instruction FCPSGN is added to generate fcpsgn. Some new
rules are added to improve the int<->float conversions that are
generated mainly due to the Float64bits and Float64frombits in
the math package. PPC64.rules is also modified as suggested
in the review for CL 63290.

Improvements:
benchmark                           old ns/op     new ns/op     delta
BenchmarkAbs-16                   1.12          0.69          -38.39%
BenchmarkCopysign-16              1.30          0.93          -28.46%
BenchmarkNextafter32-16           9.34          8.05          -13.81%
BenchmarkFrexp-16                 8.81          7.60          -13.73%

Others that used Copysign also saw smaller improvements.

I attempted to make this work using rules since that
seems to be preferred, but due to the use of Float64bits and
Float64frombits in these functions, several rules had to be added and
even then not all cases were matched. Using rules became too
complicated and seemed too fragile for these.

Updates #21390

Change-Id: Ia265da9a18355e08000818a4fba1a40e9e031995
Reviewed-on: https://go-review.googlesource.com/67130
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Keith Randall <khr@golang.org>
@gopherbot

This comment has been minimized.

Copy link

commented Oct 30, 2017

Change https://golang.org/cl/74411 mentions this issue: math: improve dim, min, max, modf for ppc64x

gopherbot pushed a commit that referenced this issue Nov 2, 2017

math: implement asm modf for ppc64x
This change adds an asm implementations modf for ppc64x.

Improvements:

BenchmarkModf-16               7.48          6.26          -16.31%

Updates: #21390

Change-Id: I9c4f3213688e3e8842d050840dc04fc9c0bf6ce4
Reviewed-on: https://go-review.googlesource.com/74411
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Munday <mike.munday@ibm.com>
@laboger

This comment has been minimized.

Copy link
Contributor Author

commented Nov 2, 2017

I have merged all the math changes I had planned for go 1.10, so will close this one.

@laboger laboger closed this Nov 2, 2017

@golang golang locked and limited conversation to collaborators Nov 2, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.