Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: optimize unsigned comparisons with zero/one #21439

Open
josharian opened this issue Aug 14, 2017 · 11 comments
Open

cmd/compile: optimize unsigned comparisons with zero/one #21439

josharian opened this issue Aug 14, 2017 · 11 comments

Comments

@josharian
Copy link
Contributor

@josharian josharian commented Aug 14, 2017

I started on an minor optimization effort earlier this summer that I won't have time to see through for 1.10. This issue is a snapshot of the work and a description of what to do next, in case anyone else wants to pick an architecture to work on and run with it.

Comparing unsigned ints with zero/one is special and (perhaps surprisingly) not uncommon. One example: Given unsigned x, x > 0 iff x != 0. This is helpful, because on many architectures, x == 0 and x != 0 are faster and/or shorter. For example, on amd64, we can use TEST instead of CMP, and arm64 has dedicated instructions for comparison with zero.

Here are some generic rewrite rules that I believe to be sound:

(Greater64U x zero:(Const64 [0])) -> (Neq64 x zero)
(Greater32U x zero:(Const32 [0])) -> (Neq32 x zero)
(Greater16U x zero:(Const16 [0])) -> (Neq16 x zero)
(Greater8U  x zero:(Const8  [0])) -> (Neq8  x zero)

(Leq64U x zero:(Const64 [0])) -> (Eq64 x zero)
(Leq32U x zero:(Const32 [0])) -> (Eq32 x zero)
(Leq16U x zero:(Const16 [0])) -> (Eq16 x zero)
(Leq8U  x zero:(Const8  [0])) -> (Eq8  x zero)

(Geq64U _ (Const64 [0])) -> (ConstBool [1])
(Geq32U _ (Const32 [0])) -> (ConstBool [1])
(Geq16U _ (Const16 [0])) -> (ConstBool [1])
(Geq8U  _ (Const8  [0])) -> (ConstBool [1])

(Less64U _ (Const64 [0])) -> (ConstBool [0])
(Less32U _ (Const32 [0])) -> (ConstBool [0])
(Less16U _ (Const16 [0])) -> (ConstBool [0])
(Less8U  _ (Const8  [0])) -> (ConstBool [0])

(Less64U x (Const64 <t> [1])) -> (Eq64 x (Const64 <t> [0]))
(Less32U x (Const32 <t> [1])) -> (Eq32 x (Const32 <t> [0]))
(Less16U x (Const16 <t> [1])) -> (Eq16 x (Const16 <t> [0]))
(Less8U  x (Const8  <t> [1])) -> (Eq8  x (Const8  <t> [0]))

(Geq64U x (Const64 <t> [1])) -> (Neq64 x (Const64 <t> [0]))
(Geq32U x (Const32 <t> [1])) -> (Neq32 x (Const32 <t> [0]))
(Geq16U x (Const16 <t> [1])) -> (Neq16 x (Const16 <t> [0]))
(Geq8U  x (Const8  <t> [1])) -> (Neq8  x (Const8  <t> [0]))

Unfortunately, doing this at the generic level is probably not ideal, since (a) not all architectures have special handling of eq/neq 0, (b) it might interfere with the prove pass.

So the remaining work here is to port these rules to arch-specific rules as applicable, and confirm/disconfirm that they are used and that they are worthwhile.

(Related aside: Why doesn't amd64.rules have rules like (SETNE (CMPQ x zero:(MOVQconst))) -> (SETNE (TESTQ x x))?)

cc @martisch @philhofer

@martisch
Copy link
Contributor

@martisch martisch commented Aug 14, 2017

Interesting (and not uncommen) i had a similar thought of checking and ensuring go gc emits TEST instead of CMP where simpler/shorter the other day but wasnt sure how far that was already done.

Im happy to have a look at this for amd64 and 386 (also the SETcc part).

BTW (need to make an extra issue for this): we should clear the SETcc target register with a XOR to avoid false dependencies but need to be careful not to dirty flags after TEST/CMP or clearing what we actually test.

@philhofer
Copy link
Contributor

@philhofer philhofer commented Aug 14, 2017

I'm happy to take the arm/arm64 parts of this.

@stemar94
Copy link

@stemar94 stemar94 commented Aug 15, 2017

These might also be related, but much less common:

(Less64 (Sub64 x y) (Const64 [0])) -> (Less64 x y)
(Less32 (Sub32 x y) (Const32 [0])) -> (Less32 x y)
(Less16 (Sub16 x y) (Const16 [0])) -> (Less16 x y)
(Less8 (Sub8 x y) (Const8 [0])) -> (Less8 x y)

(Leq64 (Sub64 x y) (Const64 [0])) -> (Leq64 x y)
(Leq32 (Sub32 x y) (Const32 [0])) -> (Leq32 x y)
(Leq16 (Sub16 x y) (Const16 [0])) -> (Leq16 x y)
(Leq8 (Sub8 x y) (Const8 [0])) -> (Leq8 x y)

(Neq64 (Sub64 x y) (Const64 [0])) -> (Neq64 x y)
(Neq32 (Sub32 x y) (Const32 [0])) -> (Neq32 x y)
(Neq16 (Sub16 x y) (Const16 [0])) -> (Neq16 x y)
(Neq8 (Sub8 x y) (Const8 [0])) -> (Neq8 x y)
(Eq64 (Sub64 x y) (Const64 [0])) -> (Eq64 x y)
(Eq32 (Sub32 x y) (Const32 [0])) -> (Eq32 x y)
(Eq16 (Sub16 x y) (Const16 [0])) -> (Eq16 x y)
(Eq8 (Sub8 x y) (Const8 [0])) -> (Eq8 x y)

(Geq64 (Sub64 x y) (Const64 [0])) -> (Geq64 x y)
(Geq32 (Sub32 x y) (Const32 [0])) -> (Geq32 x y)
(Geq16 (Sub16 x y) (Const16 [0])) -> (Geq16 x y)
(Geq8 (Sub8 x y) (Const8 [0])) -> (Geq8 x y)

(Greater64 (Sub64 x y) (Const64 [0])) -> (Greater64 x y)
(Greater32 (Sub32 x y) (Const32 [0])) -> (Greater32 x y)
(Greater16 (Sub16 x y) (Const16 [0])) -> (Greater16 x y)
(Greater8 (Sub8 x y) (Const8 [0])) -> (Greater8 x y)

I would have a CL ready for these.

@martisch
Copy link
Contributor

@martisch martisch commented Aug 15, 2017

@stemar94 since overflow in go is defined and the compiler is not allowed to optimize it away i think this does not always apply:

uint8: ((0 - 1) > 0) != (0 > 1)
int8: ((-128 - 1) > 0) != (-128 > 1)

https://play.golang.org/p/2R2nEnvZej

@benshi001
Copy link
Member

@benshi001 benshi001 commented Oct 12, 2017

I did optimization of comparison to zero with TST/TEQ/CMN on ARM32 in 1ec78d1

@gopherbot
Copy link

@gopherbot gopherbot commented Jan 2, 2020

Change https://golang.org/cl/213058 mentions this issue: cmd/compile: optimize unsigned comparisons to 0/1 on amd64

gopherbot pushed a commit that referenced this issue Feb 28, 2020
Plus a bonus optimization I noticed while working on this.

There are no functions (besides the rewrite rules) whose text size
increases as a result of this change.

Updates #21439

The following per-package text size stats were generated by parsing the
output of compiling with -S and summing the function size reported on the
STEXT line. This gives a far more accurate picture of the impact
on generated code than merely looking at the object file size changes
or the resulting binary size changes. The latter are below, for reference.

file                                          before  after   Δ       %       
runtime.s                                     477257  476417  -840    -0.176% 
math.s                                        35985   35976   -9      -0.025% 
vendor/golang.org/x/net/dns/dnsmessage.s      87314   87232   -82     -0.094% 
debug/dwarf.s                                 108444  108432  -12     -0.011% 
regexp.s                                      64535   64467   -68     -0.105% 
internal/xcoff.s                              23175   22945   -230    -0.992% 
cmd/vendor/golang.org/x/arch/arm/armasm.s     45263   45260   -3      -0.007% 
cmd/vendor/golang.org/x/arch/arm64/arm64asm.s 118140  118135  -5      -0.004% 
cmd/internal/obj/arm64.s                      151502  151498  -4      -0.003% 
cmd/compile/internal/ssa.s                    6061483 6063120 +1637   +0.027% 
total                                         9321728 9322112 +384    +0.004% 

file      before    after     Δ       %       
go        15188916  15184820  -4096   -0.027% 
addr2line 4315984   4311888   -4096   -0.095% 
cgo       4836088   4831992   -4096   -0.085% 
compile   24506008  24493720  -12288  -0.050% 
doc       4680952   4676856   -4096   -0.088% 
link      6605336   6601240   -4096   -0.062% 
pprof     14776756  14772660  -4096   -0.028% 
total     135250956 135214092 -36864  -0.027% 

Change-Id: I1243a098a08db452f7d1eb0998e241c9b199e2b4
Reviewed-on: https://go-review.googlesource.com/c/go/+/213058
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
@shawn-xdji
Copy link
Contributor

@shawn-xdji shawn-xdji commented Feb 28, 2020

I'm happy to take the arm/arm64 parts of this.

Hi @philhofer, are you still working on the arm64 part? Thanks.

@philhofer
Copy link
Contributor

@philhofer philhofer commented Feb 28, 2020

@gopherbot
Copy link

@gopherbot gopherbot commented Aug 4, 2020

Change https://golang.org/cl/246617 mentions this issue: cmd/compile: optimize unsigned comparisons to 0/1

@gopherbot
Copy link

@gopherbot gopherbot commented Aug 5, 2020

Change https://golang.org/cl/246857 mentions this issue: cmd/compile: optimize unsigned comparisons to 0/1 on arm64

@gopherbot
Copy link

@gopherbot gopherbot commented Aug 8, 2020

Change https://golang.org/cl/247557 mentions this issue: cmd/compile: optimize unsigned comparisons with 0/1 on wasm

gopherbot pushed a commit that referenced this issue Aug 17, 2020
There are some architecture-independent rules in #21439, since an
unsigned integer >= 0 is always true and < 0 is always false. This CL
adds these optimizations to generic rules.

Updates #21439

Change-Id: Iec7e3040b761ecb1e60908f764815fdd9bc62495
Reviewed-on: https://go-review.googlesource.com/c/go/+/246617
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
gopherbot pushed a commit that referenced this issue Aug 18, 2020
For an unsigned integer, it's useful to convert its order test with 0/1
to its equality test with 0. We can save a comparison instruction that
followed by a conditional branch on arm64 since it supports
compare-with-zero-and-branch instructions. For example,

  if x > 0 { ... } else { ... }

the original version:
  CMP $0, R0
  BLS 9

the optimized version:
  CBZ R0, 8

Updates #21439

Change-Id: Id1de6f865f6aa72c5d45b29f7894818857288425
Reviewed-on: https://go-review.googlesource.com/c/go/+/246857
Reviewed-by: Keith Randall <khr@golang.org>
gopherbot pushed a commit that referenced this issue Aug 22, 2020
Updates #21439

Change-Id: I0fbcde6e0c2fc368fe686b271670f9d8be4a7900
Reviewed-on: https://go-review.googlesource.com/c/go/+/247557
Run-TryBot: Agniva De Sarker <agniva.quicksilver@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Richard Musiol <neelance@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
7 participants