Skip to content

Commit

Permalink
cmd/compile: optimize ARM code with CMN/TST/TEQ
Browse files Browse the repository at this point in the history
CMN/TST/TEQ were supported since ARMv4, which can be used to
simplify comparisons.

This patch implements the optimization and here are the benchmark
results.

1. A special test case got 18.21% improvement.
name                     old time/op    new time/op    delta
TSTTEQ-4                    806µs ± 1%     659µs ± 0%  -18.21%  (p=0.000 n=20+18)
(https://github.com/benshi001/ugo1/blob/master/tstteq_test.go)

2. There is no regression in the compilecmp benchmark.
name        old time/op       new time/op       delta
Template          2.31s ± 1%        2.30s ± 1%    ~     (p=0.661 n=10+9)
Unicode           1.32s ± 3%        1.32s ± 5%    ~     (p=0.280 n=10+10)
GoTypes           7.69s ± 1%        7.65s ± 0%  -0.52%  (p=0.027 n=10+8)
Compiler          36.5s ± 1%        36.4s ± 1%    ~     (p=0.546 n=9+9)
SSA               85.1s ± 2%        84.9s ± 1%    ~     (p=0.529 n=10+10)
Flate             1.43s ± 2%        1.43s ± 2%    ~     (p=0.661 n=10+9)
GoParser          1.81s ± 2%        1.81s ± 1%    ~     (p=0.796 n=10+10)
Reflect           5.10s ± 2%        5.09s ± 1%    ~     (p=0.853 n=10+10)
Tar               2.47s ± 1%        2.48s ± 1%    ~     (p=0.123 n=10+10)
XML               2.59s ± 1%        2.58s ± 1%    ~     (p=0.853 n=10+10)
[Geo mean]        4.78s             4.77s       -0.17%

name        old user-time/op  new user-time/op  delta
Template          2.72s ± 3%        2.73s ± 2%    ~     (p=0.928 n=10+10)
Unicode           1.58s ± 4%        1.60s ± 1%    ~     (p=0.087 n=10+9)
GoTypes           9.41s ± 2%        9.36s ± 1%    ~     (p=0.060 n=10+10)
Compiler          44.4s ± 2%        44.2s ± 2%    ~     (p=0.289 n=10+10)
SSA                110s ± 2%         110s ± 1%    ~     (p=0.739 n=10+10)
Flate             1.67s ± 2%        1.63s ± 3%    ~     (p=0.063 n=10+10)
GoParser          2.12s ± 1%        2.12s ± 2%    ~     (p=0.840 n=10+10)
Reflect           5.94s ± 1%        5.98s ± 1%    ~     (p=0.063 n=9+10)
Tar               3.01s ± 2%        3.02s ± 2%    ~     (p=0.584 n=10+10)
XML               3.04s ± 3%        3.02s ± 2%    ~     (p=0.696 n=10+10)
[Geo mean]        5.73s             5.72s       -0.20%

name        old text-bytes    new text-bytes    delta
HelloSize         579kB ± 0%        579kB ± 0%    ~     (all equal)

name        old data-bytes    new data-bytes    delta
HelloSize        5.46kB ± 0%       5.46kB ± 0%    ~     (all equal)

name        old bss-bytes     new bss-bytes     delta
HelloSize        72.8kB ± 0%       72.8kB ± 0%    ~     (all equal)

name        old exe-bytes     new exe-bytes     delta
HelloSize        1.03MB ± 0%       1.03MB ± 0%    ~     (all equal)

3. There is little change in the go1 benchmark (excluding the noise).
name                     old time/op    new time/op     delta
BinaryTree17-4              40.3s ± 1%      40.6s ± 1%  +0.80%  (p=0.000 n=30+30)
Fannkuch11-4                24.2s ± 1%      24.1s ± 0%    ~     (p=0.093 n=30+30)
FmtFprintfEmpty-4           834ns ± 0%      826ns ± 0%  -0.93%  (p=0.000 n=29+24)
FmtFprintfString-4         1.39µs ± 1%     1.36µs ± 0%  -2.02%  (p=0.000 n=30+30)
FmtFprintfInt-4            1.43µs ± 1%     1.44µs ± 1%    ~     (p=0.155 n=30+29)
FmtFprintfIntInt-4         2.09µs ± 0%     2.11µs ± 0%  +1.16%  (p=0.000 n=28+30)
FmtFprintfPrefixedInt-4    2.33µs ± 1%     2.36µs ± 0%  +1.25%  (p=0.000 n=30+30)
FmtFprintfFloat-4          4.27µs ± 1%     4.32µs ± 1%  +1.27%  (p=0.000 n=30+30)
FmtManyArgs-4              8.18µs ± 0%     8.14µs ± 0%  -0.46%  (p=0.000 n=25+27)
GobDecode-4                 101ms ± 1%      101ms ± 1%    ~     (p=0.182 n=29+29)
GobEncode-4                89.6ms ± 1%     87.8ms ± 2%  -2.02%  (p=0.000 n=30+29)
Gzip-4                      4.07s ± 1%      4.08s ± 1%    ~     (p=0.173 n=30+27)
Gunzip-4                    602ms ± 1%      600ms ± 1%  -0.29%  (p=0.000 n=29+28)
HTTPClientServer-4          679µs ± 4%      683µs ± 3%    ~     (p=0.197 n=30+30)
JSONEncode-4                241ms ± 1%      239ms ± 1%  -0.84%  (p=0.000 n=30+30)
JSONDecode-4                903ms ± 1%      882ms ± 1%  -2.33%  (p=0.000 n=30+30)
Mandelbrot200-4            41.8ms ± 0%     41.8ms ± 0%    ~     (p=0.719 n=30+30)
GoParse-4                  45.5ms ± 1%     45.8ms ± 1%  +0.52%  (p=0.000 n=30+30)
RegexpMatchEasy0_32-4      1.27µs ± 1%     1.27µs ± 0%  -0.60%  (p=0.000 n=30+30)
RegexpMatchEasy0_1K-4      7.77µs ± 6%     7.69µs ± 4%  -0.96%  (p=0.040 n=30+30)
RegexpMatchEasy1_32-4      1.29µs ± 1%     1.28µs ± 1%  -0.54%  (p=0.000 n=30+30)
RegexpMatchEasy1_1K-4      10.3µs ± 6%     10.2µs ± 3%    ~     (p=0.453 n=30+27)
RegexpMatchMedium_32-4     1.98µs ± 1%     2.00µs ± 1%  +0.85%  (p=0.000 n=30+29)
RegexpMatchMedium_1K-4      503µs ± 0%      503µs ± 1%    ~     (p=0.752 n=30+30)
RegexpMatchHard_32-4       27.1µs ± 1%     26.5µs ± 0%  -1.96%  (p=0.000 n=30+24)
RegexpMatchHard_1K-4        809µs ± 1%      799µs ± 1%  -1.29%  (p=0.000 n=29+30)
Revcomp-4                  67.3ms ± 2%     67.2ms ± 1%    ~     (p=0.265 n=29+29)
Template-4                  1.08s ± 1%      1.07s ± 0%  -1.39%  (p=0.000 n=30+22)
TimeParse-4                6.93µs ± 1%     6.96µs ± 1%  +0.40%  (p=0.005 n=30+30)
TimeFormat-4               13.3µs ± 0%     13.3µs ± 1%    ~     (p=0.734 n=30+30)
[Geo mean]                  709µs           707µs       -0.32%

name                     old speed      new speed       delta
GobDecode-4              7.59MB/s ± 1%   7.57MB/s ± 1%    ~     (p=0.145 n=29+29)
GobEncode-4              8.56MB/s ± 1%   8.74MB/s ± 1%  +2.07%  (p=0.000 n=30+29)
Gzip-4                   4.76MB/s ± 1%   4.75MB/s ± 1%  -0.25%  (p=0.037 n=30+30)
Gunzip-4                 32.2MB/s ± 1%   32.3MB/s ± 1%  +0.29%  (p=0.000 n=29+28)
JSONEncode-4             8.04MB/s ± 1%   8.11MB/s ± 1%  +0.85%  (p=0.000 n=30+30)
JSONDecode-4             2.15MB/s ± 1%   2.20MB/s ± 1%  +2.29%  (p=0.000 n=30+30)
GoParse-4                1.27MB/s ± 1%   1.26MB/s ± 1%  -0.73%  (p=0.000 n=30+30)
RegexpMatchEasy0_32-4    25.1MB/s ± 1%   25.3MB/s ± 0%  +0.61%  (p=0.000 n=30+30)
RegexpMatchEasy0_1K-4     131MB/s ± 6%    133MB/s ± 4%  +1.35%  (p=0.009 n=28+30)
RegexpMatchEasy1_32-4    24.9MB/s ± 1%   25.0MB/s ± 1%  +0.54%  (p=0.000 n=30+30)
RegexpMatchEasy1_1K-4    99.2MB/s ± 6%  100.2MB/s ± 3%    ~     (p=0.448 n=30+27)
RegexpMatchMedium_32-4    503kB/s ± 1%    500kB/s ± 0%  -0.66%  (p=0.002 n=30+24)
RegexpMatchMedium_1K-4   2.04MB/s ± 0%   2.04MB/s ± 1%    ~     (p=0.358 n=30+30)
RegexpMatchHard_32-4     1.18MB/s ± 1%   1.20MB/s ± 1%  +1.75%  (p=0.000 n=30+30)
RegexpMatchHard_1K-4     1.26MB/s ± 1%   1.28MB/s ± 1%  +1.42%  (p=0.000 n=30+30)
Revcomp-4                37.8MB/s ± 2%   37.8MB/s ± 1%    ~     (p=0.266 n=29+29)
Template-4               1.80MB/s ± 1%   1.82MB/s ± 1%  +1.46%  (p=0.000 n=30+30)
[Geo mean]               6.91MB/s        6.96MB/s       +0.70%

fixes #21583

Change-Id: I24065a80588ccae7de3ad732a3cfb0026cf7e214
Reviewed-on: https://go-review.googlesource.com/67490
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
  • Loading branch information
benshi001 authored and cherrymui committed Oct 11, 2017
1 parent 4477107 commit 1ec78d1
Show file tree
Hide file tree
Showing 5 changed files with 9,270 additions and 5,294 deletions.
12 changes: 6 additions & 6 deletions src/cmd/compile/internal/arm/ssa.go
Expand Up @@ -473,17 +473,17 @@ func ssaGenValue(s *gc.SSAGenState, v *ssa.Value) {
p := s.Prog(v.Op.Asm())
p.From.Type = obj.TYPE_REG
p.From.Reg = v.Args[0].Reg()
case ssa.OpARMCMPshiftLL:
case ssa.OpARMCMPshiftLL, ssa.OpARMCMNshiftLL, ssa.OpARMTSTshiftLL, ssa.OpARMTEQshiftLL:
genshift(s, v.Op.Asm(), v.Args[0].Reg(), v.Args[1].Reg(), 0, arm.SHIFT_LL, v.AuxInt)
case ssa.OpARMCMPshiftRL:
case ssa.OpARMCMPshiftRL, ssa.OpARMCMNshiftRL, ssa.OpARMTSTshiftRL, ssa.OpARMTEQshiftRL:
genshift(s, v.Op.Asm(), v.Args[0].Reg(), v.Args[1].Reg(), 0, arm.SHIFT_LR, v.AuxInt)
case ssa.OpARMCMPshiftRA:
case ssa.OpARMCMPshiftRA, ssa.OpARMCMNshiftRA, ssa.OpARMTSTshiftRA, ssa.OpARMTEQshiftRA:
genshift(s, v.Op.Asm(), v.Args[0].Reg(), v.Args[1].Reg(), 0, arm.SHIFT_AR, v.AuxInt)
case ssa.OpARMCMPshiftLLreg:
case ssa.OpARMCMPshiftLLreg, ssa.OpARMCMNshiftLLreg, ssa.OpARMTSTshiftLLreg, ssa.OpARMTEQshiftLLreg:
genregshift(s, v.Op.Asm(), v.Args[0].Reg(), v.Args[1].Reg(), v.Args[2].Reg(), 0, arm.SHIFT_LL)
case ssa.OpARMCMPshiftRLreg:
case ssa.OpARMCMPshiftRLreg, ssa.OpARMCMNshiftRLreg, ssa.OpARMTSTshiftRLreg, ssa.OpARMTEQshiftRLreg:
genregshift(s, v.Op.Asm(), v.Args[0].Reg(), v.Args[1].Reg(), v.Args[2].Reg(), 0, arm.SHIFT_LR)
case ssa.OpARMCMPshiftRAreg:
case ssa.OpARMCMPshiftRAreg, ssa.OpARMCMNshiftRAreg, ssa.OpARMTSTshiftRAreg, ssa.OpARMTEQshiftRAreg:
genregshift(s, v.Op.Asm(), v.Args[0].Reg(), v.Args[1].Reg(), v.Args[2].Reg(), 0, arm.SHIFT_AR)
case ssa.OpARMMOVWaddr:
p := s.Prog(arm.AMOVW)
Expand Down
136 changes: 136 additions & 0 deletions src/cmd/compile/internal/ssa/gen/ARM.rules
Expand Up @@ -531,6 +531,9 @@

(CMP x (MOVWconst [c])) -> (CMPconst [c] x)
(CMP (MOVWconst [c]) x) -> (InvertFlags (CMPconst [c] x))
(CMN x (MOVWconst [c])) -> (CMNconst [c] x)
(TST x (MOVWconst [c])) -> (TSTconst [c] x)
(TEQ x (MOVWconst [c])) -> (TEQconst [c] x)

// don't extend after proper load
// MOVWreg instruction is not emitted if src and dst registers are same, but it ensures the type.
Expand Down Expand Up @@ -637,6 +640,17 @@
(CMPconst (MOVWconst [x]) [y]) && int32(x)<int32(y) && uint32(x)>uint32(y) -> (FlagLT_UGT)
(CMPconst (MOVWconst [x]) [y]) && int32(x)>int32(y) && uint32(x)<uint32(y) -> (FlagGT_ULT)
(CMPconst (MOVWconst [x]) [y]) && int32(x)>int32(y) && uint32(x)>uint32(y) -> (FlagGT_UGT)
(CMNconst (MOVWconst [x]) [y]) && int32(x)==int32(-y) -> (FlagEQ)
(CMNconst (MOVWconst [x]) [y]) && int32(x)<int32(-y) && uint32(x)<uint32(-y) -> (FlagLT_ULT)
(CMNconst (MOVWconst [x]) [y]) && int32(x)<int32(-y) && uint32(x)>uint32(-y) -> (FlagLT_UGT)
(CMNconst (MOVWconst [x]) [y]) && int32(x)>int32(-y) && uint32(x)<uint32(-y) -> (FlagGT_ULT)
(CMNconst (MOVWconst [x]) [y]) && int32(x)>int32(-y) && uint32(x)>uint32(-y) -> (FlagGT_UGT)
(TSTconst (MOVWconst [x]) [y]) && int32(x&y)==0 -> (FlagEQ)
(TSTconst (MOVWconst [x]) [y]) && int32(x&y)<0 -> (FlagLT_UGT)
(TSTconst (MOVWconst [x]) [y]) && int32(x&y)>0 -> (FlagGT_UGT)
(TEQconst (MOVWconst [x]) [y]) && int32(x^y)==0 -> (FlagEQ)
(TEQconst (MOVWconst [x]) [y]) && int32(x^y)<0 -> (FlagLT_UGT)
(TEQconst (MOVWconst [x]) [y]) && int32(x^y)>0 -> (FlagGT_UGT)

// other known comparisons
(CMPconst (MOVBUreg _) [c]) && 0xff < c -> (FlagLT_ULT)
Expand Down Expand Up @@ -989,6 +1003,24 @@
(CMP (SRL y z) x) -> (InvertFlags (CMPshiftRLreg x y z))
(CMP x (SRA y z)) -> (CMPshiftRAreg x y z)
(CMP (SRA y z) x) -> (InvertFlags (CMPshiftRAreg x y z))
(TST x (SLLconst [c] y)) -> (TSTshiftLL x y [c])
(TST x (SRLconst [c] y)) -> (TSTshiftRL x y [c])
(TST x (SRAconst [c] y)) -> (TSTshiftRA x y [c])
(TST x (SLL y z)) -> (TSTshiftLLreg x y z)
(TST x (SRL y z)) -> (TSTshiftRLreg x y z)
(TST x (SRA y z)) -> (TSTshiftRAreg x y z)
(TEQ x (SLLconst [c] y)) -> (TEQshiftLL x y [c])
(TEQ x (SRLconst [c] y)) -> (TEQshiftRL x y [c])
(TEQ x (SRAconst [c] y)) -> (TEQshiftRA x y [c])
(TEQ x (SLL y z)) -> (TEQshiftLLreg x y z)
(TEQ x (SRL y z)) -> (TEQshiftRLreg x y z)
(TEQ x (SRA y z)) -> (TEQshiftRAreg x y z)
(CMN x (SLLconst [c] y)) -> (CMNshiftLL x y [c])
(CMN x (SRLconst [c] y)) -> (CMNshiftRL x y [c])
(CMN x (SRAconst [c] y)) -> (CMNshiftRA x y [c])
(CMN x (SLL y z)) -> (CMNshiftLLreg x y z)
(CMN x (SRL y z)) -> (CMNshiftRLreg x y z)
(CMN x (SRA y z)) -> (CMNshiftRAreg x y z)

// prefer *const ops to *shift ops
(ADDshiftLL (MOVWconst [c]) x [d]) -> (ADDconst [c] (SLLconst <x.Type> x [d]))
Expand Down Expand Up @@ -1031,6 +1063,15 @@
(CMPshiftLL (MOVWconst [c]) x [d]) -> (InvertFlags (CMPconst [c] (SLLconst <x.Type> x [d])))
(CMPshiftRL (MOVWconst [c]) x [d]) -> (InvertFlags (CMPconst [c] (SRLconst <x.Type> x [d])))
(CMPshiftRA (MOVWconst [c]) x [d]) -> (InvertFlags (CMPconst [c] (SRAconst <x.Type> x [d])))
(TSTshiftLL (MOVWconst [c]) x [d]) -> (TSTconst [c] (SLLconst <x.Type> x [d]))
(TSTshiftRL (MOVWconst [c]) x [d]) -> (TSTconst [c] (SRLconst <x.Type> x [d]))
(TSTshiftRA (MOVWconst [c]) x [d]) -> (TSTconst [c] (SRAconst <x.Type> x [d]))
(TEQshiftLL (MOVWconst [c]) x [d]) -> (TEQconst [c] (SLLconst <x.Type> x [d]))
(TEQshiftRL (MOVWconst [c]) x [d]) -> (TEQconst [c] (SRLconst <x.Type> x [d]))
(TEQshiftRA (MOVWconst [c]) x [d]) -> (TEQconst [c] (SRAconst <x.Type> x [d]))
(CMNshiftLL (MOVWconst [c]) x [d]) -> (CMNconst [c] (SLLconst <x.Type> x [d]))
(CMNshiftRL (MOVWconst [c]) x [d]) -> (CMNconst [c] (SRLconst <x.Type> x [d]))
(CMNshiftRA (MOVWconst [c]) x [d]) -> (CMNconst [c] (SRAconst <x.Type> x [d]))

(ADDshiftLLreg (MOVWconst [c]) x y) -> (ADDconst [c] (SLL <x.Type> x y))
(ADDshiftRLreg (MOVWconst [c]) x y) -> (ADDconst [c] (SRL <x.Type> x y))
Expand Down Expand Up @@ -1071,6 +1112,15 @@
(CMPshiftLLreg (MOVWconst [c]) x y) -> (InvertFlags (CMPconst [c] (SLL <x.Type> x y)))
(CMPshiftRLreg (MOVWconst [c]) x y) -> (InvertFlags (CMPconst [c] (SRL <x.Type> x y)))
(CMPshiftRAreg (MOVWconst [c]) x y) -> (InvertFlags (CMPconst [c] (SRA <x.Type> x y)))
(TSTshiftLLreg (MOVWconst [c]) x y) -> (TSTconst [c] (SLL <x.Type> x y))
(TSTshiftRLreg (MOVWconst [c]) x y) -> (TSTconst [c] (SRL <x.Type> x y))
(TSTshiftRAreg (MOVWconst [c]) x y) -> (TSTconst [c] (SRA <x.Type> x y))
(TEQshiftLLreg (MOVWconst [c]) x y) -> (TEQconst [c] (SLL <x.Type> x y))
(TEQshiftRLreg (MOVWconst [c]) x y) -> (TEQconst [c] (SRL <x.Type> x y))
(TEQshiftRAreg (MOVWconst [c]) x y) -> (TEQconst [c] (SRA <x.Type> x y))
(CMNshiftLLreg (MOVWconst [c]) x y) -> (CMNconst [c] (SLL <x.Type> x y))
(CMNshiftRLreg (MOVWconst [c]) x y) -> (CMNconst [c] (SRL <x.Type> x y))
(CMNshiftRAreg (MOVWconst [c]) x y) -> (CMNconst [c] (SRA <x.Type> x y))

// constant folding in *shift ops
(ADDshiftLL x (MOVWconst [c]) [d]) -> (ADDconst x [int64(uint32(c)<<uint64(d))])
Expand Down Expand Up @@ -1119,6 +1169,15 @@
(CMPshiftLL x (MOVWconst [c]) [d]) -> (CMPconst x [int64(uint32(c)<<uint64(d))])
(CMPshiftRL x (MOVWconst [c]) [d]) -> (CMPconst x [int64(uint32(c)>>uint64(d))])
(CMPshiftRA x (MOVWconst [c]) [d]) -> (CMPconst x [int64(int32(c)>>uint64(d))])
(TSTshiftLL x (MOVWconst [c]) [d]) -> (TSTconst x [int64(uint32(c)<<uint64(d))])
(TSTshiftRL x (MOVWconst [c]) [d]) -> (TSTconst x [int64(uint32(c)>>uint64(d))])
(TSTshiftRA x (MOVWconst [c]) [d]) -> (TSTconst x [int64(int32(c)>>uint64(d))])
(TEQshiftLL x (MOVWconst [c]) [d]) -> (TEQconst x [int64(uint32(c)<<uint64(d))])
(TEQshiftRL x (MOVWconst [c]) [d]) -> (TEQconst x [int64(uint32(c)>>uint64(d))])
(TEQshiftRA x (MOVWconst [c]) [d]) -> (TEQconst x [int64(int32(c)>>uint64(d))])
(CMNshiftLL x (MOVWconst [c]) [d]) -> (CMNconst x [int64(uint32(c)<<uint64(d))])
(CMNshiftRL x (MOVWconst [c]) [d]) -> (CMNconst x [int64(uint32(c)>>uint64(d))])
(CMNshiftRA x (MOVWconst [c]) [d]) -> (CMNconst x [int64(int32(c)>>uint64(d))])

(ADDshiftLLreg x y (MOVWconst [c])) -> (ADDshiftLL x y [c])
(ADDshiftRLreg x y (MOVWconst [c])) -> (ADDshiftRL x y [c])
Expand Down Expand Up @@ -1165,6 +1224,15 @@
(CMPshiftLLreg x y (MOVWconst [c])) -> (CMPshiftLL x y [c])
(CMPshiftRLreg x y (MOVWconst [c])) -> (CMPshiftRL x y [c])
(CMPshiftRAreg x y (MOVWconst [c])) -> (CMPshiftRA x y [c])
(TSTshiftLLreg x y (MOVWconst [c])) -> (TSTshiftLL x y [c])
(TSTshiftRLreg x y (MOVWconst [c])) -> (TSTshiftRL x y [c])
(TSTshiftRAreg x y (MOVWconst [c])) -> (TSTshiftRA x y [c])
(TEQshiftLLreg x y (MOVWconst [c])) -> (TEQshiftLL x y [c])
(TEQshiftRLreg x y (MOVWconst [c])) -> (TEQshiftRL x y [c])
(TEQshiftRAreg x y (MOVWconst [c])) -> (TEQshiftRA x y [c])
(CMNshiftLLreg x y (MOVWconst [c])) -> (CMNshiftLL x y [c])
(CMNshiftRLreg x y (MOVWconst [c])) -> (CMNshiftRL x y [c])
(CMNshiftRAreg x y (MOVWconst [c])) -> (CMNshiftRA x y [c])

// Generate rotates
(ADDshiftLL [c] (SRLconst x [32-c]) x) -> (SRRconst [32-c] x)
Expand Down Expand Up @@ -1294,3 +1362,71 @@
// bit extraction
(SRAconst (SLLconst x [c]) [d]) && objabi.GOARM==7 && uint64(d)>=uint64(c) && uint64(d)<=31 -> (BFX [(d-c)|(32-d)<<8] x)
(SRLconst (SLLconst x [c]) [d]) && objabi.GOARM==7 && uint64(d)>=uint64(c) && uint64(d)<=31 -> (BFXU [(d-c)|(32-d)<<8] x)

// comparison simplification
(CMP x (RSBconst [0] y)) -> (CMN x y)
(CMN x (RSBconst [0] y)) -> (CMP x y)
(EQ (CMPconst [0] (SUB x y)) yes no) -> (EQ (CMP x y) yes no)
(EQ (CMPconst [0] (SUBconst [c] x)) yes no) -> (EQ (CMPconst [c] x) yes no)
(EQ (CMPconst [0] (SUBshiftLL x y [c])) yes no) -> (EQ (CMPshiftLL x y [c]) yes no)
(EQ (CMPconst [0] (SUBshiftRL x y [c])) yes no) -> (EQ (CMPshiftRL x y [c]) yes no)
(EQ (CMPconst [0] (SUBshiftRA x y [c])) yes no) -> (EQ (CMPshiftRA x y [c]) yes no)
(EQ (CMPconst [0] (SUBshiftLLreg x y z)) yes no) -> (EQ (CMPshiftLLreg x y z) yes no)
(EQ (CMPconst [0] (SUBshiftRLreg x y z)) yes no) -> (EQ (CMPshiftRLreg x y z) yes no)
(EQ (CMPconst [0] (SUBshiftRAreg x y z)) yes no) -> (EQ (CMPshiftRAreg x y z) yes no)
(NE (CMPconst [0] (SUB x y)) yes no) -> (NE (CMP x y) yes no)
(NE (CMPconst [0] (SUBconst [c] x)) yes no) -> (NE (CMPconst [c] x) yes no)
(NE (CMPconst [0] (SUBshiftLL x y [c])) yes no) -> (NE (CMPshiftLL x y [c]) yes no)
(NE (CMPconst [0] (SUBshiftRL x y [c])) yes no) -> (NE (CMPshiftRL x y [c]) yes no)
(NE (CMPconst [0] (SUBshiftRA x y [c])) yes no) -> (NE (CMPshiftRA x y [c]) yes no)
(NE (CMPconst [0] (SUBshiftLLreg x y z)) yes no) -> (NE (CMPshiftLLreg x y z) yes no)
(NE (CMPconst [0] (SUBshiftRLreg x y z)) yes no) -> (NE (CMPshiftRLreg x y z) yes no)
(NE (CMPconst [0] (SUBshiftRAreg x y z)) yes no) -> (NE (CMPshiftRAreg x y z) yes no)
(EQ (CMPconst [0] (ADD x y)) yes no) -> (EQ (CMN x y) yes no)
(EQ (CMPconst [0] (ADDconst [c] x)) yes no) -> (EQ (CMNconst [c] x) yes no)
(EQ (CMPconst [0] (ADDshiftLL x y [c])) yes no) -> (EQ (CMNshiftLL x y [c]) yes no)
(EQ (CMPconst [0] (ADDshiftRL x y [c])) yes no) -> (EQ (CMNshiftRL x y [c]) yes no)
(EQ (CMPconst [0] (ADDshiftRA x y [c])) yes no) -> (EQ (CMNshiftRA x y [c]) yes no)
(EQ (CMPconst [0] (ADDshiftLLreg x y z)) yes no) -> (EQ (CMNshiftLLreg x y z) yes no)
(EQ (CMPconst [0] (ADDshiftRLreg x y z)) yes no) -> (EQ (CMNshiftRLreg x y z) yes no)
(EQ (CMPconst [0] (ADDshiftRAreg x y z)) yes no) -> (EQ (CMNshiftRAreg x y z) yes no)
(NE (CMPconst [0] (ADD x y)) yes no) -> (NE (CMN x y) yes no)
(NE (CMPconst [0] (ADDconst [c] x)) yes no) -> (NE (CMNconst [c] x) yes no)
(NE (CMPconst [0] (ADDshiftLL x y [c])) yes no) -> (NE (CMNshiftLL x y [c]) yes no)
(NE (CMPconst [0] (ADDshiftRL x y [c])) yes no) -> (NE (CMNshiftRL x y [c]) yes no)
(NE (CMPconst [0] (ADDshiftRA x y [c])) yes no) -> (NE (CMNshiftRA x y [c]) yes no)
(NE (CMPconst [0] (ADDshiftLLreg x y z)) yes no) -> (NE (CMNshiftLLreg x y z) yes no)
(NE (CMPconst [0] (ADDshiftRLreg x y z)) yes no) -> (NE (CMNshiftRLreg x y z) yes no)
(NE (CMPconst [0] (ADDshiftRAreg x y z)) yes no) -> (NE (CMNshiftRAreg x y z) yes no)
(EQ (CMPconst [0] (AND x y)) yes no) -> (EQ (TST x y) yes no)
(EQ (CMPconst [0] (ANDconst [c] x)) yes no) -> (EQ (TSTconst [c] x) yes no)
(EQ (CMPconst [0] (ANDshiftLL x y [c])) yes no) -> (EQ (TSTshiftLL x y [c]) yes no)
(EQ (CMPconst [0] (ANDshiftRL x y [c])) yes no) -> (EQ (TSTshiftRL x y [c]) yes no)
(EQ (CMPconst [0] (ANDshiftRA x y [c])) yes no) -> (EQ (TSTshiftRA x y [c]) yes no)
(EQ (CMPconst [0] (ANDshiftLLreg x y z)) yes no) -> (EQ (TSTshiftLLreg x y z) yes no)
(EQ (CMPconst [0] (ANDshiftRLreg x y z)) yes no) -> (EQ (TSTshiftRLreg x y z) yes no)
(EQ (CMPconst [0] (ANDshiftRAreg x y z)) yes no) -> (EQ (TSTshiftRAreg x y z) yes no)
(NE (CMPconst [0] (AND x y)) yes no) -> (NE (TST x y) yes no)
(NE (CMPconst [0] (ANDconst [c] x)) yes no) -> (NE (TSTconst [c] x) yes no)
(NE (CMPconst [0] (ANDshiftLL x y [c])) yes no) -> (NE (TSTshiftLL x y [c]) yes no)
(NE (CMPconst [0] (ANDshiftRL x y [c])) yes no) -> (NE (TSTshiftRL x y [c]) yes no)
(NE (CMPconst [0] (ANDshiftRA x y [c])) yes no) -> (NE (TSTshiftRA x y [c]) yes no)
(NE (CMPconst [0] (ANDshiftLLreg x y z)) yes no) -> (NE (TSTshiftLLreg x y z) yes no)
(NE (CMPconst [0] (ANDshiftRLreg x y z)) yes no) -> (NE (TSTshiftRLreg x y z) yes no)
(NE (CMPconst [0] (ANDshiftRAreg x y z)) yes no) -> (NE (TSTshiftRAreg x y z) yes no)
(EQ (CMPconst [0] (XOR x y)) yes no) -> (EQ (TEQ x y) yes no)
(EQ (CMPconst [0] (XORconst [c] x)) yes no) -> (EQ (TEQconst [c] x) yes no)
(EQ (CMPconst [0] (XORshiftLL x y [c])) yes no) -> (EQ (TEQshiftLL x y [c]) yes no)
(EQ (CMPconst [0] (XORshiftRL x y [c])) yes no) -> (EQ (TEQshiftRL x y [c]) yes no)
(EQ (CMPconst [0] (XORshiftRA x y [c])) yes no) -> (EQ (TEQshiftRA x y [c]) yes no)
(EQ (CMPconst [0] (XORshiftLLreg x y z)) yes no) -> (EQ (TEQshiftLLreg x y z) yes no)
(EQ (CMPconst [0] (XORshiftRLreg x y z)) yes no) -> (EQ (TEQshiftRLreg x y z) yes no)
(EQ (CMPconst [0] (XORshiftRAreg x y z)) yes no) -> (EQ (TEQshiftRAreg x y z) yes no)
(NE (CMPconst [0] (XOR x y)) yes no) -> (NE (TEQ x y) yes no)
(NE (CMPconst [0] (XORconst [c] x)) yes no) -> (NE (TEQconst [c] x) yes no)
(NE (CMPconst [0] (XORshiftLL x y [c])) yes no) -> (NE (TEQshiftLL x y [c]) yes no)
(NE (CMPconst [0] (XORshiftRL x y [c])) yes no) -> (NE (TEQshiftRL x y [c]) yes no)
(NE (CMPconst [0] (XORshiftRA x y [c])) yes no) -> (NE (TEQshiftRA x y [c]) yes no)
(NE (CMPconst [0] (XORshiftLLreg x y z)) yes no) -> (NE (TEQshiftLLreg x y z) yes no)
(NE (CMPconst [0] (XORshiftRLreg x y z)) yes no) -> (NE (TEQshiftRLreg x y z) yes no)
(NE (CMPconst [0] (XORshiftRAreg x y z)) yes no) -> (NE (TEQshiftRAreg x y z) yes no)
20 changes: 19 additions & 1 deletion src/cmd/compile/internal/ssa/gen/ARMOps.go
Expand Up @@ -314,7 +314,7 @@ func init() {
// comparisons
{name: "CMP", argLength: 2, reg: gp2flags, asm: "CMP", typ: "Flags"}, // arg0 compare to arg1
{name: "CMPconst", argLength: 1, reg: gp1flags, asm: "CMP", aux: "Int32", typ: "Flags"}, // arg0 compare to auxInt
{name: "CMN", argLength: 2, reg: gp2flags, asm: "CMN", typ: "Flags"}, // arg0 compare to -arg1
{name: "CMN", argLength: 2, reg: gp2flags, asm: "CMN", typ: "Flags", commutative: true}, // arg0 compare to -arg1
{name: "CMNconst", argLength: 1, reg: gp1flags, asm: "CMN", aux: "Int32", typ: "Flags"}, // arg0 compare to -auxInt
{name: "TST", argLength: 2, reg: gp2flags, asm: "TST", typ: "Flags", commutative: true}, // arg0 & arg1 compare to 0
{name: "TSTconst", argLength: 1, reg: gp1flags, asm: "TST", aux: "Int32", typ: "Flags"}, // arg0 & auxInt compare to 0
Expand All @@ -326,10 +326,28 @@ func init() {
{name: "CMPshiftLL", argLength: 2, reg: gp2flags, asm: "CMP", aux: "Int32", typ: "Flags"}, // arg0 compare to arg1<<auxInt
{name: "CMPshiftRL", argLength: 2, reg: gp2flags, asm: "CMP", aux: "Int32", typ: "Flags"}, // arg0 compare to arg1>>auxInt, unsigned shift
{name: "CMPshiftRA", argLength: 2, reg: gp2flags, asm: "CMP", aux: "Int32", typ: "Flags"}, // arg0 compare to arg1>>auxInt, signed shift
{name: "CMNshiftLL", argLength: 2, reg: gp2flags, asm: "CMN", aux: "Int32", typ: "Flags"}, // arg0 compare to -(arg1<<auxInt)
{name: "CMNshiftRL", argLength: 2, reg: gp2flags, asm: "CMN", aux: "Int32", typ: "Flags"}, // arg0 compare to -(arg1>>auxInt), unsigned shift
{name: "CMNshiftRA", argLength: 2, reg: gp2flags, asm: "CMN", aux: "Int32", typ: "Flags"}, // arg0 compare to -(arg1>>auxInt), signed shift
{name: "TSTshiftLL", argLength: 2, reg: gp2flags, asm: "TST", aux: "Int32", typ: "Flags"}, // arg0 & (arg1<<auxInt) compare to 0
{name: "TSTshiftRL", argLength: 2, reg: gp2flags, asm: "TST", aux: "Int32", typ: "Flags"}, // arg0 & (arg1>>auxInt) compare to 0, unsigned shift
{name: "TSTshiftRA", argLength: 2, reg: gp2flags, asm: "TST", aux: "Int32", typ: "Flags"}, // arg0 & (arg1>>auxInt) compare to 0, signed shift
{name: "TEQshiftLL", argLength: 2, reg: gp2flags, asm: "TEQ", aux: "Int32", typ: "Flags"}, // arg0 ^ (arg1<<auxInt) compare to 0
{name: "TEQshiftRL", argLength: 2, reg: gp2flags, asm: "TEQ", aux: "Int32", typ: "Flags"}, // arg0 ^ (arg1>>auxInt) compare to 0, unsigned shift
{name: "TEQshiftRA", argLength: 2, reg: gp2flags, asm: "TEQ", aux: "Int32", typ: "Flags"}, // arg0 ^ (arg1>>auxInt) compare to 0, signed shift

{name: "CMPshiftLLreg", argLength: 3, reg: gp3flags, asm: "CMP", typ: "Flags"}, // arg0 compare to arg1<<arg2
{name: "CMPshiftRLreg", argLength: 3, reg: gp3flags, asm: "CMP", typ: "Flags"}, // arg0 compare to arg1>>arg2, unsigned shift
{name: "CMPshiftRAreg", argLength: 3, reg: gp3flags, asm: "CMP", typ: "Flags"}, // arg0 compare to arg1>>arg2, signed shift
{name: "CMNshiftLLreg", argLength: 3, reg: gp3flags, asm: "CMN", typ: "Flags"}, // arg0 + (arg1<<arg2) compare to 0
{name: "CMNshiftRLreg", argLength: 3, reg: gp3flags, asm: "CMN", typ: "Flags"}, // arg0 + (arg1>>arg2) compare to 0, unsigned shift
{name: "CMNshiftRAreg", argLength: 3, reg: gp3flags, asm: "CMN", typ: "Flags"}, // arg0 + (arg1>>arg2) compare to 0, signed shift
{name: "TSTshiftLLreg", argLength: 3, reg: gp3flags, asm: "TST", typ: "Flags"}, // arg0 & (arg1<<arg2) compare to 0
{name: "TSTshiftRLreg", argLength: 3, reg: gp3flags, asm: "TST", typ: "Flags"}, // arg0 & (arg1>>arg2) compare to 0, unsigned shift
{name: "TSTshiftRAreg", argLength: 3, reg: gp3flags, asm: "TST", typ: "Flags"}, // arg0 & (arg1>>arg2) compare to 0, signed shift
{name: "TEQshiftLLreg", argLength: 3, reg: gp3flags, asm: "TEQ", typ: "Flags"}, // arg0 ^ (arg1<<arg2) compare to 0
{name: "TEQshiftRLreg", argLength: 3, reg: gp3flags, asm: "TEQ", typ: "Flags"}, // arg0 ^ (arg1>>arg2) compare to 0, unsigned shift
{name: "TEQshiftRAreg", argLength: 3, reg: gp3flags, asm: "TEQ", typ: "Flags"}, // arg0 ^ (arg1>>arg2) compare to 0, signed shift

{name: "CMPF0", argLength: 1, reg: fp1flags, asm: "CMPF", typ: "Flags"}, // arg0 compare to 0, float32
{name: "CMPD0", argLength: 1, reg: fp1flags, asm: "CMPD", typ: "Flags"}, // arg0 compare to 0, float64
Expand Down

0 comments on commit 1ec78d1

Please sign in to comment.