Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: restore walkinrange optimization (by moving to SSA) #30645

Open
josharian opened this issue Mar 6, 2019 · 9 comments

Comments

Projects
None yet
5 participants
@josharian
Copy link
Contributor

commented Mar 6, 2019

I’m on my phone, but I suspect that CL 165617 disabled the walkinrange optimization. This makes sense—order now rewrites away OANDAND and OOROR nodes, but those are what walkinrange operates on.

One symptom (I believe) is this:

https://go-review.googlesource.com/c/go/+/165617/1/test/checkbce.go#36

We have long wanted to move walkinrange to the SSA backend; this is a good motivation.

I’m marking this as a 1.13 release blocker, since IIRC walkinrange is a significant optimization for some hot code (usually parsing/formatting code looking for bytes in a range).

@josharian

This comment has been minimized.

Copy link
Contributor Author

commented Mar 6, 2019

@josharian

This comment has been minimized.

Copy link
Contributor Author

commented Mar 6, 2019

(This is also an indicator that this optimization needs an explicit asm test.)

@ALTree ALTree changed the title cmd/compile: restore walkingrange optimization (by moving to SSA) cmd/compile: restore walkinrange optimization (by moving to SSA) Mar 7, 2019

@ALTree

This comment has been minimized.

Copy link
Member

commented Mar 7, 2019

I suspect that CL 165617 disabled the walkinrange optimization

It appears that's the case. Given

package p

func f(c uint64) bool {
	return 0x10 <= c && c < 0x20
}

on 178a2c4 (a little before that CL):

RETURN l(4) tc(1)
RETURN-list
.   AS l(4) tc(1)
.   .   NAME-p.~r1 a(true) g(1) l(3) x(8) class(PPARAMOUT) bool
.   .   LT l(4) tc(1) bool
.   .   .   CONVNOP l(4) tc(1) uint64
.   .   .   .   SUB l(4) tc(1) uint64
.   .   .   .   .   NAME-p.c a(true) g(2) l(3) x(0) class(PPARAM) tc(1) used uint64
.   .   .   .   .   LITERAL-16 l(4) tc(1) uint64
.   .   .   LITERAL-16 l(4) tc(1) uint64

on a2ace8e (after):

AS l(4) tc(1)
.   NAME-p..autotmp_2 a(true) l(4) x(0) class(PAUTO) esc(N) tc(1) assigned used bool
.   LE l(4) tc(1) bool
.   .   LITERAL-16 l(4) tc(1) uint64
.   .   NAME-p.c a(true) g(2) l(3) x(0) class(PPARAM) tc(1) used uint64

IF l(4)
.   NAME-p..autotmp_2 a(true) l(4) x(0) class(PAUTO) esc(N) tc(1) assigned used bool
IF-body
.   AS l(4) tc(1)
.   .   NAME-p..autotmp_2 a(true) l(4) x(0) class(PAUTO) esc(N) tc(1) assigned used bool
.   .   LT l(4) tc(1) bool
.   .   .   NAME-p.c a(true) g(2) l(3) x(0) class(PPARAM) tc(1) used uint64
.   .   .   LITERAL-32 l(4) tc(1) uint64

RETURN l(4) tc(1)
RETURN-list
.   AS l(4) tc(1)
.   .   NAME-p.~r1 a(true) g(1) l(3) x(8) class(PPARAMOUT) bool
.   .   NAME-p..autotmp_2 a(true) l(4) x(0) class(PAUTO) esc(N) tc(1) assigned used bool

VARKILL tc(1)
.   NAME-p..autotmp_2 a(true) l(4) x(0) class(PAUTO) esc(N) tc(1) assigned used bool

And the generated code contains two CMPQ instead of one.

@mundaym

This comment has been minimized.

Copy link
Member

commented Mar 7, 2019

Not sure if this is exactly what you're after but I've been tinkering with some simple logical optimizations for control flow graphs (it's very rough code at the moment, I was just experimenting, but it does work):

https://go-review.googlesource.com/c/go/+/165998/

This can do optimizations such as this for disjunctions (and similar optimizations for conjunctions):

x == 1 || x == 2 -> unsigned(x - 1) <= 1
x == 4 || x == 6 -> (x | 2) == 6
x != 0 || y != 0 -> (x | y) != 0

It also has some optimizations to clean up some uses of IsSliceInBounds which maybe shouldn't have been generated in the first place:

x >= 0 && IsSliceInBounds(x, y) -> IsSliceInBounds(x, y)

I scheduled it after prove since I was a bit worried it might obscure some facts.

Unfortunately it is quite unpleasant adding new optimizations, especially if you want to add handle multiple bit widths. So I was thinking of extending the rewrite rules a bit so you could write something like this and have a new mergeSuccessor function handle all the gnarly stuff around assessing cost/mergability and value/phi updating:

// x < 0 || y < 0 -> (x|y) < 0
(If (Less(64|32|16|8) x (Const(64|32|16|8) [0])) (If (Less(64|32|16|8) y (Const(64|32|16|8) [0]) yes no) no)
        && mergeSuccessor(b, 0)
        -> (If (Less(64|32|16|8) (Or(64|32|16|8) x y) (Const(64|32|16|8) [0]) yes no)
@josharian

This comment has been minimized.

Copy link
Contributor Author

commented Mar 7, 2019

Not sure if this is exactly what you're after

Yes, indeed, something like this.

I scheduled it after prove since I was a bit worried it might obscure some facts.

Makes sense. The right place to document that is an entry in passOrder.

Unfortunately it is quite unpleasant adding new optimizations

Indeed. That’s why I wrote this optimization during walk the first time around. :)

I was thinking of extending the rewrite rules a bit

I’m AFK for a while (which is also why I haven’t replied to your comments on the unsafe.Unreachable issue), and it’s not quite obvious to me whether this would work in practice. It’d be lovely if it did. Perhaps we could also move the CMOV optimizations into such a form as well.

In a pinch we could also create a new kind of rewrite rule + rulegen + runner for this kind of optimization, one geared specifically to this kind of optimization.

josharian added a commit to josharian/go that referenced this issue May 7, 2019

unicode/utf8: optimize checking against an acceptRange
An acceptRange is used to check whether a byte falls in a range.
The existing code used the naive way: lo <= x <= hi.
Because bytes are unsigned, this can be rewritten as x-lo <= hi-lo.

The compiler does this optimization (modulo golang#30645),
but only when lo and hi are constants.
In this case, lo and hi are effectively constants,
but the compiler can't see that.

Change-Id: I9a4a87393915168bd59e28c8048e85e4c688aca5
@andybons

This comment has been minimized.

Copy link
Member

commented May 16, 2019

@josharian would you still like to do this for 1.13?

@josharian

This comment has been minimized.

Copy link
Contributor Author

commented May 16, 2019

I think this is important to do for 1.13, yes. It looked to me like maybe @mundaym was going to do it. I guess I'll take a look at it.

@gopherbot

This comment has been minimized.

Copy link

commented May 20, 2019

Change https://golang.org/cl/178197 mentions this issue: cmd/compile: optimize If blocks with ConstBool phi args

@josharian

This comment has been minimized.

Copy link
Contributor Author

commented May 27, 2019

I'm going to punt this to 1.14 after all. I finally have a working version, based very loosely on @mundaym's work. But it is definitely non-trivial, and Keith would (very reasonably) prefer to wait for 1.14 for CL 178197, on which my CL depends. I'll see about cleaning up and mailing my CL sooner rather than later, so it can be ready when the tree re-opens or in case of need.

@josharian josharian modified the milestones: Go1.13, Go1.14 May 27, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.