Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: partial cse of phi inputs? #37415

Open
josharian opened this issue Feb 24, 2020 · 3 comments
Open

cmd/compile: partial cse of phi inputs? #37415

josharian opened this issue Feb 24, 2020 · 3 comments

Comments

@josharian
Copy link
Contributor

@josharian josharian commented Feb 24, 2020

This is an optimization idea, one that is still very rough. I'm interested in feedback and ideas.

Consider the beginning of os/basename (on unix):

func basename(name string) string {
	i := len(name) - 1
	// Remove trailing slashes
	for ; i > 0 && name[i] == '/'; i-- {
		name = name[:i]
	}

After decompose built-in, part of the loop here has SSA form

b2: ← b1 b4
    v11 (23) = Phi <int> v10 v41 (i[int], name+8[int])
    v74 (34) = Phi <int> v54 v11 (name+8[int])
    v13 (+23) = Greater64 <bool> v11 v12
    v82 (34) = StringMake <string> v50 v74
If v13 → b7 b5 (23)

b1 is the entry block. v11 is i. v74 is len(name).

Consider v11's args in more detail.

v10 is Add64 <int> v26 v54 (i[int]). v41 is Add64 <int> v26 v11 (i[int]). (v26 is const -1.) They are the same except for the second arg.

So we could do a partial CSE and rewrite v11 as: Add64 <int> v26 vNEW (i[int]), where vNEW is Phi <int> v54 v11. Which is to say, vNEW is the same as v74!

So we end up with fewer phi values, and fewer Add64 values. We might even be able to eliminate a bounds check (v17) that was previously out of reach; I'm not sure.

Note that this transformation is I believe similar in effect to optimizing the code above to not modify name in the loop:

func basename(name string) string {
	i := len(name) - 1
	// Remove trailing slashes
	for ; i > 0 && name[i] == '/'; i-- {
	}
	name = name[:i]

In this optimized code we have only a single phi value (for i), and we calculate len(name) from it when we need it. The CSE'ing approach discussed above generates code that has only a single phi value (for len(name)) and we calculate i from it when we need it.

In general terms, the idea is to look for phi values whose arguments are identical except for a single argument, and then to CSE those arguments, replacing the varying argument with a new phi value.

cc @randall77 @dr2chase @cherrymui

@randall77
Copy link
Contributor

@randall77 randall77 commented Feb 24, 2020

Sounds reasonable. So we look for (Phi (Op x a) (Op y a) (Op z a)) and replace with (Op (Phi x y z) a)

That seems like it would almost always be a win. Maybe it isn't if the (Op * a) have other uses besides this Phi. Probably not a big deal, but we could condition on the Uses count.

@dr2chase
Copy link
Contributor

@dr2chase dr2chase commented Feb 24, 2020

Maybe it isn't if the (Op * a) have other uses besides this Phi.

Probably depends on whether the uses are at the same loop level, or an outer one. But we don't track uses like that (yet).

@cagedmantis cagedmantis added this to the Backlog milestone Feb 27, 2020
@josharian
Copy link
Contributor Author

@josharian josharian commented Mar 1, 2020

I have a prototype of this. Unfortunately, it interferes with loopbce as currently written. Next step is to understand the details and see how hard it would be to strengthen loopbce.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants
You can’t perform that action at this time.