Skip to content

LSR should use expressions computed inside the loop to materialize values computed outside the loop #1238

@lattner

Description

@lattner
Bugzilla Link 866
Version 1.0
OS All
CC @efriedma-quic,@ekatz,@nlewycky

Extended Description

This is a special case of Bug 863. Consider this:

int %foo(int* %a, int %t, int %C) {
entry:
br label %cond_true

cond_true: ; preds = %cond_true, %entry
%x.0.0 = phi int [ 0, %entry ], [ %tmp9, %cond_true ] ; [#uses=3]
%t_addr.0.0 = phi int [ %t, %entry ], [ %tmp7, %cond_true ] ; [#uses=1]
%tmp2 = shl int %x.0.0, ubyte 2 ; <int*> [#uses=1]
%tmp3 = cast int %tmp2 to int ; [#uses=1]
%tmp5 = add int %t_addr.0.0, %t_addr.0.0 ; [#uses=1]
%tmp7 = add int %tmp5, %tmp3 ; [#uses=2]
%tmp9 = add int %x.0.0, 1 ; [#uses=2]
%tmp = setgt int %C, 39 ; [#uses=1]
br bool %tmp, label %bb12, label %cond_true

bb12: ; preds = %cond_true
ret int %tmp7
}

There are two uses of tmp7: the ret and the PHI. Because of the phi, the value of tmp7 has to be
computed inside the loop for the next iteration. However, LSR doesn't keep track of this and ends up
recomputing the value outside the loop, producing this code:

int %foo(int* %a, int %t, int %C) {
entry:
br label %cond_true

cond_true: ; preds = %cond_true, %entry
%iv. = phi uint [ 0, %entry ], [ %iv..inc, %cond_true ] ; [#uses=2]
%t_addr.0.0 = phi int [ %t, %entry ], [ %tmp.1, %cond_true ] ; [#uses=1]
%t_addr.0.0 = cast int %t_addr.0.0 to uint ; [#uses=2]
%tmp = setgt int %C, 39 ; [#uses=1]
%iv..inc = add uint %iv., 4 ; [#uses=2]
%tmp. = mul uint %t_addr.0.0, 2 ; [#uses=1]
%tmp.1 = add uint %iv., %tmp. ; [#uses=1]
%tmp.1 = cast uint %tmp.1 to int ; [#uses=1]
br bool %tmp, label %bb12, label %cond_true

bb12: ; preds = %cond_true
*** %tmp.2 = mul uint %t_addr.0.0, 2 ; [#uses=1]
*** %tmp.3 = add uint %iv..inc, %tmp.2 ; [#uses=1]
*** %tmp.4 = add uint %tmp.3, 4294967292 ; [#uses=1]
*** %tmp.4 = cast uint %tmp.4 to int ; [#uses=1]
ret int %tmp.4
}

Note that LSR is tricky: because it is going to reinsert the entire expression it uses the post-
incremented ivar value to avoid coallescing issues. It would have been much better for it to just realize
that it could "ret int %tmp %tmp.1" instead of all the ***'d code. Because it uses the the preinc value
inside the loop and the postinc value outside the loop, there is no way normal CSE will catch it, and we
don't. From this, we generate this nasty code:

_foo:
li r2, 0
LBB1_1: ;cond_true
mr r3, r4
slwi r4, r3, 1
cmpwi cr0, r5, 40
add r4, r2, r4
addi r2, r2, 4
blt cr0, LBB1_1 ;cond_true
LBB1_2: ;bb12
*** slwi r3, r3, 1
*** add r2, r2, r3
*** addi r3, r2, -4
blr

doh. mr r3, r2 would have been much faster, and would have probably eliminated a copy in the loop.

-Chris

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions