Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i8 bool accumulator not converted to i1 #57448

Closed
nikic opened this issue Aug 30, 2022 · 2 comments
Closed

i8 bool accumulator not converted to i1 #57448

nikic opened this issue Aug 30, 2022 · 2 comments

Comments

@nikic
Copy link
Contributor

nikic commented Aug 30, 2022

define zeroext i1 @test(ptr %ptr.base, i64 %len) {
start:
  %end = getelementptr inbounds i64, ptr %ptr.base, i64 %len
  %len.zero = icmp eq i64 %len, 0
  br i1 %len.zero, label %exit, label %loop

loop:
  %accum = phi i8 [ %accum.next, %loop ], [ 1, %start ]
  %ptr = phi ptr [ %ptr.next, %loop ], [ %ptr.base, %start ]
  %ptr.next = getelementptr inbounds i64, ptr %ptr, i64 1
  %accum.bool = icmp ne i8 %accum, 0
  %val = load i64, ptr %ptr, align 8
  %val.zero = icmp eq i64 %val, 0
  %and = and i1 %accum.bool, %val.zero
  %accum.next = zext i1 %and to i8
  %exit.cond = icmp eq ptr %ptr.next, %end
  br i1 %exit.cond, label %exit, label %loop

exit:
  %res = phi i1 [ true, %start ], [ %and, %loop ]
  ret i1 %res
}

This stays invariant under -O2. It should be possible to narrow %accum to i1, resulting in the following IR:

define zeroext i1 @test(ptr %ptr.base, i64 %len) {
start:
  %end = getelementptr inbounds i64, ptr %ptr.base, i64 %len
  %len.zero = icmp eq i64 %len, 0
  br i1 %len.zero, label %exit, label %loop

loop:
  %accum.bool = phi i1 [ %accum.next, %loop ], [ true, %start ]
  %ptr = phi ptr [ %ptr.next, %loop ], [ %ptr.base, %start ]
  %ptr.next = getelementptr inbounds i64, ptr %ptr, i64 1
  %val = load i64, ptr %ptr, align 8
  %val.zero = icmp eq i64 %val, 0
  %accum.next = and i1 %accum.bool, %val.zero
  %exit.cond = icmp eq ptr %ptr.next, %end
  br i1 %exit.cond, label %exit, label %loop

exit:
  %res = phi i1 [ true, %start ], [ %accum.next, %loop ]
  ret i1 %res
}
@nikic
Copy link
Contributor Author

nikic commented Aug 30, 2022

We'd want to push the icmp ne i8 %accum, 0 through the phi here, because that allows it to fold with the zext. But general transforms like foldOpIntoPhi() won't cover this, because we usually don't want to push an op over a loop backedge (easy to infinite loop in that case).

Not sure what an elegant way to fix this would be cc @rotateright

@nikic
Copy link
Contributor Author

nikic commented Sep 30, 2022

Candidate patch: https://reviews.llvm.org/D134954

@nikic nikic self-assigned this Sep 30, 2022
@nikic nikic closed this as completed in b20e34b Oct 4, 2022
nikic added a commit that referenced this issue Oct 4, 2022
Reapply with a fix for the case where an operand simplified back
to the original phi: We need to map this case to the new phi node.

-----

foldOpIntoPhi() currently only folds operations into the phi if all
but one operands constant-fold. The two exceptions to this are freeze
and select, where we allow more general simplification.

This patch makes foldOpIntoPhi() generally simplification based and
removes all the instruction-specific logic. We just try to simplify
the instruction for each operand, and for the (potentially) one
non-simplified operand, we move it into the new block with adjusted
operands.

This fixes #57448, which
was my original motivation for the change.
nikic added a commit that referenced this issue Oct 5, 2022
The infinite loop seen on buildbots should be fixed by
1189770 (assuming there are not
multiple infinite combine loops...)

-----

foldOpIntoPhi() currently only folds operations into the phi if all
but one operands constant-fold. The two exceptions to this are freeze
and select, where we allow more general simplification.

This patch makes foldOpIntoPhi() generally simplification based and
removes all the instruction-specific logic. We just try to simplify
the instruction for each operand, and for the (potentially) one
non-simplified operand, we move it into the new block with adjusted
operands.

This fixes #57448, which
was my original motivation for the change.

Differential Revision: https://reviews.llvm.org/D134954
nikic added a commit that referenced this issue Oct 7, 2022
Relative to the previous attempt, this adjusts simplification to
use the correct context instruction: We need to use the terminator
of the incoming block, not the original instruction.

-----

foldOpIntoPhi() currently only folds operations into the phi if all
but one operands constant-fold. The two exceptions to this are freeze
and select, where we allow more general simplification.

This patch makes foldOpIntoPhi() generally simplification based and
removes all the instruction-specific logic. We just try to simplify
the instruction for each operand, and for the (potentially) one
non-simplified operand, we move it into the new block with adjusted
operands.

This fixes #57448, which
was my original motivation for the change.

Differential Revision: https://reviews.llvm.org/D134954
nikic added a commit that referenced this issue Oct 17, 2022
Relative to the previous attempt, this is rebased over the
InstSimplify fix in ac74e7a,
which addresses the miscompile reported in PR58401.

-----

foldOpIntoPhi() currently only folds operations into the phi if all
but one operands constant-fold. The two exceptions to this are freeze
and select, where we allow more general simplification.

This patch makes foldOpIntoPhi() generally simplification based and
removes all the instruction-specific logic. We just try to simplify
the instruction for each operand, and for the (potentially) one
non-simplified operand, we move it into the new block with adjusted
operands.

This fixes #57448, which
was my original motivation for the change.

Differential Revision: https://reviews.llvm.org/D134954
sid8123 pushed a commit to sid8123/llvm-project that referenced this issue Oct 25, 2022
Relative to the previous attempt, this is rebased over the
InstSimplify fix in ac74e7a,
which addresses the miscompile reported in PR58401.

-----

foldOpIntoPhi() currently only folds operations into the phi if all
but one operands constant-fold. The two exceptions to this are freeze
and select, where we allow more general simplification.

This patch makes foldOpIntoPhi() generally simplification based and
removes all the instruction-specific logic. We just try to simplify
the instruction for each operand, and for the (potentially) one
non-simplified operand, we move it into the new block with adjusted
operands.

This fixes llvm#57448, which
was my original motivation for the change.

Differential Revision: https://reviews.llvm.org/D134954
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants