i8 bool accumulator not converted to i1 #57448

nikic · 2022-08-30T14:56:53Z

define zeroext i1 @test(ptr %ptr.base, i64 %len) {
start:
  %end = getelementptr inbounds i64, ptr %ptr.base, i64 %len
  %len.zero = icmp eq i64 %len, 0
  br i1 %len.zero, label %exit, label %loop

loop:
  %accum = phi i8 [ %accum.next, %loop ], [ 1, %start ]
  %ptr = phi ptr [ %ptr.next, %loop ], [ %ptr.base, %start ]
  %ptr.next = getelementptr inbounds i64, ptr %ptr, i64 1
  %accum.bool = icmp ne i8 %accum, 0
  %val = load i64, ptr %ptr, align 8
  %val.zero = icmp eq i64 %val, 0
  %and = and i1 %accum.bool, %val.zero
  %accum.next = zext i1 %and to i8
  %exit.cond = icmp eq ptr %ptr.next, %end
  br i1 %exit.cond, label %exit, label %loop

exit:
  %res = phi i1 [ true, %start ], [ %and, %loop ]
  ret i1 %res
}

This stays invariant under -O2. It should be possible to narrow %accum to i1, resulting in the following IR:

define zeroext i1 @test(ptr %ptr.base, i64 %len) {
start:
  %end = getelementptr inbounds i64, ptr %ptr.base, i64 %len
  %len.zero = icmp eq i64 %len, 0
  br i1 %len.zero, label %exit, label %loop

loop:
  %accum.bool = phi i1 [ %accum.next, %loop ], [ true, %start ]
  %ptr = phi ptr [ %ptr.next, %loop ], [ %ptr.base, %start ]
  %ptr.next = getelementptr inbounds i64, ptr %ptr, i64 1
  %val = load i64, ptr %ptr, align 8
  %val.zero = icmp eq i64 %val, 0
  %accum.next = and i1 %accum.bool, %val.zero
  %exit.cond = icmp eq ptr %ptr.next, %end
  br i1 %exit.cond, label %exit, label %loop

exit:
  %res = phi i1 [ true, %start ], [ %accum.next, %loop ]
  ret i1 %res
}

The text was updated successfully, but these errors were encountered:

nikic · 2022-08-30T15:05:19Z

We'd want to push the icmp ne i8 %accum, 0 through the phi here, because that allows it to fold with the zext. But general transforms like foldOpIntoPhi() won't cover this, because we usually don't want to push an op over a loop backedge (easy to infinite loop in that case).

Not sure what an elegant way to fix this would be cc @rotateright

nikic · 2022-09-30T12:58:10Z

Candidate patch: https://reviews.llvm.org/D134954

Reapply with a fix for the case where an operand simplified back to the original phi: We need to map this case to the new phi node. ----- foldOpIntoPhi() currently only folds operations into the phi if all but one operands constant-fold. The two exceptions to this are freeze and select, where we allow more general simplification. This patch makes foldOpIntoPhi() generally simplification based and removes all the instruction-specific logic. We just try to simplify the instruction for each operand, and for the (potentially) one non-simplified operand, we move it into the new block with adjusted operands. This fixes #57448, which was my original motivation for the change.

The infinite loop seen on buildbots should be fixed by 1189770 (assuming there are not multiple infinite combine loops...) ----- foldOpIntoPhi() currently only folds operations into the phi if all but one operands constant-fold. The two exceptions to this are freeze and select, where we allow more general simplification. This patch makes foldOpIntoPhi() generally simplification based and removes all the instruction-specific logic. We just try to simplify the instruction for each operand, and for the (potentially) one non-simplified operand, we move it into the new block with adjusted operands. This fixes #57448, which was my original motivation for the change. Differential Revision: https://reviews.llvm.org/D134954

Relative to the previous attempt, this adjusts simplification to use the correct context instruction: We need to use the terminator of the incoming block, not the original instruction. ----- foldOpIntoPhi() currently only folds operations into the phi if all but one operands constant-fold. The two exceptions to this are freeze and select, where we allow more general simplification. This patch makes foldOpIntoPhi() generally simplification based and removes all the instruction-specific logic. We just try to simplify the instruction for each operand, and for the (potentially) one non-simplified operand, we move it into the new block with adjusted operands. This fixes #57448, which was my original motivation for the change. Differential Revision: https://reviews.llvm.org/D134954

Relative to the previous attempt, this is rebased over the InstSimplify fix in ac74e7a, which addresses the miscompile reported in PR58401. ----- foldOpIntoPhi() currently only folds operations into the phi if all but one operands constant-fold. The two exceptions to this are freeze and select, where we allow more general simplification. This patch makes foldOpIntoPhi() generally simplification based and removes all the instruction-specific logic. We just try to simplify the instruction for each operand, and for the (potentially) one non-simplified operand, we move it into the new block with adjusted operands. This fixes #57448, which was my original motivation for the change. Differential Revision: https://reviews.llvm.org/D134954

Relative to the previous attempt, this is rebased over the InstSimplify fix in ac74e7a, which addresses the miscompile reported in PR58401. ----- foldOpIntoPhi() currently only folds operations into the phi if all but one operands constant-fold. The two exceptions to this are freeze and select, where we allow more general simplification. This patch makes foldOpIntoPhi() generally simplification based and removes all the instruction-specific logic. We just try to simplify the instruction for each operand, and for the (potentially) one non-simplified operand, we move it into the new block with adjusted operands. This fixes llvm#57448, which was my original motivation for the change. Differential Revision: https://reviews.llvm.org/D134954

github-actions bot added the new issue label Aug 30, 2022

nikic mentioned this issue Aug 30, 2022

codegen regression for bool rust-lang/rust#101048

Closed

EugeneZelenko added llvm:codegen and removed new issue labels Aug 30, 2022

fhahn added the missed-optimization label Aug 31, 2022

nikic self-assigned this Sep 30, 2022

EugeneZelenko added llvm:instcombine and removed llvm:codegen labels Sep 30, 2022

nikic closed this as completed in b20e34b Oct 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

i8 bool accumulator not converted to i1 #57448

i8 bool accumulator not converted to i1 #57448

nikic commented Aug 30, 2022 •

edited by VoltrexKeyva

nikic commented Aug 30, 2022

nikic commented Sep 30, 2022

i8 bool accumulator not converted to i1 #57448

i8 bool accumulator not converted to i1 #57448

Comments

nikic commented Aug 30, 2022 • edited by VoltrexKeyva

nikic commented Aug 30, 2022

nikic commented Sep 30, 2022

nikic commented Aug 30, 2022 •

edited by VoltrexKeyva