Opt should know `x/2 <= x - x/2` is true for `udiv` #91527

scottmcm · 2024-05-08T19:51:34Z

Today, this doesn't optimize away https://llvm.godbolt.org/z/jPjo6M858

define noundef zeroext i1 @demo(i32 noundef %x) unnamed_addr #0 {
start:
  %_2 = udiv i32 %x, 2
  %_3 = sub i32 %x, %_2
  %_0 = icmp ule i32 %_2, %_3
  ret i1 %_0
}

but that could just be return true, as proven in https://alive2.llvm.org/ce/z/dxHhft.

Indeed, it's always true for any denominator greater than zero. Proof: https://alive2.llvm.org/ce/z/CmkN_m

This is important for "half but it needs to be rounded up" cases. x/2 <= x is already understood to be true, but when you need the bigger half instead, the optimizer is more confused. (I lost the original non-minimized example, sorry, but it was something like splitting an n-element slice into n/2 and n-n/2 slices and ending up with the bounds checking not optimizing out.)

Maybe start by getting sub nuw for this? Like

   %_2 = udiv i32 %x, 2
-  %_3 = sub i32 %x, %_2
+  %_3 = sub nuw i32 %x, %_2

Proof for that narrower part: https://alive2.llvm.org/ce/z/xoCDe5

The text was updated successfully, but these errors were encountered:

dtcxzyw · 2024-05-09T01:40:12Z

Emm, it seems like this pattern is rare. Do you encounter this in any real-world rust applications?

Maybe start by getting sub nuw for this? Like

Yeah, we can handle this pattern in ValueTracking like #85555.

scottmcm · 2024-05-09T18:08:07Z

Do you encounter this in any real-world rust applications?

Ah, here we go, I remade what I'd been trying to do: https://rust.godbolt.org/z/TczW1Ka1d

I was trying to get same-length mutable slices to the front and the back of a rust slice, so I could walk them together and mutate them both. (And yes I was intentionally allowing the skipping of the middle element for an odd-length slice.) Doing that in safe rust code means going through split_at_mut, and thus I ended up at

pub fn front_and_back_halves(x: &mut [i32]) -> (&mut [i32], &mut [i32]) {
    let n = x.len();
    let (a, b) = x.split_at_mut(n - n/2);
    (&mut a[..n/2], b)
}

expecting that I wouldn't need unsafe code because LLVM would handle it, but it turned out that the a[..n/2] bounds check doesn't optimize out:

define void @front_and_back_halves(ptr dead_on_unwind noalias nocapture noundef writable writeonly sret([32 x i8]) align 8 dereferenceable(32) %_0, ptr noalias noundef nonnull align 4 %x.0, i64 noundef %x.1) unnamed_addr #0 {
start:
  %_41 = lshr i64 %x.1, 1
  %_3 = sub i64 %x.1, %_41
  %_7.i = icmp ugt i64 %_41, %_3
  br i1 %_7.i, label %bb3.i, label %"_ZN106_$LT$core..ops..range..Range$LT$usize$GT$$u20$as$u20$core..slice..index..SliceIndex$LT$$u5b$T$u5d$$GT$$GT$9index_mut17h7c9c67143f1b9d5cE.exit"

bb3.i:                                            ; preds = %start
  tail call void @_ZN4core5slice5index24slice_end_index_len_fail17h07f57a356f69bc93E(i64 noundef %_41, i64 noundef %_3, ptr noalias noundef nonnull readonly align 8 dereferenceable(24) @alloc_43779ed79010680daba275a097b13dc0) #2, !noalias !3
  unreachable

"_ZN106_$LT$core..ops..range..Range$LT$usize$GT$$u20$as$u20$core..slice..index..SliceIndex$LT$$u5b$T$u5d$$GT$$GT$9index_mut17h7c9c67143f1b9d5cE.exit": ; preds = %start
  %0 = getelementptr inbounds i32, ptr %x.0, i64 %_3
  store ptr %x.0, ptr %_0, align 8
  %1 = getelementptr inbounds i8, ptr %_0, i64 8
  store i64 %_41, ptr %1, align 8
  %2 = getelementptr inbounds i8, ptr %_0, i64 16
  store ptr %0, ptr %2, align 8
  %3 = getelementptr inbounds i8, ptr %_0, i64 24
  store i64 %_41, ptr %3, align 8
  ret void
}

And it's not enough to just make the n - n/2 be sub nuw: https://rust.godbolt.org/z/PT3hnbe8o

So I agree that humans probably don't write this comparison much, but it easily happens in bounds checks.

(Similarly, split_at_mut(n/2) would also give n/2- and n - n/2-length slices.)

Out of curiosity, I also tried https://rust.godbolt.org/z/4o366q4s1

    let (a, b) = x.split_at_mut(n/2);
    (a, &mut b[n%2..])

But that seems much worse because despite still having a bounds check it also doesn't realize that the lengths of the two returned slices are the same. (As evidenced by writing out two different SSA values as the lengths of the returned slices. In the earlier example note that both lengths are store i64 %_41 because LLVM at least knows they're the same length, even if the bounds check didn't optimize out.)

Explorer09 · 2024-05-12T09:18:53Z

How about the x / 2 <= (x + 1) / 2 case? When it come to "half but rounded up", I would write the expression as this: ((x + 1) / 2)

scottmcm · 2024-05-13T05:49:30Z

@Explorer09 x.len() in rust can be usize::MAX (albeit only for slices of ZSTs), which would make doing it as (x + 1)/2 wrap.

That inspired me to try .split_at_mut(n/2 + n%2); too, but nope (https://rust.godbolt.org/z/Y96EzdcGh) that didn't optimize out either.

EDIT: Oh, but it looks like trunk can simplify that one https://llvm.godbolt.org/z/s5hPGbbnP -- amusingly, it simplifies it to n - n/2 🙃

Explorer09 · 2024-05-13T07:30:07Z

@scottmcm Good point. Thanks for in the info about the SIZE_MAX where (x + 1) / 2 won't work (because of overflow).
I stand corrected.

scottmcm · 2024-05-17T22:52:31Z

Actually, I was thinking about your message more and thought that I should try doing that (x + 1) / 2 approach in a larger type to see if it would work that way. After all, if I zext to a larger type where I can add nuw, it ought to be more obvious that x/2 <= (x+1)/2 -- definitely more obvious than n/2 <= n - n/2.

Sadly, no luck. While it would be legal (https://alive2.llvm.org/ce/z/cU7chM) to optimize the check away, it doesn't even though it figures out both add nuw and trunc nuw: https://llvm.godbolt.org/z/zMKW5PP7x

Explorer09 · 2024-05-18T13:40:36Z

@scottmcm
I think another pattern that might me worth mentioning is this: (x / 2) + (x % 2) (assuming x is an unsigned integer), or the bitwise equivalent (x >> 1) + (x & 1)

scottmcm · 2024-05-22T15:44:40Z

@Explorer09 yup, see #91527 (comment) for that one :)

github-actions bot added the new issue label May 8, 2024

EugeneZelenko added llvm:instcombine missed-optimization and removed new issue labels May 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Opt should know `x/2 <= x - x/2` is true for `udiv` #91527

Opt should know `x/2 <= x - x/2` is true for `udiv` #91527

scottmcm commented May 8, 2024

dtcxzyw commented May 9, 2024

scottmcm commented May 9, 2024 •

edited

Explorer09 commented May 12, 2024

scottmcm commented May 13, 2024 •

edited

Explorer09 commented May 13, 2024

scottmcm commented May 17, 2024 •

edited

Explorer09 commented May 18, 2024

scottmcm commented May 22, 2024

Opt should know x/2 <= x - x/2 is true for udiv #91527

Opt should know x/2 <= x - x/2 is true for udiv #91527

Comments

scottmcm commented May 8, 2024

dtcxzyw commented May 9, 2024

scottmcm commented May 9, 2024 • edited

Explorer09 commented May 12, 2024

scottmcm commented May 13, 2024 • edited

Explorer09 commented May 13, 2024

scottmcm commented May 17, 2024 • edited

Explorer09 commented May 18, 2024

scottmcm commented May 22, 2024

Opt should know `x/2 <= x - x/2` is true for `udiv` #91527

Opt should know `x/2 <= x - x/2` is true for `udiv` #91527

scottmcm commented May 9, 2024 •

edited

scottmcm commented May 13, 2024 •

edited

scottmcm commented May 17, 2024 •

edited