Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missed peephole opt: sub nuw 15, x => xor x, 15 #86874

Open
Kmeakin opened this issue Mar 27, 2024 · 4 comments
Open

Missed peephole opt: sub nuw 15, x => xor x, 15 #86874

Kmeakin opened this issue Mar 27, 2024 · 4 comments

Comments

@Kmeakin
Copy link
Contributor

Kmeakin commented Mar 27, 2024

This transformation is beneficial because materializing the 15 argument to sub requires an extra instruction, but the 15 argument to xor can fit directly into the xor instruction

Rust example (https://godbolt.org/z/h7aTc5sca):

#![feature(unchecked_math)]

#[no_mangle]
pub fn src(x: u8) -> u8 {
    unsafe { 15_u8.unchecked_sub(x) }
}

#[no_mangle]
pub fn tgt(x: u8) -> u8 {
    x ^ 15
}

alive proof: https://alive2.llvm.org/ce/z/YKCPCA

define noundef i8 @src(i8 noundef %x) unnamed_addr #0 {
  %_0 = sub nuw i8 15, %x
  ret i8 %_0
}

define noundef i8 @tgt(i8 noundef %x) unnamed_addr #0 {
  %_0 = xor i8 %x, 15
  ret i8 %_0
}
@Kmeakin
Copy link
Contributor Author

Kmeakin commented Mar 27, 2024

This actually works for any constant that is one less than a power of two: https://alive2.llvm.org/ce/z/gHvLVR

define noundef i8 @src(i8 noundef %x, i8 noundef %c) {
  %self = add i8 %c, 1
  %a = tail call i8 @llvm.ctpop.i8(i8 %self)
  %_3 = icmp eq i8 %a, 1
  tail call void @llvm.assume(i1 %_3)
  %_0 = sub nuw i8 %c, %x
  ret i8 %_0
}

define noundef i8 @tgt(i8 noundef %x, i8 noundef %c) {
  %self = add i8 %c, 1
  %a = tail call i8 @llvm.ctpop.i8(i8 %self)
  %_3 = icmp eq i8 %a, 1
  tail call void @llvm.assume(i1 %_3)
  %_0 = xor i8 %c, %x
  ret i8 %_0
}

declare i8 @llvm.ctpop.i8(i8)

declare void @llvm.assume(i1 noundef)

@XChy
Copy link
Member

XChy commented Mar 28, 2024

Look reasonable, InstCombine always turns sub into xor with specific knownbits constraint, though in this case, we may lose some information.

@Kmeakin
Copy link
Contributor Author

Kmeakin commented Mar 28, 2024

The transform could be done during instruction selection if we're concerned about losing information

@nikic
Copy link
Contributor

nikic commented Apr 25, 2024

I think this should be done in DAGCombine, not InstCombine. We've recently encountered issues with sub -> xor canonicalization in cases where it cannot be reverted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants