-
Notifications
You must be signed in to change notification settings - Fork 11k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
@min(@ctz(x), y)
can become @ctz(x | (1 << y))
#90000
Comments
@min(@ctz(x), y);
can become @ctz(x | (1 << y))
@min(@ctz(x), y)
can become @ctz(x | (1 << y))
Proof for the tzcnt case: https://alive2.llvm.org/ce/z/zUH_Ny ---------------------------------------
define i16 @src_bounded_tzcnt(i16 %a0, i16 %a1) {
Entry:
%cmp = icmp ule i16 %a1, 15
assume i1 %cmp
%tz = cttz i16 %a0, 0
%r = umin i16 %tz, %a1
ret i16 %r
}
=>
define i16 @tgt_bounded_tzcnt(i16 %a0, i16 %a1) {
Entry:
%bit = shl i16 1, %a1
%or = or i16 %a0, %bit
%tz = cttz i16 %or, 0
ret i16 %tz
}
Transformation seems to be correct! |
The lzcnt version has a typo, afaict it should be: export fn bounded_lzcnt_better(x: u16) u8 {
return @clz(x | ((1 << 15) >> y));
} Proof: https://alive2.llvm.org/ce/z/yb4r54 ----------------------------------------
define i16 @src_bounded_lzcnt(i16 %a0, i16 %a1) {
#0:
%cmp = icmp ule i16 %a1, 15
assume i1 %cmp
%tz = ctlz i16 %a0, 0
%r = umin i16 %tz, %a1
ret i16 %r
}
=>
define i16 @tgt_bounded_lzcnt(i16 %a0, i16 %a1) {
#0:
%bit = lshr i16 32768, %a1
%or = or i16 %a0, %bit
%tz = ctlz i16 %or, 0
ret i16 %tz
}
Transformation seems to be correct! |
Good catch on that one. And thanks for looking into this! |
Did you see this in real world code or was this from fuzzing/testing? |
For
For |
When I saw a u16::cttz call turn into an OR+TZCNT I had the idea that you can fold a umin into that OR and tested whether LLVM knew about that yet. Specifically: or edi, 65536
tzcnt ecx, edi So yes, for me it was a theoretical optimization. |
Hi! This issue may be a good introductory issue for people new to working on LLVM. If you would like to work on this issue, your first steps are:
If you have any further questions about this issue, don't hesitate to ask via a comment in the thread below. |
@llvm/issue-subscribers-good-first-issue Author: Niles Salter (Validark)
[Godbolt link](https://zig.godbolt.org/z/fzMo9jYPK)
const y = 6;
export fn bounded_tzcnt(x: u16) u8 {
return @<!-- -->min(@<!-- -->ctz(x), y);
}
export fn bounded_tzcnt_better(x: u16) u8 {
return @<!-- -->ctz(x | (1 << y));
}
export fn bounded_lzcnt(x: u16) u8 {
return @<!-- -->min(@<!-- -->clz(x), y);
}
export fn bounded_lzcnt_better(x: u16) u8 {
return @<!-- -->clz(x | (1 << 16 >> y));
} bounded_tzcnt:
or edi, 65536
mov eax, 6
tzcnt ecx, edi
cmp cl, 6
cmovb eax, ecx
ret
bounded_tzcnt_better:
or edi, 64
tzcnt eax, edi
ret
bounded_lzcnt:
lzcnt cx, di
mov eax, 6
cmp cl, 6
cmovb eax, ecx
ret
bounded_lzcnt_better:
or edi, 1024
lzcnt ax, di
ret |
Hi, I'd like to work on this one if it's still available. |
Please read https://llvm.org/docs/InstCombineContributorGuide.html before submitting your first patch :) |
Two questions occurred to me:
Another option would be to restrict the transformation only to cases with a constant second operand, which would resolve the above questions neatly. |
Feel free to file another PR if you find that this pattern exists in some real-world applications. Unfortunately it doesn't exist in my benchmark :(
Yeah, we only fold |
https://alive2.llvm.org/ce/z/on8IIK suggests |
) The new transformation folds `umin(cttz(x), c)` to `cttz(x | (1 << c))` and `umin(ctlz(x), c)` to `ctlz(x | ((1 << (bitwidth - 1)) >> c))`. The transformation is only implemented for constant `c` to not increase the number of instructions. The idea of the transformation is to set the c-th lowest (for `cttz`) or highest (for `ctlz`) bit in the operand. In this way, the `cttz` or `ctlz` instruction always returns at most `c`. Alive2 proofs: https://alive2.llvm.org/ce/z/xRZTE7
) The new transformation folds `umin(cttz(x), c)` to `cttz(x | (1 << c))` and `umin(ctlz(x), c)` to `ctlz(x | ((1 << (bitwidth - 1)) >> c))`. The transformation is only implemented for constant `c` to not increase the number of instructions. The idea of the transformation is to set the c-th lowest (for `cttz`) or highest (for `ctlz`) bit in the operand. In this way, the `cttz` or `ctlz` instruction always returns at most `c`. Alive2 proofs: https://alive2.llvm.org/ce/z/7BQLBe
) The new transformation folds `umin(cttz(x), c)` to `cttz(x | (1 << c))` and `umin(ctlz(x), c)` to `ctlz(x | ((1 << (bitwidth - 1)) >> c))`. The transformation is only implemented for constant `c` to not increase the number of instructions. The idea of the transformation is to set the c-th lowest (for `cttz`) or highest (for `ctlz`) bit in the operand. In this way, the `cttz` or `ctlz` instruction always returns at most `c`. Alive2 proofs: https://alive2.llvm.org/ce/z/y8Hdb8
) The new transformation folds `umin(cttz(x), c)` to `cttz(x | (1 << c))` and `umin(ctlz(x), c)` to `ctlz(x | ((1 << (bitwidth - 1)) >> c))`. The transformation is only implemented for constant `c` to not increase the number of instructions. The idea of the transformation is to set the c-th lowest (for `cttz`) or highest (for `ctlz`) bit in the operand. In this way, the `cttz` or `ctlz` instruction always returns at most `c`. Alive2 proofs: https://alive2.llvm.org/ce/z/y8Hdb8
) The new transformation folds `umin(cttz(x), c)` to `cttz(x | (1 << c))` and `umin(ctlz(x), c)` to `ctlz(x | ((1 << (bitwidth - 1)) >> c))`. The transformation is only implemented for constant `c` to not increase the number of instructions. The idea of the transformation is to set the c-th lowest (for `cttz`) or highest (for `ctlz`) bit in the operand. In this way, the `cttz` or `ctlz` instruction always returns at most `c`. Alive2 proofs: https://alive2.llvm.org/ce/z/y8Hdb8
) The new transformation folds `umin(cttz(x), c)` to `cttz(x | (1 << c))` and `umin(ctlz(x), c)` to `ctlz(x | ((1 << (bitwidth - 1)) >> c))`. The transformation is only implemented for constant `c` to not increase the number of instructions. The idea of the transformation is to set the c-th lowest (for `cttz`) or highest (for `ctlz`) bit in the operand. In this way, the `cttz` or `ctlz` instruction always returns at most `c`. Alive2 proofs: https://alive2.llvm.org/ce/z/y8Hdb8
) The new transformation folds `umin(cttz(x), c)` to `cttz(x | (1 << c))` and `umin(ctlz(x), c)` to `ctlz(x | ((1 << (bitwidth - 1)) >> c))`. The transformation is only implemented for constant `c` to not increase the number of instructions. The idea of the transformation is to set the c-th lowest (for `cttz`) or highest (for `ctlz`) bit in the operand. In this way, the `cttz` or `ctlz` instruction always returns at most `c`. Alive2 proofs: https://alive2.llvm.org/ce/z/y8Hdb8
) The new transformation folds `umin(cttz(x), c)` to `cttz(x | (1 << c))` and `umin(ctlz(x), c)` to `ctlz(x | ((1 << (bitwidth - 1)) >> c))`. The transformation is only implemented for constant `c` to not increase the number of instructions. The idea of the transformation is to set the c-th lowest (for `cttz`) or highest (for `ctlz`) bit in the operand. In this way, the `cttz` or `ctlz` instruction always returns at most `c`. Alive2 proofs: https://alive2.llvm.org/ce/z/y8Hdb8
) The new transformation folds `umin(cttz(x), c)` to `cttz(x | (1 << c))` and `umin(ctlz(x), c)` to `ctlz(x | ((1 << (bitwidth - 1)) >> c))`. The transformation is only implemented for constant `c` to not increase the number of instructions. The idea of the transformation is to set the c-th lowest (for `cttz`) or highest (for `ctlz`) bit in the operand. In this way, the `cttz` or `ctlz` instruction always returns at most `c`. Alive2 proofs: https://alive2.llvm.org/ce/z/y8Hdb8
) The new transformation folds `umin(cttz(x), c)` to `cttz(x | (1 << c))` and `umin(ctlz(x), c)` to `ctlz(x | ((1 << (bitwidth - 1)) >> c))`. The transformation is only implemented for constant `c` to not increase the number of instructions. The idea of the transformation is to set the c-th lowest (for `cttz`) or highest (for `ctlz`) bit in the operand. In this way, the `cttz` or `ctlz` instruction always returns at most `c`. Alive2 proofs: https://alive2.llvm.org/ce/z/y8Hdb8
) The new transformation folds `umin(cttz(x), c)` to `cttz(x | (1 << c))` and `umin(ctlz(x), c)` to `ctlz(x | ((1 << (bitwidth - 1)) >> c))`. The transformation is only implemented for constant `c` to not increase the number of instructions. The idea of the transformation is to set the c-th lowest (for `cttz`) or highest (for `ctlz`) bit in the operand. In this way, the `cttz` or `ctlz` instruction always returns at most `c`. Alive2 proofs: https://alive2.llvm.org/ce/z/y8Hdb8 Fixes llvm#90000
Godbolt link
The text was updated successfully, but these errors were encountered: