-
Notifications
You must be signed in to change notification settings - Fork 10.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Support] Investigate making KnownBits::mul optimal #86671
Comments
Hi! This issue may be a good introductory issue for people new to working on LLVM. If you would like to work on this issue, your first steps are:
If you have any further questions about this issue, don't hesitate to ask via a comment in the thread below. |
@llvm/issue-subscribers-good-first-issue Author: Simon Pilgrim (RKSimon)
https://github.com/llvm/llvm-project/blob/ffe41819e58365dfbe85a22556c0d9d284e746b9/llvm/unittests/Support/KnownBitsTest.cpp#L586-L591
Investigate if we can make the implementation optimal (checkOptimalityBinary) Similar to #84212 and #84213 |
Do we currently fail if we check with |
yes, a lot :)
we don't even get the |
CC @goldsteinn who might have some insight on this |
I looked at this briefly but don't know how to make it optimal. @jayfoad is better at this stuff xD I think cases like like |
I don't know how to do it with a fixed-time algorithm. There are probably ways to do it if you don't mind iterating N times (for multiplying N-bit integers) but I always assumed that would not be acceptable. |
We iterate N times for shifts. My guess is also we could get a lot of early outs if the result became fully unknown early. |
I tried this implementation which tracks the lowest and highest possible value for each bit position of the result in turn, starting from bit 0: KnownBits KnownBits::mul(const KnownBits &LHS, const KnownBits &RHS,
bool NoUndefSelfMultiply) {
unsigned BitWidth = LHS.getBitWidth();
KnownBits Res(BitWidth);
unsigned MinVal = 0;
unsigned MaxVal = 0;
for (unsigned I = 0; I != BitWidth; ++I) {
MinVal += (LHS.getMinValue().trunc(I + 1) & RHS.getMinValue().trunc(I + 1).reverseBits()).popcount();
MaxVal += (LHS.getMaxValue().trunc(I + 1) & RHS.getMaxValue().trunc(I + 1).reverseBits()).popcount();
if (MinVal == MaxVal) {
if (MinVal & 1)
Res.One.setBit(I);
else
Res.Zero.setBit(I);
}
MinVal >>= 1;
MaxVal >>= 1;
}
return Res;
} It is not optimal. The first case I found where it fails is:
(i.e. 5-or-7 times 5-or-7). The algorithm computes the range of bit 2 of the result as [2, 4] but it does not know that it is always 2 or 4, never 3. |
What are the requirements for the implementation wrt performance? Is there a benchmark for |
@nikic has a compile time benchmark. If that doesn't regress any impl is fine |
Got it. Where do I find his benchmark? I assume, it's https://github.com/nikic/llvm-compile-time-tracker? |
TBH I'd just close this issue. I don't think KnownBits mul can be both optimal and efficient at the same time. |
I still think this is worth somebody experimenting with even if we just confirm the compile time cost is too high for the codegen benefits. The issue title asked for an investigation, I don't think we have any confirmed conclusions yet, just an expectation. |
llvm-project/llvm/unittests/Support/KnownBitsTest.cpp
Lines 586 to 591 in ffe4181
Investigate if we can make the implementation optimal (checkOptimalityBinary)
Similar to #84212 and #84213
The text was updated successfully, but these errors were encountered: