-
Notifications
You must be signed in to change notification settings - Fork 11.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
analysis for min/max/abs intrinsics #46259
Comments
I'm working on ConstantFolding and InstSimplify currently. |
I'm working on computeKnownBits for abs |
Default + X86 costs have been committed. |
SLP - basic abs/min/max vectorization works fine I'm going to look at adding min/max intrinsic reduction support |
This was the last InstSimplify that was on my list to adapt: Constant folding should be complete too. |
I think we have most of it covered now. I'm not as familiar with the last 3 on the list (LVI/CVP/SCCP, SCEV, IndVars). What needs to be done before we start producing intrinsics in place of cmp+select in instcombine transforms? |
The main thing that still needs to be done is recognizing min/max intrinsics in SCEV. |
First try to get abs() canonicalized: As I mentioned there, I don't think we have our size cost models updated (but let me know if I'm misreading the code). We updated the throughput models here: That's what is used by the vectorizers, but that's probably not what is used by unrolling, inlining, and simplifycfg. |
The basic cost model handling is in place as of: Still trying to get abs canonicalization in with: |
|
Awesome! Only part left now is to canonicalize min/max... I took a quick look at that, here's how the InstCombine test diff looks like: https://gist.github.com/nikic/f69061736ea8e6b6c159b28b09991599 We're still missing quite a few folds, e.g. pushing nots across min/max and forming unsigned saturating math. |
Over in bug 48816, we discovered that we need narrowing combines for abs to avoid regressions. I'll work on the related folds for min/max. |
Those were: |
This is likely the best view of current state: There are still missing parts of analysis/instcombines as noted. I think SLP is done (and the sooner we can switch to intrinsics, the better because supporting cmp+select is still at risk for hard-to-spot bugs as shown in https://reviews.llvm.org/D99753 ). AFAIK, LoopVectorizer is partially done. I'll add some PhaseOrdering tests to increase visibility. |
Added CVP handling for abs/min/max |
What about saturating intrinsics? The same pattern matching discrepancy seems to apply, no? |
mentioned in issue #48244 |
Closing this as completed. Remaining issues should be reported separately. |
Extended Description
We recently added IR intrinsics for min/max/abs, and there are several areas that need to be updated to support and optimize these. Please comment here if you are or are planning to work on any part of this, so we do not duplicate effort.
Partial list:
The text was updated successfully, but these errors were encountered: