Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[AArch64] Try to re-use extended operand for SETCC with vector ops.
Try to re-use an already extended operand for SetCC with vector operands feeding an extended select. Doing so avoids requiring another full extension of the SET_CC result when lowering the select. This improves lowering for certain extend/cmp/select patterns operating. For example with v16i8, this replaces 6 instructions for the extra extension with 4 separate selects. This improves the generated code for loops like the one below in combination with D96522. int foo(uint8_t *p, int N) { unsigned long long sum = 0; for (int i = 0; i < N ; i++, p++) { unsigned int v = *p; sum += (v < 127) ? v : 256 - v; } return sum; } https://clang.godbolt.org/z/Wco866MjY On the AArch64 cores I have access to, the patch improves performance of the vector loop by ~10%. This could be generalized per follow-ups, but the initial version targets one of the more important cases in combination with D96522. Alive2 modeling: * sext EQ https://alive2.llvm.org/ce/z/5upBvb * sext NE https://alive2.llvm.org/ce/z/zbEcJp * zext EQ https://alive2.llvm.org/ce/z/_xMwof * zext NE https://alive2.llvm.org/ce/z/5FwKfc * zext unsigned predicate: https://alive2.llvm.org/ce/z/iEwLU3 * sext signed predicate: https://alive2.llvm.org/ce/z/aMBega Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D120481
- Loading branch information