-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RISCV] Combine trunc (sra sext (x), zext (y)) to sra (x, smin (y, scalarsizeinbits(y) - 1)) #65728
Conversation
…NFC. Add a series of pre-commit tests for later patch to perform trunc (sra sext(X), zext(Y)) -> sra (X, smin (Y, scalarsize(Y) - 1)) combine.
…alarsize(Y) - 1)) For i8/i16 element-wise vector arithmetic right shift, the src value would be first sign_extended to i32 and the shift amount would be zero_extended to i32 to perform the vsra instruction, and followed by a trunc to get the final calcualtion result. For RVV, the truncate would be lowered into n-levels TRUNCATE_VECTOR_VL to satisfy RVV's SEW*2->SEW truncate restriction, such pattern would be expanded into a series of "vsetvli" and "vnsrl" instructions later. For RVV, we can use smin(Y, ScalarSizeInBits(Y)-1) to determine the actual shift amount for the vsra instruction, because we only care about the low lg2(SEW) bits as the shift amount. For more transformation validation, please see alive2 links: https://alive2.llvm.org/ce/z/wXLrLT
Yeah, I tried that one, but it still unclickable. I even tried to restart the pull request to invoke that lable, but it still doesn't work. BTW, thank you for helping me to add reviewers. |
No problem, I think you're right btw, it looks like only those with write access can request a review. I've flagged it in the discourse thread https://discourse.llvm.org/t/update-on-github-pull-requests/71540/105?u=lukel |
Kindly Ping. |
SDValue N10 = N1.getOperand(0); | ||
|
||
if (N00.getValueType().isVector() && | ||
N00.getValueType() == N10.getValueType() && N->hasOneUse() && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does N
need to have a single use?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I agree with you. There is no need to judge hasOneUse for N here. Address comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thank you for your time to read this question. |
I will merge this pull request after this patch passes all regression tests on my machine (I cannot rerun the failed workflow on buildkite). |
Thank you folks, that's a big help for me. |
…alarsizeinbits(y) - 1)) (llvm#65728) For RVV, If we want to perform an i8 or i16 element-wise vector arithmetic right shift in the upper C/C++ program, the value to be shifted would be first sign extended to i32, and the shift amount would also be zero_extended to i32 to perform the vsra.vv instruction, and followed by a truncate to get the final calculation result, such pattern will later expanded to a series of "vsetvli" and "vnsrl" instructions later, this is because the RVV spec only support 2 * SEW -> SEW truncate. But for vector, the shift amount can also be determined by smin (Y, ScalarSizeInBits(Y) - 1)). Also, for the vsra instruction, we only care about the low lg2(SEW) bits as the shift amount. - Alive2: https://alive2.llvm.org/ce/z/u3-Zdr - C++ Test cases : https://gcc.godbolt.org/z/q1qE7fbha
…alarsizeinbits(y) - 1)) (llvm#65728) For RVV, If we want to perform an i8 or i16 element-wise vector arithmetic right shift in the upper C/C++ program, the value to be shifted would be first sign extended to i32, and the shift amount would also be zero_extended to i32 to perform the vsra.vv instruction, and followed by a truncate to get the final calculation result, such pattern will later expanded to a series of "vsetvli" and "vnsrl" instructions later, this is because the RVV spec only support 2 * SEW -> SEW truncate. But for vector, the shift amount can also be determined by smin (Y, ScalarSizeInBits(Y) - 1)). Also, for the vsra instruction, we only care about the low lg2(SEW) bits as the shift amount. - Alive2: https://alive2.llvm.org/ce/z/u3-Zdr - C++ Test cases : https://gcc.godbolt.org/z/q1qE7fbha
…alarsize(Y) - 1) Like llvm#65728, for i8/i16 element-wise vector logical right shift, the src value would be first zext to i32 and the shift amount would be zext to i32 to perform the vsrl instruction, and followed by a trunc to get the final calculation result. This would be expanded into a series of "vsetvli" and "vnsrl" instructions later. For RVV, the vsrl instruction only treats the lg2(sew) bits as the shift amount, so we can calculate the shift amount by using umin(Y, scalarsize(Y) - 1).
…alarsizeinbits(y) - 1)) (llvm#65728) For RVV, If we want to perform an i8 or i16 element-wise vector arithmetic right shift in the upper C/C++ program, the value to be shifted would be first sign extended to i32, and the shift amount would also be zero_extended to i32 to perform the vsra.vv instruction, and followed by a truncate to get the final calculation result, such pattern will later expanded to a series of "vsetvli" and "vnsrl" instructions later, this is because the RVV spec only support 2 * SEW -> SEW truncate. But for vector, the shift amount can also be determined by smin (Y, ScalarSizeInBits(Y) - 1)). Also, for the vsra instruction, we only care about the low lg2(SEW) bits as the shift amount. - Alive2: https://alive2.llvm.org/ce/z/u3-Zdr - C++ Test cases : https://gcc.godbolt.org/z/q1qE7fbha
…alarsizeinbits(y) - 1)) (llvm#65728) For RVV, If we want to perform an i8 or i16 element-wise vector arithmetic right shift in the upper C/C++ program, the value to be shifted would be first sign extended to i32, and the shift amount would also be zero_extended to i32 to perform the vsra.vv instruction, and followed by a truncate to get the final calculation result, such pattern will later expanded to a series of "vsetvli" and "vnsrl" instructions later, this is because the RVV spec only support 2 * SEW -> SEW truncate. But for vector, the shift amount can also be determined by smin (Y, ScalarSizeInBits(Y) - 1)). Also, for the vsra instruction, we only care about the low lg2(SEW) bits as the shift amount. - Alive2: https://alive2.llvm.org/ce/z/u3-Zdr - C++ Test cases : https://gcc.godbolt.org/z/q1qE7fbha
For RVV, If we want to perform an i8 or i16 element-wise vector arithmetic right shift in the upper C/C++ program, the value to be shifted would be first sign extended to i32, and the shift amount would also be zero_extended to i32 to perform the vsra.vv instruction, and followed by a truncate to get the final calculation result, such pattern will later expanded to a series of "vsetvli" and "vnsrl" instructions later, this is because the RVV spec only support 2 * SEW -> SEW truncate. But for vector, the shift amount can also be determined by smin (Y, ScalarSizeInBits(Y) - 1)). Also, for the vsra instruction, we only care about the low lg2(SEW) bits as the shift amount.