-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AMDGPU] Add calling convention check in call lowering. #86267
Conversation
In llvm#81394, calls to amdgpu_gfx functions are now allowed to use s0-s3 for inreg arguments. This causes a regression in an offline lit test, where we call an external compute library function from a compute shader. The changes leave registers from the s0-s3 range to not be live-in in the MBB containing the SI_CALL instruction. This seems to be caused by a missing Gfx CC check in `SITargetLowering::LowerCall`, where we insert a `CopyFromReg` from a call chain to either s48-s51 or s0-s3. Since by the now missing copy at the beginning of the MBB, SGPR0-SGPR3 are not implicitly live anymore, the lowering of the call still using SGPR0-SGPR3 will also fail, so we should not insert the `CopyFromReg` into the chain as well.
@llvm/pr-subscribers-backend-amdgpu Author: Thomas Symalla (tsymalla) ChangesIn #81394, calls to amdgpu_gfx functions are now allowed to use s0-s3 for inreg arguments. This causes a regression in an offline lit test, where we call an external compute library function from a compute shader. The changes leave registers from the s0-s3 range to not be live-in in the MBB containing the SI_CALL instruction. I hope that makes sense somehow. Full diff: https://github.com/llvm/llvm-project/pull/86267.diff 1 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
index 7f0cff72c18661..d82038bb6c51c5 100644
--- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -3684,7 +3684,7 @@ SDValue SITargetLowering::LowerCall(CallLoweringInfo &CLI,
if (!IsSibCall)
Chain = DAG.getCALLSEQ_START(Chain, 0, 0, DL);
- if (!IsSibCall || IsChainCallConv) {
+ if (!AMDGPU::isGraphics(CallConv) && (!IsSibCall || IsChainCallConv)) {
if (!Subtarget->enableFlatScratch()) {
SmallVector<SDValue, 4> CopyFromChains;
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing test. I'd expect us to always emit the necessary parameter copies, which would often be identity copies to fold later
Yeah, I am working towards a minimal example.
I'd expect that as well, but it seems mentioned change causes the removal of these copies: Working:
Broken:
=> sgpr2_sgpr3 is not in the live-ins. |
Note: The implicits and implicit-defs disappear with the change, preventing the live register analysis from complaining |
Will revert original change |
Shouldn't it be any amdgpu_gfx function calling ccc with inreg/SGPR arguments?
Better to just finish the fix here? |
Don't have time for that, unfortunately. |
I insist you at least push the test which failed |
In #81394, calls to amdgpu_gfx functions are now allowed to use s0-s3 for inreg arguments. This causes a regression in an offline lit test, where we call an external compute library function from a compute shader. The changes leave registers from the s0-s3 range to not be live-in in the MBB containing the SI_CALL instruction.
This seems to be caused by a missing Gfx CC check in
SITargetLowering::LowerCall
, where we insert aCopyFromReg
from a call chain to either s48-s51 or s0-s3.Since by the now missing copy at the beginning of the MBB, SGPR0-SGPR3 are not implicitly live anymore, the lowering of the call still using SGPR0-SGPR3 will also fail, so we should not insert the
CopyFromReg
into the chain as well.I hope that makes sense somehow.