Skip to content

Conversation

@topperc
Copy link
Collaborator

@topperc topperc commented Nov 14, 2025

This replaces the 2 bool flags and the anonymous union. This also removes an implicit conversion from Register to unsigned and a call to MCRegister::id().

The ArgDescriptor constructor was always assigning the union through the MCRegister field even for stack offsets.

The change to SIMachineFunctionInfo.h fixes a case where getRegister was being called on an unset ArgDescriptor. Since it was only this case, it seemed cleaner to fix it at the caller. The other option would be to make getRegister() return MCRegister() for an unset ArgDescriptor.

This replaces the 2 bool flags and the anonymous union. This also
removes an implicit conversion from Register to unsigned and a
call to MCRegister::id().

The ArgDescriptor constructor was always assigning the union through
the MCRegister field even for stack offsets.

The change to SIMachineFunctionInfo.h fixes a case where getRegister
was being called on an unset ArgDescriptor. Since it was only this
case, it seemed cleaner to fix it at the caller. The other option would
be to make getRegister() return MCRegister() for an unset ArgDescriptor.
@llvmbot
Copy link
Member

llvmbot commented Nov 14, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: Craig Topper (topperc)

Changes

This replaces the 2 bool flags and the anonymous union. This also removes an implicit conversion from Register to unsigned and a call to MCRegister::id().

The ArgDescriptor constructor was always assigning the union through the MCRegister field even for stack offsets.

The change to SIMachineFunctionInfo.h fixes a case where getRegister was being called on an unset ArgDescriptor. Since it was only this case, it seemed cleaner to fix it at the caller. The other option would be to make getRegister() return MCRegister() for an unset ArgDescriptor.


Full diff: https://github.com/llvm/llvm-project/pull/167992.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/AMDGPUArgumentUsageInfo.h (+19-23)
  • (modified) llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h (+3-1)
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUArgumentUsageInfo.h b/llvm/lib/Target/AMDGPU/AMDGPUArgumentUsageInfo.h
index 8838a94a639eb..4c382a0046b64 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUArgumentUsageInfo.h
+++ b/llvm/lib/Target/AMDGPU/AMDGPUArgumentUsageInfo.h
@@ -13,6 +13,7 @@
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/CodeGen/Register.h"
 #include "llvm/Pass.h"
+#include <variant>
 
 namespace llvm {
 
@@ -27,54 +28,49 @@ struct ArgDescriptor {
   friend struct AMDGPUFunctionArgInfo;
   friend class AMDGPUArgumentUsageInfo;
 
-  union {
-    MCRegister Reg;
-    unsigned StackOffset;
-  };
+  std::variant<std::monostate, MCRegister, unsigned> Val;
 
   // Bitmask to locate argument within the register.
   unsigned Mask;
 
-  bool IsStack : 1;
-  bool IsSet : 1;
-
 public:
-  ArgDescriptor(unsigned Val = 0, unsigned Mask = ~0u, bool IsStack = false,
-                bool IsSet = false)
-      : Reg(Val), Mask(Mask), IsStack(IsStack), IsSet(IsSet) {}
+  ArgDescriptor(unsigned Mask = ~0u) : Mask(Mask) {}
 
   static ArgDescriptor createRegister(Register Reg, unsigned Mask = ~0u) {
-    return ArgDescriptor(Reg, Mask, false, true);
+    ArgDescriptor Ret(Mask);
+    Ret.Val = Reg.asMCReg();
+    return Ret;
   }
 
   static ArgDescriptor createStack(unsigned Offset, unsigned Mask = ~0u) {
-    return ArgDescriptor(Offset, Mask, true, true);
+    ArgDescriptor Ret(Mask);
+    Ret.Val = Offset;
+    return Ret;
   }
 
   static ArgDescriptor createArg(const ArgDescriptor &Arg, unsigned Mask) {
-    return ArgDescriptor(Arg.Reg.id(), Mask, Arg.IsStack, Arg.IsSet);
+    // Copy the descriptor, then change the mask.
+    ArgDescriptor Ret(Arg);
+    Ret.Mask = Mask;
+    return Ret;
   }
 
-  bool isSet() const {
-    return IsSet;
-  }
+  bool isSet() const { return !std::holds_alternative<std::monostate>(Val); }
 
   explicit operator bool() const {
     return isSet();
   }
 
-  bool isRegister() const {
-    return !IsStack;
-  }
+  bool isRegister() const { return std::holds_alternative<MCRegister>(Val); }
 
   MCRegister getRegister() const {
-    assert(!IsStack);
-    return Reg;
+    assert(isRegister());
+    return std::get<MCRegister>(Val);
   }
 
   unsigned getStackOffset() const {
-    assert(IsStack);
-    return StackOffset;
+    assert(std::holds_alternative<unsigned>(Val));
+    return std::get<unsigned>(Val);
   }
 
   unsigned getMask() const {
diff --git a/llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h b/llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h
index 019c3b79e5fe5..ca3c35067a923 100644
--- a/llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h
+++ b/llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h
@@ -1014,7 +1014,9 @@ class SIMachineFunctionInfo final : public AMDGPUMachineFunction,
   void setNumWaveDispatchVGPRs(unsigned Count) { NumWaveDispatchVGPRs = Count; }
 
   Register getPrivateSegmentWaveByteOffsetSystemSGPR() const {
-    return ArgInfo.PrivateSegmentWaveByteOffset.getRegister();
+    if (ArgInfo.PrivateSegmentWaveByteOffset)
+      return ArgInfo.PrivateSegmentWaveByteOffset.getRegister();
+    return MCRegister();
   }
 
   /// Returns the physical register reserved for use as the resource

Copy link
Contributor

@s-barannikov s-barannikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, assuming this isn't performance critical code

@topperc
Copy link
Collaborator Author

topperc commented Nov 14, 2025

LGTM, assuming this isn't performance critical code

Is std::variant expensive?

@github-actions
Copy link

github-actions bot commented Nov 14, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@s-barannikov
Copy link
Contributor

LGTM, assuming this isn't performance critical code

Is std::variant expensive?

It can be, see e.g. #157229

@topperc
Copy link
Collaborator Author

topperc commented Nov 14, 2025

LGTM, assuming this isn't performance critical code

Is std::variant expensive?

It can be, see e.g. #157229

I don't use std::visit so hopefully that avoids some of the issues. I looked a little bit at the llvm IR for SISelLowering.cpp before and after this change. It didn't look too bad. It's a little different because the byte that tracks which type is active is between the variant and the mask in the struct layout. Previously the two bools were at the end.

@topperc topperc merged commit ebc0e07 into llvm:main Nov 14, 2025
10 checks passed
@topperc topperc deleted the pr/mcregister/amdgpu-argdescriptor branch November 14, 2025 17:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants