[SystemZ] Fix Operand Retrieval for Vector Reduction Intrinsic in `shouldExpandReduction` #88874

dominik-steenken · 2024-04-16T11:12:45Z

In the existing version, SystemZTTIImpl::shouldExpandReduction will create a cast error when handling vector reduction intrinsics that do not have the vector to reduce as their first operand, such as llvm.vector.reduce.fadd and llvm.vector.reduce.fmul. This commit fixes that problem by introducing a short loop to find the vector operand instead of assuming that it is the first operand.

dominik-steenken · 2024-04-16T11:13:15Z

@uweigand @JonPsson1 FYI

llvmbot · 2024-04-16T11:13:19Z

@llvm/pr-subscribers-backend-systemz

Author: Dominik Steenken (dominik-steenken)

Changes

In the existing version, SystemZTTIImpl::shouldExpandReduction will create a cast error when handling vector reduction intrinsics that do not have the vector to reduce as their first operand, such as llvm.vector.reduce.fadd and llvm.vector.reduce.fmul. This commit fixes that problem by introducing a short loop to find the vector operand instead of assuming that it is the first operand.

Full diff: https://github.com/llvm/llvm-project/pull/88874.diff

1 Files Affected:

(modified) llvm/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp (+29-10)

diff --git a/llvm/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp b/llvm/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp
index 4c9e78c05dbcac..5da42d7cccd3c1 100644
--- a/llvm/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp
+++ b/llvm/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp
@@ -18,6 +18,7 @@
 #include "llvm/CodeGen/BasicTTIImpl.h"
 #include "llvm/CodeGen/CostTable.h"
 #include "llvm/CodeGen/TargetLowering.h"
+#include "llvm/IR/DerivedTypes.h"
 #include "llvm/IR/IntrinsicInst.h"
 #include "llvm/IR/Intrinsics.h"
 #include "llvm/Support/Debug.h"
@@ -1323,25 +1324,43 @@ SystemZTTIImpl::getIntrinsicInstrCost(const IntrinsicCostAttributes &ICA,
 }
 
 bool SystemZTTIImpl::shouldExpandReduction(const IntrinsicInst *II) const {
-  // Always expand on Subtargets without vector instructions
+  // Always expand on Subtargets without vector instructions.
   if (!ST->hasVector())
     return true;
 
-  // Always expand for operands that do not fill one vector reg
-  auto *Type = cast<FixedVectorType>(II->getOperand(0)->getType());
-  unsigned NumElts = Type->getNumElements();
-  unsigned ScalarSize = Type->getScalarSizeInBits();
+  // Find the type of the vector operand of the intrinsic
+  // This assumes that each vector reduction intrinsic only
+  // has one vector operand.
+  FixedVectorType *VType = 0x0;
+  for (unsigned I = 0; I < II->getNumOperands(); ++I) {
+    auto *T = II->getOperand(I)->getType();
+    if (T->isVectorTy()) {
+      VType = cast<FixedVectorType>(T);
+      break;
+    }
+  }
+
+  // If we did not find a vector operand, do not continue.
+  if (VType == 0x0)
+    return true;
+
+  // If the vector operand is not a full vector, the reduction
+  // should be expanded.
+  unsigned NumElts = VType->getNumElements();
+  unsigned ScalarSize = VType->getScalarSizeInBits();
   unsigned MaxElts = SystemZ::VectorBits / ScalarSize;
   if (NumElts < MaxElts)
     return true;
 
-  // Otherwise
+  // Handling of full vector operands depends on the
+  // individual intrinsic.
   switch (II->getIntrinsicID()) {
-  // Do not expand vector.reduce.add
-  case Intrinsic::vector_reduce_add:
-    // Except for i64, since the performance benefit is dubious there
-    return ScalarSize >= 64;
   default:
     return true;
+  // Do not expand vector.reduce.add...
+  case Intrinsic::vector_reduce_add:
+    // ...unless the scalar size is i64 or larger, since the
+    // performance benefit is dubious there
+    return ScalarSize >= 64;
   }
 }

JonPsson1 · 2024-04-16T11:31:25Z

llvm/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp

+  // Find the type of the vector operand of the intrinsic
+  // This assumes that each vector reduction intrinsic only
+  // has one vector operand.
+  FixedVectorType *VType = 0x0;


JonPsson1 · 2024-04-16T11:34:58Z

llvm/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp

+      break;
+    }
+  }
+


It doesn't seem quite foolproof to me to make the assumption that all vector operands have the same type. Or is it?

If so, one option could be to keep looping and if VType is non-null (following operands) assert that VType == T.

I was going off the list of intrinsics here. According to that, all of the reduction intrinsics either have a single vector as an operand, or a scalar and a vector. So any vector found should be the only vector.
However, if we want to be robust towards future intrinsic additions, we might want to handle this diffferently? However, once there are more than one vector operand, how would we generically decide which one the vector that is supposed to be reduced is?

I think we should either explicitly handle each intrinsic by opcode (opc -> vector type), or else at least make sure that there is only one vector operand type. If something different emerges in the future, I guess then we would have to handle it by opcode.

I'm just thinking that, whether or not the operand vector is full is probably going to be a factor in many of the reductions. So it seems wrong to have it just in the branch for vector.reduce.add. I do see your point though about keeping things as simple as possible. I'm going to refactor things a bit to make things simpler, but kepp some of the extensibility in case we need to handle more intrinsics.

shouldExpandReduction() looks nice now.

I don't quite like getFirstOperandVectorType(), it's not robust against any future instrinsics we do not know about, and it doesn't have an assert that would discover any such future cases if they emerge.

We want to know the intrinsic type, and we know which intrinsic we are dealing with. So why make it unnecessarily general? No need to loop as we know what operand number we need to use. This could be done directly in shouldExpandReduction(), or if you want to have a function for it, you could pass the operand number to it.

JonPsson1 · 2024-04-16T11:35:58Z

llvm/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp

  default:
    return true;
+  // Do not expand vector.reduce.add...
+  case Intrinsic::vector_reduce_add:


If this is the only reduction handled, would it simplify to just check against this near the top, and return true for anything else?

That would simplify it, however, the initial intent was that we would probably want to do similar things for other reduction intrinsics in the future, so i wanted it to be a bit more generic.

I think generally we might want to keep the code as simple as possible until we actually need something more complex. In this case it is slightly confusing to me to go through the switch and all, when there is actually only one intrinsic handled... But not sure if this is just my opinion...

This style of commenting is kind of nice but I find it perhaps a little unusual compared to https://llvm.org/docs/CodingStandards.html#id8. I guess it's up to you how you want to write it, but at the very least add the final period. I think typically the comment for a particular switch case should be in one place, probably all just below the 'case' line.

i will add the .

JonPsson1 · 2024-04-17T09:31:36Z

llvm/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp

+    // ...unless the scalar size is i64 or larger,
+    // or the operand vector is not full, since the
+    // performance benefit is dubious in those cases
+    return (VType->getScalarSizeInBits() >= 64) || not isVectorFull(VType);
  }


isVectorFull: So the idea is that any vectors which are less than one vector should be expanded, but e.g. 1.5 vectors should not be..? Couldn't you then just check if the total size of VType is >= SystemZ::VectorBits?

not -> !

You're right, that would be easier. i was thinking in terms of element counts, but we don't really need that.

github-actions · 2024-04-17T11:14:43Z

✅ With the latest revision this PR passed the C/C++ code formatter.

JonPsson1 · 2024-04-18T11:28:48Z

llvm/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp

+  assert(T->isVectorTy());
+  return cast<FixedVectorType>(T);
+}
+


Since the intrinsic is defined so that it must have a vector of integer values, we do not need to verify this. I think this should be the same for other intrinsics as well. Did you try to use some other kind of type and compile it with 'opt'? I think it will verify the module and report broken IR.

Given that we don't need to have an extra redundant check here, I think this method (at least for now) is a bit superfluous - probably easier to just get the type directly in one line below.

The vectors can also have float values, but that would not preclude the FixedVectorType. The idea behind the assert was that when you use this function to get the operand type for a given intrinsics vector operand, to check that you supplied the correct operand index, so we don't run into the same error that motivated this PR in the first place.

I think that is over-asserting as the operand index / type never changes and is trivially entered just once and has to be correct per the language spec.

The problem was that we handled some other intrinsic by accident, not that we messed up the operand index of the one we are handling, right?

Do you agree, or you think I am missing something here?

I think handling other intrinsics was intended (but primarily for future use), and the problem arose because some of them had the vector operand in a different place.

But this is not a hill i'm willing to die on. If you prefer to not have the assert and then move the cast out of the function and back into the case statement, i'll do that.

It's kind of interesting if it is needed or not, and maybe also a matter of style, perhaps. @uweigand maybe has an opinion as well?

I do believe this assert is redundant - it is fails, the immediately following cast would also fail, which would give us the same information. And without the assert, the function becomes so trivial that it might as well be inlined back in place?

correct, without the assert, the function can go.

uweigand

See inline comments. Otherwise looks good to me.

uweigand · 2024-04-18T14:11:13Z

llvm/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp

+  assert(T->isVectorTy());
+  return cast<FixedVectorType>(T);
+}
+


I do believe this assert is redundant - it is fails, the immediately following cast would also fail, which would give us the same information. And without the assert, the function becomes so trivial that it might as well be inlined back in place?

uweigand · 2024-04-18T14:11:49Z

llvm/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp

+    // ...unless the scalar size is i64 or larger,
+    // or the operand vector is not full, since the
+    // performance benefit is dubious in those cases.
+    return (VType->getScalarSizeInBits() >= 64) ||


We don't need the parentheses here.

Thanks, i've incorporated both of these comments.

In the existing version, SystemZTTIImpl::shouldExpandReduction will create a `cast` error when handling vector reduction intrinsics that do not have the vector to reduce as their first operand, such as `llvm.vector.reduce.fadd` and `llvm.vector.reduce.fmul`. This commit fixes that problem by moving the cast into the case statement that handles the specific intrinsic, where the vector operand position is well-known.

uweigand

LGTM now, thanks!

uweigand · 2024-04-19T11:14:21Z

Hi @dominik-steenken I just noticed that the commit message no longer matches what the code now actually does ... can you fix this before I merge it? Thanks!

dominik-steenken · 2024-04-19T11:23:20Z

I thought i did - i changed the reference to a loop from earlier to

This commit fixes that problem by moving the cast into the case
statement that handles the specific intrinsic, where the vector
operand position is well-known.

do you think that's not accurate?

uweigand · 2024-04-19T12:19:39Z

I thought i did - i changed the reference to a loop from earlier to
This commit fixes that problem by moving the cast into the case
statement that handles the specific intrinsic, where the vector
operand position is well-known.
do you think that's not accurate?

This is fine - but this is not what GitHub's "merge and squash" was suggesting as commit message. There I saw "This commit fixes that problem by introducing a short loop to find the vector operand instead of assuming that it is the first operand." instead, not sure why. Anyway, I've now merged using the correct message.

…ouldExpandReduction` (llvm#88874) In the existing version, SystemZTTIImpl::shouldExpandReduction will create a `cast` error when handling vector reduction intrinsics that do not have the vector to reduce as their first operand, such as `llvm.vector.reduce.fadd` and `llvm.vector.reduce.fmul`. This commit fixes that problem by moving the cast into the case statement that handles the specific intrinsic, where the vector operand position is well-known.

llvmbot added the backend:SystemZ label Apr 16, 2024

JonPsson1 reviewed Apr 16, 2024

View reviewed changes

JonPsson1 reviewed Apr 17, 2024

View reviewed changes

JonPsson1 reviewed Apr 18, 2024

View reviewed changes

uweigand reviewed Apr 18, 2024

View reviewed changes

dominik-steenken force-pushed the fix-expandreduction branch from 181b77b to da9a7b3 Compare April 19, 2024 08:41

uweigand approved these changes Apr 19, 2024

View reviewed changes

uweigand merged commit 5af9701 into llvm:main Apr 19, 2024
3 of 4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SystemZ] Fix Operand Retrieval for Vector Reduction Intrinsic in `shouldExpandReduction` #88874

[SystemZ] Fix Operand Retrieval for Vector Reduction Intrinsic in `shouldExpandReduction` #88874

dominik-steenken commented Apr 16, 2024

dominik-steenken commented Apr 16, 2024

llvmbot commented Apr 16, 2024

JonPsson1 Apr 16, 2024

dominik-steenken Apr 16, 2024

JonPsson1 Apr 16, 2024

dominik-steenken Apr 16, 2024

JonPsson1 Apr 16, 2024

dominik-steenken Apr 17, 2024

JonPsson1 Apr 17, 2024

JonPsson1 Apr 16, 2024

dominik-steenken Apr 16, 2024

JonPsson1 Apr 16, 2024

JonPsson1 Apr 18, 2024

dominik-steenken Apr 18, 2024

JonPsson1 Apr 17, 2024 •

edited

dominik-steenken Apr 17, 2024

github-actions bot commented Apr 17, 2024 •

edited

JonPsson1 Apr 18, 2024

dominik-steenken Apr 18, 2024

JonPsson1 Apr 18, 2024

dominik-steenken Apr 18, 2024 •

edited

JonPsson1 Apr 18, 2024

uweigand Apr 18, 2024

dominik-steenken Apr 19, 2024

uweigand left a comment

uweigand Apr 18, 2024

uweigand Apr 18, 2024

dominik-steenken Apr 19, 2024

uweigand left a comment

uweigand commented Apr 19, 2024

dominik-steenken commented Apr 19, 2024

uweigand commented Apr 19, 2024

[SystemZ] Fix Operand Retrieval for Vector Reduction Intrinsic in shouldExpandReduction #88874

[SystemZ] Fix Operand Retrieval for Vector Reduction Intrinsic in shouldExpandReduction #88874

Conversation

dominik-steenken commented Apr 16, 2024

dominik-steenken commented Apr 16, 2024

llvmbot commented Apr 16, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JonPsson1 Apr 17, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Apr 17, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dominik-steenken Apr 18, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

uweigand left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

uweigand left a comment

Choose a reason for hiding this comment

uweigand commented Apr 19, 2024

dominik-steenken commented Apr 19, 2024

uweigand commented Apr 19, 2024

[SystemZ] Fix Operand Retrieval for Vector Reduction Intrinsic in `shouldExpandReduction` #88874

[SystemZ] Fix Operand Retrieval for Vector Reduction Intrinsic in `shouldExpandReduction` #88874

JonPsson1 Apr 17, 2024 •

edited

github-actions bot commented Apr 17, 2024 •

edited

dominik-steenken Apr 18, 2024 •

edited