Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GISEL][RISCV] IRTranslator for scalable vector load #80006

Merged
merged 9 commits into from
Mar 20, 2024

Conversation

jiahanxie353
Copy link
Contributor

@jiahanxie353 jiahanxie353 commented Jan 30, 2024

Add IRTranslator for scalable vector load instruction and include corresponding tests with alignment argument included, which can be smaller/equal/larger than element size or smaller/equal/larger than the minimum total vector size.

Copy link

github-actions bot commented Jan 30, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@llvmbot
Copy link
Collaborator

llvmbot commented Jan 30, 2024

@llvm/pr-subscribers-backend-risc-v

@llvm/pr-subscribers-llvm-globalisel

Author: Jiahan Xie (jiahanxie353)

Changes

Hi @michaelmaitland @topperc , this patch aims at translating LLVM IR with scalable vector load into G_LOAD.


Full diff: https://github.com/llvm/llvm-project/pull/80006.diff

4 Files Affected:

  • (modified) llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp (+2-3)
  • (modified) llvm/lib/CodeGen/MachineVerifier.cpp (+1-1)
  • (modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+4-3)
  • (added) llvm/test/CodeGen/RISCV/GlobalISel/irtranslator/vec-ld.ll (+7)
diff --git a/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp b/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
index dd38317c26bf..72a8b0b001d0 100644
--- a/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
@@ -1363,9 +1363,8 @@ static bool isSwiftError(const Value *V) {
 
 bool IRTranslator::translateLoad(const User &U, MachineIRBuilder &MIRBuilder) {
   const LoadInst &LI = cast<LoadInst>(U);
-
-  unsigned StoreSize = DL->getTypeStoreSize(LI.getType());
-  if (StoreSize == 0)
+  TypeSize StoreSize = DL->getTypeStoreSize(LI.getType());
+  if (StoreSize.isZero())
     return true;
 
   ArrayRef<Register> Regs = getOrCreateVRegs(LI);
diff --git a/llvm/lib/CodeGen/MachineVerifier.cpp b/llvm/lib/CodeGen/MachineVerifier.cpp
index b182000a3d70..6150cdf7d47f 100644
--- a/llvm/lib/CodeGen/MachineVerifier.cpp
+++ b/llvm/lib/CodeGen/MachineVerifier.cpp
@@ -1192,7 +1192,7 @@ void MachineVerifier::verifyPreISelGenericInstruction(const MachineInstr *MI) {
         if (MMO.getSizeInBits() >= ValTy.getSizeInBits())
           report("Generic extload must have a narrower memory type", MI);
       } else if (MI->getOpcode() == TargetOpcode::G_LOAD) {
-        if (MMO.getSize() > ValTy.getSizeInBytes())
+        if (TypeSize::isKnownGT(MMO.getMemoryType().getSizeInBytes(), ValTy.getSizeInBytes()))
           report("load memory size cannot exceed result size", MI);
       } else if (MI->getOpcode() == TargetOpcode::G_STORE) {
         if (ValTy.getSizeInBytes() < MMO.getSize())
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 82836346d883..ab9958111491 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -20532,11 +20532,12 @@ unsigned RISCVTargetLowering::getCustomCtpopCost(EVT VT,
 
 bool RISCVTargetLowering::fallBackToDAGISel(const Instruction &Inst) const {
 
-  // GISel support is in progress or complete for G_ADD, G_SUB, G_AND, G_OR, and
-  // G_XOR.
+  // GISel support is in progress or complete for G_ADD, G_SUB, G_AND, G_OR,
+  // G_XOR, and G_LOAD
   unsigned Op = Inst.getOpcode();
   if (Op == Instruction::Add || Op == Instruction::Sub ||
-      Op == Instruction::And || Op == Instruction::Or || Op == Instruction::Xor)
+      Op == Instruction::And || Op == Instruction::Or ||
+      Op == Instruction::Xor || Op == Instruction::Load)
     return false;
 
   if (Inst.getType()->isScalableTy())
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/irtranslator/vec-ld.ll b/llvm/test/CodeGen/RISCV/GlobalISel/irtranslator/vec-ld.ll
new file mode 100644
index 000000000000..5f98c6a7066c
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/irtranslator/vec-ld.ll
@@ -0,0 +1,7 @@
+; NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
+; RUN: llc -mtriple=riscv32 -mattr=+v -global-isel -stop-after=irtranslator -verify-machineinstrs < %s | FileCheck  -check-prefixes=RV32I %s
+
+define void @vload_vint8m1(ptr %pa) {
+  %va = load <vscale x 8 x i8>, ptr %pa
+	ret void
+}

; NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
; RUN: llc -mtriple=riscv32 -mattr=+v -global-isel -stop-after=irtranslator -verify-machineinstrs < %s | FileCheck -check-prefixes=RV32I %s

define void @vload_vint8m1(ptr %pa) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is missing test checks. Also, I think we need to test a more complete set of scalable vector types. Lastly, I think we need test checks for rv64 as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll add more tests with different types and to include rv64 if this simplest test case makes sense.
Right now, I'm wondering, when I use llvm/utils/update_mir_test_checks.py , the vec-ll.s file auto-generate align 8 at the end of %va = load <vscale x 8 x i8>, ptr %pa, so I appended it. Does it make sense to add align 8?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have tests that include the align argument and tests that do not include the align argument.

We probably need to add support for both aligned and unaligned load/store. It is probably okay to start off with only worrying about aligned loads/stores. I would have thought that align 8 and align 4 is aligned on rv32 and align 8 is aligned on rv64, but llvm/test/CodeGen/RISCV/rvv/unaligned-loads-stores.ll::aligned_load_nxv1i32_a4 seems to imply that align 4 is also aligned on rv64 -- @topperc are you able eto clarify this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alignment is relative to EEW not XLen.

If an element accessed by a vector memory instruction is not naturally aligned to the size of the element, either the element is transferred successfully or an address misaligned exception is raised on that element.

@@ -1240,7 +1240,7 @@ void MachineMemOperand::print(raw_ostream &OS, ModuleSlotTracker &MST,
<< "unknown-address";
}
MachineOperand::printOperandOffset(OS, getOffset());
if (getSize() > 0 && getAlign() != getSize())
if (getMemoryType().getSizeInBytes().getKnownMinValue() > 0 && getAlign() != getMemoryType().getSizeInBytes().getKnownMinValue())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not very confident with this approach. I'm a bit worried if it would this breaks scalar cases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this check is to emit an align operand in the case that the alignment is not the same size as the element size.

That means if the memory type is nxvMxsN that we want this guard to be true when N != Align * 8. N is the size in bytes of the element type. I think you are checking M here.

@@ -1240,7 +1240,7 @@ void MachineMemOperand::print(raw_ostream &OS, ModuleSlotTracker &MST,
<< "unknown-address";
}
MachineOperand::printOperandOffset(OS, getOffset());
if (getSize() > 0 && getAlign() != getSize())
if (getType().getElementCount().getKnownMinValue() > 0 && getAlign() != getType().getElementCount().getKnownMinValue())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (getType().getElementCount().getKnownMinValue() > 0 && getAlign() != getType().getElementCount().getKnownMinValue())
unsigned MinSize = getType().getSizeInBytes().getKnownMinValue();
if (MinSize > 0 && getAlign() != MinSize)

I think this should solve a large chunk of the test failures. The old code is saying if (the size of the memory reference is not equal to the alignment).

The code you have here is saying if (the minimum number of elements is not equal to the alignment).

However, we probably need to account for the size of the element as well. The tricky part here is that we only know the minimum size of the memory reference. In the case that the MinSize is not equal to the alignment, it can't hurt to emit an explicit align value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getSize() checks isValid() so it probably makes sense to change my suggestion to

Suggested change
if (getType().getElementCount().getKnownMinValue() > 0 && getAlign() != getType().getElementCount().getKnownMinValue())
uint64_t MinSize =
MemoryType.isValid() ? getType().getSizeInBytes().getKnownMinValue() : ~UINT64_C(0);
if (MinSize > 0 && getAlign() != MinSize)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And maybe to leave a TODO for getSize() to return a TypeSize?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getSize() checks isValid() so it probably makes sense to change my suggestion to

Yes totally! Actuall I was just trying to push up to use GitHub actions to check because for some reason when I tried to test the test_g_load.mir file locally, I got errors like:

*** Bad machine code: atomic load cannot use release ordering ***
- function:    test_load
- basic block: %bb.0  (0x560485132100)
- instruction: %5:_(s32) = G_LOAD %2:_(p0) :: (load acq_rel (s32))
LLVM ERROR: Found 5 machine code errors.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.

But GitHub action is no longer reporting that so I'm a bit confused.

The command I used in the terminal is to first change my build to: -DLLVM_TARGETS_TO_BUILD="RISCV;AArch64;ARM"; and then I used ./build-llc/bin/llc -mtriple=arm64 -run-pass=none -verify-machineinstrs llvm/test/MachineVerifier/test_g_load.mir (which is just a copy-and-paste from the test file itself) to test this file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it fail when you run ninja check-llvm?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you look at the CHECK statements, this test is expecting bad machine code failures.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it can be useful to do ./build/bin/llvm-lit -v llvm/test/MachineVerifier/test_g_load.mir to see whether a specific test is failing/passing and why.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it can be useful to do ./build/bin/llvm-lit -v llvm/test/MachineVerifier/test_g_load.mir to see whether a specific test is failing/passing and why.

this command is passing!

Does it fail when you run ninja check-llvm?

I tried to run it but every time I use check-llvm it puts a lot of burden and my machine just got stuck there or even freezed.

If you look at the CHECK statements, this test is expecting bad machine code failures.

right, and I thought there would be a try-catch mechanism to catch the expected failures to report "there are expected failures" instead of actually raising them.

But anyhow, llvm-lit is working, thanks!

Copy link
Contributor

@michaelmaitland michaelmaitland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add some test cases where the llvm IR load instruction has an align argument? Align argument could be smaller/equal/larger than element size or smaller/equal/larger than minimum total vector size (element size * min elts).

Copy link
Contributor

@michaelmaitland michaelmaitland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we test load of vector of pointer type?

ret <vscale x 8 x i64> %va
}

define <vscale x 1 x i8> @vload_nx1i8_align2(ptr %pa) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure we have the intended coverage on align.

Align argument could be smaller/equal/larger than element size or smaller/equal/larger than minimum total vector size

For i8 vector types we have test for align 2, align 8, align 32. Recall that align is in bytes but the type i8 is in bits.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, I'll fix it!

@jiahanxie353
Copy link
Contributor Author

I think now the test cases satisfy "align argument could be smaller/equal/larger than element size or smaller/equal/larger than minimum total vector size".

But I wasn't sure if I'm correctly loading a vector of pointers.

@@ -880,3 +880,25 @@ define <vscale x 2 x i64> @vload_nx2i64_align32(ptr %pa) {
ret <vscale x 2 x i64> %va
}

define <vscale x 8 x i64*> @vload_nx8i64_ptrs(ptr %pa) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vscale x 8 x ptr.

@jiahanxie353 jiahanxie353 marked this pull request as ready for review March 6, 2024 21:04
Copy link
Contributor

@michaelmaitland michaelmaitland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with minor comment.

llvm/lib/CodeGen/MachineOperand.cpp Outdated Show resolved Hide resolved
@@ -1198,7 +1198,8 @@ void MachineVerifier::verifyPreISelGenericInstruction(const MachineInstr *MI) {
if (MMO.getSizeInBits() >= ValTy.getSizeInBits())
report("Generic extload must have a narrower memory type", MI);
} else if (MI->getOpcode() == TargetOpcode::G_LOAD) {
if (MMO.getSize() > ValTy.getSizeInBytes())
if (TypeSize::isKnownGT(MMO.getType().getSizeInBytes(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verifier test for this would be good

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it true that verifier tests for this aims to test run-time attributes? I tried to write some tests, such as loading <vscale x 1 x i16> into <vscale x 1 x i8> but they were all caught as compile-time errors like type mismatch. So it seems like this test requires a clever way to get around, do you have any suggestion?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This diff worked for me:

diff --git a/llvm/lib/CodeGen/MIRParser/MIParser.cpp b/llvm/lib/CodeGen/MIRParser/MIParser.cpp
index d41fc97cb806..41dbbd1b1413 100644
--- a/llvm/lib/CodeGen/MIRParser/MIParser.cpp
+++ b/llvm/lib/CodeGen/MIRParser/MIParser.cpp
@@ -3412,7 +3412,7 @@ bool MIParser::parseMachineMemoryOperand(MachineMemOperand *&Dest) {
     if (expectAndConsume(MIToken::rparen))
       return true;

-    Size = MemoryType.getSizeInBytes();
+    Size = MemoryType.getSizeInBytes().getKnownMinValue();
   }

   MachinePointerInfo Ptr = MachinePointerInfo();
diff --git a/llvm/test/MachineVerifier/test_g_load.mir b/llvm/test/MachineVerifier/test_g_load.mir
index 07c3c0a6b5a2..0ffdf8e8efed 100644
--- a/llvm/test/MachineVerifier/test_g_load.mir
+++ b/llvm/test/MachineVerifier/test_g_load.mir
@@ -26,4 +26,9 @@ body:             |
     ; CHECK: Bad machine code: atomic load cannot use release ordering
     %5:_(s32) = G_LOAD %2 :: (load acq_rel (s32))

+    ; CHECK: Bad machine code: load memory size cannot exceed result size
+    %6:_(<vscale x 2 x s8>) = G_LOAD %2 :: (load (<vscale x 1 x s32>))
+
+    ; CHECK: Bad machine code: load memory size cannot exceed result size
+    %7:_(<vscale x 1 x s8>) = G_LOAD %2 :: (load (<vscale x 2 x s8>))
 ...

Maybe @harviniriawan and @davemgreen can weigh in on the proposed change in parseMachineMemoryOperand because this PR seems like it may be relevant.

@michaelmaitland
Copy link
Contributor

Can you please update the PR description?

@jiahanxie353
Copy link
Contributor Author

Can you please update the PR description?

Sure, done!

Can we merge it? Seems like there is a failing build but it's totally irrelevant to this patch and I didn't touch that at all.

@michaelmaitland
Copy link
Contributor

Can you please update the PR description?

Sure, done!

Can we merge it? Seems like there is a failing build but it's totally irrelevant to this patch and I didn't touch that at all.

Can we wait for an opinion from the people who are working on or giving review to the scalable MMO patch I linked in my comment about the MachineVerifier tests? I think that we could probably add the changes I suggested in that comment thread. It would be nice to have those tests changes in this commit.

Copy link
Contributor

@michaelmaitland michaelmaitland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jiahanxie353 jiahanxie353 merged commit 4bf06be into llvm:main Mar 20, 2024
3 of 4 checks passed
chencha3 pushed a commit to chencha3/llvm-project that referenced this pull request Mar 23, 2024
Add IRTranslator for scalable vector load instruction and include
corresponding tests with alignment argument included, which can be
smaller/equal/larger than element size or smaller/equal/larger than the
minimum total vector size.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants