[RISCV] Extract subregister if VLEN is known when lowering extract_subvector #65392

lukel97 · 2023-09-05T17:36:29Z

If we know VLEN at compile time, then we can workout what subregister an index into a fixed length vector will be at.
We can use this information when lowering extract_subvector to perform the vslidedown on a smaller subregister. This allows us to use a smaller LMUL, or if the extract is aligned to a vector register then we can avoid the slide
altogether.

The logic here is a bit tangled with the scalable path: If people find this too unwieldy, I can separate it out and duplicate it for the fixed case.

This technique could be applied to extract_vector_elt, insert_vector_elt and insert_subvector too.

This is stacked upon #65391

This is partly a precommit for an upcoming patch, and partly to remove the fixed length LMUL restriction similarly to what was done in https://reviews.llvm.org/D158270, since it's no longer that relevant.

This patch refactors extract_subvector to lower to extract_subreg directly, and to shortcut whenever the index is 0 when extracting a scalable vector. This doesn't change any of the existing behaviour, but makes an upcoming patch that extends the scalable path slightly easier to read.

…bvector If we know VLEN at compile time, then we can workout what subregister an index into a fixed length vector will be at. We can use this information when lowering extract_subvector to perform the vslidedown on a smaller subregister. This allows us to use a smaller LMUL, or if the extract is aligned to a vector register then we can avoid the slide altogether. The logic here is a bit tangled with the scalable path: If people find this too unwieldy, I can separate it out and duplicate it for the fixed case. This technique could be applied to extract_vector_elt, insert_vector_elt and insert_subvector too.

preames · 2023-09-05T18:27:38Z

llvm/lib/Target/RISCV/RISCVSubtarget.h

@@ -152,6 +152,11 @@ class RISCVSubtarget : public RISCVGenSubtargetInfo {
    unsigned VLen = getMaxRVVVectorSizeInBits();
    return VLen == 0 ? 65536 : VLen;
  }
+  std::optional<unsigned> getRealKnownVLen() const {


I'd suggest: getExactVLen().

preames · 2023-09-05T18:35:22Z

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

+  // which register of a LMUL group contains the specific subvector as we only
+  // know the minimum register size. Therefore we must slide the vector group
+  // down the full amount.
+  if (SubVecVT.isFixedLengthVector() && !KnownVLen) {


This isn't fully general.

Consider: 8 x i64 on V (a lmul4 type), extract the second <2 x i64> quarter. We can still do know this is the first LMUL2 sub-register. This doesn't allow us to slide less, but it does allow us to reduce the lmul for the slide.

topperc · 2023-09-05T19:51:52Z

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

+  unsigned EffectiveIdx = OrigIdx;
+  unsigned Vscale = *KnownVLen / RISCV::RVVBitsPerBlock;
+  if (SubVecVT.isFixedLengthVector()) {
+    assert(KnownVLen);


This assert isn't needed. KnownVLen was already accessed on line 8670.

IIUC * on a null optional is just UB, you need to use value() to have an exception thrown. This part is a bit hairy though, I'm avoiding value() so I don't have to define Vscale twice. Any ideas on a better way to structure this?

I don't have any specific ideas, but I definitely wouldn't rely on UB. A library might put an assert in * rather than throwing an exception.

As noted in llvm#65392 (comment), when lowering an extract of a fixed length vector from another vector, we don't need to perform the vslidedown on the full vector type. Instead we can extract the smallest subregister that contains the subvector to be extracted and perform the vslidedown with a smaller LMUL. E.g, with +Zvl128b: v2i64 = extract_subvector nxv4i64, 2 is currently lowered as vsetivli zero, 2, e64, m4, ta, ma vslidedown.vi v8, v8, 2 This patch shrinks the vslidedown to LMUL=2: vsetivli zero, 2, e64, m2, ta, ma vslidedown.vi v8, v8, 2 Because we know that there's at least 128*2=256 bits in v8 at LMUL=2, and we only need the first 256 bits to extract a v2i64 at index 2. lowerEXTRACT_VECTOR_ELT already has this logic, so this extracts it out and reuses it. I've split this out into a separate PR rather than include it in llvm#65392, with the hope that we'll be able to generalize it later.

As noted in #65392 (comment), when lowering an extract of a fixed length vector from another vector, we don't need to perform the vslidedown on the full vector type. Instead we can extract the smallest subregister that contains the subvector to be extracted and perform the vslidedown with a smaller LMUL. E.g, with +Zvl128b: v2i64 = extract_subvector nxv4i64, 2 is currently lowered as vsetivli zero, 2, e64, m4, ta, ma vslidedown.vi v8, v8, 2 This patch shrinks the vslidedown to LMUL=2: vsetivli zero, 2, e64, m2, ta, ma vslidedown.vi v8, v8, 2 Because we know that there's at least 128*2=256 bits in v8 at LMUL=2, and we only need the first 256 bits to extract a v2i64 at index 2. lowerEXTRACT_VECTOR_ELT already has this logic, so this extracts it out and reuses it. I've split this out into a separate PR rather than include it in #65392, with the hope that we'll be able to generalize it later. This patch refactors extract_subvector lowering to lower to extract_subreg directly, and to shortcut whenever the index is 0 when extracting a scalable vector. This doesn't change any of the existing behaviour, but makes an upcoming patch that extends the scalable path slightly easier to read.

…#65598) As noted in llvm#65392 (comment), when lowering an extract of a fixed length vector from another vector, we don't need to perform the vslidedown on the full vector type. Instead we can extract the smallest subregister that contains the subvector to be extracted and perform the vslidedown with a smaller LMUL. E.g, with +Zvl128b: v2i64 = extract_subvector nxv4i64, 2 is currently lowered as vsetivli zero, 2, e64, m4, ta, ma vslidedown.vi v8, v8, 2 This patch shrinks the vslidedown to LMUL=2: vsetivli zero, 2, e64, m2, ta, ma vslidedown.vi v8, v8, 2 Because we know that there's at least 128*2=256 bits in v8 at LMUL=2, and we only need the first 256 bits to extract a v2i64 at index 2. lowerEXTRACT_VECTOR_ELT already has this logic, so this extracts it out and reuses it. I've split this out into a separate PR rather than include it in llvm#65392, with the hope that we'll be able to generalize it later. This patch refactors extract_subvector lowering to lower to extract_subreg directly, and to shortcut whenever the index is 0 when extracting a scalable vector. This doesn't change any of the existing behaviour, but makes an upcoming patch that extends the scalable path slightly easier to read.

lukel97 · 2023-10-04T20:33:29Z

Marking this as a draft as it may need reworked after #65598, #66087, and related PRs to reduce LMUL across vslidedowns and vslideups have landed.

…T_SUBVECTOR This is a revival of llvm#65392. When we lower an extract_subvector, we extract the subregister that the subvector is contained in first and then do a vslidedown with LMUL=1. We can currently only do this for scalable vectors though because the index is scaled by vscale and thus we will know what subregister the subvector lies in. For fixed length vectors, the index isn't scaled by vscale and so the subvector could lie in any arbitrary subregister, so we have to do a vslidedown with the full LMUL. The exception to this is when we know the exact VLEN: in which case, we can still work out the exact subregister and do the LMUL=1 vslidedown on it. This patch handles this case by scaling the index by 1/vscale before computing the subregister, and extending the LMUL=1 path to handle fixed length vectors.

…T_SUBVECTOR (#79949) This is a revival of #65392. When we lower an extract_subvector, we extract the subregister that the subvector is contained in first and then do a vslidedown with LMUL=1. We can currently only do this for scalable vectors though because the index is scaled by vscale and thus we will know what subregister the subvector lies in. For fixed length vectors, the index isn't scaled by vscale and so the subvector could lie in any arbitrary subregister, so we have to do a vslidedown with the full LMUL. The exception to this is when we know the exact VLEN: in which case, we can still work out the exact subregister and do the LMUL=1 vslidedown on it. This patch handles this case by scaling the index by 1/vscale before computing the subregister, and extending the LMUL=1 path to handle fixed length vectors.

lukel97 added 3 commits September 5, 2023 18:27

[RISCV] Add extract_subvector tests for a statically-known VLEN. NFC

9bba3dd

This is partly a precommit for an upcoming patch, and partly to remove the fixed length LMUL restriction similarly to what was done in https://reviews.llvm.org/D158270, since it's no longer that relevant.

lukel97 requested review from preames, frasercrmck and topperc September 5, 2023 17:36

lukel97 requested a review from a team as a code owner September 5, 2023 17:36

github-actions bot added the backend:RISC-V label Sep 5, 2023

preames reviewed Sep 5, 2023

View reviewed changes

topperc reviewed Sep 5, 2023

View reviewed changes

lukel97 mentioned this pull request Sep 7, 2023

[RISCV] Shrink vslidedown when lowering fixed extract_subvector #65598

Merged

lukel97 marked this pull request as draft October 4, 2023 20:29

lukel97 closed this Nov 29, 2023

lukel97 mentioned this pull request Jan 30, 2024

[RISCV] Handle fixed length vectors with exact VLEN in loweringEXTRACT_SUBVECTOR #79949

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RISCV] Extract subregister if VLEN is known when lowering extract_subvector #65392

[RISCV] Extract subregister if VLEN is known when lowering extract_subvector #65392

lukel97 commented Sep 5, 2023

preames Sep 5, 2023

preames Sep 5, 2023

topperc Sep 5, 2023

lukel97 Sep 5, 2023

topperc Sep 7, 2023

lukel97 commented Oct 4, 2023

[RISCV] Extract subregister if VLEN is known when lowering extract_subvector #65392

[RISCV] Extract subregister if VLEN is known when lowering extract_subvector #65392

Conversation

lukel97 commented Sep 5, 2023

preames Sep 5, 2023

Choose a reason for hiding this comment

preames Sep 5, 2023

Choose a reason for hiding this comment

topperc Sep 5, 2023

Choose a reason for hiding this comment

lukel97 Sep 5, 2023

Choose a reason for hiding this comment

topperc Sep 7, 2023

Choose a reason for hiding this comment

lukel97 commented Oct 4, 2023