Skip to content

Commit

Permalink
[Clang][AArch64] svldr_vnum/svstr_vnum should use cntsb iso vscale fo…
Browse files Browse the repository at this point in the history
…r the offset

The specification for LDR/STR says that:

  The ZA array vector is selected by the sum of the vector select register
  and immediate offset, modulo the number of bytes in a Streaming SVE
  vector. [..] This instruction does not require the PE to be in Streaming
  SVE mode

When the instruction is used outside of streaming mode, 'vscale' will result
in the wrong value being used for the offset because LLVM's code-generator
will emit the non-streaming 'RDVL/ADDVL' instead of the 'RDSVL/ADDSVL'
instructions which are used to get the Streaming-SVE vector length.

Reviewed By: bryanpkc

Differential Revision: https://reviews.llvm.org/D156121
  • Loading branch information
sdesmalen-arm committed Jul 24, 2023
1 parent 0736200 commit a8cbd27
Show file tree
Hide file tree
Showing 3 changed files with 8 additions and 8 deletions.
8 changes: 4 additions & 4 deletions clang/lib/CodeGen/CGBuiltin.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -9508,11 +9508,11 @@ Value *CodeGenFunction::EmitSMEZero(SVETypeFlags TypeFlags,
Value *CodeGenFunction::EmitSMELdrStr(SVETypeFlags TypeFlags,
SmallVectorImpl<Value *> &Ops,
unsigned IntID) {
Function *Vscale = CGM.getIntrinsic(Intrinsic::vscale, Int64Ty);
llvm::Value *VscaleCall = Builder.CreateCall(Vscale, {}, "vscale");
Function *Cntsb = CGM.getIntrinsic(Intrinsic::aarch64_sme_cntsb);
llvm::Value *CntsbCall = Builder.CreateCall(Cntsb, {}, "svlb");
llvm::Value *MulVL = Builder.CreateMul(
VscaleCall,
Builder.getInt64(16 * cast<llvm::ConstantInt>(Ops[1])->getZExtValue()),
CntsbCall,
Builder.getInt64(cast<llvm::ConstantInt>(Ops[1])->getZExtValue()),
"mulvl");
Ops[2] = Builder.CreateGEP(Int8Ty, Ops[2], MulVL);
Ops[0] = EmitTileslice(Ops[1], Ops[0]);
Expand Down
4 changes: 2 additions & 2 deletions clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_ldr.c
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,8 @@ void test_svldr_vnum_za(uint32_t slice_base, const void *ptr) {
// CHECK-C-LABEL: @test_svldr_vnum_za_1(
// CHECK-CXX-LABEL: @_Z20test_svldr_vnum_za_1jPKv(
// CHECK-NEXT: entry:
// CHECK-NEXT: [[VSCALE:%.*]] = tail call i64 @llvm.vscale.i64()
// CHECK-NEXT: [[MULVL:%.*]] = mul nuw nsw i64 [[VSCALE]], 240
// CHECK-NEXT: [[SVLB:%.*]] = tail call i64 @llvm.aarch64.sme.cntsb()
// CHECK-NEXT: [[MULVL:%.*]] = mul i64 [[SVLB]], 15
// CHECK-NEXT: [[TMP0:%.*]] = getelementptr i8, ptr [[PTR:%.*]], i64 [[MULVL]]
// CHECK-NEXT: [[TILESLICE:%.*]] = add i32 [[SLICE_BASE:%.*]], 15
// CHECK-NEXT: tail call void @llvm.aarch64.sme.ldr(i32 [[TILESLICE]], ptr [[TMP0]])
Expand Down
4 changes: 2 additions & 2 deletions clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_str.c
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,8 @@ void test_svstr_vnum_za(uint32_t slice_base, void *ptr) {
// CHECK-C-LABEL: @test_svstr_vnum_za_1(
// CHECK-CXX-LABEL: @_Z20test_svstr_vnum_za_1jPv(
// CHECK-NEXT: entry:
// CHECK-NEXT: [[VSCALE:%.*]] = tail call i64 @llvm.vscale.i64()
// CHECK-NEXT: [[MULVL:%.*]] = mul nuw nsw i64 [[VSCALE]], 240
// CHECK-NEXT: [[SVLB:%.*]] = tail call i64 @llvm.aarch64.sme.cntsb()
// CHECK-NEXT: [[MULVL:%.*]] = mul i64 [[SVLB]], 15
// CHECK-NEXT: [[TMP0:%.*]] = getelementptr i8, ptr [[PTR:%.*]], i64 [[MULVL]]
// CHECK-NEXT: [[TILESLICE:%.*]] = add i32 [[SLICE_BASE:%.*]], 15
// CHECK-NEXT: tail call void @llvm.aarch64.sme.str(i32 [[TILESLICE]], ptr [[TMP0]])
Expand Down

0 comments on commit a8cbd27

Please sign in to comment.