Skip to content

[AArch64][SVE] Error in masked_gather_v2f16 when enabling SVE for 128-bit target #56412

@MattPD

Description

@MattPD

Encountering an ICE after removing NEON preference in useSVEForFixedLengthVectors for 128-bit vector register size SVE code generation.

Context:
There's currently a >= 256 vector size restriction in useSVEForFixedLengthVectors (in "llvm/lib/Target/AArch64/AArch64Subtarget.h").

bool useSVEForFixedLengthVectors() const {
  // Prefer NEON unless larger SVE registers are available.
  return hasSVE() && getMinSVEVectorSizeInBits() >= 256;
}

As an experiment I've relaxed it in two different ways, by changing the line in question to return hasSVE() && getMinSVEVectorSizeInBits() >= 128; or return hasSVE(); (with the same effect).

I've then run the build of the modified compiler on the AArch64 LIT tests, encountering an ICE for the following test:
https://github.com/llvm/llvm-project/blob/main/llvm/test/CodeGen/AArch64/sve-fixed-length-masked-gather.ll

Compilation using either clang or llc together with the assumed 128-bit SVE register size is sufficient to trigger the ICE:

clang -msve-vector-bits=128 -mcpu=neoverse-n2 sve-fixed-length-masked-gather.ll
llc -aarch64-sve-vector-bits-min=128 -mtriple=arm64-unknown-unknown -mcpu=neoverse-n2 sve-fixed-length-masked-gather.ll

This function alone is sufficient to trigger the ICE:

target triple = "aarch64-unknown-linux-gnu"

define void @masked_gather_v2f16(<2 x half>* %a, <2 x half*>* %b) vscale_range(2,0) #0 {
  %cval = load <2 x half>, <2 x half>* %a
  %ptrs = load <2 x half*>, <2 x half*>* %b
  %mask = fcmp oeq <2 x half> %cval, zeroinitializer
  %vals = call <2 x half> @llvm.masked.gather.v2f16(<2 x half*> %ptrs, i32 8, <2 x i1> %mask, <2 x half> undef)
  store <2 x half> %vals, <2 x half>* %a
  ret void
}

declare <2 x half> @llvm.masked.gather.v2f16(<2 x half*>, i32, <2 x i1>, <2 x half>)

attributes #0 = { "target-features"="+sve" }

Removing the vscale_range(2,0) attribute or changing it to vscale_range(1,0) has no impact (i.e., the ICE still occurs).

Here's the output from the ICE in question (note the cyclic pattern of function calls in SelectionDAG):

$ clang -msve-vector-bits=128 -mcpu=neoverse-n2 sve-fixed-length-masked-gather.ll
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: /llvm_build/bin/clang-15 -cc1 -triple aarch64-unknown-linux-gnu -emit-obj -mrelax-all --mrelax-relocations -disable-free -clear-ast-before-backend -main-file-name sve-fixed-length-masked-gather.masked_gather_v2f16.only.ll -mrelocation-model pic -pic-level 2 -pic-is-pie -mframe-pointer=non-leaf -fmath-errno -ffp-contract=on -fno-rounding-math -mconstructor-aliases -funwind-tables=2 -target-cpu neoverse-n2 -target-feature +v8.5a -target-feature +crc -target-feature +lse -target-feature +rdm -target-feature +crypto -target-feature +dotprod -target-feature +fp-armv8 -target-feature +neon -target-feature +fullfp16 -target-feature +ras -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -target-feature +rcpc -target-feature +mte -target-feature +ssbs -target-feature +sb -target-feature +bf16 -target-feature +i8mm -target-feature +fp16fml -target-feature +sm4 -target-feature +sha3 -target-feature +sha2 -target-feature +aes -target-abi aapcs -mvscale-max=1 -mvscale-min=1 -fallow-half-arguments-and-returns -mllvm -treat-scalable-fixed-error-as-warning -debugger-tuning=gdb -fcoverage-compilation-dir=/llvm_src -resource-dir /llvm_build/lib/clang/15.0.0 -fdebug-compilation-dir=/llvm_src -ferror-limit 19 -fno-signed-char -fgnuc-version=4.2.1 -fcolor-diagnostics -faddrsig -D__GCC_HAVE_DWARF2_CFI_ASM=1 -o /tmp/sve-fixed-length-masked-gather-a81266.o -x ir sve-fixed-length-masked-gather.masked_gather_v2f16.only.ll
1.      Code generation
2.      Running pass 'Function Pass Manager' on module 'sve-fixed-length-masked-gather.masked_gather_v2f16.only.ll'.
3.      Running pass 'AArch64 Instruction Selection' on function '@masked_gather_v2f16'
  #0 0x0000ffff784b83bc llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /llvm_src/llvm-project/llvm/lib/Support/Unix/Signals.inc:573:3
  #1 0x0000ffff784b69f8 llvm::sys::RunSignalHandlers() /llvm_src/llvm-project/llvm/lib/Support/Signals.cpp:104:18
  #2 0x0000ffff784b6ba4 SignalHandler(int) /llvm_src/llvm-project/llvm/lib/Support/Unix/Signals.inc:407:1
  #3 0x0000ffff7d9216c0 (linux-vdso.so.1+0x6c0)
  #4 0x0000ffff7792dd60 llvm::DAGTypeLegalizer::AnalyzeNewNode(llvm::SDNode*) (.part.0) /llvm_src/llvm-project/llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.cpp:501:9
  #5 0x0000ffff7792dcf8 llvm::SDNode::getNodeId() const /llvm_src/llvm-project/llvm/include/llvm/CodeGen/SelectionDAGNodes.h:713:34
  #6 0x0000ffff7792dcf8 llvm::DAGTypeLegalizer::AnalyzeNewValue(llvm::SDValue&) /llvm_src/llvm-project/llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.cpp:572:31
  #7 0x0000ffff7792de08 llvm::SDValue::getNode() const /llvm_src/llvm-project/llvm/include/llvm/CodeGen/SelectionDAGNodes.h:151:36
  #8 0x0000ffff7792de08 llvm::DAGTypeLegalizer::AnalyzeNewNode(llvm::SDNode*) (.part.0) /llvm_src/llvm-project/llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.cpp:525:32
  #9 0x0000ffff7792dcf8 llvm::SDNode::getNodeId() const /llvm_src/llvm-project/llvm/include/llvm/CodeGen/SelectionDAGNodes.h:713:34
 #10 0x0000ffff7792dcf8 llvm::DAGTypeLegalizer::AnalyzeNewValue(llvm::SDValue&) /llvm_src/llvm-project/llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.cpp:572:31
 #11 0x0000ffff7792de08 llvm::SDValue::getNode() const /llvm_src/llvm-project/llvm/include/llvm/CodeGen/SelectionDAGNodes.h:151:36
 #12 0x0000ffff7792de08 llvm::DAGTypeLegalizer::AnalyzeNewNode(llvm::SDNode*) (.part.0) /llvm_src/llvm-project/llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.cpp:525:32
 #13 0x0000ffff7792dcf8 llvm::SDNode::getNodeId() const /llvm_src/llvm-project/llvm/include/llvm/CodeGen/SelectionDAGNodes.h:713:34
 #14 0x0000ffff7792dcf8 llvm::DAGTypeLegalizer::AnalyzeNewValue(llvm::SDValue&) /llvm_src/llvm-project/llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.cpp:572:31
 #15 0x0000ffff7792de08 llvm::SDValue::getNode() const /llvm_src/llvm-project/llvm/include/llvm/CodeGen/SelectionDAGNodes.h:151:36
. . .
#500 0x0000ffff7792de08 llvm::DAGTypeLegalizer::AnalyzeNewNode(llvm::SDNode*) (.part.0) /llvm_src/llvm-project/llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.cpp:525:32
#501 0x0000ffff7792dcf8 llvm::SDNode::getNodeId() const /llvm_src/llvm-project/llvm/include/llvm/CodeGen/SelectionDAGNodes.h:713:34
#502 0x0000ffff7792dcf8 llvm::DAGTypeLegalizer::AnalyzeNewValue(llvm::SDValue&) /llvm_src/llvm-project/llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.cpp:572:31
#503 0x0000ffff7792de08 llvm::SDValue::getNode() const /llvm_src/llvm-project/llvm/include/llvm/CodeGen/SelectionDAGNodes.h:151:36
#504 0x0000ffff7792de08 llvm::DAGTypeLegalizer::AnalyzeNewNode(llvm::SDNode*) (.part.0) /llvm_src/llvm-project/llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.cpp:525:32
#505 0x0000ffff7792dcf8 llvm::SDNode::getNodeId() const /llvm_src/llvm-project/llvm/include/llvm/CodeGen/SelectionDAGNodes.h:713:34
#506 0x0000ffff7792dcf8 llvm::DAGTypeLegalizer::AnalyzeNewValue(llvm::SDValue&) /llvm_src/llvm-project/llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.cpp:572:31
clang-15: error: unable to execute command: Segmentation fault (core dumped)
clang-15: error: clang frontend command failed due to signal (use -v to see invocation)
clang version 15.0.0 (https://github.com/llvm/llvm-project.git ac3e26bcffa29d3519f87be678ad09431a6bf6f2)
Target: aarch64-unknown-linux-gnu
Thread model: posix
InstalledDir: /llvm_build/bin
clang-15: note: diagnostic msg: Error generating preprocessed source(s) - no preprocessable inputs.

I'm wondering, would you happen to know whether SVE code generation for the targets with 128-bit SVE registers is meant to be supported--and, possibly, would modifying useSVEForFixedLengthVectors as above be the proper way to go about it or could there be any remaining checks that need to be changed, e.g., in AArch64TargetLowering::useSVEForFixedLengthVectorVT (

// Ensure NEON MVTs only belong to a single register class.
if (VT.getFixedSizeInBits() <= 128)
return false;
)?

// Ensure NEON MVTs only belong to a single register class.
if (VT.getFixedSizeInBits() <= 128)

cc @paulwalker-arm @sdesmalen-arm @stevesuzuki-arm (in case it's relevant to halide/Halide#6781)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions