Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RISCV] RISCV vector calling convention (2/2) #79096

Merged
merged 1 commit into from
Mar 30, 2024

Conversation

4vtomat
Copy link
Member

@4vtomat 4vtomat commented Jan 23, 2024

This commit handles vector arguments/return for function definition/call,
the new class RVVArgDispatcher is added for doing all vector register
assignment including mask types, data types as well as tuple types.
It precomputes the register number for each argument as per
https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc#standard-vector-calling-convention-variant
and it's passed to calling convention function to handle all vector arguments.

This patch contains 2 commits, the first one is the same as #78550 and would be removed later.

@llvmbot llvmbot added clang Clang issues not falling into any other category backend:RISC-V clang:codegen llvm:globalisel labels Jan 23, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Jan 23, 2024

@llvm/pr-subscribers-backend-risc-v
@llvm/pr-subscribers-clang
@llvm/pr-subscribers-clang-codegen

@llvm/pr-subscribers-llvm-globalisel

Author: Brandon Wu (4vtomat)

Changes
  • [clang][RISCV] Enable struct of homogeneous scalable vector as function argument
  • [RISCV] RISCV vector calling convention (2/2)

Patch is 135.00 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/79096.diff

519 Files Affected:

  • (modified) clang/lib/CodeGen/CGCall.cpp (+75-86)
  • (modified) clang/lib/CodeGen/Targets/RISCV.cpp (+7-1)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vget.c (+730-1761)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vset.c (+730-1761)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsoxseg2ei16.c (+384-576)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsoxseg2ei32.c (+368-552)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsoxseg2ei64.c (+328-492)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsoxseg2ei8.c (+384-576)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsoxseg3ei16.c (+371-593)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsoxseg3ei32.c (+371-593)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsoxseg3ei64.c (+351-561)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsoxseg3ei8.c (+371-593)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsoxseg4ei16.c (+445-741)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsoxseg4ei32.c (+445-741)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsoxseg4ei64.c (+421-701)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsoxseg4ei8.c (+445-741)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsoxseg5ei16.c (+365-625)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsoxseg5ei32.c (+365-625)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsoxseg5ei64.c (+365-625)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsoxseg5ei8.c (+365-625)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsoxseg6ei16.c (+417-729)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsoxseg6ei32.c (+417-729)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsoxseg6ei64.c (+417-729)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsoxseg6ei8.c (+417-729)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsoxseg7ei16.c (+469-833)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsoxseg7ei32.c (+469-833)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsoxseg7ei64.c (+469-833)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsoxseg7ei8.c (+469-833)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsoxseg8ei16.c (+521-937)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsoxseg8ei32.c (+521-937)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsoxseg8ei64.c (+521-937)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsoxseg8ei8.c (+521-937)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsseg2e16.c (+120-180)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsseg2e32.c (+96-144)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsseg2e64.c (+72-108)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsseg2e8.c (+96-144)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsseg3e16.c (+120-192)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsseg3e32.c (+90-144)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsseg3e64.c (+60-96)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsseg3e8.c (+100-160)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsseg4e16.c (+144-240)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsseg4e32.c (+108-180)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsseg4e64.c (+72-120)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsseg4e8.c (+120-200)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsseg5e16.c (+126-216)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsseg5e32.c (+84-144)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsseg5e64.c (+42-72)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsseg5e8.c (+113-193)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsseg6e16.c (+144-252)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsseg6e32.c (+96-168)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsseg6e64.c (+48-84)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsseg6e8.c ()
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsseg7e16.c ()
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsseg7e32.c ()
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsseg7e64.c ()
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsseg7e8.c ()
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsseg8e16.c ()
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsseg8e32.c ()
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsseg8e64.c ()
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsseg8e8.c ()
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vssseg2e16.c (+120-180)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vssseg2e32.c (+96-144)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vssseg2e64.c (+72-108)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vssseg2e8.c (+96-144)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vssseg3e16.c (+120-192)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vssseg3e32.c (+90-144)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vssseg3e64.c (+60-96)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vssseg3e8.c (+100-160)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vssseg4e16.c (+144-240)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vssseg4e32.c (+108-180)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vssseg4e64.c (+72-120)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vssseg4e8.c (+120-200)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vssseg5e16.c (+126-216)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vssseg5e32.c (+84-144)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vssseg5e64.c (+42-72)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vssseg5e8.c (+113-193)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vssseg6e16.c (+144-252)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vssseg6e32.c (+96-168)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vssseg6e64.c (+48-84)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vssseg6e8.c ()
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vssseg7e16.c ()
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vssseg7e32.c ()
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vssseg7e64.c ()
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vssseg7e8.c ()
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vssseg8e16.c ()
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vssseg8e32.c ()
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vssseg8e64.c ()
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vssseg8e8.c ()
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsuxseg2ei16.c ()
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsuxseg2ei32.c ()
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsuxseg2ei64.c (+328-492)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsuxseg2ei8.c (+384-576)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsuxseg3ei16.c (+371-593)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsuxseg3ei32.c (+371-593)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsuxseg3ei64.c (+351-561)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsuxseg3ei8.c (+371-593)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsuxseg4ei16.c (+445-741)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsuxseg4ei32.c (+445-741)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsuxseg4ei64.c (+421-701)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsuxseg4ei8.c (+445-741)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsuxseg5ei16.c (+365-625)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsuxseg5ei32.c (+365-625)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsuxseg5ei64.c (+365-625)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsuxseg5ei8.c (+365-625)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsuxseg6ei16.c (+417-729)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsuxseg6ei32.c (+417-729)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsuxseg6ei64.c (+417-729)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsuxseg6ei8.c (+417-729)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsuxseg7ei16.c (+469-833)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsuxseg7ei32.c (+469-833)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsuxseg7ei64.c (+469-833)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsuxseg7ei8.c (+469-833)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsuxseg8ei16.c (+521-937)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsuxseg8ei32.c (+521-937)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsuxseg8ei64.c (+521-937)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vsuxseg8ei8.c (+521-937)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vget.c (+730-1761)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vset.c (+730-1761)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsoxseg2ei16.c (+384-576)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsoxseg2ei32.c (+368-552)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsoxseg2ei64.c (+328-492)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsoxseg2ei8.c (+384-576)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsoxseg3ei16.c (+371-593)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsoxseg3ei32.c (+371-593)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsoxseg3ei64.c (+351-561)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsoxseg3ei8.c (+371-593)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsoxseg4ei16.c (+445-741)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsoxseg4ei32.c (+445-741)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsoxseg4ei64.c (+421-701)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsoxseg4ei8.c (+445-741)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsoxseg5ei16.c (+365-625)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsoxseg5ei32.c (+365-625)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsoxseg5ei64.c (+365-625)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsoxseg5ei8.c (+365-625)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsoxseg6ei16.c (+417-729)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsoxseg6ei32.c (+417-729)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsoxseg6ei64.c (+417-729)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsoxseg6ei8.c (+417-729)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsoxseg7ei16.c (+469-833)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsoxseg7ei32.c (+469-833)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsoxseg7ei64.c (+469-833)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsoxseg7ei8.c (+469-833)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsoxseg8ei16.c (+521-937)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsoxseg8ei32.c (+521-937)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsoxseg8ei64.c (+521-937)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsoxseg8ei8.c (+521-937)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsseg2e16.c (+120-180)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsseg2e32.c (+96-144)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsseg2e64.c (+72-108)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsseg2e8.c (+96-144)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsseg3e16.c (+120-192)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsseg3e32.c (+90-144)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsseg3e64.c (+60-96)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsseg3e8.c (+100-160)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsseg4e16.c (+144-240)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsseg4e32.c (+108-180)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsseg4e64.c (+72-120)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsseg4e8.c (+120-200)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsseg5e16.c (+126-216)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsseg5e32.c (+84-144)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsseg5e64.c (+42-72)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsseg5e8.c (+113-193)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsseg6e16.c (+144-252)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsseg6e32.c (+96-168)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsseg6e64.c (+48-84)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsseg6e8.c (+128-224)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsseg7e16.c (+162-288)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsseg7e32.c (+108-192)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsseg7e64.c (+54-96)
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsseg7e8.c ()
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsseg8e16.c ()
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsseg8e32.c ()
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsseg8e64.c ()
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vsseg8e8.c ()
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vssseg2e16.c ()
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vssseg2e32.c ()
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vssseg2e64.c ()
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vssseg2e8.c ()
  • (modified) clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/overloaded/vssseg3e16.c ()
diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index acf6cbad1c74809..fbaa2aaa2064267 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -3206,6 +3206,25 @@ void CodeGenFunction::EmitFunctionProlog(const CGFunctionInfo &FI,
         }
       }
 
+      llvm::StructType *STy =
+          dyn_cast<llvm::StructType>(ArgI.getCoerceToType());
+      llvm::TypeSize StructSize;
+      llvm::TypeSize PtrElementSize;
+      if (ArgI.isDirect() && !ArgI.getCanBeFlattened() && STy &&
+          STy->getNumElements() > 1) {
+        StructSize = CGM.getDataLayout().getTypeAllocSize(STy);
+        PtrElementSize =
+            CGM.getDataLayout().getTypeAllocSize(ConvertTypeForMem(Ty));
+        if (STy->containsHomogeneousScalableVectorTypes()) {
+          assert(StructSize == PtrElementSize &&
+                 "Only allow non-fractional movement of structure with"
+                 "homogeneous scalable vector type");
+
+          ArgVals.push_back(ParamValue::forDirect(AI));
+          break;
+        }
+      }
+
       Address Alloca = CreateMemTemp(Ty, getContext().getDeclAlign(Arg),
                                      Arg->getName());
 
@@ -3214,53 +3233,29 @@ void CodeGenFunction::EmitFunctionProlog(const CGFunctionInfo &FI,
 
       // Fast-isel and the optimizer generally like scalar values better than
       // FCAs, so we flatten them if this is safe to do for this argument.
-      llvm::StructType *STy = dyn_cast<llvm::StructType>(ArgI.getCoerceToType());
       if (ArgI.isDirect() && ArgI.getCanBeFlattened() && STy &&
           STy->getNumElements() > 1) {
-        llvm::TypeSize StructSize = CGM.getDataLayout().getTypeAllocSize(STy);
-        llvm::TypeSize PtrElementSize =
-            CGM.getDataLayout().getTypeAllocSize(Ptr.getElementType());
-        if (StructSize.isScalable()) {
-          assert(STy->containsHomogeneousScalableVectorTypes() &&
-                 "ABI only supports structure with homogeneous scalable vector "
-                 "type");
-          assert(StructSize == PtrElementSize &&
-                 "Only allow non-fractional movement of structure with"
-                 "homogeneous scalable vector type");
-          assert(STy->getNumElements() == NumIRArgs);
-
-          llvm::Value *LoadedStructValue = llvm::PoisonValue::get(STy);
-          for (unsigned i = 0, e = STy->getNumElements(); i != e; ++i) {
-            auto *AI = Fn->getArg(FirstIRArg + i);
-            AI->setName(Arg->getName() + ".coerce" + Twine(i));
-            LoadedStructValue =
-                Builder.CreateInsertValue(LoadedStructValue, AI, i);
-          }
+        uint64_t SrcSize = StructSize.getFixedValue();
+        uint64_t DstSize = PtrElementSize.getFixedValue();
 
-          Builder.CreateStore(LoadedStructValue, Ptr);
+        Address AddrToStoreInto = Address::invalid();
+        if (SrcSize <= DstSize) {
+          AddrToStoreInto = Ptr.withElementType(STy);
         } else {
-          uint64_t SrcSize = StructSize.getFixedValue();
-          uint64_t DstSize = PtrElementSize.getFixedValue();
-
-          Address AddrToStoreInto = Address::invalid();
-          if (SrcSize <= DstSize) {
-            AddrToStoreInto = Ptr.withElementType(STy);
-          } else {
-            AddrToStoreInto =
-                CreateTempAlloca(STy, Alloca.getAlignment(), "coerce");
-          }
+          AddrToStoreInto =
+              CreateTempAlloca(STy, Alloca.getAlignment(), "coerce");
+        }
 
-          assert(STy->getNumElements() == NumIRArgs);
-          for (unsigned i = 0, e = STy->getNumElements(); i != e; ++i) {
-            auto AI = Fn->getArg(FirstIRArg + i);
-            AI->setName(Arg->getName() + ".coerce" + Twine(i));
-            Address EltPtr = Builder.CreateStructGEP(AddrToStoreInto, i);
-            Builder.CreateStore(AI, EltPtr);
-          }
+        assert(STy->getNumElements() == NumIRArgs);
+        for (unsigned i = 0, e = STy->getNumElements(); i != e; ++i) {
+          auto AI = Fn->getArg(FirstIRArg + i);
+          AI->setName(Arg->getName() + ".coerce" + Twine(i));
+          Address EltPtr = Builder.CreateStructGEP(AddrToStoreInto, i);
+          Builder.CreateStore(AI, EltPtr);
+        }
 
-          if (SrcSize > DstSize) {
-            Builder.CreateMemCpy(Ptr, AddrToStoreInto, DstSize);
-          }
+        if (SrcSize > DstSize) {
+          Builder.CreateMemCpy(Ptr, AddrToStoreInto, DstSize);
         }
       } else {
         // Simple case, just do a coerced store of the argument into the alloca.
@@ -5277,6 +5272,24 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo,
         break;
       }
 
+      llvm::StructType *STy =
+          dyn_cast<llvm::StructType>(ArgInfo.getCoerceToType());
+      llvm::Type *SrcTy = ConvertTypeForMem(I->Ty);
+      llvm::TypeSize SrcTypeSize;
+      llvm::TypeSize DstTypeSize;
+      if (STy && ArgInfo.isDirect() && !ArgInfo.getCanBeFlattened()) {
+        SrcTypeSize = CGM.getDataLayout().getTypeAllocSize(SrcTy);
+        DstTypeSize = CGM.getDataLayout().getTypeAllocSize(STy);
+        if (STy->containsHomogeneousScalableVectorTypes()) {
+          assert(SrcTypeSize == DstTypeSize &&
+                 "Only allow non-fractional movement of structure with "
+                 "homogeneous scalable vector type");
+
+          IRCallArgs[FirstIRArg] = I->getKnownRValue().getScalarVal();
+          break;
+        }
+      }
+
       // FIXME: Avoid the conversion through memory if possible.
       Address Src = Address::invalid();
       if (!I->isAggregate()) {
@@ -5292,54 +5305,30 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo,
 
       // Fast-isel and the optimizer generally like scalar values better than
       // FCAs, so we flatten them if this is safe to do for this argument.
-      llvm::StructType *STy =
-            dyn_cast<llvm::StructType>(ArgInfo.getCoerceToType());
       if (STy && ArgInfo.isDirect() && ArgInfo.getCanBeFlattened()) {
-        llvm::Type *SrcTy = Src.getElementType();
-        llvm::TypeSize SrcTypeSize =
-            CGM.getDataLayout().getTypeAllocSize(SrcTy);
-        llvm::TypeSize DstTypeSize = CGM.getDataLayout().getTypeAllocSize(STy);
-        if (SrcTypeSize.isScalable()) {
-          assert(STy->containsHomogeneousScalableVectorTypes() &&
-                 "ABI only supports structure with homogeneous scalable vector "
-                 "type");
-          assert(SrcTypeSize == DstTypeSize &&
-                 "Only allow non-fractional movement of structure with "
-                 "homogeneous scalable vector type");
-          assert(NumIRArgs == STy->getNumElements());
-
-          llvm::Value *StoredStructValue =
-              Builder.CreateLoad(Src, Src.getName() + ".tuple");
-          for (unsigned i = 0, e = STy->getNumElements(); i != e; ++i) {
-            llvm::Value *Extract = Builder.CreateExtractValue(
-                StoredStructValue, i, Src.getName() + ".extract" + Twine(i));
-            IRCallArgs[FirstIRArg + i] = Extract;
-          }
+        uint64_t SrcSize = SrcTypeSize.getFixedValue();
+        uint64_t DstSize = DstTypeSize.getFixedValue();
+
+        // If the source type is smaller than the destination type of the
+        // coerce-to logic, copy the source value into a temp alloca the size
+        // of the destination type to allow loading all of it. The bits past
+        // the source value are left undef.
+        if (SrcSize < DstSize) {
+          Address TempAlloca = CreateTempAlloca(STy, Src.getAlignment(),
+                                                Src.getName() + ".coerce");
+          Builder.CreateMemCpy(TempAlloca, Src, SrcSize);
+          Src = TempAlloca;
         } else {
-          uint64_t SrcSize = SrcTypeSize.getFixedValue();
-          uint64_t DstSize = DstTypeSize.getFixedValue();
-
-          // If the source type is smaller than the destination type of the
-          // coerce-to logic, copy the source value into a temp alloca the size
-          // of the destination type to allow loading all of it. The bits past
-          // the source value are left undef.
-          if (SrcSize < DstSize) {
-            Address TempAlloca = CreateTempAlloca(STy, Src.getAlignment(),
-                                                  Src.getName() + ".coerce");
-            Builder.CreateMemCpy(TempAlloca, Src, SrcSize);
-            Src = TempAlloca;
-          } else {
-            Src = Src.withElementType(STy);
-          }
+          Src = Src.withElementType(STy);
+        }
 
-          assert(NumIRArgs == STy->getNumElements());
-          for (unsigned i = 0, e = STy->getNumElements(); i != e; ++i) {
-            Address EltPtr = Builder.CreateStructGEP(Src, i);
-            llvm::Value *LI = Builder.CreateLoad(EltPtr);
-            if (ArgHasMaybeUndefAttr)
-              LI = Builder.CreateFreeze(LI);
-            IRCallArgs[FirstIRArg + i] = LI;
-          }
+        assert(NumIRArgs == STy->getNumElements());
+        for (unsigned i = 0, e = STy->getNumElements(); i != e; ++i) {
+          Address EltPtr = Builder.CreateStructGEP(Src, i);
+          llvm::Value *LI = Builder.CreateLoad(EltPtr);
+          if (ArgHasMaybeUndefAttr)
+            LI = Builder.CreateFreeze(LI);
+          IRCallArgs[FirstIRArg + i] = LI;
         }
       } else {
         // In the simple case, just pass the coerced loaded value.
diff --git a/clang/lib/CodeGen/Targets/RISCV.cpp b/clang/lib/CodeGen/Targets/RISCV.cpp
index 0851d1993d0c0f5..245b111cef83bec 100644
--- a/clang/lib/CodeGen/Targets/RISCV.cpp
+++ b/clang/lib/CodeGen/Targets/RISCV.cpp
@@ -433,7 +433,13 @@ ABIArgInfo RISCVABIInfo::classifyArgumentType(QualType Ty, bool IsFixed,
         return getNaturalAlignIndirect(Ty, /*ByVal=*/false);
     }
 
-    return ABIArgInfo::getDirect();
+    ABIArgInfo Info = ABIArgInfo::getDirect();
+
+    // If it is tuple type, it can't be flattened.
+    if (llvm::StructType *STy = dyn_cast<llvm::StructType>(CGT.ConvertType(Ty)))
+      Info.setCanBeFlattened(!STy->containsHomogeneousScalableVectorTypes());
+
+    return Info;
   }
 
   if (const VectorType *VT = Ty->getAs<VectorType>())
diff --git a/clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vget.c b/clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vget.c
index 1f790fe38065ae5..a324cb72b67ccf6 100644
--- a/clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vget.c
+++ b/clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/non-policy/non-overloaded/vget.c
@@ -668,3291 +668,2260 @@ vuint64m4_t test_vget_v_u64m8_u64m4(vuint64m8_t src, size_t index) {
 }
 
 // CHECK-RV64-LABEL: define dso_local <vscale x 1 x half> @test_vget_v_f16mf4x2_f16mf4
-// CHECK-RV64-SAME: (<vscale x 1 x half> [[SRC_COERCE0:%.*]], <vscale x 1 x half> [[SRC_COERCE1:%.*]], i64 noundef [[INDEX:%.*]]) #[[ATTR0]] {
+// CHECK-RV64-SAME: ({ <vscale x 1 x half>, <vscale x 1 x half> } [[SRC:%.*]], i64 noundef [[INDEX:%.*]]) #[[ATTR0]] {
 // CHECK-RV64-NEXT:  entry:
-// CHECK-RV64-NEXT:    [[TMP0:%.*]] = insertvalue { <vscale x 1 x half>, <vscale x 1 x half> } poison, <vscale x 1 x half> [[SRC_COERCE0]], 0
-// CHECK-RV64-NEXT:    [[TMP1:%.*]] = insertvalue { <vscale x 1 x half>, <vscale x 1 x half> } [[TMP0]], <vscale x 1 x half> [[SRC_COERCE1]], 1
-// CHECK-RV64-NEXT:    [[TMP2:%.*]] = extractvalue { <vscale x 1 x half>, <vscale x 1 x half> } [[TMP1]], 0
-// CHECK-RV64-NEXT:    ret <vscale x 1 x half> [[TMP2]]
+// CHECK-RV64-NEXT:    [[TMP0:%.*]] = extractvalue { <vscale x 1 x half>, <vscale x 1 x half> } [[SRC]], 0
+// CHECK-RV64-NEXT:    ret <vscale x 1 x half> [[TMP0]]
 //
 vfloat16mf4_t test_vget_v_f16mf4x2_f16mf4(vfloat16mf4x2_t src, size_t index) {
   return __riscv_vget_v_f16mf4x2_f16mf4(src, 0);
 }
 
 // CHECK-RV64-LABEL: define dso_local <vscale x 1 x half> @test_vget_v_f16mf4x3_f16mf4
-// CHECK-RV64-SAME: (<vscale x 1 x half> [[SRC_COERCE0:%.*]], <vscale x 1 x half> [[SRC_COERCE1:%.*]], <vscale x 1 x half> [[SRC_COERCE2:%.*]], i64 noundef [[INDEX:%.*]]) #[[ATTR0]] {
+// CHECK-RV64-SAME: ({ <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half> } [[SRC:%.*]], i64 noundef [[INDEX:%.*]]) #[[ATTR0]] {
 // CHECK-RV64-NEXT:  entry:
-// CHECK-RV64-NEXT:    [[TMP0:%.*]] = insertvalue { <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half> } poison, <vscale x 1 x half> [[SRC_COERCE0]], 0
-// CHECK-RV64-NEXT:    [[TMP1:%.*]] = insertvalue { <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half> } [[TMP0]], <vscale x 1 x half> [[SRC_COERCE1]], 1
-// CHECK-RV64-NEXT:    [[TMP2:%.*]] = insertvalue { <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half> } [[TMP1]], <vscale x 1 x half> [[SRC_COERCE2]], 2
-// CHECK-RV64-NEXT:    [[TMP3:%.*]] = extractvalue { <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half> } [[TMP2]], 0
-// CHECK-RV64-NEXT:    ret <vscale x 1 x half> [[TMP3]]
+// CHECK-RV64-NEXT:    [[TMP0:%.*]] = extractvalue { <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half> } [[SRC]], 0
+// CHECK-RV64-NEXT:    ret <vscale x 1 x half> [[TMP0]]
 //
 vfloat16mf4_t test_vget_v_f16mf4x3_f16mf4(vfloat16mf4x3_t src, size_t index) {
   return __riscv_vget_v_f16mf4x3_f16mf4(src, 0);
 }
 
 // CHECK-RV64-LABEL: define dso_local <vscale x 1 x half> @test_vget_v_f16mf4x4_f16mf4
-// CHECK-RV64-SAME: (<vscale x 1 x half> [[SRC_COERCE0:%.*]], <vscale x 1 x half> [[SRC_COERCE1:%.*]], <vscale x 1 x half> [[SRC_COERCE2:%.*]], <vscale x 1 x half> [[SRC_COERCE3:%.*]], i64 noundef [[INDEX:%.*]]) #[[ATTR0]] {
+// CHECK-RV64-SAME: ({ <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half> } [[SRC:%.*]], i64 noundef [[INDEX:%.*]]) #[[ATTR0]] {
 // CHECK-RV64-NEXT:  entry:
-// CHECK-RV64-NEXT:    [[TMP0:%.*]] = insertvalue { <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half> } poison, <vscale x 1 x half> [[SRC_COERCE0]], 0
-// CHECK-RV64-NEXT:    [[TMP1:%.*]] = insertvalue { <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half> } [[TMP0]], <vscale x 1 x half> [[SRC_COERCE1]], 1
-// CHECK-RV64-NEXT:    [[TMP2:%.*]] = insertvalue { <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half> } [[TMP1]], <vscale x 1 x half> [[SRC_COERCE2]], 2
-// CHECK-RV64-NEXT:    [[TMP3:%.*]] = insertvalue { <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half> } [[TMP2]], <vscale x 1 x half> [[SRC_COERCE3]], 3
-// CHECK-RV64-NEXT:    [[TMP4:%.*]] = extractvalue { <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half> } [[TMP3]], 0
-// CHECK-RV64-NEXT:    ret <vscale x 1 x half> [[TMP4]]
+// CHECK-RV64-NEXT:    [[TMP0:%.*]] = extractvalue { <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half> } [[SRC]], 0
+// CHECK-RV64-NEXT:    ret <vscale x 1 x half> [[TMP0]]
 //
 vfloat16mf4_t test_vget_v_f16mf4x4_f16mf4(vfloat16mf4x4_t src, size_t index) {
   return __riscv_vget_v_f16mf4x4_f16mf4(src, 0);
 }
 
 // CHECK-RV64-LABEL: define dso_local <vscale x 1 x half> @test_vget_v_f16mf4x5_f16mf4
-// CHECK-RV64-SAME: (<vscale x 1 x half> [[SRC_COERCE0:%.*]], <vscale x 1 x half> [[SRC_COERCE1:%.*]], <vscale x 1 x half> [[SRC_COERCE2:%.*]], <vscale x 1 x half> [[SRC_COERCE3:%.*]], <vscale x 1 x half> [[SRC_COERCE4:%.*]], i64 noundef [[INDEX:%.*]]) #[[ATTR0]] {
+// CHECK-RV64-SAME: ({ <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half> } [[SRC:%.*]], i64 noundef [[INDEX:%.*]]) #[[ATTR0]] {
 // CHECK-RV64-NEXT:  entry:
-// CHECK-RV64-NEXT:    [[TMP0:%.*]] = insertvalue { <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half> } poison, <vscale x 1 x half> [[SRC_COERCE0]], 0
-// CHECK-RV64-NEXT:    [[TMP1:%.*]] = insertvalue { <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half> } [[TMP0]], <vscale x 1 x half> [[SRC_COERCE1]], 1
-// CHECK-RV64-NEXT:    [[TMP2:%.*]] = insertvalue { <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half> } [[TMP1]], <vscale x 1 x half> [[SRC_COERCE2]], 2
-// CHECK-RV64-NEXT:    [[TMP3:%.*]] = insertvalue { <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half> } [[TMP2]], <vscale x 1 x half> [[SRC_COERCE3]], 3
-// CHECK-RV64-NEXT:    [[TMP4:%.*]] = insertvalue { <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half> } [[TMP3]], <vscale x 1 x half> [[SRC_COERCE4]], 4
-// CHECK-RV64-NEXT:    [[TMP5:%.*]] = extractvalue { <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half> } [[TMP4]], 0
-// CHECK-RV64-NEXT:    ret <vscale x 1 x half> [[TMP5]]
+// CHECK-RV64-NEXT:    [[TMP0:%.*]] = extractvalue { <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half> } [[SRC]], 0
+// CHECK-RV64-NEXT:    ret <vscale x 1 x half> [[TMP0]]
 //
 vfloat16mf4_t test_vget_v_f16mf4x5_f16mf4(vfloat16mf4x5_t src, size_t index) {
   return __riscv_vget_v_f16mf4x5_f16mf4(src, 0);
 }
 
 // CHECK-RV64-LABEL: define dso_local <vscale x 1 x half> @test_vget_v_f16mf4x6_f16mf4
-// CHECK-RV64-SAME: (<vscale x 1 x half> [[SRC_COERCE0:%.*]], <vscale x 1 x half> [[SRC_COERCE1:%.*]], <vscale x 1 x half> [[SRC_COERCE2:%.*]], <vscale x 1 x half> [[SRC_COERCE3:%.*]], <vscale x 1 x half> [[SRC_COERCE4:%.*]], <vscale x 1 x half> [[SRC_COERCE5:%.*]], i64 noundef [[INDEX:%.*]]) #[[ATTR0]] {
+// CHECK-RV64-SAME: ({ <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half> } [[SRC:%.*]], i64 noundef [[INDEX:%.*]]) #[[ATTR0]] {
 // CHECK-RV64-NEXT:  entry:
-// CHECK-RV64-NEXT:    [[TMP0:%.*]] = insertvalue { <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half> } poison, <vscale x 1 x half> [[SRC_COERCE0]], 0
-// CHECK-RV64-NEXT:    [[TMP1:%.*]] = insertvalue { <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half> } [[TMP0]], <vscale x 1 x half> [[SRC_COERCE1]], 1
-// CHECK-RV64-NEXT:    [[TMP2:%.*]] = insertvalue { <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half> } [[TMP1]], <vscale x 1 x half> [[SRC_COERCE2]], 2
-// CHECK-RV64-NEXT:    [[TMP3:%.*]] = insertvalue { <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half> } [[TMP2]], <vscale x 1 x half> [[SRC_COERCE3]], 3
-// CHECK-RV64-NEXT:    [[TMP4:%.*]] = insertvalue { <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half> } [[TMP3]], <vscale x 1 x half> [[SRC_COERCE4]], 4
-// CHECK-RV64-NEXT:    [[TMP5:%.*]] = insertvalue { <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half> } [[TMP4]], <vscale x 1 x half> [[SRC_COERCE5]], 5
-// CHECK-RV64-NEXT:    [[TMP6:%.*]] = extractvalue { <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half> } [[TMP5]], 0
-// CHECK-RV64-NEXT:    ret <vscale x 1 x half> [[TMP6]]
+// CHECK-RV64-NEXT:    [[TMP0:%.*]] = extractvalue { <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half>, <vscale x 1 x half> } [[SRC]], 0
+// CHECK-RV64-NEXT:    ret <vscale x 1 x half> [[TMP0]]
 //
 vfloat16mf4_t test_vget_v_f16mf4x6_f16mf4(vfloat16mf4x6_t src, size_t index) {
   return __riscv_vget_v_f16mf4x6_f16mf4(src, 0);
 }
 
 // CHECK-RV64-LABEL: define dso_local <vscale x 1 x half> @test_vget_v_f16mf4x7_f16mf4
-// CHECK-RV64-SAME: (<vscale x 1 x half> [[SRC_COERCE0:%.*]], <vscale x 1 x half> [[SRC_COERCE1:%.*]], <vscale x 1 x half> [[SRC_COERCE2:%.*]], <vscale x 1 x half> [[SRC_COERCE3:%.*]], <vscale x 1 x half> [[SRC_COERCE4:%.*]], <vscale x 1 x half> [[SRC_COERCE5:%.*]], <vscale x 1 x half> [[SRC_COERCE6:%.*]], i64 noundef [[INDEX:%.*]]) #[[ATTR0]] {
-// CHECK-RV64-NEXT:  entry:
-// CHECK-RV64...
[truncated]

@4vtomat 4vtomat changed the title riscv vector cc part2 2 [RISCV] RISCV vector calling convention (2/2) Jan 23, 2024
llvm/lib/Target/RISCV/RISCVISelLowering.cpp Outdated Show resolved Hide resolved
llvm/lib/Target/RISCV/RISCVISelLowering.cpp Outdated Show resolved Hide resolved
llvm/lib/Target/RISCV/RISCVISelLowering.cpp Outdated Show resolved Hide resolved
llvm/lib/Target/RISCV/RISCVISelLowering.cpp Outdated Show resolved Hide resolved
llvm/lib/Target/RISCV/RISCVISelLowering.cpp Outdated Show resolved Hide resolved
llvm/lib/Target/RISCV/RISCVISelLowering.cpp Outdated Show resolved Hide resolved
std::vector<Type *> TypeList;
if (IsRet)
TypeList.push_back(MF.getFunction().getReturnType());
else if (CLI)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Braces

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As LLVM Coding Standards mentioned, the else if and for both contain only 1 statement and without comments, we don't need braces for it, do we?

llvm/lib/Target/RISCV/RISCVISelLowering.cpp Outdated Show resolved Hide resolved
@@ -20543,6 +20531,121 @@ unsigned RISCVTargetLowering::getMinimumJumpTableEntries() const {
return Subtarget.getMinimumJumpTableEntries();
}

void RVVArgDispatcher::constructArgInfos(Type *Ty) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be reinventing splitToValueTypes

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit different since splitToValueTypes uses ComputeValueVTs to get value type(s) and it can't handle vector tuple type. Same situation in SelectionDAGISel::LowerArguments, they both use ComputeValueVTs for getting value types, that's why RVVArgDispatcher comes to help.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is a "vector tuple type"? The DAG doesn't seem to rely on a similar helper

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By vector tuple type, I mean RVV tuple type, for example: vint32m2x2_t which needs 2 lmul2 vector registers. The llvm models it as struct {vint32m2_t, vint32m2_t} and it would be flattened by backend which makes it impossible to differentiate normal vector type and vector tuple type in AssignFn, that's why I added the helper to both GISel and normal selection DAG.

@4vtomat
Copy link
Member Author

4vtomat commented Feb 26, 2024

Rebase

@4vtomat
Copy link
Member Author

4vtomat commented Mar 1, 2024

ping

Copy link
Collaborator

@topperc topperc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

This commit handles vector arguments/return for function definition/call,
the new class RVVArgDispatcher is added for doing all vector register
assignment including mask types, data types as well as tuple types.
It precomputes the register number for each argument as per
https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc#standard-vector-calling-convention-variant
and it's passed to calling convention function to handle all vector arguments.

Depends on: llvm#78550
@4vtomat
Copy link
Member Author

4vtomat commented Mar 30, 2024

Rebase and squash.

@4vtomat 4vtomat merged commit 29e8bfc into llvm:main Mar 30, 2024
4 checks passed
@4vtomat 4vtomat deleted the riscv_vector_cc_part2_2 branch March 30, 2024 13:05
@newgre
Copy link

newgre commented Apr 11, 2024

Hi,

this commit has been identified as the root cause for a compiler crash with the following reproducer:

#pragma clang riscv intrinsic vector

unsigned long out_of_line_vsetvl() { return __builtin_rvv_vsetvlimax(2, 7); }

template <class Unused>
__rvv_float32mf2_t function1(__rvv_float32mf2_t f0_vf0) {
  const unsigned f1_vl0 = __builtin_rvv_vsetvlimax(2, 7);

  const __rvv_float32mf2_t f1_vf1 = __riscv_vfsgnjx_vv_f32mf2(f0_vf0, f0_vf0, f1_vl0);
  const __rvv_uint32mf2_t f1_vu0 = __riscv_vreinterpret_v_f32mf2_u32mf2(f1_vf1);
  const __rvv_uint32mf2_t f1_vu1 = __riscv_vmv_v_x_u32mf2(0, f1_vl0);
  const __rvv_float32mf2_t f1_vf2 = __riscv_vreinterpret_v_u32mf2_f32mf2(f1_vu1);
  const __rvv_bool64_t f1_vb0 = __riscv_vmflt_vv_f32mf2_b64(f1_vf2, f0_vf0, f1_vl0);
  const __rvv_uint32mf2_t f1_vu2 = __riscv_vmv_v_x_u32mf2(1, f1_vl0);
  const __rvv_uint32mf2_t f1_vu3 = __riscv_vmv_v_x_u32mf2(2, f1_vl0);
  const __rvv_uint32mf2_t f1_vu4 = __riscv_vmerge_vvm_u32mf2(f1_vu2, f1_vu3, f1_vb0, f1_vl0);
  const __rvv_uint32mf2_t f1_vu5 = __riscv_vadd_vv_u32mf2(f1_vu0, f1_vu4, f1_vl0);
  const __rvv_float32mf2_t f1_vf3 = __riscv_vreinterpret_v_u32mf2_f32mf2(f1_vu5);
  const __rvv_float32mf2_t f1_vf4 = __riscv_vfmv_v_f_f32mf2(3, f1_vl0);
  const __rvv_uint32mf2_t f1_vu6 = __riscv_vmv_v_x_u32mf2(0x7f800000, f1_vl0);
  const __rvv_float32mf2_t f1_vf5 = __riscv_vreinterpret_v_u32mf2_f32mf2(f1_vu6);
  const __rvv_uint32mf2_t f1_vu7 = __riscv_vreinterpret_v_f32mf2_u32mf2(f1_vf3);
  const __rvv_uint32mf2_t f1_vu8 = __riscv_vadd_vv_u32mf2(f1_vu7, f1_vu7, f1_vl0);
  const __rvv_uint32mf2_t f1_vu9 = __riscv_vsrl_vx_u32mf2(f1_vu8, 24, f1_vl0);
  const __rvv_int32mf2_t f1_vi0 = __riscv_vreinterpret_v_u32mf2_i32mf2(f1_vu9);
  const unsigned vl1 = out_of_line_vsetvl();
  const __rvv_bool64_t f1_vb1 = __riscv_vmslt_vx_i32mf2_b64(f1_vi0, 0xff, vl1);
  const __rvv_float32mf2_t f1_vf6 = __riscv_vmerge_vvm_f32mf2(f1_vf5, f1_vf4, f1_vb0, f1_vl0);
  const __rvv_float32mf2_t f1_vf7 = __riscv_vmerge_vvm_f32mf2(f1_vf6, f1_vf3, f1_vb1, f1_vl0);
  return f1_vf7;
}

void function0(const unsigned num, float* buf) {
  const unsigned f0_vl0 = __builtin_rvv_vsetvlimax(2, 7);
  const __rvv_float32mf2_t f0_vf0 = __riscv_vfmv_v_f_f32mf2(*buf, f0_vl0);
  const __rvv_float32mf2_t f0_vf1 = __riscv_vle32_v_f32mf2(buf, f0_vl0);
  const __rvv_float32mf2_t f0_vf2 = __riscv_vfmv_v_f_f32mf2(*buf, f0_vl0);
  const __rvv_bool64_t f0_vb0 = __riscv_vmfeq_vv_f32mf2_b64(f0_vf1, f0_vf0, f0_vl0);
  const __rvv_bool64_t f0_vb1 = __riscv_vmnot_m_b64(f0_vb0, f0_vl0);
  const int f0_i0 = __riscv_vfirst_m_b64(f0_vb1, f0_vl0);
  const bool f0_b0 = f0_i0 < 0;
  if (__builtin_expect(f0_b0, 0)) function1<void>(f0_vf2);
  out_of_line_vsetvl();
}

clang --target=riscv64-unknown-linux -march=rv64gcv -O1 -o repro.o -x c++ repro.c

#87897 refers to the same root cause.

Could we revert this back to green as a remediation?

@4vtomat
Copy link
Member Author

4vtomat commented Apr 12, 2024

The issue is handled in this PR: 87736, thanks!

@bgra8
Copy link
Contributor

bgra8 commented Apr 12, 2024

@4vtomat the LLVM dev policy mandates reverts instead of fixes forward to keep the tree healthy.

Please revert first to unblock all other teams that are hindered by this issue, investigate the issue offline, and then reapply the fixed patch.

@4vtomat
Copy link
Member Author

4vtomat commented Apr 12, 2024

@4vtomat the LLVM dev policy mandates reverts instead of fixes forward to keep the tree healthy.

Please revert first to unblock all other teams that are hindered by this issue, investigate the issue offline, and then reapply the fixed patch.

Got it, thanks~

4vtomat added a commit to 4vtomat/llvm-project that referenced this pull request Apr 12, 2024
This reverts commit 29e8bfc.
This patch didn't handle vector return type correctly.
4vtomat added a commit that referenced this pull request Apr 12, 2024
This reverts commit 29e8bfc.
This patch didn't handle vector return type correctly.
4vtomat added a commit to 4vtomat/llvm-project that referenced this pull request Apr 12, 2024
4vtomat added a commit to 4vtomat/llvm-project that referenced this pull request Apr 16, 2024
Return values are handled in a same way as function arguments.
One thing to mention is that if a type can be broken down into homogeneous
vector types, e.g. {<vscale x 4 x i32>, {<vscale x 4 x i32>, <vscale x 4 x i32>}},
it is considered as a vector tuple type and need to be handled by tuple
type rule.
4vtomat added a commit that referenced this pull request Apr 16, 2024
Bug fix: Handle RVV return type in calling convention correctly.
Return values are handled in a same way as function arguments.
One thing to mention is that if a type can be broken down into
homogeneous
vector types, e.g. {<vscale x 4 x i32>, {<vscale x 4 x i32>, <vscale x 4
x i32>}},
it is considered as a vector tuple type and need to be handled by tuple
type rule.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:RISC-V clang:codegen clang Clang issues not falling into any other category llvm:globalisel
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants