Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AMDGPU] Add tests for vector rebroadcast. #91322

Merged
merged 3 commits into from
May 13, 2024

Conversation

PeddleSpam
Copy link
Contributor

No description provided.

@PeddleSpam PeddleSpam marked this pull request as ready for review May 7, 2024 13:20
@llvmbot
Copy link

llvmbot commented May 7, 2024

@llvm/pr-subscribers-backend-amdgpu

Author: Leon Clark (PeddleSpam)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/91322.diff

1 Files Affected:

  • (added) llvm/test/CodeGen/AMDGPU/vector_rebroadcast.ll (+39)
diff --git a/llvm/test/CodeGen/AMDGPU/vector_rebroadcast.ll b/llvm/test/CodeGen/AMDGPU/vector_rebroadcast.ll
new file mode 100644
index 0000000000000..50c5dadfcbb15
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/vector_rebroadcast.ll
@@ -0,0 +1,39 @@
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 -verify-machineinstrs < %s | FileCheck -check-prefix=GFX9 %s
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1010 -verify-machineinstrs < %s | FileCheck -check-prefix=GFX10 %s
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1100 -verify-machineinstrs < %s | FileCheck -check-prefix=GFX11 %s
+
+define <4 x float> @rebroadcast_v4f32(ptr addrspace(1) %arg0) {
+; GFX9-LABEL: rebroadcast_v4f32:
+; GFX9:       ; %bb.0: ; %entry
+; GFX9-NEXT:  s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX9-NEXT:  global_load_dwordx4 v[0:3], v[0:1], off
+; GFX9-NEXT:  s_waitcnt vmcnt(0)
+; GFX9-NEXT:  v_mov_b32_e32 v0, v1
+; GFX9-NEXT:  v_mov_b32_e32 v2, v1
+; GFX9-NEXT:  v_mov_b32_e32 v3, v1
+; GFX9-NEXT:  s_setpc_b64 s[30:31]
+;
+; GFX10-LABEL: rebroadcast_v4f32:
+; GFX10:       ; %bb.0: ; %entry
+; GFX10-NEXT:  s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX10-NEXT:  global_load_dwordx4 v[0:3], v[0:1], off
+; GFX10-NEXT:  s_waitcnt vmcnt(0)
+; GFX10-NEXT:  v_mov_b32_e32 v0, v1
+; GFX10-NEXT:  v_mov_b32_e32 v2, v1
+; GFX10-NEXT:  v_mov_b32_e32 v3, v1
+; GFX10-NEXT:  s_setpc_b64 s[30:31]
+;
+; GFX11-LABEL: rebroadcast_v4f32:
+; GFX11:       ; %bb.0: ; %entry
+; GFX11-NEXT:  s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT:  global_load_b128 v[0:3], v[0:1], off
+; GFX11-NEXT:  s_waitcnt vmcnt(0)
+; GFX11-NEXT:  v_mov_b32_e32 v0, v1
+; GFX11-NEXT:  v_mov_b32_e32 v2, v1
+; GFX11-NEXT:  v_mov_b32_e32 v3, v1
+; GFX11-NEXT:  s_setpc_b64 s[30:31]
+entry:
+  %val0 = load <4 x float>, ptr addrspace(1) %arg0
+  %val1 = shufflevector <4 x float> %val0, <4 x float> undef, <4 x i32> <i32 1, i32 1, i32 1, i32 1>
+  ret <4 x float> %val1
+}

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this merge in with another test? Should it test more vector sizes? Probably should link to follow up patch context

llvm/test/CodeGen/AMDGPU/vector_rebroadcast.ll Outdated Show resolved Hide resolved
@PeddleSpam
Copy link
Contributor Author

Can this merge in with another test? Should it test more vector sizes? Probably should link to follow up patch context

I've added tests for more vector types/sizes. It's a lot to merge with another file but I can if you'd prefer.

@PeddleSpam PeddleSpam merged commit bd67986 into llvm:main May 13, 2024
4 checks passed
@PeddleSpam PeddleSpam deleted the shuffle_splat branch May 13, 2024 18:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants