Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CostModel][X86] check all reduction cost kinds using -cost-kind=all #132000

Merged
merged 1 commit into from
Mar 19, 2025

Conversation

RKSimon
Copy link
Collaborator

@RKSimon RKSimon commented Mar 19, 2025

No description provided.

@llvmbot
Copy link
Member

llvmbot commented Mar 19, 2025

@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-backend-x86

Author: Simon Pilgrim (RKSimon)

Changes

Patch is 864.01 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/132000.diff

14 Files Affected:

  • (modified) llvm/test/Analysis/CostModel/X86/reduce-add.ll (+208-176)
  • (modified) llvm/test/Analysis/CostModel/X86/reduce-and.ll (+164-145)
  • (modified) llvm/test/Analysis/CostModel/X86/reduce-fadd.ll (+191-191)
  • (modified) llvm/test/Analysis/CostModel/X86/reduce-fmax.ll (+136-102)
  • (modified) llvm/test/Analysis/CostModel/X86/reduce-fmin.ll (+136-102)
  • (modified) llvm/test/Analysis/CostModel/X86/reduce-fmul.ll (+191-191)
  • (modified) llvm/test/Analysis/CostModel/X86/reduce-mul.ll (+182-182)
  • (modified) llvm/test/Analysis/CostModel/X86/reduce-or.ll (+164-145)
  • (modified) llvm/test/Analysis/CostModel/X86/reduce-smax.ll (+201-204)
  • (modified) llvm/test/Analysis/CostModel/X86/reduce-smin.ll (+201-204)
  • (modified) llvm/test/Analysis/CostModel/X86/reduce-umax.ll (+201-204)
  • (modified) llvm/test/Analysis/CostModel/X86/reduce-umin.ll (+201-204)
  • (modified) llvm/test/Analysis/CostModel/X86/reduce-xor.ll (+182-163)
  • (modified) llvm/test/Analysis/CostModel/X86/reduction.ll (+886-802)
diff --git a/llvm/test/Analysis/CostModel/X86/reduce-add.ll b/llvm/test/Analysis/CostModel/X86/reduce-add.ll
index c869d0e3032b9..9a717e7dbef73 100644
--- a/llvm/test/Analysis/CostModel/X86/reduce-add.ll
+++ b/llvm/test/Analysis/CostModel/X86/reduce-add.ll
@@ -1,55 +1,71 @@
 ; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py
-; RUN: opt < %s -passes="print<cost-model>" -mtriple=x86_64-apple-darwin 2>&1 -disable-output -mattr=+sse2 | FileCheck %s --check-prefixes=SSE
-; RUN: opt < %s -passes="print<cost-model>" -mtriple=x86_64-apple-darwin 2>&1 -disable-output -mattr=+ssse3 | FileCheck %s --check-prefixes=SSE
-; RUN: opt < %s -passes="print<cost-model>" -mtriple=x86_64-apple-darwin 2>&1 -disable-output -mattr=+sse4.2 | FileCheck %s --check-prefixes=SSE
-; RUN: opt < %s -passes="print<cost-model>" -mtriple=x86_64-apple-darwin 2>&1 -disable-output -mattr=+avx | FileCheck %s --check-prefixes=AVX1
-; RUN: opt < %s -passes="print<cost-model>" -mtriple=x86_64-apple-darwin 2>&1 -disable-output -mattr=+avx2 | FileCheck %s --check-prefixes=AVX2
-; RUN: opt < %s -passes="print<cost-model>" -mtriple=x86_64-apple-darwin 2>&1 -disable-output -mattr=+avx512f | FileCheck %s --check-prefixes=AVX512,AVX512F
-; RUN: opt < %s -passes="print<cost-model>" -mtriple=x86_64-apple-darwin 2>&1 -disable-output -mattr=+avx512f,+avx512bw | FileCheck %s --check-prefixes=AVX512,AVX512BW
-; RUN: opt < %s -passes="print<cost-model>" -mtriple=x86_64-apple-darwin 2>&1 -disable-output -mattr=+avx512f,+avx512dq | FileCheck %s --check-prefixes=AVX512,AVX512DQ
+; RUN: opt < %s -passes="print<cost-model>" -mtriple=x86_64-apple-darwin 2>&1 -disable-output -cost-kind=all -mattr=+sse2 | FileCheck %s --check-prefixes=SSE
+; RUN: opt < %s -passes="print<cost-model>" -mtriple=x86_64-apple-darwin 2>&1 -disable-output -cost-kind=all -mattr=+ssse3 | FileCheck %s --check-prefixes=SSE
+; RUN: opt < %s -passes="print<cost-model>" -mtriple=x86_64-apple-darwin 2>&1 -disable-output -cost-kind=all -mattr=+sse4.2 | FileCheck %s --check-prefixes=SSE
+; RUN: opt < %s -passes="print<cost-model>" -mtriple=x86_64-apple-darwin 2>&1 -disable-output -cost-kind=all -mattr=+avx | FileCheck %s --check-prefixes=AVX1
+; RUN: opt < %s -passes="print<cost-model>" -mtriple=x86_64-apple-darwin 2>&1 -disable-output -cost-kind=all -mattr=+avx2 | FileCheck %s --check-prefixes=AVX2
+; RUN: opt < %s -passes="print<cost-model>" -mtriple=x86_64-apple-darwin 2>&1 -disable-output -cost-kind=all -mattr=+avx512f | FileCheck %s --check-prefixes=AVX512F
+; RUN: opt < %s -passes="print<cost-model>" -mtriple=x86_64-apple-darwin 2>&1 -disable-output -cost-kind=all -mattr=+avx512f,+avx512bw | FileCheck %s --check-prefixes=AVX512BW
+; RUN: opt < %s -passes="print<cost-model>" -mtriple=x86_64-apple-darwin 2>&1 -disable-output -cost-kind=all -mattr=+avx512f,+avx512dq | FileCheck %s --check-prefixes=AVX512DQ
 
-; RUN: opt < %s -passes="print<cost-model>" -mtriple=x86_64-apple-darwin 2>&1 -disable-output -mcpu=slm | FileCheck %s --check-prefixes=SLM
+; RUN: opt < %s -passes="print<cost-model>" -mtriple=x86_64-apple-darwin 2>&1 -disable-output -cost-kind=all -mcpu=slm | FileCheck %s --check-prefixes=SLM
 
 define i32 @reduce_i64(i32 %arg) {
 ; SSE-LABEL: 'reduce_i64'
-; SSE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %V1 = call i64 @llvm.vector.reduce.add.v1i64(<1 x i64> undef)
-; SSE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i64 @llvm.vector.reduce.add.v2i64(<2 x i64> undef)
-; SSE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i64 @llvm.vector.reduce.add.v4i64(<4 x i64> undef)
-; SSE-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %V8 = call i64 @llvm.vector.reduce.add.v8i64(<8 x i64> undef)
-; SSE-NEXT:  Cost Model: Found an estimated cost of 9 for instruction: %V16 = call i64 @llvm.vector.reduce.add.v16i64(<16 x i64> undef)
-; SSE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
+; SSE-NEXT:  Cost Model: Found costs of 0 for: %V1 = call i64 @llvm.vector.reduce.add.v1i64(<1 x i64> undef)
+; SSE-NEXT:  Cost Model: Found costs of 2 for: %V2 = call i64 @llvm.vector.reduce.add.v2i64(<2 x i64> undef)
+; SSE-NEXT:  Cost Model: Found costs of RThru:3 CodeSize:3 Lat:4 SizeLat:4 for: %V4 = call i64 @llvm.vector.reduce.add.v4i64(<4 x i64> undef)
+; SSE-NEXT:  Cost Model: Found costs of RThru:5 CodeSize:5 Lat:8 SizeLat:8 for: %V8 = call i64 @llvm.vector.reduce.add.v8i64(<8 x i64> undef)
+; SSE-NEXT:  Cost Model: Found costs of RThru:9 CodeSize:9 Lat:16 SizeLat:16 for: %V16 = call i64 @llvm.vector.reduce.add.v16i64(<16 x i64> undef)
+; SSE-NEXT:  Cost Model: Found costs of RThru:0 CodeSize:1 Lat:1 SizeLat:1 for: ret i32 undef
 ;
 ; AVX1-LABEL: 'reduce_i64'
-; AVX1-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %V1 = call i64 @llvm.vector.reduce.add.v1i64(<1 x i64> undef)
-; AVX1-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V2 = call i64 @llvm.vector.reduce.add.v2i64(<2 x i64> undef)
-; AVX1-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i64 @llvm.vector.reduce.add.v4i64(<4 x i64> undef)
-; AVX1-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %V8 = call i64 @llvm.vector.reduce.add.v8i64(<8 x i64> undef)
-; AVX1-NEXT:  Cost Model: Found an estimated cost of 15 for instruction: %V16 = call i64 @llvm.vector.reduce.add.v16i64(<16 x i64> undef)
-; AVX1-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
+; AVX1-NEXT:  Cost Model: Found costs of 0 for: %V1 = call i64 @llvm.vector.reduce.add.v1i64(<1 x i64> undef)
+; AVX1-NEXT:  Cost Model: Found costs of 1 for: %V2 = call i64 @llvm.vector.reduce.add.v2i64(<2 x i64> undef)
+; AVX1-NEXT:  Cost Model: Found costs of 3 for: %V4 = call i64 @llvm.vector.reduce.add.v4i64(<4 x i64> undef)
+; AVX1-NEXT:  Cost Model: Found costs of RThru:7 CodeSize:8 Lat:5 SizeLat:9 for: %V8 = call i64 @llvm.vector.reduce.add.v8i64(<8 x i64> undef)
+; AVX1-NEXT:  Cost Model: Found costs of RThru:15 CodeSize:18 Lat:9 SizeLat:21 for: %V16 = call i64 @llvm.vector.reduce.add.v16i64(<16 x i64> undef)
+; AVX1-NEXT:  Cost Model: Found costs of RThru:0 CodeSize:1 Lat:1 SizeLat:1 for: ret i32 undef
 ;
 ; AVX2-LABEL: 'reduce_i64'
-; AVX2-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %V1 = call i64 @llvm.vector.reduce.add.v1i64(<1 x i64> undef)
-; AVX2-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V2 = call i64 @llvm.vector.reduce.add.v2i64(<2 x i64> undef)
-; AVX2-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i64 @llvm.vector.reduce.add.v4i64(<4 x i64> undef)
-; AVX2-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V8 = call i64 @llvm.vector.reduce.add.v8i64(<8 x i64> undef)
-; AVX2-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V16 = call i64 @llvm.vector.reduce.add.v16i64(<16 x i64> undef)
-; AVX2-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
+; AVX2-NEXT:  Cost Model: Found costs of 0 for: %V1 = call i64 @llvm.vector.reduce.add.v1i64(<1 x i64> undef)
+; AVX2-NEXT:  Cost Model: Found costs of 1 for: %V2 = call i64 @llvm.vector.reduce.add.v2i64(<2 x i64> undef)
+; AVX2-NEXT:  Cost Model: Found costs of 3 for: %V4 = call i64 @llvm.vector.reduce.add.v4i64(<4 x i64> undef)
+; AVX2-NEXT:  Cost Model: Found costs of RThru:4 CodeSize:4 Lat:4 SizeLat:5 for: %V8 = call i64 @llvm.vector.reduce.add.v8i64(<8 x i64> undef)
+; AVX2-NEXT:  Cost Model: Found costs of RThru:6 CodeSize:6 Lat:6 SizeLat:9 for: %V16 = call i64 @llvm.vector.reduce.add.v16i64(<16 x i64> undef)
+; AVX2-NEXT:  Cost Model: Found costs of RThru:0 CodeSize:1 Lat:1 SizeLat:1 for: ret i32 undef
 ;
-; AVX512-LABEL: 'reduce_i64'
-; AVX512-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %V1 = call i64 @llvm.vector.reduce.add.v1i64(<1 x i64> undef)
-; AVX512-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %V2 = call i64 @llvm.vector.reduce.add.v2i64(<2 x i64> undef)
-; AVX512-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i64 @llvm.vector.reduce.add.v4i64(<4 x i64> undef)
-; AVX512-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %V8 = call i64 @llvm.vector.reduce.add.v8i64(<8 x i64> undef)
-; AVX512-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %V16 = call i64 @llvm.vector.reduce.add.v16i64(<16 x i64> undef)
-; AVX512-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
+; AVX512F-LABEL: 'reduce_i64'
+; AVX512F-NEXT:  Cost Model: Found costs of 0 for: %V1 = call i64 @llvm.vector.reduce.add.v1i64(<1 x i64> undef)
+; AVX512F-NEXT:  Cost Model: Found costs of 1 for: %V2 = call i64 @llvm.vector.reduce.add.v2i64(<2 x i64> undef)
+; AVX512F-NEXT:  Cost Model: Found costs of 3 for: %V4 = call i64 @llvm.vector.reduce.add.v4i64(<4 x i64> undef)
+; AVX512F-NEXT:  Cost Model: Found costs of RThru:7 CodeSize:7 Lat:9 SizeLat:8 for: %V8 = call i64 @llvm.vector.reduce.add.v8i64(<8 x i64> undef)
+; AVX512F-NEXT:  Cost Model: Found costs of RThru:8 CodeSize:8 Lat:10 SizeLat:9 for: %V16 = call i64 @llvm.vector.reduce.add.v16i64(<16 x i64> undef)
+; AVX512F-NEXT:  Cost Model: Found costs of RThru:0 CodeSize:1 Lat:1 SizeLat:1 for: ret i32 undef
+;
+; AVX512BW-LABEL: 'reduce_i64'
+; AVX512BW-NEXT:  Cost Model: Found costs of 0 for: %V1 = call i64 @llvm.vector.reduce.add.v1i64(<1 x i64> undef)
+; AVX512BW-NEXT:  Cost Model: Found costs of 1 for: %V2 = call i64 @llvm.vector.reduce.add.v2i64(<2 x i64> undef)
+; AVX512BW-NEXT:  Cost Model: Found costs of 3 for: %V4 = call i64 @llvm.vector.reduce.add.v4i64(<4 x i64> undef)
+; AVX512BW-NEXT:  Cost Model: Found costs of RThru:7 CodeSize:7 Lat:9 SizeLat:7 for: %V8 = call i64 @llvm.vector.reduce.add.v8i64(<8 x i64> undef)
+; AVX512BW-NEXT:  Cost Model: Found costs of RThru:8 CodeSize:8 Lat:10 SizeLat:8 for: %V16 = call i64 @llvm.vector.reduce.add.v16i64(<16 x i64> undef)
+; AVX512BW-NEXT:  Cost Model: Found costs of RThru:0 CodeSize:1 Lat:1 SizeLat:1 for: ret i32 undef
+;
+; AVX512DQ-LABEL: 'reduce_i64'
+; AVX512DQ-NEXT:  Cost Model: Found costs of 0 for: %V1 = call i64 @llvm.vector.reduce.add.v1i64(<1 x i64> undef)
+; AVX512DQ-NEXT:  Cost Model: Found costs of 1 for: %V2 = call i64 @llvm.vector.reduce.add.v2i64(<2 x i64> undef)
+; AVX512DQ-NEXT:  Cost Model: Found costs of 3 for: %V4 = call i64 @llvm.vector.reduce.add.v4i64(<4 x i64> undef)
+; AVX512DQ-NEXT:  Cost Model: Found costs of RThru:7 CodeSize:7 Lat:9 SizeLat:8 for: %V8 = call i64 @llvm.vector.reduce.add.v8i64(<8 x i64> undef)
+; AVX512DQ-NEXT:  Cost Model: Found costs of RThru:8 CodeSize:8 Lat:10 SizeLat:9 for: %V16 = call i64 @llvm.vector.reduce.add.v16i64(<16 x i64> undef)
+; AVX512DQ-NEXT:  Cost Model: Found costs of RThru:0 CodeSize:1 Lat:1 SizeLat:1 for: ret i32 undef
 ;
 ; SLM-LABEL: 'reduce_i64'
-; SLM-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: %V1 = call i64 @llvm.vector.reduce.add.v1i64(<1 x i64> undef)
-; SLM-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %V2 = call i64 @llvm.vector.reduce.add.v2i64(<2 x i64> undef)
-; SLM-NEXT:  Cost Model: Found an estimated cost of 9 for instruction: %V4 = call i64 @llvm.vector.reduce.add.v4i64(<4 x i64> undef)
-; SLM-NEXT:  Cost Model: Found an estimated cost of 17 for instruction: %V8 = call i64 @llvm.vector.reduce.add.v8i64(<8 x i64> undef)
-; SLM-NEXT:  Cost Model: Found an estimated cost of 33 for instruction: %V16 = call i64 @llvm.vector.reduce.add.v16i64(<16 x i64> undef)
-; SLM-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
+; SLM-NEXT:  Cost Model: Found costs of 0 for: %V1 = call i64 @llvm.vector.reduce.add.v1i64(<1 x i64> undef)
+; SLM-NEXT:  Cost Model: Found costs of 5 for: %V2 = call i64 @llvm.vector.reduce.add.v2i64(<2 x i64> undef)
+; SLM-NEXT:  Cost Model: Found costs of RThru:9 CodeSize:6 Lat:7 SizeLat:7 for: %V4 = call i64 @llvm.vector.reduce.add.v4i64(<4 x i64> undef)
+; SLM-NEXT:  Cost Model: Found costs of RThru:17 CodeSize:8 Lat:11 SizeLat:11 for: %V8 = call i64 @llvm.vector.reduce.add.v8i64(<8 x i64> undef)
+; SLM-NEXT:  Cost Model: Found costs of RThru:33 CodeSize:12 Lat:19 SizeLat:19 for: %V16 = call i64 @llvm.vector.reduce.add.v16i64(<16 x i64> undef)
+; SLM-NEXT:  Cost Model: Found costs of RThru:0 CodeSize:1 Lat:1 SizeLat:1 for: ret i32 undef
 ;
   %V1  = call i64 @llvm.vector.reduce.add.v1i64(<1 x i64> undef)
   %V2  = call i64 @llvm.vector.reduce.add.v2i64(<2 x i64> undef)
@@ -61,44 +77,60 @@ define i32 @reduce_i64(i32 %arg) {
 
 define i32 @reduce_i32(i32 %arg) {
 ; SSE-LABEL: 'reduce_i32'
-; SSE-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i32 @llvm.vector.reduce.add.v2i32(<2 x i32> undef)
-; SSE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> undef)
-; SSE-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %V8 = call i32 @llvm.vector.reduce.add.v8i32(<8 x i32> undef)
-; SSE-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V16 = call i32 @llvm.vector.reduce.add.v16i32(<16 x i32> undef)
-; SSE-NEXT:  Cost Model: Found an estimated cost of 10 for instruction: %V32 = call i32 @llvm.vector.reduce.add.v32i32(<32 x i32> undef)
-; SSE-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
+; SSE-NEXT:  Cost Model: Found costs of 2 for: %V2 = call i32 @llvm.vector.reduce.add.v2i32(<2 x i32> undef)
+; SSE-NEXT:  Cost Model: Found costs of 3 for: %V4 = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> undef)
+; SSE-NEXT:  Cost Model: Found costs of 4 for: %V8 = call i32 @llvm.vector.reduce.add.v8i32(<8 x i32> undef)
+; SSE-NEXT:  Cost Model: Found costs of 6 for: %V16 = call i32 @llvm.vector.reduce.add.v16i32(<16 x i32> undef)
+; SSE-NEXT:  Cost Model: Found costs of 10 for: %V32 = call i32 @llvm.vector.reduce.add.v32i32(<32 x i32> undef)
+; SSE-NEXT:  Cost Model: Found costs of RThru:0 CodeSize:1 Lat:1 SizeLat:1 for: ret i32 undef
 ;
 ; AVX1-LABEL: 'reduce_i32'
-; AVX1-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i32 @llvm.vector.reduce.add.v2i32(<2 x i32> undef)
-; AVX1-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> undef)
-; AVX1-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %V8 = call i32 @llvm.vector.reduce.add.v8i32(<8 x i32> undef)
-; AVX1-NEXT:  Cost Model: Found an estimated cost of 9 for instruction: %V16 = call i32 @llvm.vector.reduce.add.v16i32(<16 x i32> undef)
-; AVX1-NEXT:  Cost Model: Found an estimated cost of 17 for instruction: %V32 = call i32 @llvm.vector.reduce.add.v32i32(<32 x i32> undef)
-; AVX1-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
+; AVX1-NEXT:  Cost Model: Found costs of 2 for: %V2 = call i32 @llvm.vector.reduce.add.v2i32(<2 x i32> undef)
+; AVX1-NEXT:  Cost Model: Found costs of 3 for: %V4 = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> undef)
+; AVX1-NEXT:  Cost Model: Found costs of 5 for: %V8 = call i32 @llvm.vector.reduce.add.v8i32(<8 x i32> undef)
+; AVX1-NEXT:  Cost Model: Found costs of RThru:9 CodeSize:10 Lat:7 SizeLat:11 for: %V16 = call i32 @llvm.vector.reduce.add.v16i32(<16 x i32> undef)
+; AVX1-NEXT:  Cost Model: Found costs of RThru:17 CodeSize:20 Lat:11 SizeLat:23 for: %V32 = call i32 @llvm.vector.reduce.add.v32i32(<32 x i32> undef)
+; AVX1-NEXT:  Cost Model: Found costs of RThru:0 CodeSize:1 Lat:1 SizeLat:1 for: ret i32 undef
 ;
 ; AVX2-LABEL: 'reduce_i32'
-; AVX2-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i32 @llvm.vector.reduce.add.v2i32(<2 x i32> undef)
-; AVX2-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> undef)
-; AVX2-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %V8 = call i32 @llvm.vector.reduce.add.v8i32(<8 x i32> undef)
-; AVX2-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %V16 = call i32 @llvm.vector.reduce.add.v16i32(<16 x i32> undef)
-; AVX2-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %V32 = call i32 @llvm.vector.reduce.add.v32i32(<32 x i32> undef)
-; AVX2-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
+; AVX2-NEXT:  Cost Model: Found costs of 2 for: %V2 = call i32 @llvm.vector.reduce.add.v2i32(<2 x i32> undef)
+; AVX2-NEXT:  Cost Model: Found costs of 3 for: %V4 = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> undef)
+; AVX2-NEXT:  Cost Model: Found costs of 5 for: %V8 = call i32 @llvm.vector.reduce.add.v8i32(<8 x i32> undef)
+; AVX2-NEXT:  Cost Model: Found costs of RThru:6 CodeSize:6 Lat:6 SizeLat:7 for: %V16 = call i32 @llvm.vector.reduce.add.v16i32(<16 x i32> undef)
+; AVX2-NEXT:  Cost Model: Found costs of RThru:8 CodeSize:8 Lat:8 SizeLat:11 for: %V32 = call i32 @llvm.vector.reduce.add.v32i32(<32 x i32> undef)
+; AVX2-NEXT:  Cost Model: Found costs of RThru:0 CodeSize:1 Lat:1 SizeLat:1 for: ret i32 undef
+;
+; AVX512F-LABEL: 'reduce_i32'
+; AVX512F-NEXT:  Cost Model: Found costs of 2 for: %V2 = call i32 @llvm.vector.reduce.add.v2i32(<2 x i32> undef)
+; AVX512F-NEXT:  Cost Model: Found costs of 3 for: %V4 = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> undef)
+; AVX512F-NEXT:  Cost Model: Found costs of 5 for: %V8 = call i32 @llvm.vector.reduce.add.v8i32(<8 x i32> undef)
+; AVX512F-NEXT:  Cost Model: Found costs of RThru:9 CodeSize:9 Lat:13 SizeLat:10 for: %V16 = call i32 @llvm.vector.reduce.add.v16i32(<16 x i32> undef)
+; AVX512F-NEXT:  Cost Model: Found costs of RThru:10 CodeSize:10 Lat:14 SizeLat:11 for: %V32 = call i32 @llvm.vector.reduce.add.v32i32(<32 x i32> undef)
+; AVX512F-NEXT:  Cost Model: Found costs of RThru:0 CodeSize:1 Lat:1 SizeLat:1 for: ret i32 undef
+;
+; AVX512BW-LABEL: 'reduce_i32'
+; AVX512BW-NEXT:  Cost Model: Found costs of 2 for: %V2 = call i32 @llvm.vector.reduce.add.v2i32(<2 x i32> undef)
+; AVX512BW-NEXT:  Cost Model: Found costs of 3 for: %V4 = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> undef)
+; AVX512BW-NEXT:  Cost Model: Found costs of 5 for: %V8 = call i32 @llvm.vector.reduce.add.v8i32(<8 x i32> undef)
+; AVX512BW-NEXT:  Cost Model: Found costs of RThru:9 CodeSize:9 Lat:13 SizeLat:9 for: %V16 = call i32 @llvm.vector.reduce.add.v16i32(<16 x i32> undef)
+; AVX512BW-NEXT:  Cost Model: Found costs of RThru:10 CodeSize:10 Lat:14 SizeLat:10 for: %V32 = call i32 @llvm.vector.reduce.add.v32i32(<32 x i32> undef)
+; AVX512BW-NEXT:  Cost Model: Found costs of RThru:0 CodeSize:1 Lat:1 SizeLat:1 for: ret i32 undef
 ;
-; AVX512-LABEL: 'reduce_i32'
-; AVX512-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i32 @llvm.vector.reduce.add.v2i32(<2 x i32> undef)
-; AVX512-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> undef)
-; AVX512-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %V8 = call i32 @llvm.vector.reduce.add.v8i32(<8 x i32> undef)
-; AVX512-NEXT:  Cost Model: Found an estimated cost of 9 for instruction: %V16 = call i32 @llvm.vector.reduce.add.v16i32(<16 x i32> undef)
-; AVX512-NEXT:  Cost Model: Found an estimated cost of 10 for instruction: %V32 = call i32 @llvm.vector.reduce.add.v32i32(<32 x i32> undef)
-; AVX512-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
+; AVX512DQ-LABEL: 'reduce_i32'
+; AVX512DQ-NEXT:  Cost Model: Found costs of 2 for: %V2 = call i32 @llvm.vector.reduce.add.v2i32(<2 x i32> undef)
+; AVX512DQ-NEXT:  Cost Model: Found costs of 3 for: %V4 = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> undef)
+; AVX512DQ-NEXT:  Cost Model: Found costs of 5 for: %V8 = call i32 @llvm.vector.reduce.add.v8i32(<8 x i32> undef)
+; AVX512DQ-NEXT:  Cost Model: Found costs of RThru:9 CodeSize:9 Lat:13 SizeLat:10 for: %V16 = call i32 @llvm.vector.reduce.add.v16i32(<16 x i32> undef)
+; AVX512DQ-NEXT:  Cost Model: Found costs of RThru:10 CodeSize:10 Lat:14 SizeLat:11 for: %V32 = call i32 @llvm.vector.reduce.add.v32i32(<32 x i32> undef)
+; AVX512DQ-NEXT:  Cost Model: Found costs of RThru:0 CodeSize:1 Lat:1 SizeLat:1 for: ret i32 undef
 ;
 ; SLM-LABEL: 'reduce_i32'
-; SLM-NEXT:  Cost Model: Found an estimated cost of 2 for instruction:...
[truncated]

Copy link

⚠️ undef deprecator found issues in your code. ⚠️

You can test this locally with the following command:
git diff -U0 --pickaxe-regex -S '([^a-zA-Z0-9#_-]undef[^a-zA-Z0-9_-]|UndefValue::get)' 6ca1424fc1db255627f27eb6a50c7a837e3fecb3 cc8b52092d0ac372a93b78095102fe3540dce960 llvm/test/Analysis/CostModel/X86/reduce-add.ll llvm/test/Analysis/CostModel/X86/reduce-and.ll llvm/test/Analysis/CostModel/X86/reduce-fadd.ll llvm/test/Analysis/CostModel/X86/reduce-fmax.ll llvm/test/Analysis/CostModel/X86/reduce-fmin.ll llvm/test/Analysis/CostModel/X86/reduce-fmul.ll llvm/test/Analysis/CostModel/X86/reduce-mul.ll llvm/test/Analysis/CostModel/X86/reduce-or.ll llvm/test/Analysis/CostModel/X86/reduce-smax.ll llvm/test/Analysis/CostModel/X86/reduce-smin.ll llvm/test/Analysis/CostModel/X86/reduce-umax.ll llvm/test/Analysis/CostModel/X86/reduce-umin.ll llvm/test/Analysis/CostModel/X86/reduce-xor.ll llvm/test/Analysis/CostModel/X86/reduction.ll

The following files introduce new uses of undef:

  • llvm/test/Analysis/CostModel/X86/reduce-add.ll
  • llvm/test/Analysis/CostModel/X86/reduce-and.ll
  • llvm/test/Analysis/CostModel/X86/reduce-fmax.ll
  • llvm/test/Analysis/CostModel/X86/reduce-fmin.ll
  • llvm/test/Analysis/CostModel/X86/reduce-or.ll
  • llvm/test/Analysis/CostModel/X86/reduce-xor.ll
  • llvm/test/Analysis/CostModel/X86/reduction.ll

Undef is now deprecated and should only be used in the rare cases where no replacement is possible. For example, a load of uninitialized memory yields undef. You should use poison values for placeholders instead.

In tests, avoid using undef and having tests that trigger undefined behavior. If you need an operand with some unimportant value, you can add a new argument to the function and use that instead.

For example, this is considered a bad practice:

define void @fn() {
  ...
  br i1 undef, ...
}

Please use the following instead:

define void @fn(i1 %cond) {
  ...
  br i1 %cond, ...
}

Please refer to the Undefined Behavior Manual for more information.

@RKSimon RKSimon merged commit 945ce96 into llvm:main Mar 19, 2025
9 of 13 checks passed
@RKSimon RKSimon deleted the x86-reduction-costs-all branch March 19, 2025 11:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants