-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CostModel][X86] check all reduction cost kinds using -cost-kind=all #132000
Conversation
@llvm/pr-subscribers-llvm-analysis @llvm/pr-subscribers-backend-x86 Author: Simon Pilgrim (RKSimon) ChangesPatch is 864.01 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/132000.diff 14 Files Affected:
diff --git a/llvm/test/Analysis/CostModel/X86/reduce-add.ll b/llvm/test/Analysis/CostModel/X86/reduce-add.ll
index c869d0e3032b9..9a717e7dbef73 100644
--- a/llvm/test/Analysis/CostModel/X86/reduce-add.ll
+++ b/llvm/test/Analysis/CostModel/X86/reduce-add.ll
@@ -1,55 +1,71 @@
; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py
-; RUN: opt < %s -passes="print<cost-model>" -mtriple=x86_64-apple-darwin 2>&1 -disable-output -mattr=+sse2 | FileCheck %s --check-prefixes=SSE
-; RUN: opt < %s -passes="print<cost-model>" -mtriple=x86_64-apple-darwin 2>&1 -disable-output -mattr=+ssse3 | FileCheck %s --check-prefixes=SSE
-; RUN: opt < %s -passes="print<cost-model>" -mtriple=x86_64-apple-darwin 2>&1 -disable-output -mattr=+sse4.2 | FileCheck %s --check-prefixes=SSE
-; RUN: opt < %s -passes="print<cost-model>" -mtriple=x86_64-apple-darwin 2>&1 -disable-output -mattr=+avx | FileCheck %s --check-prefixes=AVX1
-; RUN: opt < %s -passes="print<cost-model>" -mtriple=x86_64-apple-darwin 2>&1 -disable-output -mattr=+avx2 | FileCheck %s --check-prefixes=AVX2
-; RUN: opt < %s -passes="print<cost-model>" -mtriple=x86_64-apple-darwin 2>&1 -disable-output -mattr=+avx512f | FileCheck %s --check-prefixes=AVX512,AVX512F
-; RUN: opt < %s -passes="print<cost-model>" -mtriple=x86_64-apple-darwin 2>&1 -disable-output -mattr=+avx512f,+avx512bw | FileCheck %s --check-prefixes=AVX512,AVX512BW
-; RUN: opt < %s -passes="print<cost-model>" -mtriple=x86_64-apple-darwin 2>&1 -disable-output -mattr=+avx512f,+avx512dq | FileCheck %s --check-prefixes=AVX512,AVX512DQ
+; RUN: opt < %s -passes="print<cost-model>" -mtriple=x86_64-apple-darwin 2>&1 -disable-output -cost-kind=all -mattr=+sse2 | FileCheck %s --check-prefixes=SSE
+; RUN: opt < %s -passes="print<cost-model>" -mtriple=x86_64-apple-darwin 2>&1 -disable-output -cost-kind=all -mattr=+ssse3 | FileCheck %s --check-prefixes=SSE
+; RUN: opt < %s -passes="print<cost-model>" -mtriple=x86_64-apple-darwin 2>&1 -disable-output -cost-kind=all -mattr=+sse4.2 | FileCheck %s --check-prefixes=SSE
+; RUN: opt < %s -passes="print<cost-model>" -mtriple=x86_64-apple-darwin 2>&1 -disable-output -cost-kind=all -mattr=+avx | FileCheck %s --check-prefixes=AVX1
+; RUN: opt < %s -passes="print<cost-model>" -mtriple=x86_64-apple-darwin 2>&1 -disable-output -cost-kind=all -mattr=+avx2 | FileCheck %s --check-prefixes=AVX2
+; RUN: opt < %s -passes="print<cost-model>" -mtriple=x86_64-apple-darwin 2>&1 -disable-output -cost-kind=all -mattr=+avx512f | FileCheck %s --check-prefixes=AVX512F
+; RUN: opt < %s -passes="print<cost-model>" -mtriple=x86_64-apple-darwin 2>&1 -disable-output -cost-kind=all -mattr=+avx512f,+avx512bw | FileCheck %s --check-prefixes=AVX512BW
+; RUN: opt < %s -passes="print<cost-model>" -mtriple=x86_64-apple-darwin 2>&1 -disable-output -cost-kind=all -mattr=+avx512f,+avx512dq | FileCheck %s --check-prefixes=AVX512DQ
-; RUN: opt < %s -passes="print<cost-model>" -mtriple=x86_64-apple-darwin 2>&1 -disable-output -mcpu=slm | FileCheck %s --check-prefixes=SLM
+; RUN: opt < %s -passes="print<cost-model>" -mtriple=x86_64-apple-darwin 2>&1 -disable-output -cost-kind=all -mcpu=slm | FileCheck %s --check-prefixes=SLM
define i32 @reduce_i64(i32 %arg) {
; SSE-LABEL: 'reduce_i64'
-; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %V1 = call i64 @llvm.vector.reduce.add.v1i64(<1 x i64> undef)
-; SSE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i64 @llvm.vector.reduce.add.v2i64(<2 x i64> undef)
-; SSE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i64 @llvm.vector.reduce.add.v4i64(<4 x i64> undef)
-; SSE-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %V8 = call i64 @llvm.vector.reduce.add.v8i64(<8 x i64> undef)
-; SSE-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %V16 = call i64 @llvm.vector.reduce.add.v16i64(<16 x i64> undef)
-; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
+; SSE-NEXT: Cost Model: Found costs of 0 for: %V1 = call i64 @llvm.vector.reduce.add.v1i64(<1 x i64> undef)
+; SSE-NEXT: Cost Model: Found costs of 2 for: %V2 = call i64 @llvm.vector.reduce.add.v2i64(<2 x i64> undef)
+; SSE-NEXT: Cost Model: Found costs of RThru:3 CodeSize:3 Lat:4 SizeLat:4 for: %V4 = call i64 @llvm.vector.reduce.add.v4i64(<4 x i64> undef)
+; SSE-NEXT: Cost Model: Found costs of RThru:5 CodeSize:5 Lat:8 SizeLat:8 for: %V8 = call i64 @llvm.vector.reduce.add.v8i64(<8 x i64> undef)
+; SSE-NEXT: Cost Model: Found costs of RThru:9 CodeSize:9 Lat:16 SizeLat:16 for: %V16 = call i64 @llvm.vector.reduce.add.v16i64(<16 x i64> undef)
+; SSE-NEXT: Cost Model: Found costs of RThru:0 CodeSize:1 Lat:1 SizeLat:1 for: ret i32 undef
;
; AVX1-LABEL: 'reduce_i64'
-; AVX1-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %V1 = call i64 @llvm.vector.reduce.add.v1i64(<1 x i64> undef)
-; AVX1-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %V2 = call i64 @llvm.vector.reduce.add.v2i64(<2 x i64> undef)
-; AVX1-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i64 @llvm.vector.reduce.add.v4i64(<4 x i64> undef)
-; AVX1-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %V8 = call i64 @llvm.vector.reduce.add.v8i64(<8 x i64> undef)
-; AVX1-NEXT: Cost Model: Found an estimated cost of 15 for instruction: %V16 = call i64 @llvm.vector.reduce.add.v16i64(<16 x i64> undef)
-; AVX1-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
+; AVX1-NEXT: Cost Model: Found costs of 0 for: %V1 = call i64 @llvm.vector.reduce.add.v1i64(<1 x i64> undef)
+; AVX1-NEXT: Cost Model: Found costs of 1 for: %V2 = call i64 @llvm.vector.reduce.add.v2i64(<2 x i64> undef)
+; AVX1-NEXT: Cost Model: Found costs of 3 for: %V4 = call i64 @llvm.vector.reduce.add.v4i64(<4 x i64> undef)
+; AVX1-NEXT: Cost Model: Found costs of RThru:7 CodeSize:8 Lat:5 SizeLat:9 for: %V8 = call i64 @llvm.vector.reduce.add.v8i64(<8 x i64> undef)
+; AVX1-NEXT: Cost Model: Found costs of RThru:15 CodeSize:18 Lat:9 SizeLat:21 for: %V16 = call i64 @llvm.vector.reduce.add.v16i64(<16 x i64> undef)
+; AVX1-NEXT: Cost Model: Found costs of RThru:0 CodeSize:1 Lat:1 SizeLat:1 for: ret i32 undef
;
; AVX2-LABEL: 'reduce_i64'
-; AVX2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %V1 = call i64 @llvm.vector.reduce.add.v1i64(<1 x i64> undef)
-; AVX2-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %V2 = call i64 @llvm.vector.reduce.add.v2i64(<2 x i64> undef)
-; AVX2-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i64 @llvm.vector.reduce.add.v4i64(<4 x i64> undef)
-; AVX2-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V8 = call i64 @llvm.vector.reduce.add.v8i64(<8 x i64> undef)
-; AVX2-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %V16 = call i64 @llvm.vector.reduce.add.v16i64(<16 x i64> undef)
-; AVX2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
+; AVX2-NEXT: Cost Model: Found costs of 0 for: %V1 = call i64 @llvm.vector.reduce.add.v1i64(<1 x i64> undef)
+; AVX2-NEXT: Cost Model: Found costs of 1 for: %V2 = call i64 @llvm.vector.reduce.add.v2i64(<2 x i64> undef)
+; AVX2-NEXT: Cost Model: Found costs of 3 for: %V4 = call i64 @llvm.vector.reduce.add.v4i64(<4 x i64> undef)
+; AVX2-NEXT: Cost Model: Found costs of RThru:4 CodeSize:4 Lat:4 SizeLat:5 for: %V8 = call i64 @llvm.vector.reduce.add.v8i64(<8 x i64> undef)
+; AVX2-NEXT: Cost Model: Found costs of RThru:6 CodeSize:6 Lat:6 SizeLat:9 for: %V16 = call i64 @llvm.vector.reduce.add.v16i64(<16 x i64> undef)
+; AVX2-NEXT: Cost Model: Found costs of RThru:0 CodeSize:1 Lat:1 SizeLat:1 for: ret i32 undef
;
-; AVX512-LABEL: 'reduce_i64'
-; AVX512-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %V1 = call i64 @llvm.vector.reduce.add.v1i64(<1 x i64> undef)
-; AVX512-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %V2 = call i64 @llvm.vector.reduce.add.v2i64(<2 x i64> undef)
-; AVX512-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i64 @llvm.vector.reduce.add.v4i64(<4 x i64> undef)
-; AVX512-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %V8 = call i64 @llvm.vector.reduce.add.v8i64(<8 x i64> undef)
-; AVX512-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %V16 = call i64 @llvm.vector.reduce.add.v16i64(<16 x i64> undef)
-; AVX512-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
+; AVX512F-LABEL: 'reduce_i64'
+; AVX512F-NEXT: Cost Model: Found costs of 0 for: %V1 = call i64 @llvm.vector.reduce.add.v1i64(<1 x i64> undef)
+; AVX512F-NEXT: Cost Model: Found costs of 1 for: %V2 = call i64 @llvm.vector.reduce.add.v2i64(<2 x i64> undef)
+; AVX512F-NEXT: Cost Model: Found costs of 3 for: %V4 = call i64 @llvm.vector.reduce.add.v4i64(<4 x i64> undef)
+; AVX512F-NEXT: Cost Model: Found costs of RThru:7 CodeSize:7 Lat:9 SizeLat:8 for: %V8 = call i64 @llvm.vector.reduce.add.v8i64(<8 x i64> undef)
+; AVX512F-NEXT: Cost Model: Found costs of RThru:8 CodeSize:8 Lat:10 SizeLat:9 for: %V16 = call i64 @llvm.vector.reduce.add.v16i64(<16 x i64> undef)
+; AVX512F-NEXT: Cost Model: Found costs of RThru:0 CodeSize:1 Lat:1 SizeLat:1 for: ret i32 undef
+;
+; AVX512BW-LABEL: 'reduce_i64'
+; AVX512BW-NEXT: Cost Model: Found costs of 0 for: %V1 = call i64 @llvm.vector.reduce.add.v1i64(<1 x i64> undef)
+; AVX512BW-NEXT: Cost Model: Found costs of 1 for: %V2 = call i64 @llvm.vector.reduce.add.v2i64(<2 x i64> undef)
+; AVX512BW-NEXT: Cost Model: Found costs of 3 for: %V4 = call i64 @llvm.vector.reduce.add.v4i64(<4 x i64> undef)
+; AVX512BW-NEXT: Cost Model: Found costs of RThru:7 CodeSize:7 Lat:9 SizeLat:7 for: %V8 = call i64 @llvm.vector.reduce.add.v8i64(<8 x i64> undef)
+; AVX512BW-NEXT: Cost Model: Found costs of RThru:8 CodeSize:8 Lat:10 SizeLat:8 for: %V16 = call i64 @llvm.vector.reduce.add.v16i64(<16 x i64> undef)
+; AVX512BW-NEXT: Cost Model: Found costs of RThru:0 CodeSize:1 Lat:1 SizeLat:1 for: ret i32 undef
+;
+; AVX512DQ-LABEL: 'reduce_i64'
+; AVX512DQ-NEXT: Cost Model: Found costs of 0 for: %V1 = call i64 @llvm.vector.reduce.add.v1i64(<1 x i64> undef)
+; AVX512DQ-NEXT: Cost Model: Found costs of 1 for: %V2 = call i64 @llvm.vector.reduce.add.v2i64(<2 x i64> undef)
+; AVX512DQ-NEXT: Cost Model: Found costs of 3 for: %V4 = call i64 @llvm.vector.reduce.add.v4i64(<4 x i64> undef)
+; AVX512DQ-NEXT: Cost Model: Found costs of RThru:7 CodeSize:7 Lat:9 SizeLat:8 for: %V8 = call i64 @llvm.vector.reduce.add.v8i64(<8 x i64> undef)
+; AVX512DQ-NEXT: Cost Model: Found costs of RThru:8 CodeSize:8 Lat:10 SizeLat:9 for: %V16 = call i64 @llvm.vector.reduce.add.v16i64(<16 x i64> undef)
+; AVX512DQ-NEXT: Cost Model: Found costs of RThru:0 CodeSize:1 Lat:1 SizeLat:1 for: ret i32 undef
;
; SLM-LABEL: 'reduce_i64'
-; SLM-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %V1 = call i64 @llvm.vector.reduce.add.v1i64(<1 x i64> undef)
-; SLM-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %V2 = call i64 @llvm.vector.reduce.add.v2i64(<2 x i64> undef)
-; SLM-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %V4 = call i64 @llvm.vector.reduce.add.v4i64(<4 x i64> undef)
-; SLM-NEXT: Cost Model: Found an estimated cost of 17 for instruction: %V8 = call i64 @llvm.vector.reduce.add.v8i64(<8 x i64> undef)
-; SLM-NEXT: Cost Model: Found an estimated cost of 33 for instruction: %V16 = call i64 @llvm.vector.reduce.add.v16i64(<16 x i64> undef)
-; SLM-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
+; SLM-NEXT: Cost Model: Found costs of 0 for: %V1 = call i64 @llvm.vector.reduce.add.v1i64(<1 x i64> undef)
+; SLM-NEXT: Cost Model: Found costs of 5 for: %V2 = call i64 @llvm.vector.reduce.add.v2i64(<2 x i64> undef)
+; SLM-NEXT: Cost Model: Found costs of RThru:9 CodeSize:6 Lat:7 SizeLat:7 for: %V4 = call i64 @llvm.vector.reduce.add.v4i64(<4 x i64> undef)
+; SLM-NEXT: Cost Model: Found costs of RThru:17 CodeSize:8 Lat:11 SizeLat:11 for: %V8 = call i64 @llvm.vector.reduce.add.v8i64(<8 x i64> undef)
+; SLM-NEXT: Cost Model: Found costs of RThru:33 CodeSize:12 Lat:19 SizeLat:19 for: %V16 = call i64 @llvm.vector.reduce.add.v16i64(<16 x i64> undef)
+; SLM-NEXT: Cost Model: Found costs of RThru:0 CodeSize:1 Lat:1 SizeLat:1 for: ret i32 undef
;
%V1 = call i64 @llvm.vector.reduce.add.v1i64(<1 x i64> undef)
%V2 = call i64 @llvm.vector.reduce.add.v2i64(<2 x i64> undef)
@@ -61,44 +77,60 @@ define i32 @reduce_i64(i32 %arg) {
define i32 @reduce_i32(i32 %arg) {
; SSE-LABEL: 'reduce_i32'
-; SSE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i32 @llvm.vector.reduce.add.v2i32(<2 x i32> undef)
-; SSE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> undef)
-; SSE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V8 = call i32 @llvm.vector.reduce.add.v8i32(<8 x i32> undef)
-; SSE-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %V16 = call i32 @llvm.vector.reduce.add.v16i32(<16 x i32> undef)
-; SSE-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %V32 = call i32 @llvm.vector.reduce.add.v32i32(<32 x i32> undef)
-; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
+; SSE-NEXT: Cost Model: Found costs of 2 for: %V2 = call i32 @llvm.vector.reduce.add.v2i32(<2 x i32> undef)
+; SSE-NEXT: Cost Model: Found costs of 3 for: %V4 = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> undef)
+; SSE-NEXT: Cost Model: Found costs of 4 for: %V8 = call i32 @llvm.vector.reduce.add.v8i32(<8 x i32> undef)
+; SSE-NEXT: Cost Model: Found costs of 6 for: %V16 = call i32 @llvm.vector.reduce.add.v16i32(<16 x i32> undef)
+; SSE-NEXT: Cost Model: Found costs of 10 for: %V32 = call i32 @llvm.vector.reduce.add.v32i32(<32 x i32> undef)
+; SSE-NEXT: Cost Model: Found costs of RThru:0 CodeSize:1 Lat:1 SizeLat:1 for: ret i32 undef
;
; AVX1-LABEL: 'reduce_i32'
-; AVX1-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i32 @llvm.vector.reduce.add.v2i32(<2 x i32> undef)
-; AVX1-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> undef)
-; AVX1-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %V8 = call i32 @llvm.vector.reduce.add.v8i32(<8 x i32> undef)
-; AVX1-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %V16 = call i32 @llvm.vector.reduce.add.v16i32(<16 x i32> undef)
-; AVX1-NEXT: Cost Model: Found an estimated cost of 17 for instruction: %V32 = call i32 @llvm.vector.reduce.add.v32i32(<32 x i32> undef)
-; AVX1-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
+; AVX1-NEXT: Cost Model: Found costs of 2 for: %V2 = call i32 @llvm.vector.reduce.add.v2i32(<2 x i32> undef)
+; AVX1-NEXT: Cost Model: Found costs of 3 for: %V4 = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> undef)
+; AVX1-NEXT: Cost Model: Found costs of 5 for: %V8 = call i32 @llvm.vector.reduce.add.v8i32(<8 x i32> undef)
+; AVX1-NEXT: Cost Model: Found costs of RThru:9 CodeSize:10 Lat:7 SizeLat:11 for: %V16 = call i32 @llvm.vector.reduce.add.v16i32(<16 x i32> undef)
+; AVX1-NEXT: Cost Model: Found costs of RThru:17 CodeSize:20 Lat:11 SizeLat:23 for: %V32 = call i32 @llvm.vector.reduce.add.v32i32(<32 x i32> undef)
+; AVX1-NEXT: Cost Model: Found costs of RThru:0 CodeSize:1 Lat:1 SizeLat:1 for: ret i32 undef
;
; AVX2-LABEL: 'reduce_i32'
-; AVX2-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i32 @llvm.vector.reduce.add.v2i32(<2 x i32> undef)
-; AVX2-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> undef)
-; AVX2-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %V8 = call i32 @llvm.vector.reduce.add.v8i32(<8 x i32> undef)
-; AVX2-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %V16 = call i32 @llvm.vector.reduce.add.v16i32(<16 x i32> undef)
-; AVX2-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %V32 = call i32 @llvm.vector.reduce.add.v32i32(<32 x i32> undef)
-; AVX2-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
+; AVX2-NEXT: Cost Model: Found costs of 2 for: %V2 = call i32 @llvm.vector.reduce.add.v2i32(<2 x i32> undef)
+; AVX2-NEXT: Cost Model: Found costs of 3 for: %V4 = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> undef)
+; AVX2-NEXT: Cost Model: Found costs of 5 for: %V8 = call i32 @llvm.vector.reduce.add.v8i32(<8 x i32> undef)
+; AVX2-NEXT: Cost Model: Found costs of RThru:6 CodeSize:6 Lat:6 SizeLat:7 for: %V16 = call i32 @llvm.vector.reduce.add.v16i32(<16 x i32> undef)
+; AVX2-NEXT: Cost Model: Found costs of RThru:8 CodeSize:8 Lat:8 SizeLat:11 for: %V32 = call i32 @llvm.vector.reduce.add.v32i32(<32 x i32> undef)
+; AVX2-NEXT: Cost Model: Found costs of RThru:0 CodeSize:1 Lat:1 SizeLat:1 for: ret i32 undef
+;
+; AVX512F-LABEL: 'reduce_i32'
+; AVX512F-NEXT: Cost Model: Found costs of 2 for: %V2 = call i32 @llvm.vector.reduce.add.v2i32(<2 x i32> undef)
+; AVX512F-NEXT: Cost Model: Found costs of 3 for: %V4 = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> undef)
+; AVX512F-NEXT: Cost Model: Found costs of 5 for: %V8 = call i32 @llvm.vector.reduce.add.v8i32(<8 x i32> undef)
+; AVX512F-NEXT: Cost Model: Found costs of RThru:9 CodeSize:9 Lat:13 SizeLat:10 for: %V16 = call i32 @llvm.vector.reduce.add.v16i32(<16 x i32> undef)
+; AVX512F-NEXT: Cost Model: Found costs of RThru:10 CodeSize:10 Lat:14 SizeLat:11 for: %V32 = call i32 @llvm.vector.reduce.add.v32i32(<32 x i32> undef)
+; AVX512F-NEXT: Cost Model: Found costs of RThru:0 CodeSize:1 Lat:1 SizeLat:1 for: ret i32 undef
+;
+; AVX512BW-LABEL: 'reduce_i32'
+; AVX512BW-NEXT: Cost Model: Found costs of 2 for: %V2 = call i32 @llvm.vector.reduce.add.v2i32(<2 x i32> undef)
+; AVX512BW-NEXT: Cost Model: Found costs of 3 for: %V4 = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> undef)
+; AVX512BW-NEXT: Cost Model: Found costs of 5 for: %V8 = call i32 @llvm.vector.reduce.add.v8i32(<8 x i32> undef)
+; AVX512BW-NEXT: Cost Model: Found costs of RThru:9 CodeSize:9 Lat:13 SizeLat:9 for: %V16 = call i32 @llvm.vector.reduce.add.v16i32(<16 x i32> undef)
+; AVX512BW-NEXT: Cost Model: Found costs of RThru:10 CodeSize:10 Lat:14 SizeLat:10 for: %V32 = call i32 @llvm.vector.reduce.add.v32i32(<32 x i32> undef)
+; AVX512BW-NEXT: Cost Model: Found costs of RThru:0 CodeSize:1 Lat:1 SizeLat:1 for: ret i32 undef
;
-; AVX512-LABEL: 'reduce_i32'
-; AVX512-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i32 @llvm.vector.reduce.add.v2i32(<2 x i32> undef)
-; AVX512-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V4 = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> undef)
-; AVX512-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %V8 = call i32 @llvm.vector.reduce.add.v8i32(<8 x i32> undef)
-; AVX512-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %V16 = call i32 @llvm.vector.reduce.add.v16i32(<16 x i32> undef)
-; AVX512-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %V32 = call i32 @llvm.vector.reduce.add.v32i32(<32 x i32> undef)
-; AVX512-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
+; AVX512DQ-LABEL: 'reduce_i32'
+; AVX512DQ-NEXT: Cost Model: Found costs of 2 for: %V2 = call i32 @llvm.vector.reduce.add.v2i32(<2 x i32> undef)
+; AVX512DQ-NEXT: Cost Model: Found costs of 3 for: %V4 = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> undef)
+; AVX512DQ-NEXT: Cost Model: Found costs of 5 for: %V8 = call i32 @llvm.vector.reduce.add.v8i32(<8 x i32> undef)
+; AVX512DQ-NEXT: Cost Model: Found costs of RThru:9 CodeSize:9 Lat:13 SizeLat:10 for: %V16 = call i32 @llvm.vector.reduce.add.v16i32(<16 x i32> undef)
+; AVX512DQ-NEXT: Cost Model: Found costs of RThru:10 CodeSize:10 Lat:14 SizeLat:11 for: %V32 = call i32 @llvm.vector.reduce.add.v32i32(<32 x i32> undef)
+; AVX512DQ-NEXT: Cost Model: Found costs of RThru:0 CodeSize:1 Lat:1 SizeLat:1 for: ret i32 undef
;
; SLM-LABEL: 'reduce_i32'
-; SLM-NEXT: Cost Model: Found an estimated cost of 2 for instruction:...
[truncated]
|
You can test this locally with the following command:git diff -U0 --pickaxe-regex -S '([^a-zA-Z0-9#_-]undef[^a-zA-Z0-9_-]|UndefValue::get)' 6ca1424fc1db255627f27eb6a50c7a837e3fecb3 cc8b52092d0ac372a93b78095102fe3540dce960 llvm/test/Analysis/CostModel/X86/reduce-add.ll llvm/test/Analysis/CostModel/X86/reduce-and.ll llvm/test/Analysis/CostModel/X86/reduce-fadd.ll llvm/test/Analysis/CostModel/X86/reduce-fmax.ll llvm/test/Analysis/CostModel/X86/reduce-fmin.ll llvm/test/Analysis/CostModel/X86/reduce-fmul.ll llvm/test/Analysis/CostModel/X86/reduce-mul.ll llvm/test/Analysis/CostModel/X86/reduce-or.ll llvm/test/Analysis/CostModel/X86/reduce-smax.ll llvm/test/Analysis/CostModel/X86/reduce-smin.ll llvm/test/Analysis/CostModel/X86/reduce-umax.ll llvm/test/Analysis/CostModel/X86/reduce-umin.ll llvm/test/Analysis/CostModel/X86/reduce-xor.ll llvm/test/Analysis/CostModel/X86/reduction.ll The following files introduce new uses of undef:
Undef is now deprecated and should only be used in the rare cases where no replacement is possible. For example, a load of uninitialized memory yields In tests, avoid using For example, this is considered a bad practice: define void @fn() {
...
br i1 undef, ...
} Please use the following instead: define void @fn(i1 %cond) {
...
br i1 %cond, ...
} Please refer to the Undefined Behavior Manual for more information. |
No description provided.