Skip to content

Commit

Permalink
Merging r322106:
Browse files Browse the repository at this point in the history
------------------------------------------------------------------------
r322106 | abataev | 2018-01-09 11:08:22 -0800 (Tue, 09 Jan 2018) | 11 lines

[COST]Fix PR35865: Fix cost model evaluation for shuffle on X86.

Summary:
If the vector type is transformed to non-vector single type, the compile
may crash trying to get vector information about non-vector type.

Reviewers: RKSimon, spatel, mkuper, hfinkel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D41862
------------------------------------------------------------------------

llvm-svn: 322680
  • Loading branch information
zmodem committed Jan 17, 2018
1 parent 40aaf37 commit 95aeee7
Show file tree
Hide file tree
Showing 2 changed files with 29 additions and 1 deletion.
3 changes: 2 additions & 1 deletion llvm/lib/Target/X86/X86TargetTransformInfo.cpp
Expand Up @@ -754,7 +754,8 @@ int X86TTIImpl::getShuffleCost(TTI::ShuffleKind Kind, Type *Tp, int Index,
// type remains the same.
if (Kind == TTI::SK_PermuteSingleSrc && LT.first != 1) {
MVT LegalVT = LT.second;
if (LegalVT.getVectorElementType().getSizeInBits() ==
if (LegalVT.isVector() &&
LegalVT.getVectorElementType().getSizeInBits() ==
Tp->getVectorElementType()->getPrimitiveSizeInBits() &&
LegalVT.getVectorNumElements() < Tp->getVectorNumElements()) {

Expand Down
27 changes: 27 additions & 0 deletions llvm/test/Transforms/SLPVectorizer/X86/PR35865.ll
@@ -0,0 +1,27 @@
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
; RUN: opt -slp-vectorizer < %s -S -o - -mtriple=x86_64-apple-macosx10.10.0 -mcpu=core2 | FileCheck %s

define void @_Z10fooConvertPDv4_xS0_S0_PKS_() {
; CHECK-LABEL: @_Z10fooConvertPDv4_xS0_S0_PKS_(
; CHECK-NEXT: entry:
; CHECK-NEXT: [[TMP0:%.*]] = extractelement <16 x half> undef, i32 4
; CHECK-NEXT: [[CONV_I_4_I:%.*]] = fpext half [[TMP0]] to float
; CHECK-NEXT: [[TMP1:%.*]] = bitcast float [[CONV_I_4_I]] to i32
; CHECK-NEXT: [[VECINS_I_4_I:%.*]] = insertelement <8 x i32> undef, i32 [[TMP1]], i32 4
; CHECK-NEXT: [[TMP2:%.*]] = extractelement <16 x half> undef, i32 5
; CHECK-NEXT: [[CONV_I_5_I:%.*]] = fpext half [[TMP2]] to float
; CHECK-NEXT: [[TMP3:%.*]] = bitcast float [[CONV_I_5_I]] to i32
; CHECK-NEXT: [[VECINS_I_5_I:%.*]] = insertelement <8 x i32> [[VECINS_I_4_I]], i32 [[TMP3]], i32 5
; CHECK-NEXT: ret void
;
entry:
%0 = extractelement <16 x half> undef, i32 4
%conv.i.4.i = fpext half %0 to float
%1 = bitcast float %conv.i.4.i to i32
%vecins.i.4.i = insertelement <8 x i32> undef, i32 %1, i32 4
%2 = extractelement <16 x half> undef, i32 5
%conv.i.5.i = fpext half %2 to float
%3 = bitcast float %conv.i.5.i to i32
%vecins.i.5.i = insertelement <8 x i32> %vecins.i.4.i, i32 %3, i32 5
ret void
}

0 comments on commit 95aeee7

Please sign in to comment.