Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't remove shufflevector if input might be poison #43530

Open
nunoplopes opened this issue Nov 30, 2019 · 8 comments
Open

Can't remove shufflevector if input might be poison #43530

nunoplopes opened this issue Nov 30, 2019 · 8 comments
Labels
bugzilla miscompilation

Comments

@nunoplopes
Copy link
Member

nunoplopes commented Nov 30, 2019

Bugzilla Link 44185
Version trunk
OS All
Blocks #47292
CC @aemerson,@efriedma-quic,@hfinkel,@zhengyang92,@RKSimon,@MattPD,@regehr,@rotateright,@yuanfang-chen

Extended Description

Test case from Transforms/InstCombine/insert-extract-shuffle.ll:

define <4 x float> @insert_not_undef_shuffle_translate_commute(float %x, <4 x float> %y, <4 x float> %q) {
  %xv = insertelement <4 x float> %q, float %x, i32 2
  %r = shufflevector <4 x float> %y, <4 x float> %xv, <4 x i32> { 0, 6, 2, undef }
  ret <4 x float> %r
}
=>
define <4 x float> @insert_not_undef_shuffle_translate_commute(float %x, <4 x float> %y, <4 x float> %q) {
  %r = insertelement <4 x float> %y, float %x, i32 1
  ret <4 x float> %r
}
Transformation doesn't verify!
ERROR: Target is more poisonous than source

Example:
float %x = poison
<4 x float> %y = < poison, poison, poison, poison >
<4 x float> %q = < poison, poison, poison, poison >

Source:
<4 x float> %xv = < poison, poison, poison, poison >
<4 x float> %r = < poison, poison, poison, undef >

Target:
<4 x float> %r = < poison, poison, poison, poison >
Source value: < poison, poison, poison, undef >
Target value: < poison, poison, poison, poison >

This is with the semantics of LangRef, where an undef mask stops propagation of poison. This implies this kind of shufflevectors can't be removed.

@nunoplopes
Copy link
Member Author

nunoplopes commented Nov 30, 2019

Similar optimization in Transforms/InstCombine/shuffle_select.ll:

define <4 x i32> @add_undef_mask_elt(<4 x i32> %v) {
  %b = add <4 x i32> %v, { 11, 12, 13, 14 }
  %s = shufflevector <4 x i32> %b, <4 x i32> %v, <4 x i32> { 0, 5, undef, 7 }
  ret <4 x i32> %s
}
=>
define <4 x i32> @add_undef_mask_elt(<4 x i32> %v) {
  %s = add <4 x i32> %v, { 11, 0, undef, 0 }
  ret <4 x i32> %s
}
Transformation doesn't verify!
ERROR: Target is more poisonous than source

Example:
<4 x i32> %v = < poison, poison, poison, poison >

Source:
<4 x i32> %b = < poison, poison, poison, poison >
<4 x i32> %s = < poison, poison, undef, poison >

Target:
<4 x i32> %s = < poison, poison, poison, poison >
Source value: < poison, poison, undef, poison >
Target value: < poison, poison, poison, poison >

@hfinkel
Copy link
Collaborator

hfinkel commented Dec 1, 2019

This is with the semantics of LangRef, where an undef mask stops propagation of poison.

This seems unfortunate. Can you please remind me why we choose these semantics? It seems to me that (add undef, poison) should be poison, not undef, no?

@nunoplopes
Copy link
Member Author

nunoplopes commented Dec 2, 2019

This is with the semantics of LangRef, where an undef mask stops propagation of poison.

This seems unfortunate. Can you please remind me why we choose these
semantics? It seems to me that (add undef, poison) should be poison, not
undef, no?

See the discussion here: #43303

Short version, this semantics is needed if we want to support things introduction of broadcast:

%t = insertelement <4 x float> undef, float %arg, i32 1
%t4 = insertelement <4 x float> %t, float %arg, i32 1
%t5 = insertelement <4 x float> %t4, float %arg, i32 2
%t6 = insertelement <4 x float> %t5, float %arg, i32 3
ret <4 x float> %t6
=>
%t = insertelement <4 x float> undef, float %arg, i32 0
%t2 = shufflevector <4 x float> %t, <4 x float> undef, <4 x i32> <i32 undef, i32 0, i32 0, i32 0>
ret <4 x float> %t2

IMHO, the correct fix is to introduce a poison value and use that to initialize vectors. Also, a poison mask would give poison, so no more of these problems of choosing "undef or poison -> undef". Using shufflevector to stop poison isn't ideal, since then we can't remove it.
BTW, I've shown in #​43958 that these bugs are exploitable for end-to-end miscompilations.

@hfinkel
Copy link
Collaborator

hfinkel commented Dec 2, 2019

This is with the semantics of LangRef, where an undef mask stops propagation of poison.

This seems unfortunate. Can you please remind me why we choose these
semantics? It seems to me that (add undef, poison) should be poison, not
undef, no?

See the discussion here: #43303

Short version, this semantics is needed if we want to support things
introduction of broadcast:

%t = insertelement <4 x float> undef, float %arg, i32 1
%t4 = insertelement <4 x float> %t, float %arg, i32 1
%t5 = insertelement <4 x float> %t4, float %arg, i32 2
%t6 = insertelement <4 x float> %t5, float %arg, i32 3
ret <4 x float> %t6
=>
%t = insertelement <4 x float> undef, float %arg, i32 0
%t2 = shufflevector <4 x float> %t, <4 x float> undef, <4 x i32> <i32
undef, i32 0, i32 0, i32 0>
ret <4 x float> %t2

IMHO, the correct fix is to introduce a poison value and use that to
initialize vectors. Also, a poison mask would give poison, so no more of
these problems of choosing "undef or poison -> undef".

That makes sense to me. I expect that in many cases where we use an 'undef' constant, a poison constant would be a better fit. It's the "I need to put a value here, but the program shouldn't ever actually depend on it" value, and at least informally, that's how undef has often been used (as opposed to "the program can depend on it, and it's arbitrary, but that won't matter to the user" undef semantics).

Using shufflevector
to stop poison isn't ideal, since then we can't remove it.
BTW, I've shown in #​43958 that these bugs are exploitable for end-to-end
miscompilations.

@aemerson
Copy link
Contributor

aemerson commented Dec 30, 2020

It's the "I need to put a value here, but the program shouldn't ever actually depend on it" value, and at least informally, that's how undef has often been used (as opposed to "the program can depend on it, and it's arbitrary, but that won't matter to the user" undef semantics).

This is a nice succinct explanation of the difference between the two, that seems to be missing from the langref's explanation of the constants.

@aqjune
Copy link
Contributor

aqjune commented Nov 27, 2021

mentioned in issue #47292

@nunoplopes
Copy link
Member Author

nunoplopes commented May 4, 2022

Juneyoung's patches so they don't get lost:
https://reviews.llvm.org/D93818
https://reviews.llvm.org/D103874

@nunoplopes
Copy link
Member Author

nunoplopes commented Jun 9, 2022

FWIW, I've switched the semantics of shufflevector in Alive2 to yield poison elements when idx is OOB.
This fixed quite a few tests, but a few regressed as well:

  • Transforms/InstCombine/broadcast.ll
  • Transforms/InstCombine/insert-const-shuf.ll
  • Transforms/InstCombine/sub-of-negatible.ll
  • Transforms/SLPVectorizer/X86/blending-shuffle-inseltpoison.ll
  • Transforms/SLPVectorizer/X86/blending-shuffle.ll
  • Transforms/SLPVectorizer/X86/crash_lencod.ll
  • Transforms/SLPVectorizer/X86/hadd.ll
  • Transforms/SLPVectorizer/X86/insert-element-build-vector-const-undef.ll
  • Transforms/SLPVectorizer/X86/sitofp.ll
  • Transforms/VectorCombine/X86/load.ll

I've inspected all of them. They are all related with mixed insertelement starting with undef vectors. I suggest we restrict these optimizations to just poison vectors as undef vectors should be very rare nowadays.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla miscompilation
Projects
None yet
Development

No branches or pull requests

4 participants