Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InstSimplify unconditionally executed trapping constant expression in phi fold #49839

Closed
zhendongsu opened this issue May 26, 2021 · 28 comments
Closed
Assignees
Labels
bugzilla Issues migrated from bugzilla confirmed Verified by a second party ipo Interprocedural optimizations miscompilation

Comments

@zhendongsu
Copy link

zhendongsu commented May 26, 2021

Bugzilla Link 50495
Version trunk
OS All
CC @rotateright

Extended Description

[585] % clangtk -v
clang version 13.0.0 (https://github.com/llvm/llvm-project.git 5dd86aa)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /local/suz-local/opfuzz/bin
Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/8
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/6
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/6.5.0
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7.5.0
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/8
Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7.5.0
Candidate multilib: .;@m64
Candidate multilib: 32;@m32
Candidate multilib: x32;@MX32
Selected multilib: .;@m64
[586] %
[586] % clangtk -O2 small.c; ./.aout
bash: ./.aout: No such file or directory
[587] %
[587] %
[587] % clangtk -v
clang version 13.0.0 (https://github.com/llvm/llvm-project.git 5dd86aa)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /local/suz-local/opfuzz/bin
Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/8
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/6
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/6.5.0
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7.5.0
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/8
Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7.5.0
Candidate multilib: .;@m64
Candidate multilib: 32;@m32
Candidate multilib: x32;@MX32
Selected multilib: .;@m64
clang-13: warning: argument unused during compilation: '-I /usr/local/include/csmith' [-Wunused-command-line-argument]
[588] %
[588] % clangtk -O2 small.c; ./a.out
[589] %
[589] % clangtk -O3 small.c
[590] % ./a.out
Floating point exception
[591] %
[591] % cat small.c

int a, b, e[2], f;
static char c, d, *g = &c, **h = &g;
char i(char j, char k) { return k ? j % k : 0; }
int main() {
  int *l = &e[1];
  for (; d < 1; d++) {
    int *m = &a, *o = &e[1];
    **h = l == m;
    char ***n[2] = {&h, &h};
    for (; b < 4; ++b)
      if (f != 1)
        *o = i(1, *g);
  }
  return 0;
}
@rotateright
Copy link
Contributor

This is still visible. I see srem and constant expressions in the IR, so some pass is likely hoisting an op without checking that it is safe to speculate.

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 11, 2021
@llvmbot llvmbot added the confirmed Verified by a second party label Jan 26, 2022
@fhahn
Copy link
Contributor

fhahn commented Mar 21, 2022

Still reproduces: https://godbolt.org/z/fx5j9zh49

According to GodBolt this started happening between 11.0 and 12.0: https://godbolt.org/z/hPrEb6a5E

@fhahn
Copy link
Contributor

fhahn commented Mar 22, 2022

The culprit seems to be instsimplify: https://godbolt.org/z/9xW8d3a1e

For the input, it sinks the srem to the a constant expression in the phi node in the successor block, which will then be unconditionally executed after lowering in the backend. The transformation is invalid according to Alive2

@e = external global [2 x i32]
@a = external global i32

define <2 x i8> @main(i1 %c) {
entry:
  br i1 %c, label %then, label %exit

then:
  %srem = srem i8 1, zext (i1 icmp eq (i32* getelementptr inbounds ([2 x i32], [2 x i32]* @e, i64 0, i64 1), i32* @a) to i8)
  %ins = insertelement <2 x i8> zeroinitializer, i8 %srem, i64 0
  br label %exit

exit:
  %p = phi <2 x i8> [ zeroinitializer, %entry ], [ %ins, %then ]
  ret <2 x i8> %p
}

@fhahn fhahn changed the title wrong code at -O3 on x86_64-linux-gnu InstSimplify sinks trapping constant expression into phi node (wrong code at -O3 on x86_64-linux-gnu) Mar 22, 2022
@fhahn
Copy link
Contributor

fhahn commented Mar 22, 2022

I think instsimplify shouldn't move the code, but I'm tagging a couple of people for additional opinions about movement of constant expressions that may trap. @nikic @nunoplopes @preames @rotateright @LebedevRI

@rotateright
Copy link
Contributor

Thanks for tracking this down!
Yes, it seems likely that we should just bail out of the transform.
I didn't step through this example yet - can we use isSafeToSpeculativelyExecute on the path where the transform is attempted? That's what we use to guard against moving div/rem ops in other passes.

@nikic
Copy link
Contributor

nikic commented Mar 22, 2022

Hm, I'm not so sure here. Intuitively I would say this transform is correct, because phi values are materialized on the incoming edge, not in the phi block. LangRef says:

For the purposes of the SSA form, the use of each incoming value is deemed to occur on the edge from the corresponding predecessor block to the current block (but after any definition of an ‘invoke’ instruction’s return value on the same edge).

I would say it's a backend bug if the backend materializes a trapping constant expression in the phi block. It should happen on the edge (possibly a split critical edge).

But then alive2 says the transform is not valid, so possibly my understanding of the semantics is not correct.

@nikic
Copy link
Contributor

nikic commented Mar 22, 2022

Thinking about this a bit more, I think this transform must be valid, otherwise doing RAUW with a constant folding result would be illegal, which would be very bad. I think the alive2 modelling isn't correct in this instance, because it places the constexpr evaluation in the wrong block.

@fhahn
Copy link
Contributor

fhahn commented Mar 22, 2022

But then alive2 says the transform is not valid, so possibly my understanding of the semantics is not correct.

Yes that was my original thought as well, the incoming value should only execute when the edge is executed.

@LebedevRI
Copy link
Member

I agree that as per langref the backend is at fault here.
Filed AliveToolkit/alive2#781

@nunoplopes
Copy link
Member

What if a phi BB's predecessor has multiple successors?
There's no execution on an edge, only on BBs. Do we assume the execution happens in the predecessor on in an imaginary BB in-between?
e.g.:

br %cond, then, else

then:
phi [ div... ]

is this the same as:

%x = div ...
br %cond, then, else

then:
phi [ %x ]

or:

br %cond, then_0, else

then_0:
%x = div ...
br then

then:
phi [ %x ]

@nikic
Copy link
Contributor

nikic commented Mar 22, 2022

@nunoplopes The latter. If the edge is critical, it needs to be split.

@fhahn fhahn changed the title InstSimplify sinks trapping constant expression into phi node (wrong code at -O3 on x86_64-linux-gnu) Backend unconditionally executes incoming trapping constant expression from phi (wrong code at -O3 on x86_64-linux-gnu) Mar 22, 2022
@nunoplopes
Copy link
Member

@nunoplopes The latter. If the edge is critical, it needs to be split.

Thank you, makes sense! Let me try to implement that in Alive2..

@nikic
Copy link
Contributor

nikic commented Apr 12, 2022

I looked a bit closer into the original reproducer, and the problem here is in InstSimplify after all, just in a different place:

@g = extern_weak global i32

define i64 @test(i1 %c) {
entry:
 br i1 %c, label %if, label %join

if:
 br label %join

join:
 %phi = phi i64 [ poison, %if ], [ srem (i64 1, i64 ptrtoint (i32* @g to i64)) , %entry ]
 ret i64 %phi
}

; RUN: opt -S -instsimplify < %s

@g = extern_weak global i32

define i64 @test(i1 %c) {
entry:
 br i1 %c, label %if, label %join

if:                                               ; preds = %entry
 br label %join

join:                                             ; preds = %if, %entry
 ret i64 srem (i64 1, i64 ptrtoint (i32* @g to i64))
}

The phi with poison fold ends up executing the constant expression unconditionally, rather than only on one branch.

@nikic nikic self-assigned this Apr 12, 2022
@nikic nikic changed the title Backend unconditionally executes incoming trapping constant expression from phi (wrong code at -O3 on x86_64-linux-gnu) InstSimplify unconditionally executed trapping constant expression in phi fold Apr 12, 2022
@nikic nikic closed this as completed in 1d530b9 Apr 12, 2022
@zhendongsu
Copy link
Author

It looks like the original C test still fails:

[526] % clangtk -v
clang version 15.0.0 (https://github.com/llvm/llvm-project.git d941d597837d9e1405086f008c9bd6a71e7263c9)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /local/suz-local/opfuzz/bin
Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/8
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/6
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/6.5.0
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7.5.0
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/8
Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7.5.0
Candidate multilib: .;@m64
Selected multilib: .;@m64
Found CUDA installation: /usr/local/cuda, version 11.0
[527] % 
[527] % clangtk -O3 49839.c
[528] % ./a.out
Floating point exception
[529] % 
[529] % cat 49839.c 
int a, b, e[2], f;
static char c, d, *g = &c, **h = &g;
char i(char j, char k) { return k ? j % k : 0; }
int main() {
  int *l = &e[1];
  for (; d < 1; d++) {
    int *m = &a, *o = &e[1];
    **h = l == m;
    char ***n[2] = {&h, &h};
    for (; b < 4; ++b)
      if (f != 1)
        *o = i(1, *g);
  }
  return 0;
}

Compiler Explorer: https://godbolt.org/z/f5G5r16x5

@nikic
@fhahn

@nunoplopes nunoplopes reopened this Jun 12, 2022
@nunoplopes
Copy link
Member

nunoplopes commented Jun 12, 2022

According to alivecc, this is a bug in InstSimplify.
It seems that we have a constant expression with an srem in a phi node in one of the entries, and poison in other.
We copy the expression to both phi entries, and then we get UB on the other branch. I guess the fix is to avoid merging phi entries when there are side-effects.

Before:

%vector.body:
  %index = phi i32 [ 0, %vector.ph ], [ %index.next, %pred.srem.continue24 ]
  %11 = phi i32 [ %3, %vector.ph ], [ %28, %pred.srem.continue24 ]
  br i1 %7, label %pred.srem.if, label %pred.srem.continue

%pred.srem.continue:
  %14 = phi <4 x i8> [ poison, %vector.body ], [ %13, %pred.srem.if ]
  br i1 %8, label %pred.srem.if19, label %pred.srem.continue20

%pred.srem.if:
  %__constexpr_5 = gep inbounds ptr @e, 8 x i64 0, 4 x i64 1
  %__constexpr_4 = icmp eq ptr %__constexpr_5, @a
  %__constexpr_3 = zext i1 %__constexpr_4 to i8
  %12 = srem i8 1, %__constexpr_3
  %13 = insertelement <4 x i8> poison, i8 %12, i64 0
  br label %pred.srem.continue

%pred.srem.continue20:
  %17 = phi <4 x i8> [ %14, %pred.srem.continue ], [ %16, %pred.srem.if19 ]

After:

%vector.body:
  %index = phi i32 [ 0, %vector.ph ], [ %index.next, %pred.srem.continue24 ]
  %4 = phi i32 [ %3, %vector.ph ], [ %14, %pred.srem.continue24 ]
  %__constexpr_1 = gep inbounds ptr @e, 8 x i64 0, 4 x i64 1
  %__constexpr_0 = icmp eq ptr %__constexpr_1, @a
  br i1 %__constexpr_0, label %pred.srem.if, label %pred.srem.continue

%pred.srem.continue:
  %__constexpr_3 = gep inbounds ptr @e, 8 x i64 0, 4 x i64 1
  %__constexpr_2 = icmp eq ptr %__constexpr_3, @a
  br i1 %__constexpr_2, label %pred.srem.if19, label %pred.srem.continue_%pred.srem.continue20

%pred.srem.continue_%pred.srem.continue20:
  %__constexpr_34 = gep inbounds ptr @e, 8 x i64 0, 4 x i64 1
  %__constexpr_33 = icmp eq ptr %__constexpr_34, @a
  %__constexpr_32 = zext i1 %__constexpr_33 to i8       ; 0
  %__constexpr_31 = srem i8 1, %__constexpr_32          ; UB
  %__copy_1 = <4 x i8> { %__constexpr_31, poison, poison, poison }
  br label %pred.srem.continue20

%pred.srem.continue20:
  %5 = phi <4 x i8> [ %__copy_1, %pred.srem.continue_%pred.srem.continue20 ], [ %__copy_2, %pred.srem.if19 ]

We may need c-smith + alivecc to minimize this thing (@regehr).

@nunoplopes
Copy link
Member

The code looks good, I dunno where the bug might be:

// We cannot start executing a trapping constant expression on more control
// flow paths.
auto *CE = dyn_cast<ConstantExpr>(CommonValue);
if (CE && CE->canTrap())
return nullptr;

@nunoplopes
Copy link
Member

Reduced test case:

@e = external global [2 x i32]
@a = external global i32

define i8 @main() {
pred.srem.continue8:
  br i1 false, label %pred.srem.if9, label %pred.srem.continue10

pred.srem.if9:
  %sr = srem i8 1, zext (i1 icmp eq (ptr getelementptr inbounds ([2 x i32], ptr @e, i64 0, i64 1), ptr @a) to i8)
  %v = insertelement <4 x i8> zeroinitializer, i8 %sr, i64 3
  br label %pred.srem.continue10

pred.srem.continue10:
  %phi = phi <4 x i8> [ poison, %pred.srem.continue8 ], [ %v, %pred.srem.if9 ]
  %r = extractelement <4 x i8> %phi, i64 3
  br label %middle.block

middle.block:
  ret i8 %r
}

InstSimplify changes:

@@ -6,15 +6,11 @@
   br i1 false, label %pred.srem.if9, label %pred.srem.continue10

 pred.srem.if9:
-  %sr = srem i8 1, zext (i1 icmp eq (ptr getelementptr inbounds ([2 x i32], ptr @e, i64 0, i64 1), ptr @a) to i8)
-  %v = insertelement <4 x i8> zeroinitializer, i8 %sr, i64 3
   br label %pred.srem.continue10

 pred.srem.continue10:
-  %phi = phi <4 x i8> [ poison, %pred.srem.continue8 ], [ %v, %pred.srem.if9 ]
-  %r = extractelement <4 x i8> %phi, i64 3
   br label %middle.block

 middle.block:
-  ret i8 %r
+  ret i8 srem (i8 1, i8 zext (i1 icmp eq (ptr getelementptr inbounds ([2 x i32], ptr @e, i64 0, i64 1), ptr @a) to i8))
 }

@nikic
Copy link
Contributor

nikic commented Jun 13, 2022

This is because I only checked for ConstantExpr in

auto *CE = dyn_cast<ConstantExpr>(CommonValue);
, but this should be done for all Constant. In this case it's a trapping constant expression inside an aggregate constant.

@nikic nikic closed this as completed in 7e64a29 Jun 13, 2022
@nikic
Copy link
Contributor

nikic commented Jun 13, 2022

Okay, this was a bit trickier than expected, because the canTrap() API itself also didn't account for constant aggregates. The second InstSimplify issue is fixed with 7e64a29.

...however, testing the original C reproducer just now, it still fails, so there must be another issue somewhere.

@nikic nikic reopened this Jun 13, 2022
@zhendongsu
Copy link
Author

Perhaps #55999 is also related although it doesn't reproduce with 13.* and 14.*?

@nunoplopes
Copy link
Member

@zhendongsu it's hard to tell without looking at the IR. You can use clang -S -emit-llvm + llvm-reduce to reduce at LLVM IR level. Alive2 can then tell you which optimization is wrong.
Anyway, just saying that it would be really nice if you guys switched your infrastructure to report bugs in LLVM IR, not C, as the minimal repro in LLVM IR is often much smaller than the C equivalent.

@nunoplopes
Copy link
Member

It's SimplifyCFG now:

@e = external global [2 x i32]
@a = external global i32

define i32 @main(i1 %0) {
vector.body:
  br i1 %0, label %pred.srem.if, label %pred.srem.continue

pred.srem.if:
  br label %pred.srem.continue

pred.srem.continue:
  %1 = phi <4 x i8> [ zeroinitializer, %vector.body ], [ <i8 srem (i8 1, i8 zext (i1 icmp eq (ptr getelementptr inbounds ([2 x i32], ptr @e, i64 0, i64 1), ptr @a) to i8)), i8 poison, i8 poison, i8 poison>, %pred.srem.if ]
  ret i32 0
}

gets transformed to:

define i32 @main(i1 %0) {
vector.body:
  %spec.select = select i1 %0, <4 x i8> <i8 srem (i8 1, i8 zext (i1 icmp eq (ptr getelementptr inbounds ([2 x i32], ptr @e, i64 0, i64 1), ptr @a) to i8)), i8 poison, i8 poison, i8 poison>, <4 x i8> zeroinitializer
  ret i32 0
}

@nikic
Copy link
Contributor

nikic commented Jun 13, 2022

Yeah, we have quite a few places with the same basic issue, they're checking canTrap() on ConstantExpr rather than Constant.

@nikic
Copy link
Contributor

nikic commented Jun 13, 2022

Fixed some SimplifyCFG issues in 571c713, but the C reproducer is still broken.

I still see an incorrect usage in SelectionDAGIsel, will check if fixing that helps.

@nikic
Copy link
Contributor

nikic commented Jun 13, 2022

I fixed the SelectionDAG issue in b9a7dea, but it did not help.

@nunoplopes
Copy link
Member

Thanks @nikic!

There's still a problem in SimplifyCFG somehow:

@e = external global [2 x i32]
@a = external global i32

define i32 @main(i1 %0) {
vector.body:
  br i1 %0, label %pred.srem.if, label %pred.srem.continue

pred.srem.if:
  br label %pred.srem.continue

pred.srem.continue:
  %1 = phi <4 x i8> [ poison, %vector.body ], [ <i8 srem (i8 1, i8 zext (i1 icmp eq (ptr getelementptr inbounds ([2 x i32], ptr @e, i64 0, i64 1), ptr @a) to i8)), i8 poison, i8 poison, i8 poison>, %pred.srem.if ]
  br label %pred.srem.continue6

pred.srem.continue6:
  br i1 %0, label %pred.srem.if7, label %pred.srem.continue8

pred.srem.if7:
  %2 = insertelement <4 x i8> zeroinitializer, i8 0, i64 0
  br label %pred.srem.continue8

pred.srem.continue8:
  %3 = phi <4 x i8> [ %1, %pred.srem.continue6 ], [ zeroinitializer, %pred.srem.if7 ]
  ret i32 0
}

@nikic
Copy link
Contributor

nikic commented Jun 14, 2022

I've put up an RFC to address the root cause: https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179

@nunoplopes
Copy link
Member

@nikic fixed this already.

mem-frob pushed a commit to draperlaboratory/hope-llvm-project that referenced this issue Oct 7, 2022
…839)

Folding this case would result in the constant expression being
executed unconditionally, which may introduce a new trap.

Fixes llvm/llvm-project#49839.
mem-frob pushed a commit to draperlaboratory/hope-llvm-project that referenced this issue Oct 7, 2022
Unfortunately, it's not just constant expressions that can trap,
we might also have a trapping constant expression nested inside
a constant aggregate.

Perform the check during phi folding on Constant rather than
ConstantExpr, and extend the Constant::mayTrap() implementation
to also recursive into ConstantAggregates, not just ConstantExprs.

Fixes llvm/llvm-project#49839.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla confirmed Verified by a second party ipo Interprocedural optimizations miscompilation
Projects
None yet
Development

No branches or pull requests

7 participants