Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loop unswitch and GVN interact badly and miscompile #31000

Closed
nunoplopes opened this issue Jan 16, 2017 · 20 comments
Closed

Loop unswitch and GVN interact badly and miscompile #31000

nunoplopes opened this issue Jan 16, 2017 · 20 comments
Labels

Comments

@nunoplopes
Copy link
Member

Bugzilla Link 31652
Version trunk
OS All
Blocks #27880
Attachments Reduced IR test case
CC @chandlerc,@majnemer,@aqjune,@regehr,@sanjoy

Extended Description

Loop unswitch and GVN when taken together can miscompile code due to disagreeing semantics of branch on undef.
In our opinion it's loop unswitch that is incorrect.

Test case in C:
static volatile int always_false = 0;

void maybe_init(int *p) {}

int f(int limit0) {
int maybe_undef_loc;
maybe_init(&maybe_undef_loc);

int limit = limit0;
int total = 0;
for (int i = 0; i < limit; i++) {
total++;
if (always_false) {
if (maybe_undef_loc != (limit0 + 10)) {
total++;
}
}
limit = limit0 + 10;
}
return total;
}

int printf(const char *, ...);

int main(int argc, char **argv) {
printf("f(10) = %d\n", f(10));
return 0;
}

I believe this test case has no UB even at C level. It should print "f(10) = 20".

I've attached a reduced IR test case.

Running LICM, the comparison of maybe_undef_loc gets hoisted:

$ opt -S bug.ll -loop-rotate -licm
...
loop.body.lr.ph:
%undef_loc = load i32, i32* %maybe_undef_loc, align 4
%add = add nsw i32 %limit0, 10
%cmp3 = icmp ne i32 %undef_loc, %add
%limit2 = add nsw i32 %limit0, 10
br label %loop.body
...

This is still ok. But then adding loop unswitch:

$ opt -S bug.ll -loop-rotate -licm -loop-unswitch
...
loop.body.lr.ph:
%undef_loc = load i32, i32* %maybe_undef_loc, align 4
%add = add nsw i32 %limit0, 10
%cmp3 = icmp ne i32 %undef_loc, %add
%limit2 = add nsw i32 %limit0, 10
br i1 %cmp3, label %loop.body.lr.ph.split.us, label %loop.body.lr.ph.loop.body.lr.ph.split_crit_edge
...

Loop unswitch introduce a branch on %cmp3, which will be undef once we inline maybe_init(). Hence loop unswitch assumes branch on undef is a non-deterministic jump (so not UB).
By running GVN after loop unswitch, we clearly see that GVN has a different perspective on the semantics of branch on undef. It produces this diff:

  • %limit2 = add nsw i32 %limit0, 10
    ...
  • %cmp = icmp slt i32 %i2, %limit2
  • %cmp = icmp slt i32 %i2, %undef_loc

I'm not including the context here, but to be able to justify this transformation, branch on undef has to be UB, not a non-deterministic jump as assumed by loop unswitch above.

Putting all things together:
$ opt -S bug.ll -loop-rotate -licm -loop-unswitch -gvn | opt -S -O2 | llc -o x.S && clang x.S -o x && ./x
f(10) = 1

This is wrong. Should be 20, not 1.

This example shows how Loop unswitch and GVN can produce a miscompilation end-to-end. The bug can be fixed by using the proposed freeze instruction in loop unswitch.

(Example by Sanjoy)

@llvmbot
Copy link
Collaborator

llvmbot commented Jan 17, 2017

Are you sure it's actually GVN?
GVN uses a lot of the instruction simplification and CFG simplification utilities, my guess is it's really down there in one of them :)

@llvmbot
Copy link
Collaborator

llvmbot commented Jan 17, 2017

Which would be much worse of course, because it would mean it affected more passes

@sanjoy
Copy link
Contributor

sanjoy commented Jan 17, 2017

Are you sure it's actually GVN?

The most aggressive thing we can say at this time is that the bug is in (GVN + LoopUnswitch). It is difficult to narrow it down further today without formalizing undef. :)

In the new poison formalization, the bug is not in GVN, but in loop unswitch.

GVN uses a lot of the instruction simplification and CFG simplification
utilities, my guess is it's really down there in one of them :)

In brief (at the risk of potentially sounding misleading): if, in today's undef formalization we had to decide the bug was in GVN (i.e. change the definition such that the bug was not in LoopUnswitch), then the blame would fall on GVN::propagateEquality.

@llvmbot
Copy link
Collaborator

llvmbot commented Jan 17, 2017

Except propagateEquality does not do anything with undef.
Only utilities it calls do.
Otherwise, the only undef handling it has is to treat values from unreachable blocks as undef.

@sanjoy
Copy link
Contributor

sanjoy commented Jan 17, 2017

Except propagateEquality does not do anything with undef.

It does not have to do anything specifically with undef. The bug is more about taking into account the possibility of some value being undef either at runtime, or discovered to be undef after optimization that happens later.

I believe GVN::propagateEquality does this transform:

if (x == y) {
use(x); // divides by x
}

=>

if (x == y) {
use(y); // (now) divides by y
}

Now if x was 1, and y was undef (at runtime or discovered to be undef later, during GVN it is a normal llvm::Value that you don't otherwise have information about), what did the transform just do?

One interpretation is that with x == 1 and y == undef, the original program,

if (1 == undef) {
use(1);
}

is undefined because it branches on 1 == undef == undef. Under this interpretation GVN::propagateEquality was correct since the program was undefined originally and any transform on it is correct by definition.

However, under this interpretation of branching on undef, loop unswitch is incorrect, since it effectively "hoists" branches "out of control flow":

for (;;) {
if (c0) {
if (c1) {
}
}
}

to

if (c1) {
for (;;) {
if (c0) {
}
}
} else {
// other version ...
}

Given the semantic "br undef" is UB, we've miscompiled the program if c0 was false and c1 was undef since we transformed a program that would never branch on undef (no UB) to one that does (has UB).

The other interpretation is that LoopUnswitch is correct, and branch on undef just picks one of its successors at random.

However, this means GVN::propagateEquality is incorrect, since if you have:

if (x == y) {
use(x); // divides by x
}

=>

if (x == y) {
use(y); // (now) divides by y
}

If x was 1 and y was undef, the initial program would (at random) either do a division by 1 or not. However, after the transform it would (at random) divide by undef or not. But dividing by undef is undefined behavior, and by doing the transform you've introduced UB in a correct program.

Only utilities it calls do.
Otherwise, the only undef handling it has is to treat values from
unreachable blocks as undef.

@llvmbot
Copy link
Collaborator

llvmbot commented Jan 17, 2017

First, "I believe this test case has no UB even at C level. It should print "f(10) = 20".
"
This is, IMHO, pretty clearly abuse of the standard :)

You are simply taking the address to avoid this clause:

"If the lvalue designates an object of automatic storage duration that could have been declared with the register storage class (never had its address taken), and that object is uninitialized (not declared with an initializer and no assignment to it has been performed prior to use), the behavior is undefined."

This seems more a bug in the standard than the compiler ;)

Which means when i see programs like this, my main answer is "hey, that's super-interesting that you found this loophole. But i don't have interest ATM in trying to make the compiler do something sane here, you should just fix your variable".

I certainly realize your goal with the freeze proposal is to give meaning to the indeterminate value that maybe_undef_loc represents here.

But htere are plenty of cases where the compiler doesn't do a thing that the standard says because it is insane or broken.

Past that, if i had to choose, it's loop unswitch that's broken here.

I actually have pretty simple reasoning:

Any transformation that makes it so that
if (x==y)

requires extra work to try to handle "undef" or any other value, is broken.

Because it would be nonsensical.

That is, given if (x==y), it must always be legal to replace x with y inside any dominated part of the if condition.

If we've created semantics that make that untrue, we should fix them eventually
If transformations are currently doing things that make that untrue (like loop unswitch), we should fix them for now.

But i'm also of the mind that it's not worth losing sleep over this case at all.

IE i don't think it's really worth making loop unswitch try to prove the non-undefness of this variable just to fix this case.

@sanjoy
Copy link
Contributor

sanjoy commented Jan 17, 2017

First, "I believe this test case has no UB even at C level. It should print
"f(10) = 20".
"
This is, IMHO, pretty clearly abuse of the standard :)

You are simply taking the address to avoid this clause:

"If the lvalue designates an object of automatic storage duration that could
have been declared with the register storage class (never had its address
taken), and that object is uninitialized (not declared with an initializer
and no assignment to it has been performed prior to use), the behavior is
undefined."

Nope.

The maybe_init(&maybe_undef_loc) call is just to let me order optimizations in a way that I can trigger the micompile in an easily understood manner.

The program never actually reads from maybe_undef_loc; that read is guarded under a condition that is always false. So even if the you drop the clause you quoted, as long as taking the address of an uninitialized stack variable is well defined this is a miscompile.

That is why I have the second "opt -S -O2". That second run runs the inliner, and makes it "obvious" that maybe_undef_loc is uninitialized. But the we've already miscompiled the program.

Past that, if i had to choose, it's loop unswitch that's broken here.

That's what we went with.

@llvmbot
Copy link
Collaborator

llvmbot commented Jan 17, 2017

I'm kinda surprised, i would never have bet this is from real code.
In any case, i think loop unswitch should not unswitch variables unless it can prove they have defined values.

THis is what gcc does:

"
/* Unswitching on undefined values would introduce undefined
behavior that the original program might never exercise. */
if (ssa_undefined_value_p (use, true))
return NULL_TREE;"
"

Which in turn determines if it can prove it's initialized (by being an argument, a hard register variable, a returned reference, or has a defining statement).

(we could do better using IPA to see if it's always inited through IPA dataflow)

@sanjoy
Copy link
Contributor

sanjoy commented Jan 17, 2017

I'm kinda surprised, i would never have bet this is from real code.

It's not "real real code" (i.e. it is synthetic, not reduced from some larger miscompiling test case), but it isn't relying on some dusty corner case of the standard either. :)

In any case, i think loop unswitch should not unswitch variables unless it
can prove they have defined values.

That's kind of hard to do at the IR level -- unless you can know that a llvm::Value is a constant, or a some simple expression involving constants, it can always be undef/poison.

The "freeze" operator lets us "cast away" the potential undefined-ness in the cases where loop unswitch cannot prove the definedness of the value it is unswitching on, giving us an expression safe to switch on.

Which in turn determines if it can prove it's initialized (by being an
argument, a hard register variable, a returned reference, or has a defining
statement).

In LLVM IR both arguments and defining statements (e.g. loads, shifts, adds etc.) can result in poison and undef.

@llvmbot
Copy link
Collaborator

llvmbot commented Jan 17, 2017

"
In LLVM IR both arguments and defining statements (e.g. loads, shifts, adds etc.) can result in poison and undef.
"
Sure, and i'd argue that until it can definitely prove that such a thing does not occur, through symbolic evaluation, whatever, it can't do unswitching on it.

I realize this limits the usefulness of loop unswitching ATM. I don't know how else you'd ever fix this testcase right now. :)

(since freeze/etc is a larger change)

@llvmbot
Copy link
Collaborator

llvmbot commented Jan 17, 2017

and as for "not a dusty corner caseæ argument, saying "well, the condition is false, so it has no undefined behavior", is technically true, but that's kind of pointless in a large way. In the example, there is no useful case for the variable to be uninitialized, IMHO, and any case in which it is a bug.

Do you have an example case where, lacking the condition, the program is both usefully using it and it's not a bug?

It is definitely a bug if the condition is not false, and while it's true that it would be nice to not introduce this bug to the case where it is always false, i still have a lot of trouble saying "well, it's super important we get this right".

(But i'll admit I feel this way about most cases where people have hidden undefined behavior like this :P. I think it would be just fine for the standard to say it's undefined if it ends up unitialized, whether or not it's ever evaluated)

@sanjoy
Copy link
Contributor

sanjoy commented Jan 17, 2017

"
In LLVM IR both arguments and defining statements (e.g. loads, shifts, adds
etc.) can result in poison and undef.
"
Sure, and i'd argue that until it can definitely prove that such a thing
does not occur, through symbolic evaluation, whatever, it can't do
unswitching on it.

I realize this limits the usefulness of loop unswitching ATM. I don't know
how else you'd ever fix this testcase right now. :)

Yes, there is no way we can truly fix the problem in today's IR.

However, I personally don't have the bandwidth at this time to take on pessimizing loop unswitch. :)

@sanjoy
Copy link
Contributor

sanjoy commented Jan 17, 2017

and as for "not a dusty corner caseæ argument, saying "well, the condition
is false, so it has no undefined behavior", is technically true, but that's
kind of pointless in a large way. In the example, there is no useful case
for the variable to be uninitialized, IMHO, and any case in which it is a
bug.

Do you have an example case where, lacking the condition, the program is
both usefully using it and it's not a bug?

I suppose you could have cases like:

SmallVector<int, 4> Vect; // 4 uninitialized ints
Vect.push_back(1);

// After inlining inlined body

if (Vect.size() > 2 && Vect[1] != 400) {
// Do something
}

Here Vect.size() > 2 is the always_false part, and Vect[1] is the maybe_undef bit.

Another example is how we use PatternMatch.h. One idiomatic use tends to be (with references desugared to pointers):

Value *V; // Sometimes people will set V to nullptr, but sometimes they won't
if (match(Expr, &V)) {
// Use V;
}
// V is not initialized down this path.

(But i'll admit I feel this way about most cases where people have hidden
undefined behavior like this :P. I think it would be just fine for the
standard to say it's undefined if it ends up unitialized, whether or not
it's ever evaluated)

IIUC you're saying all stack variables have to be initialized on exit from the function. If so, I don't think that is idiomatic in C++ but I will admit I'm only familiar with a handful of C++ codebases.

@llvmbot
Copy link
Collaborator

llvmbot commented Jan 17, 2017

and as for "not a dusty corner caseæ argument, saying "well, the condition
is false, so it has no undefined behavior", is technically true, but that's
kind of pointless in a large way. In the example, there is no useful case
for the variable to be uninitialized, IMHO, and any case in which it is a
bug.

Do you have an example case where, lacking the condition, the program is
both usefully using it and it's not a bug?

I suppose you could have cases like:

SmallVector<int, 4> Vect; // 4 uninitialized ints
Vect.push_back(1);

// After inlining inlined body

if (Vect.size() > 2 && Vect[1] != 400) {
// Do something
}

Here Vect.size() > 2 is the always_false part, and Vect[1] is the
maybe_undef bit.

Another example is how we use PatternMatch.h. One idiomatic use tends to be
(with references desugared to pointers):

Value *V; // Sometimes people will set V to nullptr, but sometimes they
won't
if (match(Expr, &V)) {
// Use V;
}
// V is not initialized down this path.

(But i'll admit I feel this way about most cases where people have hidden
undefined behavior like this :P. I think it would be just fine for the
standard to say it's undefined if it ends up unitialized, whether or not
it's ever evaluated)

IIUC you're saying all stack variables have to be initialized on exit from
the function. If so, I don't think that is idiomatic in C++ but I will
admit I'm only familiar with a handful of C++ codebases.

Well, actually, i'd just change the standard to definitely init them to zero, and then our only problem is proving that we don't have to init them to zero because it's overwritten by some other value :)

I doubt the performance consequences are really that high these days ...

@llvmbot
Copy link
Collaborator

llvmbot commented Jan 17, 2017

"
In LLVM IR both arguments and defining statements (e.g. loads, shifts, adds
etc.) can result in poison and undef.
"
Sure, and i'd argue that until it can definitely prove that such a thing
does not occur, through symbolic evaluation, whatever, it can't do
unswitching on it.

I realize this limits the usefulness of loop unswitching ATM. I don't know
how else you'd ever fix this testcase right now. :)

Yes, there is no way we can truly fix the problem in today's IR.

However, I personally don't have the bandwidth at this time to take on
pessimizing loop unswitch. :)

Hey, it's your bug, if you want to leave it open, SGTM :)

@nunoplopes
Copy link
Member Author

Thanks Daniel for your feedback (and for the gcc ref)! At least we agree that loop unswitch has a problem :)

Freeze is indeed the best fix for loop unswitching that we could come up with.
Also, please realize that LLVM/clang themselves introduce a lot of undefs (e.g., bitfields, padding, etc), so a program doesn't need to have UB for this miscompilation to happen; we just didn't look close enough to find those cases.

BTW, MSVC has an easy strategy to deal with undef values. Since their pipeline is (mostly) fixed, after inlining and all the inter-proc analyses they know there won't be any more undefs flowing through function arguments or global variables. Therefore they can mark those variables as defined.
In LLVM we've always assumed that we can have at arbitrary order, but this makes things more complicated. We could perhaps have a similar notion of "only-local-transformations-from-now-on" in the new PM?

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 10, 2021
@aqjune
Copy link
Contributor

aqjune commented Apr 23, 2022

If this issue is addressed, we'll need to remove this snippet of code as well: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp#L2508

@aqjune
Copy link
Contributor

aqjune commented Apr 26, 2022

The old loop unswitch pass will be removed in https://reviews.llvm.org/D124376 .

@fhahn
Copy link
Contributor

fhahn commented Apr 28, 2022

The old loop unswitch pass will be removed in https://reviews.llvm.org/D124376 .

I think we can consider this issue closed, once D124376 lands.

@fhahn fhahn closed this as completed in fb4113e Apr 29, 2022
@aqjune
Copy link
Contributor

aqjune commented Apr 30, 2022

If this issue is addressed, we'll need to remove this snippet of code as well: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp#L2508

This is removed via 40a2e35

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants