-
Notifications
You must be signed in to change notification settings - Fork 10.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Loop unswitch and GVN interact badly and miscompile #31000
Comments
Are you sure it's actually GVN? |
Which would be much worse of course, because it would mean it affected more passes |
The most aggressive thing we can say at this time is that the bug is in (GVN + LoopUnswitch). It is difficult to narrow it down further today without formalizing undef. :) In the new poison formalization, the bug is not in GVN, but in loop unswitch.
In brief (at the risk of potentially sounding misleading): if, in today's undef formalization we had to decide the bug was in GVN (i.e. change the definition such that the bug was not in LoopUnswitch), then the blame would fall on GVN::propagateEquality. |
Except propagateEquality does not do anything with undef. |
It does not have to do anything specifically with undef. The bug is more about taking into account the possibility of some value being undef either at runtime, or discovered to be undef after optimization that happens later. I believe GVN::propagateEquality does this transform: if (x == y) { => if (x == y) { Now if x was 1, and y was undef (at runtime or discovered to be undef later, during GVN it is a normal llvm::Value that you don't otherwise have information about), what did the transform just do? One interpretation is that with x == 1 and y == undef, the original program, if (1 == undef) { is undefined because it branches on However, under this interpretation of branching on undef, loop unswitch is incorrect, since it effectively "hoists" branches "out of control flow": for (;;) { to if (c1) { Given the semantic "br undef" is UB, we've miscompiled the program if c0 was false and c1 was undef since we transformed a program that would never branch on undef (no UB) to one that does (has UB). The other interpretation is that LoopUnswitch is correct, and branch on undef just picks one of its successors at random. However, this means GVN::propagateEquality is incorrect, since if you have: if (x == y) { => if (x == y) { If x was 1 and y was undef, the initial program would (at random) either do a division by 1 or not. However, after the transform it would (at random) divide by undef or not. But dividing by undef is undefined behavior, and by doing the transform you've introduced UB in a correct program.
|
First, "I believe this test case has no UB even at C level. It should print "f(10) = 20". You are simply taking the address to avoid this clause: "If the lvalue designates an object of automatic storage duration that could have been declared with the register storage class (never had its address taken), and that object is uninitialized (not declared with an initializer and no assignment to it has been performed prior to use), the behavior is undefined." This seems more a bug in the standard than the compiler ;) Which means when i see programs like this, my main answer is "hey, that's super-interesting that you found this loophole. But i don't have interest ATM in trying to make the compiler do something sane here, you should just fix your variable". I certainly realize your goal with the freeze proposal is to give meaning to the indeterminate value that maybe_undef_loc represents here. But htere are plenty of cases where the compiler doesn't do a thing that the standard says because it is insane or broken. Past that, if i had to choose, it's loop unswitch that's broken here. I actually have pretty simple reasoning: Any transformation that makes it so that requires extra work to try to handle "undef" or any other value, is broken. Because it would be nonsensical. That is, given if (x==y), it must always be legal to replace x with y inside any dominated part of the if condition. If we've created semantics that make that untrue, we should fix them eventually But i'm also of the mind that it's not worth losing sleep over this case at all. IE i don't think it's really worth making loop unswitch try to prove the non-undefness of this variable just to fix this case. |
Nope. The maybe_init(&maybe_undef_loc) call is just to let me order optimizations in a way that I can trigger the micompile in an easily understood manner. The program never actually reads from maybe_undef_loc; that read is guarded under a condition that is always false. So even if the you drop the clause you quoted, as long as taking the address of an uninitialized stack variable is well defined this is a miscompile. That is why I have the second "opt -S -O2". That second run runs the inliner, and makes it "obvious" that maybe_undef_loc is uninitialized. But the we've already miscompiled the program.
That's what we went with. |
I'm kinda surprised, i would never have bet this is from real code. THis is what gcc does: " Which in turn determines if it can prove it's initialized (by being an argument, a hard register variable, a returned reference, or has a defining statement). (we could do better using IPA to see if it's always inited through IPA dataflow) |
It's not "real real code" (i.e. it is synthetic, not reduced from some larger miscompiling test case), but it isn't relying on some dusty corner case of the standard either. :)
That's kind of hard to do at the IR level -- unless you can know that a llvm::Value is a constant, or a some simple expression involving constants, it can always be undef/poison. The "freeze" operator lets us "cast away" the potential undefined-ness in the cases where loop unswitch cannot prove the definedness of the value it is unswitching on, giving us an expression safe to switch on.
In LLVM IR both arguments and defining statements (e.g. loads, shifts, adds etc.) can result in poison and undef. |
" I realize this limits the usefulness of loop unswitching ATM. I don't know how else you'd ever fix this testcase right now. :) (since freeze/etc is a larger change) |
and as for "not a dusty corner caseæ argument, saying "well, the condition is false, so it has no undefined behavior", is technically true, but that's kind of pointless in a large way. In the example, there is no useful case for the variable to be uninitialized, IMHO, and any case in which it is a bug. Do you have an example case where, lacking the condition, the program is both usefully using it and it's not a bug? It is definitely a bug if the condition is not false, and while it's true that it would be nice to not introduce this bug to the case where it is always false, i still have a lot of trouble saying "well, it's super important we get this right". (But i'll admit I feel this way about most cases where people have hidden undefined behavior like this :P. I think it would be just fine for the standard to say it's undefined if it ends up unitialized, whether or not it's ever evaluated) |
Yes, there is no way we can truly fix the problem in today's IR. However, I personally don't have the bandwidth at this time to take on pessimizing loop unswitch. :) |
I suppose you could have cases like: SmallVector<int, 4> Vect; // 4 uninitialized ints // After inlining inlined body if (Vect.size() > 2 && Vect[1] != 400) { Here Another example is how we use PatternMatch.h. One idiomatic use tends to be (with references desugared to pointers): Value *V; // Sometimes people will set V to nullptr, but sometimes they won't
IIUC you're saying all stack variables have to be initialized on exit from the function. If so, I don't think that is idiomatic in C++ but I will admit I'm only familiar with a handful of C++ codebases. |
Well, actually, i'd just change the standard to definitely init them to zero, and then our only problem is proving that we don't have to init them to zero because it's overwritten by some other value :) I doubt the performance consequences are really that high these days ... |
Hey, it's your bug, if you want to leave it open, SGTM :) |
Thanks Daniel for your feedback (and for the gcc ref)! At least we agree that loop unswitch has a problem :) Freeze is indeed the best fix for loop unswitching that we could come up with. BTW, MSVC has an easy strategy to deal with undef values. Since their pipeline is (mostly) fixed, after inlining and all the inter-proc analyses they know there won't be any more undefs flowing through function arguments or global variables. Therefore they can mark those variables as defined. |
If this issue is addressed, we'll need to remove this snippet of code as well: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp#L2508 |
The old loop unswitch pass will be removed in https://reviews.llvm.org/D124376 . |
I think we can consider this issue closed, once D124376 lands. |
This is removed via 40a2e35 |
Extended Description
Loop unswitch and GVN when taken together can miscompile code due to disagreeing semantics of branch on undef.
In our opinion it's loop unswitch that is incorrect.
Test case in C:
static volatile int always_false = 0;
void maybe_init(int *p) {}
int f(int limit0) {
int maybe_undef_loc;
maybe_init(&maybe_undef_loc);
int limit = limit0;
int total = 0;
for (int i = 0; i < limit; i++) {
total++;
if (always_false) {
if (maybe_undef_loc != (limit0 + 10)) {
total++;
}
}
limit = limit0 + 10;
}
return total;
}
int printf(const char *, ...);
int main(int argc, char **argv) {
printf("f(10) = %d\n", f(10));
return 0;
}
I believe this test case has no UB even at C level. It should print "f(10) = 20".
I've attached a reduced IR test case.
Running LICM, the comparison of maybe_undef_loc gets hoisted:
$ opt -S bug.ll -loop-rotate -licm
...
loop.body.lr.ph:
%undef_loc = load i32, i32* %maybe_undef_loc, align 4
%add = add nsw i32 %limit0, 10
%cmp3 = icmp ne i32 %undef_loc, %add
%limit2 = add nsw i32 %limit0, 10
br label %loop.body
...
This is still ok. But then adding loop unswitch:
$ opt -S bug.ll -loop-rotate -licm -loop-unswitch
...
loop.body.lr.ph:
%undef_loc = load i32, i32* %maybe_undef_loc, align 4
%add = add nsw i32 %limit0, 10
%cmp3 = icmp ne i32 %undef_loc, %add
%limit2 = add nsw i32 %limit0, 10
br i1 %cmp3, label %loop.body.lr.ph.split.us, label %loop.body.lr.ph.loop.body.lr.ph.split_crit_edge
...
Loop unswitch introduce a branch on %cmp3, which will be undef once we inline maybe_init(). Hence loop unswitch assumes branch on undef is a non-deterministic jump (so not UB).
By running GVN after loop unswitch, we clearly see that GVN has a different perspective on the semantics of branch on undef. It produces this diff:
...
I'm not including the context here, but to be able to justify this transformation, branch on undef has to be UB, not a non-deterministic jump as assumed by loop unswitch above.
Putting all things together:
$ opt -S bug.ll -loop-rotate -licm -loop-unswitch -gvn | opt -S -O2 | llc -o x.S && clang x.S -o x && ./x
f(10) = 1
This is wrong. Should be 20, not 1.
This example shows how Loop unswitch and GVN can produce a miscompilation end-to-end. The bug can be fixed by using the proposed freeze instruction in loop unswitch.
(Example by Sanjoy)
The text was updated successfully, but these errors were encountered: