Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upSafe Rust code miscompilation due to a bug in LLVM's Global Value Numbering #45839
Comments
bstrie
added
A-LLVM
I-nominated
I-unsound 💥
T-compiler
labels
Nov 7, 2017
This comment has been minimized.
This comment has been minimized.
|
Conservatively tagging this as I-unsound, feel free to downgrade to I-wrong if necessary. |
This comment has been minimized.
This comment has been minimized.
|
We discussed this in the @rust-lang/compiler team meeting and we are thinking of closing this bug. It doesn't make sense to have a bug for every LLVM bug, and this one hasn't been observed "in the wild" for Rust code, only for artificial examples, so it's probably not adding much value. @RalfJung thanks for bringing it to our attention. If LLVM decides not to fix, we may want to re-open, though it's unclear if there is much we can practically do to prevent this sort of optimization. |
nikomatsakis
closed this
Nov 9, 2017
This comment has been minimized.
This comment has been minimized.
|
Fine for me.
I'd say this depends on their reasoning. If they somehow introduce a rule that this ought to be UB, we have a problem -- I don't see any good way to rule out such safe code. I would also be surprised if it was not possible to expand this miscompilation to trigger actual UB in safe Rust (notwithstanding the fact that miscompilations giving the wrong results in safe code aren't really less bad than UB in safe code). |
This comment has been minimized.
This comment has been minimized.
|
FWIW, we could fix this by replacing the #[bench]
fn current(b: &mut Bencher) {
let mut v = Vec::with_capacity(512);
let mut r = 0u32;
v.clear();
for _ in 0..512 {
r = r.wrapping_mul(1664525).wrapping_add(1013904223);
v.push(r);
}
b.iter(|| {
let v = black_box(&v);
let mut p = v.as_ptr();
let end = v.as_ptr().wrapping_offset(v.len() as isize);
let mut sum = 0;
while p != end {
sum += unsafe { *p };
p = p.wrapping_offset(1);
}
black_box(sum)
});
}goes from 23ns/iter up to 187ns/iter . Investigating the asm (playground), it looks like it inhibits auto-vectorisation on: pub trait WrappingOffsetExt {
fn wrapping_offset_ext(self, offset: isize) -> Self;
}
impl<T> WrappingOffsetExt for *const T {
fn wrapping_offset_ext(self, offset: isize) -> Self {
((self as isize).wrapping_add(offset * (::std::mem::size_of::<T>() as isize)) as *const T)
}
}
#[no_mangle]
pub fn via_wrapping_offset(v: &[u32]) -> u32 {
let mut p = v.as_ptr();
let end = v.as_ptr().wrapping_offset(v.len() as isize);
let mut sum = 0;
while p != end {
sum += unsafe { *p };
p = p.wrapping_offset(1);
}
sum
}
#[no_mangle]
pub fn via_integer(v: &[u32]) -> u32 {
let mut p = v.as_ptr();
let end = v.as_ptr().wrapping_offset_ext(v.len() as isize);
let mut sum = 0;
while p != end {
sum += unsafe { *p };
p = p.wrapping_offset_ext(1);
}
sum
}
fn main() {}Edit: Fixed lack of |
This comment has been minimized.
This comment has been minimized.
That's what I mentioned in my previous commit; however, your patch is slightly wrong: You need to multiply |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Not sure which commit that, I must have missed it, sorry.
D'oh, sorry. Numbers and conclusions stay exactly the same. @eddyb |
RalfJung
referenced this issue
Nov 10, 2017
Closed
Clarify semantics of `wrapping_offset` when leaving object bounds #45719
This comment has been minimized.
This comment has been minimized.
|
@eddyb thanks for the pointer to #25434. It seems that |
This comment has been minimized.
This comment has been minimized.
regehr
commented
Nov 13, 2017
|
Can I ask if you (Rust compiler) folks have looked over the emitted IR for this one super carefully to make sure it doesn't contain anything that gives LLVM an easy excuse to invoke UB? |
This comment has been minimized.
This comment has been minimized.
|
Obviously, the entire reason there is UB here is because LLVM's semantics are not clear enough. However, the 2 "known" exploits of this bug rely either on out-of-bounds |
This comment has been minimized.
This comment has been minimized.
regehr
commented
Nov 13, 2017
|
@arielb1 I was trying to politely ask if y'all are completely sure the Rust compiler isn't doing something stupid. |
This comment has been minimized.
This comment has been minimized.
|
@regehr Yes; I've looked over the IR. Also, there is a C version of the same bug reported here which doesn't appear to invoke any undefined behavior either. The underlying problem appears to replacing if (p == q) {
*p = 42;
}In LLVM, In C terms, the equality comparison may succeed when "one [operand] is a pointer to one past the end of one array object and the other is a pointer to the start of a different array object that happens to immediately follow the first array object in the address space" (6.5.9p6), but the dereference of the one-past-the-end pointer isn't valid despite being equal to a pointer which would have a valid dereference, because "If the result [of pointer arithmetic] points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated." (6.5.6p8). |
This comment has been minimized.
This comment has been minimized.
regehr
commented
Nov 13, 2017
|
Thanks @sunfishcode! This stuff is super difficult and I wanted some reassurance this is a hard bug and not an easy bug :) |
This comment has been minimized.
This comment has been minimized.
|
Another interesting question is if we can use this LVLM bug to trigger UB in safe Rust code. So far it seems we have a gadget to "replace" one value of a given type by another; if we could apply that selectively so that the during bounds checks, the index is replaced when checking but not when accessing, we would get UB. Not sure if that is possible though, because the Rust compiler is emitting both of these together. |
bstrie
referenced this issue
Nov 17, 2017
Open
Investigate how LLVM deals with huge stack frames #18072
This comment has been minimized.
This comment has been minimized.
|
The LLVM bug seems to shows a program which has UB in C. In In Rust, I.e. why is wrapping_offset safe on pointers when it's potentially undefined in every other systems language? |
This comment has been minimized.
This comment has been minimized.
regehr
commented
Jul 26, 2018
|
@ahmedcharles the result of (int)p has type int and it obeys integer rules, not pointer rules. adding 1 would only be UB if (int)p had evaluated to INT_MAX |
This comment has been minimized.
This comment has been minimized.
That's not correct. Whether the resulting pointer may be used to access memory is a different question, also see the updated documentation for this method. |
RalfJung commentedNov 7, 2017
•
edited
lib/src.rsin thefoocrate:src/main.rsin the main crate:This prints
b = 1, x = 7777. That is wrong; ifbis1, it must be the case that the then-branch in theq as *const _ == pconditional was taken, soris&mut x(sinceb2istrue), soxcannot still be7777.This is a bug in LLVM's GVN, see https://bugs.llvm.org/show_bug.cgi?id=35229 for the upstream report and some further analysis.
Original C test case by Gil Hur.