Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upUnwinding through `fn()` references across crates causes the behavior to depend on the optimization level #29485
Comments
This comment has been minimized.
This comment has been minimized.
|
Looks like a codegen bug related to the %X = type { i8, i8 }
declare void @foo()
define void @f(%X* noalias) {
%x = alloca %X*
store %X* %0, %X** %x
%2 = load %X*, %X** %x
%3 = getelementptr inbounds %X, %X* %2, i32 0, i32 0
store i8 1, i8* %3
call void @foo()
%4 = load %X*, %X** %x
%5 = getelementptr inbounds %X, %X* %4, i32 0, i32 0
store i8 0, i8* %5
ret void
} When optimized, we get: %X = type { i8, i8 }
declare void @foo()
define void @f(%X* noalias nocapture) {
%2 = getelementptr inbounds %X, %X* %0, i64 0, i32 0
tail call void @foo()
store i8 0, i8* %2, align 1
ret void
}And as we can see, the store of 1 has been optimized away. If we instead remove the %X = type { i8, i8 }
declare void @foo()
define void @f(%X* nocapture) {
%2 = getelementptr inbounds %X, %X* %0, i64 0, i32 0
store i8 1, i8* %2, align 1
tail call void @foo()
store i8 0, i8* %2, align 1
ret void
}cc @dotdash, perhaps we can't place |
alexcrichton
added
A-codegen
I-nominated
T-compiler
labels
Oct 31, 2015
This comment has been minimized.
This comment has been minimized.
|
I'm wrong. |
This comment has been minimized.
This comment has been minimized.
|
An LLVM IR There's a decent argument that LLVM's treatment of #include <iostream>
using namespace std;
void xx() {
throw 1;
}
void (*volatile g)() = xx;
void f(int* __restrict x) {
*x = 1;
g();
*x = 0;
}
int main() {
int x = 0;
try { f(&x); } catch(...) {}
std::cout << x;
return 0;
}gcc prints 1; clang prints 0. On the other hand, you're likely to run into resistance proposing any restriction on how alias analysis can reason about noalias pointers. |
This comment has been minimized.
This comment has been minimized.
|
This isn't just a
MemoryDependenceAnalysis in LLVM currently says that the second store here has a direct local dependence on the first store. There's a sense in which that is a correct answer, but DeadStoreElimination inteprets this answer to mean that the first store is dead. It's not immediately obvious which of these two should be fixed.
|
This comment has been minimized.
This comment has been minimized.
|
triage: P-high |
rust-highfive
added
P-high
and removed
I-nominated
labels
Nov 5, 2015
This comment has been minimized.
This comment has been minimized.
|
Is the problem here that LLVM is optimizing too heavily, or that we are providing incorrect annotations? Either way, it may be prudent to supply fewer annotations for the time being. That'd certainly be an easy patch to do. |
This comment has been minimized.
This comment has been minimized.
|
I asked sunfish to clarify on IRC, and he believes this is 100% an llvm bug. |
This comment has been minimized.
This comment has been minimized.
|
I've now filed an LLVM bug: https://llvm.org/bugs/show_bug.cgi?id=25422 |
This comment has been minimized.
This comment has been minimized.
|
We were discussing in the compiler team meeting. Do we expect LLVM to fix this in a short time frame? I would think probably not -- we were thinking therefore that it would be prudent to workaround in the time being. Would this mean removing the aliasing annotations and thus taking some speed hit? Are there other solutions? Other implications? |
This comment has been minimized.
This comment has been minimized.
No; fixing DSE is probably easy enough... but trusting noalias to work correctly alongside unwinding would probably require auditing every use of alias analysis in LLVM (for example, I'm pretty sure LICM is also broken). In any case, no matter how fast this is fixed, Rust can't unconditionally depend on any fix because we support linking against 3.7.
Yes.
No.
I don't think so. |
This comment has been minimized.
This comment has been minimized.
|
LICM is not obviously broken; see eg. isGuaranteedToExecute in LICM.cpp; it explicitly checks for instructions which may unwind before doing related things. The bug here does appear to arise from the difference between MemDep's way of viewing the world and DSE's. I don't know how conservative Rust wants to be here, but there is room for optimism that this may just be an isolated DSE bug (or a MemDep bug which only affects DSE). |
This comment has been minimized.
This comment has been minimized.
|
Won't removing |
This comment has been minimized.
This comment has been minimized.
|
Okay, LICM's is in fact correct... the definition of isGuaranteedToExecute is overly conservative in a way which happens to cover up this issue. That said, my non-comprehensive audit shows -memcpyopt, -mldst-motion, and -loop-idiom are broken. (Granted, -mldst-motion isn't part of the default pass pipeline.) A couple more testcases:
|
This comment has been minimized.
This comment has been minimized.
|
Wow. I revoke my optimism then. |
This comment has been minimized.
This comment has been minimized.
I am curious what effect it will have, I don't really have a good feeling On Thu, Nov 12, 2015 at 7:39 PM, Dan Gohman notifications@github.com
|
arielb1
added
the
A-LLVM
label
Nov 13, 2015
This comment has been minimized.
This comment has been minimized.
|
This looks rather unambiguously like an LLVM bug (though we may want to work around it). |
This comment has been minimized.
This comment has been minimized.
|
(a reasonable subtask for a relative newcomer to the project could be to evaluate the actual performance impact removing the |
Gankro
added
the
I-unsound 💥
label
Nov 22, 2015
This comment has been minimized.
This comment has been minimized.
|
For rust itself, the text segment in the |
This comment has been minimized.
This comment has been minimized.
|
Could we have concrete numbers ( |
This comment has been minimized.
This comment has been minimized.
|
Of course, 1-3% perf losses still suck - this might be a reason to make official rustc builds use |
This comment has been minimized.
This comment has been minimized.
|
@arielb1 I removed all diff --git a/src/librustc_trans/trans/attributes.rs b/src/librustc_trans/trans/attributes.rs
index 28dfa4e..32dd31d 100644
--- a/src/librustc_trans/trans/attributes.rs
+++ b/src/librustc_trans/trans/attributes.rs
@@ -10,7 +10,7 @@
//! Set and unset common attributes on LLVM values.
use libc::{c_uint, c_ulonglong};
-use llvm::{self, ValueRef, AttrHelper};
+use llvm::{self, ValueRef};
use middle::ty;
use middle::infer;
use session::config::NoDebugInfo;
@@ -117,7 +117,6 @@ pub fn from_fn_attrs(ccx: &CrateContext, attrs: &[ast::Attribute], llfn: ValueRe
llvm::ColdAttribute as u64)
}
} else if attr.check_name("allocator") {
- llvm::Attribute::NoAlias.apply_llfn(llvm::ReturnIndex as c_uint, llfn);
} else if attr.check_name("unwind") {
unwind(llfn, true);
}
@@ -190,24 +189,12 @@ pub fn from_fn_type<'a, 'tcx>(ccx: &CrateContext<'a, 'tcx>, fn_type: ty::Ty<'tcx
// invisible to the program. We also know it's nonnull as well
// as how many bytes we can dereference
attrs.arg(1, llvm::Attribute::StructRet)
- .arg(1, llvm::Attribute::NoAlias)
.arg(1, llvm::Attribute::NoCapture)
.arg(1, llvm::DereferenceableAttribute(llret_sz));
// Add one more since there's an outptr
idx += 1;
} else {
- // The `noalias` attribute on the return value is useful to a
- // function ptr caller.
- match ret_ty.sty {
- // `Box` pointer return values never alias because ownership
- // is transferred
- ty::TyBox(it) if common::type_is_sized(ccx.tcx(), it) => {
- attrs.ret(llvm::Attribute::NoAlias);
- }
- _ => {}
- }
-
// We can also mark the return value as `dereferenceable` in certain cases
match ret_ty.sty {
// These are not really pointers but pairs, (pointer, len)
@@ -233,8 +220,7 @@ pub fn from_fn_type<'a, 'tcx>(ccx: &CrateContext<'a, 'tcx>, fn_type: ty::Ty<'tcx
// For non-immediate arguments the callee gets its own copy of
// the value on the stack, so there are no aliases. It's also
// program-invisible so can't possibly capture
- attrs.arg(idx, llvm::Attribute::NoAlias)
- .arg(idx, llvm::Attribute::NoCapture)
+ attrs.arg(idx, llvm::Attribute::NoCapture)
.arg(idx, llvm::DereferenceableAttribute(llarg_sz));
}
@@ -244,8 +230,6 @@ pub fn from_fn_type<'a, 'tcx>(ccx: &CrateContext<'a, 'tcx>, fn_type: ty::Ty<'tcx
// `Box` pointer parameters never alias because ownership is transferred
ty::TyBox(inner) => {
- attrs.arg(idx, llvm::Attribute::NoAlias);
-
if common::type_is_sized(ccx.tcx(), inner) {
let llsz = machine::llsize_of_real(ccx, type_of::type_of(ccx, inner));
attrs.arg(idx, llvm::DereferenceableAttribute(llsz));
@@ -265,10 +249,6 @@ pub fn from_fn_type<'a, 'tcx>(ccx: &CrateContext<'a, 'tcx>, fn_type: ty::Ty<'tcx
// on memory dependencies rather than pointer equality
let interior_unsafe = mt.ty.type_contents(ccx.tcx()).interior_unsafe();
- if mt.mutbl == hir::MutMutable || !interior_unsafe {
- attrs.arg(idx, llvm::Attribute::NoAlias);
- }
-
if mt.mutbl == hir::MutImmutable && !interior_unsafe {
attrs.arg(idx, llvm::Attribute::ReadOnly);
}
diff --git a/src/librustc_trans/trans/foreign.rs b/src/librustc_trans/trans/foreign.rs
index 95e9e85..39858e9 100644
--- a/src/librustc_trans/trans/foreign.rs
+++ b/src/librustc_trans/trans/foreign.rs
@@ -363,8 +363,7 @@ pub fn trans_native_call<'blk, 'tcx>(bcx: Block<'blk, 'tcx>,
// The outptr can be noalias and nocapture because it's entirely
// invisible to the program. We also know it's nonnull as well
// as how many bytes we can dereference
- attrs.arg(1, llvm::Attribute::NoAlias)
- .arg(1, llvm::Attribute::NoCapture)
+ attrs.arg(1, llvm::Attribute::NoCapture)
.arg(1, llvm::DereferenceableAttribute(llret_sz));
};Sizes before:
Sizes after:
Ignoring the I'll gather new timings in the evening. It'll probably be output from |
This comment has been minimized.
This comment has been minimized.
|
More numbers. I went with the lilbcore
libstd
libsyntax
|
This comment has been minimized.
This comment has been minimized.
|
@dotdash great, thanks! This is sort of inline with my expectations. I also expect we'll see more impact on smaller examples. |
This comment has been minimized.
This comment has been minimized.
|
As an aside, at the work week I got a lot of great suggestions for benchmarks, I have to try and draw them up together into a central list now. |
nikomatsakis
self-assigned this
Jan 7, 2016
This comment has been minimized.
This comment has been minimized.
|
Assigning to myself for the purpose of gathering up a few benchmarks. |
This comment has been minimized.
This comment has been minimized.
|
Doesn't the above patch remove way more |
This comment has been minimized.
This comment has been minimized.
|
So I ran the benchmarks from https://github.com/cristicbz/rust-doom/tree/rustc-benchmark master:
Patched:
|
This comment has been minimized.
This comment has been minimized.
|
That is interesting. Looks like about a 5% slowdown. |
This comment has been minimized.
This comment has been minimized.
|
This patch also removes NoAlias for " Maybe it'd be interesting to see how big the impact of just disabling NoAlias for shared borrows is? EDIT: oops obviously I mixed up shared and mutable references, |
This comment has been minimized.
This comment has been minimized.
|
Time for some new numbers. This time only removing the diff --git a/src/librustc_trans/trans/attributes.rs b/src/librustc_trans/trans/attributes.rs
index 28dfa4e..95c3941 100644
--- a/src/librustc_trans/trans/attributes.rs
+++ b/src/librustc_trans/trans/attributes.rs
@@ -265,7 +265,7 @@ pub fn from_fn_type<'a, 'tcx>(ccx: &CrateContext<'a, 'tcx>, fn_type: ty::Ty<'tcx
// on memory dependencies rather than pointer equality
let interior_unsafe = mt.ty.type_contents(ccx.tcx()).interior_unsafe();
- if mt.mutbl == hir::MutMutable || !interior_unsafe {
+ if mt.mutbl != hir::MutMutable && !interior_unsafe {
attrs.arg(idx, llvm::Attribute::NoAlias);
}Current master (
Patched master:
So basically no difference in performance. I'll make an PR for this. |
This comment has been minimized.
This comment has been minimized.
|
Thanks for persevering, @dotdash! |
This comment has been minimized.
This comment has been minimized.
|
On Thu, Feb 04, 2016 at 02:48:48PM -0800, Björn Steinbrink wrote:
Great. |
dotdash
added a commit
to dotdash/rust
that referenced
this issue
Feb 10, 2016
dotdash
referenced this issue
Feb 10, 2016
Merged
Workaround LLVM optimizer bug by not marking &mut pointers as noalias #31545
dotdash
added a commit
to dotdash/rust
that referenced
this issue
Feb 10, 2016
bors
added a commit
that referenced
this issue
Feb 11, 2016
bors
closed this
in
#31545
Feb 12, 2016
This comment has been minimized.
This comment has been minimized.
|
#32155 could be a regression from this change. It's not that particular code regresses all the time, but it makes the optimization flaky, and it depends on the surrounding program. |
mahkoh commentedOct 30, 2015
In
a.rsIn
b.rsNB: Some pre-1.0 nightlies already behaved this way.