-
Notifications
You must be signed in to change notification settings - Fork 13.6k
Description
Currently, rustc emits lifetime intrinsics to declare the shortest possible lifetime for allocas. Unfortunately, this stops some optimizations from happening. For example, @eddyb came up with the following example:
#![crate_type="lib"]
extern crate test;
#[derive(Copy)]
struct Big {
large: [u64; 100000],
}
pub fn test_func() {
let x = Big {
large: [0; 100000],
};
test::black_box(x);
}
This currently results in the following optimized IR:
define void @_ZN9test_func20hef205289cff69060raaE() unnamed_addr #0 {
entry-block:
%x = alloca %struct.Big, align 8
%0 = bitcast %struct.Big* %x to i8*
%arg = alloca %struct.Big, align 8
call void @llvm.lifetime.start(i64 800000, i8* %0)
call void @llvm.memset.p0i8.i64(i8* %0, i8 0, i64 800000, i32 8, i1 false)
%1 = bitcast %struct.Big* %arg to i8*
call void @llvm.lifetime.start(i64 800000, i8* %1)
call void @llvm.memcpy.p0i8.p0i8.i64(i8* %1, i8* %0, i64 800000, i32 8, i1 false)
call void asm "", "r,~{dirflag},~{fpsr},~{flags}"(%struct.Big* %arg) #2, !noalias !0, !srcloc !3
call void @llvm.lifetime.end(i64 800000, i8* %1) #2, !alias.scope !4, !noalias !0
call void @llvm.lifetime.end(i64 800000, i8* %1)
call void @llvm.lifetime.end(i64 800000, i8* %0)
ret void
}
As you can see, there are still two allocas, one for x
and one copy of it in %arg
, which is used for the function call. Since x
is unused otherwise, we could directly call memset
on %arg
and drop %x
altogether. But the lifetime of %arg
only start after the memset
call, so the optimization doesn't happen. Moving the call to llvm.lifetime.start
up makes the optimization possible.
Now, the lifetime intrinsics only buy us anything if the ranges don't overlap. If the ranges overlap, we may as well make them all start at the same point. One way might be to insert start/end calls at positions that match up with scopes in the language, using an insertion marker like we do for allocas to insert the start calls and a cleanup scope for the end calls. This should also make things more robust than it currently is. We had a few misoptimization problems due to missing calls to llvm.lifetime.end
. :-/