Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regression: dead heap allocations aren't optimized out anymore #24194

Open
oli-obk opened this Issue Apr 8, 2015 · 25 comments

Comments

Projects
None yet
@oli-obk
Copy link
Contributor

oli-obk commented Apr 8, 2015

#22159 was closed after a llvm update (#22526). It used to work (I remember I tried it out). Now it doesn't work anymore.

fn main() {
    let _ = Box::new(42);
}

http://is.gd/Wekr7w

LLVM-IR:

; ModuleID = 'rust_out.0.rs'
target datalayout = "e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

; Function Attrs: uwtable
define internal void @_ZN4main20h657e6a8d1dc11120eaaE() unnamed_addr #0 {
entry-block:
  %0 = tail call i8* @je_mallocx(i64 4, i32 0)
  %1 = icmp eq i8* %0, null
  br i1 %1, label %then-block-57-.i.i, label %"_ZN5boxed12Box$LT$T$GT$3new19h823444268625886800E.exit"

then-block-57-.i.i:                               ; preds = %entry-block
  tail call void @_ZN3oom20he7076b57c17ed7c6HYaE()
  unreachable

"_ZN5boxed12Box$LT$T$GT$3new19h823444268625886800E.exit": ; preds = %entry-block
  %2 = bitcast i8* %0 to i32*
  store i32 42, i32* %2, align 4
  %3 = icmp eq i8* %0, inttoptr (i64 2097865012304223517 to i8*)
  br i1 %3, label %"_ZN14Box$LT$i32$GT$8drop.86517h1e7c6ecb62969b35E.exit", label %cond.i

cond.i:                                           ; preds = %"_ZN5boxed12Box$LT$T$GT$3new19h823444268625886800E.exit"
  tail call void @je_sdallocx(i8* %0, i64 4, i32 0)
  br label %"_ZN14Box$LT$i32$GT$8drop.86517h1e7c6ecb62969b35E.exit"

"_ZN14Box$LT$i32$GT$8drop.86517h1e7c6ecb62969b35E.exit": ; preds = %"_ZN5boxed12Box$LT$T$GT$3new19h823444268625886800E.exit", %cond.i
  ret void
}

define i64 @main(i64, i8**) unnamed_addr #1 {
top:
  %2 = tail call i64 @_ZN2rt10lang_start20he050f8de3bcc02b7VRIE(i8* bitcast (void ()* @_ZN4main20h657e6a8d1dc11120eaaE to i8*), i64 %0, i8** %1)
  ret i64 %2
}

declare i64 @_ZN2rt10lang_start20he050f8de3bcc02b7VRIE(i8*, i64, i8**) unnamed_addr #1

declare noalias i8* @je_mallocx(i64, i32) unnamed_addr #1

; Function Attrs: cold noinline noreturn
declare void @_ZN3oom20he7076b57c17ed7c6HYaE() unnamed_addr #2

declare void @je_sdallocx(i8*, i64, i32) unnamed_addr #1

attributes #0 = { uwtable "split-stack" }
attributes #1 = { "split-stack" }
attributes #2 = { cold noinline noreturn "split-stack" }
@oli-obk

This comment has been minimized.

Copy link
Contributor Author

oli-obk commented Apr 8, 2015

hmmm, I'm not so sure if it's a regression. It still works for vectors: http://is.gd/yNAFF6 . Maybe it never worked for boxes?

@steveklabnik

This comment has been minimized.

Copy link
Member

steveklabnik commented May 24, 2016

So, where do we go with this ticket? Was/is this a regression? is it worth tracking?

@dotdash

This comment has been minimized.

Copy link
Contributor

dotdash commented May 24, 2016

@steveklabnik Probably a regression from when we moved from actual zeroing-drop to using the non-zero drop pattern. The check for that pattern basically says "leak this if the adress matches the drop pattern", so LLVM cannot remove the allocation because that would change semantics. Dynamic drop will fix this, and I think we should keep this open to track that, just to make sure.

@wesleywiser

This comment has been minimized.

Copy link
Member

wesleywiser commented Mar 4, 2017

This appears to be fixed: https://is.gd/kT0Gs9 Is this worth creating some kind of regression test for?

@oli-obk

This comment has been minimized.

Copy link
Contributor Author

oli-obk commented Mar 4, 2017

We could test it by adding a run pass test that uses a custom allocator (which doesn't allocate, but panics).

@bluss bluss added the E-needstest label Mar 4, 2017

@bluss

This comment has been minimized.

Copy link
Contributor

bluss commented Mar 4, 2017

tentatively needstest, then. Maybe a src/test/codegen test?

@nagisa

This comment has been minimized.

Copy link
Contributor

nagisa commented May 22, 2017

Seems to not work again on nightly.

@est31

This comment has been minimized.

Copy link
Contributor

est31 commented May 22, 2017

I've used @Mark-Simulacrum 's bisection tool to find out the regressing commit. It was the LLVM 4.0 upgrade, commit 0777c75 .

@nagisa nagisa self-assigned this Jun 1, 2017

@nagisa

This comment has been minimized.

Copy link
Contributor

nagisa commented Jun 1, 2017

Assigning myself just so it stays on my to-do list. Not immediately working on it, though. Feel free to take over if you want.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Jun 1, 2017

Historically this optimization was done by rust-lang/llvm@4daef48 but this commit was not carried forward to our current branch when the 4.0 upgrade was done because it no longer applies cleanly (IIRC)

@nagisa

This comment has been minimized.

Copy link
Contributor

nagisa commented Jun 3, 2017

Sadly, I couldn’t make the patch above to work. In fact, some testing seems to indicate that the responsibility for eliding allocations has been moved out of LLVM into clang.

Namely, code like this

#include<stdlib.h>
void wowzers(int y) {
    void *z = malloc(y);
    if(z != NULL) free(z);
}

when compiled with clang test.c -emit-llvm -S -O3 -fno-builtin (clang version being 4.0) has the allocations in the produced IR

; Function Attrs: nounwind sspstrong uwtable
define void @wowzers(i32) local_unnamed_addr #0 {
  %2 = zext i32 %0 to i64
  %3 = tail call noalias i8* @malloc(i64 %2) #2
  %4 = icmp eq i8* %3, null
  br i1 %4, label %6, label %5
; <label>:5:                                      ; preds = %1
  tail call void @free(i8* nonnull %3) #2
  br label %6
; <label>:6:                                      ; preds = %1, %5
  ret void
}

whereas if the special handling of the built-ins is not disabled (i.e. without -fno-builtin), it compiles to a ret void even with -O0.

This seems to suggest to me that, unless we do some serious patchwork on LLVM (pretty sure we don’t want to do that), it falls onto rustc to optimise out heap allocations now.

This could also very well be a bug. I have a test case on hand that does optimise out on 3.9 but not on 4.0. I might do a bisection.

@nagisa

This comment has been minimized.

Copy link
Contributor

nagisa commented Jun 3, 2017

Okay, never mind. I got it to work.

nagisa added a commit to nagisa/rust that referenced this issue Jun 3, 2017

Upgrade LLVM
Includes rust-lang/llvm#80

Also adds tests and thus fixes rust-lang#24194

nagisa added a commit to nagisa/rust that referenced this issue Jun 3, 2017

Upgrade LLVM
Includes rust-lang/llvm#80

Also adds tests and thus fixes rust-lang#24194

@nagisa nagisa referenced this issue Jun 3, 2017

Merged

Upgrade LLVM #42410

bors added a commit that referenced this issue Jun 10, 2017

Auto merge of #42410 - nagisa:llvmup, r=sanxiyn
Upgrade LLVM

Includes rust-lang/llvm#80

Also adds tests and thus fixes #24194

bors added a commit that referenced this issue Jun 10, 2017

Auto merge of #42410 - nagisa:llvmup, r=sanxiyn
Upgrade LLVM

Includes rust-lang/llvm#80

Also adds tests and thus fixes #24194

bors added a commit that referenced this issue Jun 14, 2017

Auto merge of #42410 - nagisa:llvmup, r=sanxiyn
Upgrade LLVM

Includes rust-lang/llvm#80

Also adds tests and thus fixes #24194
@nagisa

This comment has been minimized.

Copy link
Contributor

nagisa commented Jun 15, 2017

The LLVM upgrade reintroduces optimisation for _rust_allocate, but the optimisation for __rust_allocate_zeroed was backed out due to weird UB-like behaviour with bitvec iterators in rustc_data_structures.

Should investigate eventually. Assigning myself.

@nagisa

This comment has been minimized.

Copy link
Contributor

nagisa commented Jun 15, 2017

cc @arielb1 you were interested in this yesterday.

@bors bors closed this in 8938269 Jun 16, 2017

@nagisa nagisa reopened this Jun 16, 2017

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Jul 23, 2017

I believe this is no longer a regression, so untagging as a regression.

@nox

This comment has been minimized.

Copy link
Contributor

nox commented Apr 2, 2018

It's not very clear to me whether this is still actionable. The original snippet doesn't seem to be misoptimised anymore. Can someone produce a new snippet that demonstrates the continued existence of the issue? Should the issue be closed?

Cc @rust-lang/wg-codegen

@nagisa

This comment has been minimized.

Copy link
Contributor

nagisa commented Apr 3, 2018

@nagisa

This comment has been minimized.

Copy link
Contributor

nagisa commented Apr 3, 2018

@nox

This comment has been minimized.

Copy link
Contributor

nox commented Apr 3, 2018

@nagisa Any snippet demonstrating the issue?

@nagisa

This comment has been minimized.

Copy link
Contributor

nagisa commented Apr 3, 2018

#![feature(allocator_api)]
use std::heap::{Heap, Layout, Alloc};

pub unsafe fn alloc_zeroed_doesnt_optimise() {
    let _ = Heap.alloc_zeroed(Layout::from_size_align_unchecked(4, 8));
}

extern "C" {
    fn foo(x: *mut u8);
}

pub unsafe fn alloc_zeroed_should_optimise_rezeroing() {
    let a = Heap.alloc_zeroed(Layout::from_size_align_unchecked(16, 8)).unwrap();
    let slc = ::std::slice::from_raw_parts_mut(a, 16);
    for i in slc.iter_mut() {
        *i = 0;
    }
    foo(slc.as_mut_ptr());
}
@shepmaster

This comment has been minimized.

Copy link
Member

shepmaster commented Jul 5, 2018

For searchability purposes, the functions are now called __rust_alloc_zeroed and __rust_dealloc.

@shepmaster

This comment has been minimized.

Copy link
Member

shepmaster commented Jul 5, 2018

@nikic

This comment has been minimized.

Copy link
Contributor

nikic commented Nov 5, 2018

It would be nice if we didn't have to patch LLVM to support custom allocation functions. Unfortunately the last discussion on that topic didn't really end up anywhere: http://lists.llvm.org/pipermail/llvm-dev/2016-January/093625.html

@oli-obk

This comment has been minimized.

Copy link
Contributor Author

oli-obk commented Nov 5, 2018

The only way around that is to guarantee this optimization via MIR optimizations. But our MIR optimization story is nowhere near the level required for that.

@arthurprs

This comment has been minimized.

Copy link
Contributor

arthurprs commented Nov 5, 2018

Couldn't this be generalized by having a "pure, no side-effects" unsafe attribute?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.