Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustup-init 0.6.1 segfaults on x86_64 and armv7, 0.6.0 on OS X 10.10 #36023

Closed
brson opened this issue Aug 26, 2016 · 17 comments · Fixed by #36117
Closed

rustup-init 0.6.1 segfaults on x86_64 and armv7, 0.6.0 on OS X 10.10 #36023

brson opened this issue Aug 26, 2016 · 17 comments · Fixed by #36117
Labels
A-codegen Area: Code generation I-crash Issue: The compiler crashes (SIGSEGV, SIGABRT, etc). Use I-ICE instead when the compiler panics. regression-from-stable-to-beta Performance or correctness regression from stable to beta. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@brson
Copy link
Contributor

brson commented Aug 26, 2016

rustup reports of 0.6.1 crashes:

These were built with rustc 1.13.0-nightly (3c5a0fa45 2016-08-22)

rustup 0.6.0 crashed on OS X 10.10+. Fixed with a strategic #[inline(never)]

That was built with rustc 1.13.0-nightly (1576de0ce 2016-08-21)

Reducing these has been very difficult and I have nothing for them right now.

cc @nikomatsakis @eddyb @alexcrichton

@brson brson added A-codegen Area: Code generation I-nominated regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. I-crash Issue: The compiler crashes (SIGSEGV, SIGABRT, etc). Use I-ICE instead when the compiler panics. labels Aug 26, 2016
@eddyb
Copy link
Member

eddyb commented Aug 26, 2016

@brson Do you know of a rustc nightly which doesn't result in crashes?

@brson
Copy link
Contributor Author

brson commented Aug 26, 2016

@eddyb looking for one now.

@eddyb
Copy link
Member

eddyb commented Aug 26, 2016

rust-lang-deprecated/rustup.sh#68 looks like the exact same issue AFAICT.

@brson brson added regression-from-stable-to-beta Performance or correctness regression from stable to beta. and removed regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. labels Aug 26, 2016
@brson
Copy link
Contributor Author

brson commented Aug 26, 2016

rustc 1.13.0-nightly (aef6971ca 2016-08-17) from 2016-08-18 is the first nightly that miscompiles. Curiously this is also the first 1.13 nightly. Unfortunately the bug is on beta too.

reproducing:

commit c8cc167a0ab725e25dc49eda1ea9d71196def5d7

cargo build --release
printf "12" > ~/.multirust/version && cat ~/.multirust/version
target/release/rustup-init

nightly results:

  • 2016-08-10, rustc 1.12.0-nightly (576f766 2016-08-09) - good
  • 2016-08-16, rustc 1.12.0-nightly (197be89 2016-08-15) - build error
  • 2016-08-17, rustc 1.12.0-nightly (1bf5fa3 2016-08-16) - build error
  • 2016-08-18, rustc 1.13.0-nightly (aef6971 2016-08-17) - bad
  • 2016-08-22, rustc 1.13.0-nightly (1576de0 2016-08-21) - bad
  • 2016-08-26, rustc 1.13.0-nightly (e07dd59 2016-08-25) - bad

stable/beta results:

  • rustc 1.11.0 (9b21dcd 2016-08-15) - good
  • rustc 1.12.0-beta.2 (389dad7 2016-08-24) - bad!
  • beta-2016-08-16, rustc 1.12.0-beta.1 (822166b 2016-08-16) - buld error

brson added a commit to brson/rustup.rs that referenced this issue Aug 27, 2016
@brson
Copy link
Contributor Author

brson commented Aug 27, 2016

Removing the two lifetime annotations from #35409 appears to make the crash go away.

@brson
Copy link
Contributor Author

brson commented Aug 27, 2016

I am trying to minimize a test case next.

@brson
Copy link
Contributor Author

brson commented Aug 28, 2016

Reduced case. This segfaults on x86_64-unknown-linux-gnu with rustc 1.13.0-nightly (e07dd59ea 2016-08-25), compiled with -O.

use std::ops::Deref;

fn main() {
    if env_var("FOOBAR").as_ref().map(Deref::deref).ok() == Some("yes") {
        panic!()
    }

    let env_home: Result<String, ()> = Ok("foo-bar-baz".to_string());
    let env_home = env_home.as_ref().map(Deref::deref).ok();

    if env_home == Some("") { panic!() }
}

#[inline(never)]
fn env_var(s: &str) -> Result<String, VarError> {
    Err(VarError::NotPresent)
}

pub enum VarError {
    NotPresent,
    NotUnicode(String),
}

@eddyb
Copy link
Member

eddyb commented Aug 28, 2016

Normalized to:

#[inline(never)]
fn err_none() -> Result<String, Option<String>> {
    Err(None)
}

#[inline(never)]
fn panic() -> ! {
    panic!()
}

#[inline(always)]
fn ok<T, E>(r: Result<T, E>) -> Option<T> {
    match r {
        Ok(x) => Some(x),
        Err(_) => None
    }
}

fn main() {
    {
        let a = err_none();
        let b = match a {
            Ok(ref x) => Ok(&**x),
            Err(ref e) => Err(e)
        };
        if let Some("a") = ok(b) {
            panic()
        }
    }

    {
        let a = err_none();
        let b = match a {
            Ok(ref x) => Ok(&**x),
            Err(ref e) => Err(e)
        };
        if let Some("a") = ok(b) {
           panic()
        }
    }
}

Backtrace is:

#0  je_rtree_get (rtree=<optimized out>, dependent=true, key=0) at /home/eddy/Projects/rust-2/src/liballoc_jemalloc/../jemalloc/include/jemalloc/internal/rtree.h:253
#1  je_chunk_lookup (dependent=true, ptr=0x0) at /home/eddy/Projects/rust-2/src/liballoc_jemalloc/../jemalloc/include/jemalloc/internal/chunk.h:91
#2  huge_node_get (ptr=0x0) at /home/eddy/Projects/rust-2/src/liballoc_jemalloc/../jemalloc/src/huge.c:11
#3  je_huge_dalloc (tsd=0x7ffff7ff1770, ptr=0x0, tcache=0x7ffff6a0c000) at /home/eddy/Projects/rust-2/src/liballoc_jemalloc/../jemalloc/src/huge.c:375
#4  0x000055555555b262 in alloc::heap::deallocate (align=1, ptr=<optimized out>, old_size=<optimized out>) at /home/eddy/Projects/rust-2/src/liballoc/heap.rs:113
#5  alloc::raw_vec::{{impl}}::drop<u8> (self=<optimized out>) at /home/eddy/Projects/rust-2/src/liballoc/raw_vec.rs:561
#6  test::main () at /home/eddy/Projects/rust-2/test.rs:21
#7  0x0000555555569997 in __rust_maybe_catch_panic ()
#8  0x000055555556241f in std::rt::lang_start::h5381d9388ae2d3b7 ()
#9  0x00007ffff7219770 in __libc_start_main () from /nix/store/qc1jm0rplpgkwcacnc03hs9w4c8v2m31-glibc-multi-2.23/lib/libc.so.6
#10 0x000055555555b059 in _start () at ../sysdeps/x86_64/start.S:108

Note the ptr=0x0 passed to je_huge_dalloc.

@eddyb
Copy link
Member

eddyb commented Aug 28, 2016

So far I've found that the drop glue for Result<String, Option<String>> optimizes to start with:

bb2.i:                                            ; preds = %bb3
  %4 = bitcast [3 x i64]* %3 to %"3.std::string::String"*
  %5 = getelementptr inbounds %"3.std::string::String", %"3.std::string::String"* %4, i64 0, i32 0, i32 0, i32 0, i32 0, i32 0
  %6 = load i8*, i8** %5, align 8, !alias.scope !2, !nonnull !9

That load is reading the data pointer of the String (which may be null in the Err case, where it represents Err(None)) and assuming it's nonnull.
While the null check which distinguishes between Err(None) and Err(Some(_)) is still there, it gets optimized out later, due to that !nonnull metadata on the first load.

EDIT: Those 3 instructions are for the Ok case, however, the IR ends up looking like:

bb3:
  %a = alloca %"2.std::result::Result<std::string::String, std::option::Option<std::string::String>>", align 8
  %a1 = alloca %"2.std::result::Result<std::string::String, std::option::Option<std::string::String>>", align 8
  %0 = bitcast %"2.std::result::Result<std::string::String, std::option::Option<std::string::String>>"* %a to i8*
  call void @llvm.lifetime.start(i64 32, i8* %0)
  call fastcc void @_ZN4test8err_none17h78cf98032df7f4e6E(%"2.std::result::Result<std::string::String, std::option::Option<std::string::String>>"* noalias nocapture nonnull dereferenceable(32) %a)
  %1 = getelementptr inbounds %"2.std::result::Result<std::string::String, std::option::Option<std::string::String>>", %"2.std::result::Result<std::string::String, std::option::Option<std::string::String>>"* %a, i64 0, i32 0
  %2 = load i64, i64* %1, align 8, !range !1
  %switch = icmp eq i64 %2, 1
  %3 = getelementptr inbounds %"2.std::result::Result<std::string::String, std::option::Option<std::string::String>>", %"2.std::result::Result<std::string::String, std::option::Option<std::string::String>>"* %a, i64 0, i32 2
  br i1 %switch, label %bb3.i, label %bb2.i

bb2.i:                                            ; preds = %bb3
  %4 = getelementptr inbounds [3 x i64], [3 x i64]* %3, i64 0, i64 0
  %5 = load i64, i64* %4, align 8, !range !2, !alias.scope !3
  %6 = getelementptr inbounds %"2.std::result::Result<std::string::String, std::option::Option<std::string::String>>", %"2.std::result::Result<std::string::String, std::option::Option<std::string::String>>"* %a, i64 0, i32 2, i64 2
  %7 = load i64, i64* %6, align 8, !alias.scope !10
  br label %bb8

bb3.i:                                            ; preds = %bb3
  br label %bb8

The !nonnull has turned into a !range !2 (presumably "everything but 0").
Which is still fine, bb2.i is taken for Ok and bb3.i for Err, so you would think it'd work.

But SimplifyCFG comes in and shoves the load into the entry block:

bb3:
  %a = alloca %"2.std::result::Result<std::string::String, std::option::Option<std::string::String>>", align 8
  %a1 = alloca %"2.std::result::Result<std::string::String, std::option::Option<std::string::String>>", align 8
  %0 = bitcast %"2.std::result::Result<std::string::String, std::option::Option<std::string::String>>"* %a to i8*
  call void @llvm.lifetime.start(i64 32, i8* %0)
  call fastcc void @_ZN4test8err_none17h78cf98032df7f4e6E(%"2.std::result::Result<std::string::String, std::option::Option<std::string::String>>"* noalias nocapture nonnull dereferenceable(32) %a)
  %1 = getelementptr inbounds %"2.std::result::Result<std::string::String, std::option::Option<std::string::String>>", %"2.std::result::Result<std::string::String, std::option::Option<std::string::String>>"* %a, i64 0, i32 0
  %2 = load i64, i64* %1, align 8, !range !1
  %switch = icmp eq i64 %2, 1
  %3 = getelementptr inbounds %"2.std::result::Result<std::string::String, std::option::Option<std::string::String>>", %"2.std::result::Result<std::string::String, std::option::Option<std::string::String>>"* %a, i64 0, i32 2
  %4 = getelementptr inbounds [3 x i64], [3 x i64]* %3, i64 0, i64 0

; This is the load for the Ok case specifically.
  %5 = load i64, i64* %4, align 8, !range !2, !alias.scope !3

  %6 = getelementptr inbounds %"2.std::result::Result<std::string::String, std::option::Option<std::string::String>>", %"2.std::result::Result<std::string::String, std::option::Option<std::string::String>>"* %a, i64 0, i32 2, i64 2
  %7 = load i64, i64* %6, align 8, !alias.scope !10
  %b.sroa.7.097 = select i1 %switch, i64 undef, i64 %7
  %tmp6.sroa.0.0 = select i1 %switch, i64 0, i64 %5
  %8 = inttoptr i64 %tmp6.sroa.0.0 to i8*
  %switch5tmp = icmp ne i64 %tmp6.sroa.0.0, 0
  %9 = icmp eq i64 %b.sroa.7.097, 1
  %or.cond = and i1 %switch5tmp, %9
  br i1 %or.cond, label %bb4.i.i.i.i.i, label %bb10

What does this mean?

  • LLVM has a nasty (class of) bug around metadata hints, not updating/removing them when needed
  • MIR trans and the lifetime intrinsics are coincidental factors, causing the crash, not the misoptimization
    • the assembly is pretty similar without lifetime intrinsics, the null check is still missing

@eddyb
Copy link
Member

eddyb commented Aug 28, 2016

Barebones reproduction:

#![feature(start)]

use std::ptr;

extern {
    fn abort() -> !;
}

struct S {
    ptr: &'static mut u8,
    cap: usize,
    len: usize
}

struct OS {
    ptr: *mut u8,
    cap: usize,
    len: usize
}

struct R {
    is_err: bool,
    buf: OS
}

#[inline(never)]
unsafe fn begin() -> R {
    R {
        is_err: true,
        buf: OS { ptr: ptr::null_mut(), len: 0, cap: 0 }
    }
}

unsafe fn end(r: &mut R) {
    if r.is_err {
        if r.buf.ptr != ptr::null_mut() {
            *r.buf.ptr = 0;
        }
    } else {
        *r.buf.ptr = 0;
    }
}

unsafe fn ok(r: [usize; 3]) -> (*const u8, usize) {
    let r: &(bool, (*const u8, usize)) = &*(&r as *const _ as *const _);
    if r.0 {
        (ptr::null(), std::mem::uninitialized())
    } else {
        r.1
    }
}

macro_rules! twice {
    ($x:expr) => {$x; $x}
}

#[start]
fn start(_: isize, _: *const *const u8) -> isize {
    twice!(unsafe {
        let mut r = begin();
        let a = if r.is_err {
            [1usize, &r.buf as *const _ as usize, 0]
        } else {
            let s: &S = &*(&r.buf as *const _ as *const S);
            [0usize, s.ptr as *const _ as usize, s.len]
        };

        let b = ok(a);
        if b.0 != ptr::null() && b.1 == 1 && *b.0 == 1 {
            abort()
        }

        end(&mut r);
    });
    0
}

@eddyb
Copy link
Member

eddyb commented Aug 28, 2016

Found a LLVM commit should've fixed this (but apparently didn't fully?): llvm-mirror/llvm@a90859a.

EDIT: Most likely FoldTwoEntryPHINode also needs that metadata-stripping treatment.

@eddyb
Copy link
Member

eddyb commented Aug 28, 2016

Filed LLVM bug report: https://llvm.org/bugs/show_bug.cgi?id=29163.

@arielb1
Copy link
Contributor

arielb1 commented Aug 28, 2016

But I thought that bad-metadata loads only result in undef, not in UB. That's how the last of these bugs were fixed.

@eddyb
Copy link
Member

eddyb commented Aug 28, 2016

@arielb1 Just undef is useless, the metadata is there to kill later redundant checks. Hopefully the fix is just to add one or two of those metadata stripping loops to that unhandled case.

@eddyb
Copy link
Member

eddyb commented Aug 29, 2016

Fix is llvm-mirror/llvm@f94d4a7. Recommend backport to beta, too.

@brson
Copy link
Contributor Author

brson commented Aug 29, 2016

Thanks a bunch @eddyb

bors added a commit that referenced this issue Aug 30, 2016
llvm: backport "[SimplifyCFG] Hoisting invalidates metadata".

Fixes #36023 by backporting @majnemer's LLVM patch fixing [the LLVM bug](https://llvm.org/bugs/show_bug.cgi?id=29163.) where SimplifyCFG hoisted instructions andkept their metadata (conditional `!nonnull` loads could kill a null check later if hoisted).

r? @alexcrichton
@sanxiyn
Copy link
Member

sanxiyn commented Nov 30, 2016

LLVM patch was merged to 3.9 stable branch in r288063 and will be included in 3.9.1 release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation I-crash Issue: The compiler crashes (SIGSEGV, SIGABRT, etc). Use I-ICE instead when the compiler panics. regression-from-stable-to-beta Performance or correctness regression from stable to beta. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants