Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stack overflow in rust-brotli #36778

Closed
nikomatsakis opened this Issue Sep 27, 2016 · 15 comments

Comments

Projects
None yet
8 participants
@nikomatsakis
Copy link
Contributor

nikomatsakis commented Sep 27, 2016

This commit in rust-brotli:

dropbox/rust-brotli@e580fea

fixes a stack overflow that started occurring recently with nightly builds. Perhaps related to #35408?

@danielrh, do you always test against the most recent nightly, or do you only upgrade nightly periodically? I am trying to narrow down when this stack overflow may have started to occur.

@danielrh

This comment has been minimized.

Copy link

danielrh commented Sep 27, 2016

I got distracted for several months, so this test had not been running since early July

@TimNN

This comment has been minimized.

Copy link
Contributor

TimNN commented Sep 27, 2016

I get the stack overflow also onrustc 1.12.0-beta.6 (on OS X)

@TimNN

This comment has been minimized.

Copy link
Contributor

TimNN commented Sep 27, 2016

I can reproduce this as early as nightly-2016-03-20 with RUSTFLAGS="-Z orbit" (the nightly before, nightly-2016-03-20, does not support -Z orbit).

Also seems to be related to mir since on eg. nightly-2016-08-16 it reproduces with -Z orbit but not with -Z orbit=off.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor Author

nikomatsakis commented Sep 27, 2016

@TimNN thanks =) I am hoping that this proves to be a slightly more tractable use case to analyze and improve the stack usage problems we found in #35408. (Though current theory is that the best way to fix is general purpose MIR optimizations.)

cc @eddyb

@eddyb

This comment has been minimized.

Copy link
Member

eddyb commented Sep 27, 2016

@khernyo

This comment has been minimized.

Copy link
Contributor

khernyo commented Sep 29, 2016

@khernyo

This comment has been minimized.

Copy link
Contributor

khernyo commented Oct 1, 2016

You can find a minimized version at https://github.com/khernyo/rust-issue-36778

Basically it boils down to the following code, but the stack overflow only manifests when it's run using cargo run:

fn main() {
    let child = std::thread::spawn(|| {
        std::cell::RefCell::new([0i8; 232500]);
    });
    child.join().unwrap();
}

The first rust commit which seems to causes the stack overflow is ee977e7

@eddyb

This comment has been minimized.

Copy link
Member

eddyb commented Oct 1, 2016

Yeah that's MIR alright - if you double the array size it also breaks before the switch to MIR, right?

@khernyo

This comment has been minimized.

Copy link
Contributor

khernyo commented Oct 1, 2016

Yes, that's right.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor Author

nikomatsakis commented Oct 4, 2016

@khernyo thanks for minimizing that! =)

@nikomatsakis

This comment has been minimized.

Copy link
Contributor Author

nikomatsakis commented Oct 4, 2016

OK, so, maybe this clear to @eddyb, but I did a little bit of digging into the MIR for the case in question. It sort of seems to me it's doing the right thing -- am I missing something?

In particular, the MIR for the closure looks like (modified to only use the type [i8;22], sorry):

fn main::{{closure}}(_1: [closure@<anon>:2:36: 4:6]) -> () {
    let mut _0: ();                      // return pointer
    let mut _2: std::cell::RefCell<[i8; 22]>;
    let mut _3: [i8; 22];

    bb0: {
        StorageLive(_3);                 // scope 0 at <anon>:3:33: 3:42
        _3 = [const 0i8; const 22usize]; // scope 0 at <anon>:3:33: 3:42
        _2 = <std::cell::RefCell<T>><[i8; 22]>::new(_3) -> bb1; // scope 0 at <anon>:3:9: 3:43
    }

    bb1: {
        StorageDead(_3);                 // scope 0 at <anon>:3:33: 3:42
        _0 = ();                         // scope 0 at <anon>:2:39: 4:6
        return;                          // scope 0 at <anon>:2:36: 4:6
    }
}

The problem here is that _2 and _3 overlap, and hence we have two copies of the Big Stack Array. But this doesn't seem wrong -- _3 is not dead until after the call to RefCell::new returns, and _2 is where the result needs to be stored, right? (This is more-or-less the motivation for emplacement in a nutshell, it seems to me?)

@dotdash

This comment has been minimized.

Copy link
Contributor

dotdash commented Oct 4, 2016

The problem is not with the closure, but with RefCell::new(). Old trans only had a single alloca for the array itself, used to pass the data on to UnsafeCell::new(), but MIR trans has 3 such allocas. One to pass the data on to UnsafeCell::new() and two which are used to initialize the value variable. The first is %arg0 used as intermediate storage for incoming arguments, the second is %value itself. Old trans just used to incoming pointer to represent value, allocating no extra stack space.

@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented Oct 4, 2016

For the record, the MIR for RefCell::new:

// MIR for `<cell::RefCell<T>>::new`
// node_id = 18741
// pass_name = PreTrans
// disambiguator = after

fn <cell::RefCell<T>>::new(_1: T) -> cell::RefCell<T> {
    let mut _0: cell::RefCell<T>;        // return pointer
    scope 1 {
        let _2: T;                       // "value" in scope 1 at ../../src/libcore/cell.rs:463:22: 463:27
    }
    let mut _3: cell::UnsafeCell<T>;
    let mut _4: ();
    let mut _5: T;
    let mut _6: cell::Cell<usize>;

    bb0: {
        StorageLive(_2);                 // scope 0 at ../../src/libcore/cell.rs:463:22: 463:27
        _2 = _1;                         // scope 0 at ../../src/libcore/cell.rs:463:22: 463:27
        StorageLive(_3);                 // scope 1 at ../../src/libcore/cell.rs:465:20: 465:42
        StorageLive(_5);                 // scope 1 at ../../src/libcore/cell.rs:465:36: 465:41
        _5 = _2;                         // scope 1 at ../../src/libcore/cell.rs:465:36: 465:41
        _3 = <cell::UnsafeCell<T>><T>::new(_5) -> [return: bb2, unwind: bb1]; // scope 1 at ../../src/libcore/cell.rs:465:20: 465:42
    }

    bb1: {
        resume;                          // scope 0 at ../../src/libcore/cell.rs:463:5: 468:6
    }

    bb2: {
        StorageLive(_6);                 // scope 1 at ../../src/libcore/cell.rs:466:21: 466:38
        _6 = <cell::Cell<T>><usize>::new(cell::UNUSED) -> [return: bb4, unwind: bb3]; // scope 1 at ../../src/libcore/cell.rs:466:21: 466:38
    }

    bb3: {
        drop(_3) -> bb1;                 // scope 1 at ../../src/libcore/cell.rs:465:20: 465:42
    }

    bb4: {
        _0 = cell::RefCell::<T> { borrow: _6, value: _3 }; // scope 1 at ../../src/libcore/cell.rs:464:9: 467:10
        StorageDead(_6);                 // scope 1 at ../../src/libcore/cell.rs:466:21: 466:38
        StorageDead(_3);                 // scope 1 at ../../src/libcore/cell.rs:465:20: 465:42
        StorageDead(_5);                 // scope 1 at ../../src/libcore/cell.rs:465:36: 465:41
        StorageDead(_2);                 // scope 0 at ../../src/libcore/cell.rs:463:22: 463:27
        return;                          // scope 1 at ../../src/libcore/cell.rs:463:5: 468:6
    }
}
@eddyb

This comment has been minimized.

Copy link
Member

eddyb commented Oct 4, 2016

@nikomatsakis This is why I want MIR optimizations to always run.
I believe @pcwalton's MIR passes already help this kind of situation, although AFAICT we also need to propagate the _0.value destination to the <cell::UnsafeCell<T>><T>::new(_5) call.

@brson

This comment has been minimized.

Copy link
Contributor

brson commented Oct 6, 2016

Closing in favor of #35408

@brson brson closed this Oct 6, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.