New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue #52475: Make loop detector only consider reachable memory #52626

Merged
merged 15 commits into from Sep 6, 2018

Conversation

Projects
None yet
8 participants
@brunocodutra
Contributor

brunocodutra commented Jul 22, 2018

As suggested by @oli-obk alloc_ids should be ignored by traversing all Allocations in interpreter memory at a given moment in time, beginning by ByRef locals in the stack.

  • Generalize the implementation of Hash for EvalSnapshot to traverse Allocations
  • Generalize the implementation of PartialEq for EvalSnapshot to traverse Allocations
  • Commit regression tests

Fixes #52626
Fixes #52849

@rust-highfive

This comment has been minimized.

Show comment
Hide comment
@rust-highfive

rust-highfive Jul 22, 2018

Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @petrochenkov (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

Collaborator

rust-highfive commented Jul 22, 2018

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @petrochenkov (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@oli-obk

This comment has been minimized.

Show comment
Hide comment
@oli-obk

oli-obk Jul 22, 2018

Contributor

I'll have a more detailed look later. Just wondering right now if we can reuse the StableHash stuff of incremental. Seems like we're trying to solve similar issues

Contributor

oli-obk commented Jul 22, 2018

I'll have a more detailed look later. Just wondering right now if we can reuse the StableHash stuff of incremental. Seems like we're trying to solve similar issues

@brunocodutra

This comment has been minimized.

Show comment
Hide comment
@brunocodutra

brunocodutra Jul 22, 2018

Contributor

Just wondering right now if we can reuse the StableHash stuff of incremental.

I'm not sure I follow, but reuse sounds good.

Contributor

brunocodutra commented Jul 22, 2018

Just wondering right now if we can reuse the StableHash stuff of incremental.

I'm not sure I follow, but reuse sounds good.

@oli-obk

This comment has been minimized.

Show comment
Hide comment
@oli-obk

oli-obk Jul 22, 2018

Contributor

Hmm alloc_id_recursion_tracker should be removed... noone uses it

So what I meant was reusing

impl<'a> HashStable<StableHashingContext<'a>> for mir::interpret::AllocId {

Contributor

oli-obk commented Jul 22, 2018

Hmm alloc_id_recursion_tracker should be removed... noone uses it

So what I meant was reusing

impl<'a> HashStable<StableHashingContext<'a>> for mir::interpret::AllocId {

@ecstatic-morse

I think this would be clearer if we moved this subset of fields into its own struct within EvalContext. There's a commit that did this that I reverted in the original PR because those fields were public and I had to change every use of them.

@petrochenkov

This comment has been minimized.

Show comment
Hide comment
@petrochenkov
Contributor

petrochenkov commented Jul 23, 2018

@brunocodutra

This comment has been minimized.

Show comment
Hide comment
@brunocodutra

brunocodutra Jul 23, 2018

Contributor

@ecstatic-morse that's a great point, in fact my idea is was to crawl the transitive allocations in advance as part of constructing EvalContextRef, so that Hash and Eq are trivial.

EDIT: actually I think @oli-obk might be onto something better with HashStable.

Contributor

brunocodutra commented Jul 23, 2018

@ecstatic-morse that's a great point, in fact my idea is was to crawl the transitive allocations in advance as part of constructing EvalContextRef, so that Hash and Eq are trivial.

EDIT: actually I think @oli-obk might be onto something better with HashStable.

@brunocodutra

This comment has been minimized.

Show comment
Hide comment
@brunocodutra

brunocodutra Jul 24, 2018

Contributor

My understanding of the internals of MIRI are still flaky, so please bear with me if I'm talking nonsense, but after some testing, I'm growing skeptical that the exact example given in #52475 (which I reproduce below) can actually be detected, not because of AllocIds changing, but because the hidden local variables created by &5 seem to imply Frame::locals is an ever growing list. In fact, if I ignore locals that are notByRef in my implementation it does detect this case as a in infinite loop.

#![feature(const_fn, const_let)]

const fn churn_alloc_id() -> usize {
    let mut x: &i32 = &5;
    loop {
        x = &5;
    }
    0
}

fn main() {
    let _ = [(); churn_alloc_id()];
}

Does that make sense?
If so, can anyone come up with an example where AllocIds keep changing, but the number of locals remain constant?

Contributor

brunocodutra commented Jul 24, 2018

My understanding of the internals of MIRI are still flaky, so please bear with me if I'm talking nonsense, but after some testing, I'm growing skeptical that the exact example given in #52475 (which I reproduce below) can actually be detected, not because of AllocIds changing, but because the hidden local variables created by &5 seem to imply Frame::locals is an ever growing list. In fact, if I ignore locals that are notByRef in my implementation it does detect this case as a in infinite loop.

#![feature(const_fn, const_let)]

const fn churn_alloc_id() -> usize {
    let mut x: &i32 = &5;
    loop {
        x = &5;
    }
    0
}

fn main() {
    let _ = [(); churn_alloc_id()];
}

Does that make sense?
If so, can anyone come up with an example where AllocIds keep changing, but the number of locals remain constant?

@oli-obk

This comment has been minimized.

Show comment
Hide comment
@oli-obk

oli-obk Jul 24, 2018

Contributor

the hidden local variables created by &5 seem to imply Frame::locals is an ever growing list.

Nope, rustc allocates a single variable and just reuses it.

if you look at the MIR of that function, you can see that it's even more extreme, because rustc generates a hidden static for the &5:

    let mut _0: usize;                   // return place
    scope 1 {
        let mut _1: &i32;                // "x" in scope 1 at src/main.rs:4:9: 4:14
    }
    scope 2 {
    }
    let mut _2: &i32;
    let mut _3: &i32;

    bb0: {                              
        StorageLive(_1);                 // bb0[0]: scope 0 at src/main.rs:4:9: 4:14
        StorageLive(_2);                 // bb0[1]: scope 0 at src/main.rs:4:23: 4:25
        _2 = promoted[1];                // bb0[2]: scope 0 at src/main.rs:4:23: 4:25
                                         // mir::Constant
                                         // + span: src/main.rs:4:23: 4:25
                                         // + ty: &i32
                                         // + literal: promoted[1]
        _1 = _2;                         // bb0[3]: scope 0 at src/main.rs:4:23: 4:25
        StorageDead(_2);                 // bb0[4]: scope 0 at src/main.rs:4:25: 4:26
        goto -> bb1;                     // bb0[5]: scope 1 at src/main.rs:5:5: 7:6
    }

    bb1: {                              
        StorageLive(_3);                 // bb1[0]: scope 1 at src/main.rs:6:13: 6:15
        _3 = promoted[0];                // bb1[1]: scope 1 at src/main.rs:6:13: 6:15
                                         // mir::Constant
                                         // + span: src/main.rs:6:13: 6:15
                                         // + ty: &i32
                                         // + literal: promoted[0]
        _1 = _3;                         // bb1[2]: scope 1 at src/main.rs:6:9: 6:15
        StorageDead(_3);                 // bb1[3]: scope 1 at src/main.rs:6:15: 6:16
        goto -> bb1;                     // bb1[4]: scope 1 at src/main.rs:5:5: 7:6
    }

if you change the code to

    let x = 5;
    let mut x: *const u32 = &x;
    loop {
        let y = 5;
        x = &y;
    }
    0

you get

    let mut _0: usize;                   // return place
    scope 1 {
        let _1: u32;                     // "x" in scope 1 at src/main.rs:4:9: 4:10
        scope 3 {
            let mut _2: *const u32;      // "x" in scope 3 at src/main.rs:5:9: 5:14
            scope 5 {
                let _6: u32;             // "y" in scope 5 at src/main.rs:7:13: 7:14
            }
            scope 6 {
            }
        }
        scope 4 {
        }
    }
    scope 2 {
    }
    let mut _3: *const u32;
    let mut _4: &u32;
    let mut _5: &u32;
    let mut _7: *const u32;
    let mut _8: &u32;
    let mut _9: &u32;

    bb0: {                              
        StorageLive(_1);                 // bb0[0]: scope 0 at src/main.rs:4:9: 4:10
        _1 = const 5u32;                 // bb0[1]: scope 0 at src/main.rs:4:13: 4:14
                                         // ty::Const
                                         // + ty: u32
                                         // + val: Value(ByVal(Bytes(5)))
                                         // mir::Constant
                                         // + span: src/main.rs:4:13: 4:14
                                         // + ty: u32
                                         // + literal: const 5u32
        StorageLive(_2);                 // bb0[2]: scope 1 at src/main.rs:5:9: 5:14
        StorageLive(_3);                 // bb0[3]: scope 1 at src/main.rs:5:29: 5:31
        StorageLive(_4);                 // bb0[4]: scope 1 at src/main.rs:5:29: 5:31
        StorageLive(_5);                 // bb0[5]: scope 1 at src/main.rs:5:29: 5:31
        _5 = &_1;                        // bb0[6]: scope 1 at src/main.rs:5:29: 5:31
        _4 = _5;                         // bb0[7]: scope 1 at src/main.rs:5:29: 5:31
        _3 = move _4 as *const u32 (Misc); // bb0[8]: scope 1 at src/main.rs:5:29: 5:31
        _2 = move _3;                    // bb0[9]: scope 1 at src/main.rs:5:29: 5:31
        StorageDead(_3);                 // bb0[10]: scope 1 at src/main.rs:5:30: 5:31
        StorageDead(_4);                 // bb0[11]: scope 1 at src/main.rs:5:30: 5:31
        StorageDead(_5);                 // bb0[12]: scope 1 at src/main.rs:5:31: 5:32
        goto -> bb1;                     // bb0[13]: scope 3 at src/main.rs:6:5: 9:6
    }

    bb1: {                              
        StorageLive(_6);                 // bb1[0]: scope 3 at src/main.rs:7:13: 7:14
        _6 = const 5u32;                 // bb1[1]: scope 3 at src/main.rs:7:17: 7:18
                                         // ty::Const
                                         // + ty: u32
                                         // + val: Value(ByVal(Bytes(5)))
                                         // mir::Constant
                                         // + span: src/main.rs:7:17: 7:18
                                         // + ty: u32
                                         // + literal: const 5u32
        StorageLive(_7);                 // bb1[2]: scope 5 at src/main.rs:8:13: 8:15
        StorageLive(_8);                 // bb1[3]: scope 5 at src/main.rs:8:13: 8:15
        StorageLive(_9);                 // bb1[4]: scope 5 at src/main.rs:8:13: 8:15
        _9 = &_6;                        // bb1[5]: scope 5 at src/main.rs:8:13: 8:15
        _8 = _9;                         // bb1[6]: scope 5 at src/main.rs:8:13: 8:15
        _7 = move _8 as *const u32 (Misc); // bb1[7]: scope 5 at src/main.rs:8:13: 8:15
        StorageDead(_8);                 // bb1[8]: scope 5 at src/main.rs:8:14: 8:15
        _2 = move _7;                    // bb1[9]: scope 5 at src/main.rs:8:9: 8:15
        StorageDead(_7);                 // bb1[10]: scope 5 at src/main.rs:8:14: 8:15
        StorageDead(_9);                 // bb1[11]: scope 5 at src/main.rs:8:15: 8:16
        StorageDead(_6);                 // bb1[12]: scope 3 at src/main.rs:9:5: 9:6
        goto -> bb1;                     // bb1[13]: scope 3 at src/main.rs:6:5: 9:6
    }

Where you can see that it keeps taking the address of _6 in bb1. There aren't infinite locals, just infinite allocations, because each StorageDead(_6) deletes the allocation, and each StorageLive(_6) creates a new one.

Contributor

oli-obk commented Jul 24, 2018

the hidden local variables created by &5 seem to imply Frame::locals is an ever growing list.

Nope, rustc allocates a single variable and just reuses it.

if you look at the MIR of that function, you can see that it's even more extreme, because rustc generates a hidden static for the &5:

    let mut _0: usize;                   // return place
    scope 1 {
        let mut _1: &i32;                // "x" in scope 1 at src/main.rs:4:9: 4:14
    }
    scope 2 {
    }
    let mut _2: &i32;
    let mut _3: &i32;

    bb0: {                              
        StorageLive(_1);                 // bb0[0]: scope 0 at src/main.rs:4:9: 4:14
        StorageLive(_2);                 // bb0[1]: scope 0 at src/main.rs:4:23: 4:25
        _2 = promoted[1];                // bb0[2]: scope 0 at src/main.rs:4:23: 4:25
                                         // mir::Constant
                                         // + span: src/main.rs:4:23: 4:25
                                         // + ty: &i32
                                         // + literal: promoted[1]
        _1 = _2;                         // bb0[3]: scope 0 at src/main.rs:4:23: 4:25
        StorageDead(_2);                 // bb0[4]: scope 0 at src/main.rs:4:25: 4:26
        goto -> bb1;                     // bb0[5]: scope 1 at src/main.rs:5:5: 7:6
    }

    bb1: {                              
        StorageLive(_3);                 // bb1[0]: scope 1 at src/main.rs:6:13: 6:15
        _3 = promoted[0];                // bb1[1]: scope 1 at src/main.rs:6:13: 6:15
                                         // mir::Constant
                                         // + span: src/main.rs:6:13: 6:15
                                         // + ty: &i32
                                         // + literal: promoted[0]
        _1 = _3;                         // bb1[2]: scope 1 at src/main.rs:6:9: 6:15
        StorageDead(_3);                 // bb1[3]: scope 1 at src/main.rs:6:15: 6:16
        goto -> bb1;                     // bb1[4]: scope 1 at src/main.rs:5:5: 7:6
    }

if you change the code to

    let x = 5;
    let mut x: *const u32 = &x;
    loop {
        let y = 5;
        x = &y;
    }
    0

you get

    let mut _0: usize;                   // return place
    scope 1 {
        let _1: u32;                     // "x" in scope 1 at src/main.rs:4:9: 4:10
        scope 3 {
            let mut _2: *const u32;      // "x" in scope 3 at src/main.rs:5:9: 5:14
            scope 5 {
                let _6: u32;             // "y" in scope 5 at src/main.rs:7:13: 7:14
            }
            scope 6 {
            }
        }
        scope 4 {
        }
    }
    scope 2 {
    }
    let mut _3: *const u32;
    let mut _4: &u32;
    let mut _5: &u32;
    let mut _7: *const u32;
    let mut _8: &u32;
    let mut _9: &u32;

    bb0: {                              
        StorageLive(_1);                 // bb0[0]: scope 0 at src/main.rs:4:9: 4:10
        _1 = const 5u32;                 // bb0[1]: scope 0 at src/main.rs:4:13: 4:14
                                         // ty::Const
                                         // + ty: u32
                                         // + val: Value(ByVal(Bytes(5)))
                                         // mir::Constant
                                         // + span: src/main.rs:4:13: 4:14
                                         // + ty: u32
                                         // + literal: const 5u32
        StorageLive(_2);                 // bb0[2]: scope 1 at src/main.rs:5:9: 5:14
        StorageLive(_3);                 // bb0[3]: scope 1 at src/main.rs:5:29: 5:31
        StorageLive(_4);                 // bb0[4]: scope 1 at src/main.rs:5:29: 5:31
        StorageLive(_5);                 // bb0[5]: scope 1 at src/main.rs:5:29: 5:31
        _5 = &_1;                        // bb0[6]: scope 1 at src/main.rs:5:29: 5:31
        _4 = _5;                         // bb0[7]: scope 1 at src/main.rs:5:29: 5:31
        _3 = move _4 as *const u32 (Misc); // bb0[8]: scope 1 at src/main.rs:5:29: 5:31
        _2 = move _3;                    // bb0[9]: scope 1 at src/main.rs:5:29: 5:31
        StorageDead(_3);                 // bb0[10]: scope 1 at src/main.rs:5:30: 5:31
        StorageDead(_4);                 // bb0[11]: scope 1 at src/main.rs:5:30: 5:31
        StorageDead(_5);                 // bb0[12]: scope 1 at src/main.rs:5:31: 5:32
        goto -> bb1;                     // bb0[13]: scope 3 at src/main.rs:6:5: 9:6
    }

    bb1: {                              
        StorageLive(_6);                 // bb1[0]: scope 3 at src/main.rs:7:13: 7:14
        _6 = const 5u32;                 // bb1[1]: scope 3 at src/main.rs:7:17: 7:18
                                         // ty::Const
                                         // + ty: u32
                                         // + val: Value(ByVal(Bytes(5)))
                                         // mir::Constant
                                         // + span: src/main.rs:7:17: 7:18
                                         // + ty: u32
                                         // + literal: const 5u32
        StorageLive(_7);                 // bb1[2]: scope 5 at src/main.rs:8:13: 8:15
        StorageLive(_8);                 // bb1[3]: scope 5 at src/main.rs:8:13: 8:15
        StorageLive(_9);                 // bb1[4]: scope 5 at src/main.rs:8:13: 8:15
        _9 = &_6;                        // bb1[5]: scope 5 at src/main.rs:8:13: 8:15
        _8 = _9;                         // bb1[6]: scope 5 at src/main.rs:8:13: 8:15
        _7 = move _8 as *const u32 (Misc); // bb1[7]: scope 5 at src/main.rs:8:13: 8:15
        StorageDead(_8);                 // bb1[8]: scope 5 at src/main.rs:8:14: 8:15
        _2 = move _7;                    // bb1[9]: scope 5 at src/main.rs:8:9: 8:15
        StorageDead(_7);                 // bb1[10]: scope 5 at src/main.rs:8:14: 8:15
        StorageDead(_9);                 // bb1[11]: scope 5 at src/main.rs:8:15: 8:16
        StorageDead(_6);                 // bb1[12]: scope 3 at src/main.rs:9:5: 9:6
        goto -> bb1;                     // bb1[13]: scope 3 at src/main.rs:6:5: 9:6
    }

Where you can see that it keeps taking the address of _6 in bb1. There aren't infinite locals, just infinite allocations, because each StorageDead(_6) deletes the allocation, and each StorageLive(_6) creates a new one.

@brunocodutra

This comment has been minimized.

Show comment
Hide comment
@brunocodutra

brunocodutra Jul 24, 2018

Contributor
Contributor

brunocodutra commented Jul 24, 2018

@oli-obk

This comment has been minimized.

Show comment
Hide comment
@oli-obk

oli-obk Jul 24, 2018

Contributor

Most likely because ByVal can mean Value::Scalar(Scalar::Ptr(ptr)) which in turn can again include an AllocId

Contributor

oli-obk commented Jul 24, 2018

Most likely because ByVal can mean Value::Scalar(Scalar::Ptr(ptr)) which in turn can again include an AllocId

@brunocodutra

This comment has been minimized.

Show comment
Hide comment
@brunocodutra

brunocodutra Jul 24, 2018

Contributor

Sorry, fat fingered the wrong button on my phone.

Contributor

brunocodutra commented Jul 24, 2018

Sorry, fat fingered the wrong button on my phone.

@brunocodutra brunocodutra reopened this Jul 24, 2018

@brunocodutra

This comment has been minimized.

Show comment
Hide comment
@brunocodutra

brunocodutra Jul 24, 2018

Contributor

Most likely because ByVal can mean Value::Scalar(Scalar::Ptr(ptr)) which in turn can again include an AllocId

Makes total sense, thanks

Edit: wait, I think I misunderstood this, isn't what you're saying exactly what line 172 covers?

Contributor

brunocodutra commented Jul 24, 2018

Most likely because ByVal can mean Value::Scalar(Scalar::Ptr(ptr)) which in turn can again include an AllocId

Makes total sense, thanks

Edit: wait, I think I misunderstood this, isn't what you're saying exactly what line 172 covers?

@brunocodutra

This comment has been minimized.

Show comment
Hide comment
@brunocodutra

brunocodutra Jul 24, 2018

Contributor

Oh I see now, ByRef vs ByVal, right, totally missed that.

Contributor

brunocodutra commented Jul 24, 2018

Oh I see now, ByRef vs ByVal, right, totally missed that.

@oli-obk

I think this would be clearer if we moved this subset of fields into its own struct within EvalContext.

I think we should do definitely do this. The amount of handwritten equality code is scary. It looks like we'll accidentally break it in the future.

@brunocodutra

This comment has been minimized.

Show comment
Hide comment
@brunocodutra

brunocodutra Jul 30, 2018

Contributor

The amount of handwritten equality code is scary

Indeed, I still didn't remove the [WIP] notice because I'm still working out a way to simplify this algorithm, which is proving harder than I first envisioned.

My current idea is to resolve all Allocations reachable through locals in the stack into some custom structure that would be trivial to compara, however I'm kinda caught up on the distinction between the allocations map in TyCtxt and Memory. The implementation of HashStable uses the former, whereas the original infonite loop detection algorithm uses the latter, which one should I use?

Contributor

brunocodutra commented Jul 30, 2018

The amount of handwritten equality code is scary

Indeed, I still didn't remove the [WIP] notice because I'm still working out a way to simplify this algorithm, which is proving harder than I first envisioned.

My current idea is to resolve all Allocations reachable through locals in the stack into some custom structure that would be trivial to compara, however I'm kinda caught up on the distinction between the allocations map in TyCtxt and Memory. The implementation of HashStable uses the former, whereas the original infonite loop detection algorithm uses the latter, which one should I use?

@brunocodutra

This comment has been minimized.

Show comment
Hide comment
@brunocodutra

brunocodutra Jul 30, 2018

Contributor

Also the current implementation pushed is broken, because it doesn't actually clone the Allocations.

Contributor

brunocodutra commented Jul 30, 2018

Also the current implementation pushed is broken, because it doesn't actually clone the Allocations.

@oli-obk

This comment has been minimized.

Show comment
Hide comment
@oli-obk

oli-obk Jul 30, 2018

Contributor

The implementation of HashStable uses the former, whereas the original infonite loop detection algorithm uses the latter, which one should I use?

Oh no. I forgot about this.

Most of the impls are on StableHashingContext instead of on a generic argument. I wonder how many of these impls actually need the context and could be generic instead.

If we can make the impls generic, then we can just use HashStable<Memory>

Contributor

oli-obk commented Jul 30, 2018

The implementation of HashStable uses the former, whereas the original infonite loop detection algorithm uses the latter, which one should I use?

Oh no. I forgot about this.

Most of the impls are on StableHashingContext instead of on a generic argument. I wonder how many of these impls actually need the context and could be generic instead.

If we can make the impls generic, then we can just use HashStable<Memory>

@brunocodutra

This comment has been minimized.

Show comment
Hide comment
@brunocodutra

brunocodutra Jul 30, 2018

Contributor

If we can make the impls generic, then we can just use HashStable<Memory>

I'm not sure this would be so straightforward, the allocation map seems to come from a global variable rather than the hashing context, that is, StableHashingContext doesn't seem to require a way to resolve Allocations to be provided.

Based on that observation, my current implementation also resolves allocation using the global context, as opposed to Memory::alloc_map, but I wonder whether this is correct. What's the difference between this two mappings?

Contributor

brunocodutra commented Jul 30, 2018

If we can make the impls generic, then we can just use HashStable<Memory>

I'm not sure this would be so straightforward, the allocation map seems to come from a global variable rather than the hashing context, that is, StableHashingContext doesn't seem to require a way to resolve Allocations to be provided.

Based on that observation, my current implementation also resolves allocation using the global context, as opposed to Memory::alloc_map, but I wonder whether this is correct. What's the difference between this two mappings?

@oli-obk

This comment has been minimized.

Show comment
Hide comment
@oli-obk

oli-obk Jul 30, 2018

Contributor

The global context allocations are statics and constants, while the ones in the EvalContext are for computing the current static/constant.

cc @michaelwoerister do you think we can tune HashStable impls by making all the HashStable<StableHashingContext> impls be HashStable<T> with T: HashingContext which exposes a bunch of useful functions

trait HashingContext {
    fn with_allocation<T>(&self, f: impl FnOnce(&Allocation) -> T) -> T;
}

that allow reusing that code?

Contributor

oli-obk commented Jul 30, 2018

The global context allocations are statics and constants, while the ones in the EvalContext are for computing the current static/constant.

cc @michaelwoerister do you think we can tune HashStable impls by making all the HashStable<StableHashingContext> impls be HashStable<T> with T: HashingContext which exposes a bunch of useful functions

trait HashingContext {
    fn with_allocation<T>(&self, f: impl FnOnce(&Allocation) -> T) -> T;
}

that allow reusing that code?

@brunocodutra

This comment has been minimized.

Show comment
Hide comment
@brunocodutra

brunocodutra Jul 30, 2018

Contributor

The global context allocations are statics and constants, while the ones in the EvalContext are for computing the current static/constant.

I see, makes sense, it turns out we need to use both then.

This got me thinking, since the current infinite loop detection algorithm does not snapshot the static memory, I wonder whether it would produce false positives for loops that only change statics. I'll add a regression test for that too.

EDIT: I think I misread this the first time, what do you mean by current static/constant?

Contributor

brunocodutra commented Jul 30, 2018

The global context allocations are statics and constants, while the ones in the EvalContext are for computing the current static/constant.

I see, makes sense, it turns out we need to use both then.

This got me thinking, since the current infinite loop detection algorithm does not snapshot the static memory, I wonder whether it would produce false positives for loops that only change statics. I'll add a regression test for that too.

EDIT: I think I misread this the first time, what do you mean by current static/constant?

@brunocodutra

This comment has been minimized.

Show comment
Hide comment
@brunocodutra

brunocodutra Jul 30, 2018

Contributor

I reset the branch to a state I'm reasonably confident to be correct. Now I really need help understanding what exactly needs to be done moving forward with the implementation of Eq.

To summarize my understanding, all AllocIds need to be resolved to Allocations through Memory::get(AllocId) starting from Frame::locals. AllocIds can be found both in Scalar::Ptrs as well as Allocation::relocations. This is a different process from that used to compute HashStable, because the latter currently resolves AllocIds to Allocations through the global TyCtxt::alloc_map.

Open questions:

  1. Is the implementation of Eq above correct?
  2. Even though HashStable relies on a different mapping from AllocId to Allocation, is it sufficient for the requirement that x == y implies hash_stable(x) == hash_stable(y), so that generalizing StableHashingContext to make it possible for Memory to implement it may be tracked by a separate issue?
Contributor

brunocodutra commented Jul 30, 2018

I reset the branch to a state I'm reasonably confident to be correct. Now I really need help understanding what exactly needs to be done moving forward with the implementation of Eq.

To summarize my understanding, all AllocIds need to be resolved to Allocations through Memory::get(AllocId) starting from Frame::locals. AllocIds can be found both in Scalar::Ptrs as well as Allocation::relocations. This is a different process from that used to compute HashStable, because the latter currently resolves AllocIds to Allocations through the global TyCtxt::alloc_map.

Open questions:

  1. Is the implementation of Eq above correct?
  2. Even though HashStable relies on a different mapping from AllocId to Allocation, is it sufficient for the requirement that x == y implies hash_stable(x) == hash_stable(y), so that generalizing StableHashingContext to make it possible for Memory to implement it may be tracked by a separate issue?
@oli-obk

This comment has been minimized.

Show comment
Hide comment
@oli-obk

oli-obk Sep 6, 2018

Contributor

@bors r+

Contributor

oli-obk commented Sep 6, 2018

@bors r+

@bors

This comment has been minimized.

Show comment
Hide comment
@bors

bors Sep 6, 2018

Contributor

📌 Commit 05cdf8d has been approved by oli-obk

Contributor

bors commented Sep 6, 2018

📌 Commit 05cdf8d has been approved by oli-obk

@bors

This comment has been minimized.

Show comment
Hide comment
@bors

bors Sep 6, 2018

Contributor

⌛️ Testing commit 05cdf8d with merge c318691...

Contributor

bors commented Sep 6, 2018

⌛️ Testing commit 05cdf8d with merge c318691...

bors added a commit that referenced this pull request Sep 6, 2018

Auto merge of #52626 - brunocodutra:issue-52475, r=oli-obk
Fix issue #52475: Make loop detector only consider reachable memory

As [suggested](#51702 (comment)) by @oli-obk `alloc_id`s should be ignored by traversing all `Allocation`s in interpreter memory at a given moment in time, beginning by `ByRef` locals in the stack.

- [x] Generalize the implementation of `Hash` for `EvalSnapshot` to traverse `Allocation`s
- [x] Generalize the implementation of `PartialEq` for `EvalSnapshot` to traverse `Allocation`s
- [x] Commit regression tests

Fixes #52626
Fixes #52849
@bors

This comment has been minimized.

Show comment
Hide comment
@bors

bors Sep 6, 2018

Contributor

☀️ Test successful - status-appveyor, status-travis
Approved by: oli-obk
Pushing c318691 to master...

Contributor

bors commented Sep 6, 2018

☀️ Test successful - status-appveyor, status-travis
Approved by: oli-obk
Pushing c318691 to master...

@bors bors merged commit 05cdf8d into rust-lang:master Sep 6, 2018

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
homu Test successful
Details

@brunocodutra brunocodutra deleted the brunocodutra:issue-52475 branch Sep 6, 2018

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Sep 19, 2018

Member

@brunocodutra @oli-obk I must be missing something... I am staring at this code and trying to understand how it avoids going in endless cycles when there are cycles in memory. However, I cannot see anything here that would stop copying when we reach an allocation that has already been copied. Has that been overlooked, or am I just missing the place where the magic is happening?

Member

RalfJung commented Sep 19, 2018

@brunocodutra @oli-obk I must be missing something... I am staring at this code and trying to understand how it avoids going in endless cycles when there are cycles in memory. However, I cannot see anything here that would stop copying when we reach an allocation that has already been copied. Has that been overlooked, or am I just missing the place where the magic is happening?

@brunocodutra

This comment has been minimized.

Show comment
Hide comment
@brunocodutra

brunocodutra Sep 20, 2018

Contributor

Does stable hashing avoid cycles? I thought it did not and concluded one couldn't form cycles of allocations, at least not in const fn. Can you come up with a counter exemple that hangs the compiler?

Contributor

brunocodutra commented Sep 20, 2018

Does stable hashing avoid cycles? I thought it did not and concluded one couldn't form cycles of allocations, at least not in const fn. Can you come up with a counter exemple that hangs the compiler?

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Sep 20, 2018

Member

How would stable hashing avoid cycles...? I don't understand how it would even begin to have the infrastructure and state for that.^^

Can you come up with a counter example that hangs the compiler?

I am sure this will be easy once we got a few more features stabilizied. Still trying to get one with the current restricted set.


Another thing: Why does EvalSnapshot::new copy everything with clone, and then later has a ridiculously expensive implementation of PartialEq::eq do compare? Wouldn't it be much better to do the work of EvalSnapshot::snapshot just once, in EvalSnapshot::new?

Member

RalfJung commented Sep 20, 2018

How would stable hashing avoid cycles...? I don't understand how it would even begin to have the infrastructure and state for that.^^

Can you come up with a counter example that hangs the compiler?

I am sure this will be easy once we got a few more features stabilizied. Still trying to get one with the current restricted set.


Another thing: Why does EvalSnapshot::new copy everything with clone, and then later has a ridiculously expensive implementation of PartialEq::eq do compare? Wouldn't it be much better to do the work of EvalSnapshot::snapshot just once, in EvalSnapshot::new?

@brunocodutra

This comment has been minimized.

Show comment
Hide comment
@brunocodutra

brunocodutra Sep 20, 2018

Contributor

I refer to stable hashing because not only it's part of loop detection, but also pretty much how the snapshot itself is implemented.

As for the other question, I can't think of a good reason besides me being obtuse. I'll revisit this PR later this evening with a fresh mind and much better understanding of the language - this PR was the very first thing I ever wrote in Rust.

Contributor

brunocodutra commented Sep 20, 2018

I refer to stable hashing because not only it's part of loop detection, but also pretty much how the snapshot itself is implemented.

As for the other question, I can't think of a good reason besides me being obtuse. I'll revisit this PR later this evening with a fresh mind and much better understanding of the language - this PR was the very first thing I ever wrote in Rust.

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Sep 20, 2018

Member

I refer to stable hashing because not only it's part of loop detection, but also pretty much how the snapshot itself is implemented.

Sure but it has nothing to do with detecting cycles.^^

And actually, it's not just cycles. Currently this duplicates allocations when they are referenced multiple times. So imagine

let x = 0;
let x = (&x, &x);
let x = (&x, &x);
let x = (&x, &x);
let x = (&x, &x);
let x = (&x, &x);
let x = (&x, &x);
let x = (&x, &x);
let x = (&x, &x);

This will duplicate the original assertion (just containing 0) 2^n times. So, there definitely needs to be a deduplication. I will open an issue.

As for the other question, I can't think of a good reason besides me being obtuse. I'll revisit this PR later this evening with a fresh mind and much better understanding of the language - this PR was the very first thing I ever wrote in Rust.

That's okay. :) I was just wondering if there is maybe something I missed. I am touching the snapshot stuff in #54380, and I will see if I can change it to snapshot earlier and not copy all of memory. I won't do any deduplication though.

Member

RalfJung commented Sep 20, 2018

I refer to stable hashing because not only it's part of loop detection, but also pretty much how the snapshot itself is implemented.

Sure but it has nothing to do with detecting cycles.^^

And actually, it's not just cycles. Currently this duplicates allocations when they are referenced multiple times. So imagine

let x = 0;
let x = (&x, &x);
let x = (&x, &x);
let x = (&x, &x);
let x = (&x, &x);
let x = (&x, &x);
let x = (&x, &x);
let x = (&x, &x);
let x = (&x, &x);

This will duplicate the original assertion (just containing 0) 2^n times. So, there definitely needs to be a deduplication. I will open an issue.

As for the other question, I can't think of a good reason besides me being obtuse. I'll revisit this PR later this evening with a fresh mind and much better understanding of the language - this PR was the very first thing I ever wrote in Rust.

That's okay. :) I was just wondering if there is maybe something I missed. I am touching the snapshot stuff in #54380, and I will see if I can change it to snapshot earlier and not copy all of memory. I won't do any deduplication though.

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Sep 20, 2018

Member

Hm actually stable hashing does something with tcx. So maybe it has sufficient state to deduplicate something? I don't see where and how this interacts with the snapshotting TBH. At the very least, an extensive comment should be added, answering questions like "why is AllocIdSnapshot an Option?"

Member

RalfJung commented Sep 20, 2018

Hm actually stable hashing does something with tcx. So maybe it has sufficient state to deduplicate something? I don't see where and how this interacts with the snapshotting TBH. At the very least, an extensive comment should be added, answering questions like "why is AllocIdSnapshot an Option?"

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Sep 20, 2018

Member

I give up. I tried making a snapshot early, but I cannot comprehend what this code does, and I really should get actual work done.^^

The problem I ran into is that the snapshot isn't actually a snapshot. It's not a deep copy, it's full of references to... somewhere. This definitely needs lots of documentation, right now, the code is incomprehensible as far as I am concerned.

Member

RalfJung commented Sep 20, 2018

I give up. I tried making a snapshot early, but I cannot comprehend what this code does, and I really should get actual work done.^^

The problem I ran into is that the snapshot isn't actually a snapshot. It's not a deep copy, it's full of references to... somewhere. This definitely needs lots of documentation, right now, the code is incomprehensible as far as I am concerned.

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Sep 20, 2018

Member

I opened #54384 to track the lack of documentation. It would be great if you could take care of that @brunoabinader while the memory of this PR is still fresh in your mind :)

And sorry if I came across too complaining. I wanted to get some other work done and the loop detector made that so much more complicated, so I wanted to quickly fix it to be simpler and got frustrated when that didn't work and I couldn't figure out why. I shouldn't have loaded that frustration on you though. I am happy for your contributions!

Member

RalfJung commented Sep 20, 2018

I opened #54384 to track the lack of documentation. It would be great if you could take care of that @brunoabinader while the memory of this PR is still fresh in your mind :)

And sorry if I came across too complaining. I wanted to get some other work done and the loop detector made that so much more complicated, so I wanted to quickly fix it to be simpler and got frustrated when that didn't work and I couldn't figure out why. I shouldn't have loaded that frustration on you though. I am happy for your contributions!

@brunocodutra

This comment has been minimized.

Show comment
Hide comment
@brunocodutra

brunocodutra Sep 20, 2018

Contributor
Contributor

brunocodutra commented Sep 20, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment