Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix closure upvar soundness bug in regionck #17869

Merged
merged 4 commits into from
Oct 17, 2014

Conversation

bkoropoff
Copy link
Contributor

This PR is based on #17784, which fixes closure soundness problems in borrowck. Only the last two commits are unique to this PR.

My understanding of regionck is still evolving, so I'm not sure if this is the right approach. Feedback is appreciated.

  • In link_reborrowed_region, we account for the ability of upvars to
    change their mutability due to later processing. A map of recursive
    region links we may want to establish in the future is maintained,
    with the links being established when the mutability of the borrow
    is adjusted.
  • When asked to establish a region link for an upvar, we link it to
    the region of the closure body. This creates the necessary
    constraint to stop unsound reborrows from the closure environment.

This partially (maybe completely) solves issue #17403. Remaining work:

  • This is only known to help with by-ref upvars. I have not looked at
    by-value upvars yet to see if they can cause problems.
  • The error diagnostics that result from failed region inference are
    pretty inscrutible.

@alexcrichton
Copy link
Member

@nikomatsakis, @pnkfelix, @pcwalton, would one of you be up for review?

@nikomatsakis
Copy link
Contributor

I'll take this.

@bkoropoff
Copy link
Contributor Author

Rebased, fixed tidy error, and cleaned up some comments and commit messages. Hopefully my thinking is more clear now.

@nikomatsakis
Copy link
Contributor

So, I read this over in some detail. I think it's not quite working as I envision. Here is a summary of how I was thinking it ought to work. The basic idea is to have mem-categorization model as closely as possible the desugared version.

I think the fundamental problem is that the code now attempts to classify move || as "copied upvars", like proc does, but that's wrong. This was ok for proc because it was always effectively FnOnce. But that's not the case for unboxed closures.

I think the result of mem-categorization needs to depend both on whether the closure is Fn/FnMut/FnOnce and whether the closure is escaping or not. Terminology note: I will call a ||
closure "non-escaping" and a move || closure "escaping". This is because a move|| closure is the kind that are intended to escape the stack frame that created them.

We use the Fn vs FnMut etc distinction as follows:

  1. In a once closure, we can ignore the deref and just consider escaping vs non-escaping.
  2. In a &closure, we want to model it as a &deref
  3. In a &mut closure, we want to model it as a &mut closure

We use the escaping vs non-escaping closure distinction as follows:

  1. Escaping (move): when capturing a variable of type T, you get a copied upvar of type T
  2. Non-escaping (non move): when capturing a variable of type T, you get a deref through a pointer of type &M T, where M is the inferred mutability. (For now, I am ignoring the idea of a non-escaping once closure, since we don't support those in the inference system yet.)

So if we have a non-escaping FnMut closure, and it infers (say) a mutable borrow, then we want &'a mut &'b mut T, where T is the type of the variable. Here 'a would be an anonymous, free lifetime bound on the closure body, and 'b would be the lifetime bound of the (non-escaping) closure.

For an escaping closure (move), then we would generate mem-categorize the upvar to &'a mut T, where 'a is an anonymous, free lifetime bound on the closure body.

Modulo #3696, this should be roughly what the desugared form would do.

Does this make sense?

@nikomatsakis
Copy link
Contributor

Sorry for taking a few days to get to this. I had to clear an hour or so where I could read and think uninterrupted, which proved hard! I'll try to respond much faster now that I've thought it over some (and/or catch me on IRC).

@bkoropoff
Copy link
Contributor Author

I think I understand. What I've essentially done is make regionck treat cat_upvar as if it included these implicit derefs. What you'd like to see is these implicit derefs reflected literally in the returned categorization, and then the correct behavior should fall out of that.

It seems easier to summarize with a table. Consider an upvar with type T, an anonymous free region in the closure body 'free, and a region of the closure expression 'cl. Let bk mean the inferred borrow kind. Please excuse the somewhat ad-hoc notation for dereferencing through a reference type.

move ref
Fn * &'free T ** &'free &'cl bk T
FnMut * &'free mut T ** &'free mut &'cl bk T
FnOnce T * &'cl bk T
stack N/A ** &'free mut &'cl bk T
proc T N/A

It seems like cat_upvar and cat_copied_upvar could be combined in this system since any interesting difference between them is captured in the surrounding derefs. Does this all seem reasonable?

What are your thoughts on the "maybe link" map I added? It seems intuitively incorrect to neglect to establish a region link because the upvar had not yet been inferred as mutable at the time the expression was processed. I've definitely seen it get exercised when compiling the standard libs. I haven't come up with an example yet where not using it causes things to "go wrong", though.

@bkoropoff bkoropoff changed the title Attempt to fix closure upvar soundness bug in regionck Fix closure upvar soundness bug in regionck Oct 12, 2014
@bkoropoff
Copy link
Contributor Author

I reworked the PR to mem-categorize upvars appropriately and fix the resulting fallout. I think this is generally in line with the strategy you envisioned. There are a few details that still need some attention:

Region of by-ref upvars

You had said that the by-ref upvar should have a region which is the lifetime bound of the closure. I tried making this change, but I think the existing behavior is actually correct. When the closure expression is processed, an inference variable is instantiated for each upvar, the link closure_region <= freevar_region is created, and freevar_region is stored in the upvar borrow map. Within the closure body, mentions of the upvar are categorized as a deref of a pointer with freevar_region. If there are two upvars, foo and bar, this is analagous to desugaring the closure to a structure like:

struct Closure<'foo,'bar> {
    foo: &'foo Foo,
    bar: &'bar Bar
}

If I change it to instead use closure_region when categorizing the upvar, the two regions are forced to be unified together, and the resulting analagous desugaring is unnecessarily restrictive:

struct Closure<'closure> {
    foo: &'closure Foo,
    bar: &'closure Bar
}

This actually caused a build failure in rustc itself in lookup_struct_fields in ty.rs:

        let struct_fields = cx.struct_fields.borrow();
        let mut results: SmallVector<&[field_ty]> = SmallVector::zero();
        each_super_struct(cx, did, |s| {
            match struct_fields.find(&s) {
                Some(fields) => results.push(fields.as_slice()),
                _ => {
                    cx.sess.bug(
                        format!("ID not mapped to struct fields: {}",
                                cx.map.node_to_string(did.node)).as_slice());
                }
            }
        });

        let len = results.as_slice().iter().map(|x| x.len()).sum();

struct_fields and results are both captured by the closure. The closure then pushes references with a lifetime bounded by that of the struct_fields borrow into results, causing the borrow of struct_fields to persist until the end of the block instead of just the lifetime of the stack closure. If the lifetimes of the two upvar borrow are forced to be unified, this causes the borrow of results to also persist to the end of the block, causing a borrow conflict at results.as_slice().

Generating an anonymous free region

I initially tried to create the anonymous free region as a BrAnon, but this required doing so within astconv and storing it in the closure type in order to ensure the region was unique with respect to other closure parameters. This ended up causing a lot of code churn in metadata encoding, etc. I opted instead to introduce a special BrEnv variant for this purpose, which had a much more limited impact. I'm not sure if this was the right decision.

@nikomatsakis
Copy link
Contributor

@bkoropoff this is looking really good. I think it'd be good to have tests for moving out of upvars in all the various combinations. In particular, move |&:| and friends should still not permit moving out.

@bkoropoff
Copy link
Contributor Author

I rebased, added tests for moving out of upvars, and addressed some of the review feedback. I kept these changes in separate "CR FIX" commits to make it clear what changed since last time. These will be rebased and squashed when the PR is ready to merge.

@nikomatsakis
Copy link
Contributor

This looks good. I'm ready to give r+.

@nikomatsakis
Copy link
Contributor

@bkoropoff will rebase to cleanup commit history, r+ when that is done

bors added a commit that referenced this pull request Oct 16, 2014
…sakis

This PR is based on #17784, which fixes closure soundness problems in borrowck.  Only the last two commits are unique to this PR.

My understanding of regionck is still evolving, so I'm not sure if this is the right approach.  Feedback is appreciated.

- In `link_reborrowed_region`, we account for the ability of upvars to
  change their mutability due to later processing.  A map of recursive
  region links we may want to establish in the future is maintained,
  with the links being established when the mutability of the borrow
  is adjusted.
- When asked to establish a region link for an upvar, we link it to
  the region of the closure body.  This creates the necessary
  constraint to stop unsound reborrows from the closure environment.

This partially (maybe completely) solves issue #17403.  Remaining work:

- This is only known to help with by-ref upvars.  I have not looked at
  by-value upvars yet to see if they can cause problems.
- The error diagnostics that result from failed region inference are
  pretty inscrutible.
- Unify the representations of `cat_upvar` and `cat_copied_upvar`
- In `link_reborrowed_region`, account for the ability of upvars to
  change their mutability due to later processing.  A map of recursive
  region links we may want to establish in the future is maintained,
  with the links being established when the kind of the borrow is
  adjusted.
- When categorizing upvars, add an explicit deref that represents the
  closure environment pointer for closures that do not take the
  environment by value.  The region for the implicit pointer is an
  anonymous free region type introduced for this purpose.  This
  creates the necessary constraint to prevent unsound reborrows from
  the environment.
- Add a note to categorizations to make it easier to tell when extra
  dereferences have been inserted by an upvar without having to
  perform deep pattern matching.
- Adjust borrowck to deal with the changes.  Where `cat_upvar` and
  `cat_copied_upvar` were previously treated differently, they are
  now both treated roughly like local variables within the closure
  body, as the explicit derefs now ensure proper behavior.  However,
  error diagnostics had to be changed to explicitly look through the
  extra dereferences to avoid producing confusing messages about
  references not present in the source code.

Closes issue rust-lang#17403.  Remaining work:

- The error diagnostics that result from failed region inference are
  pretty inscrutible and should be improved.

Code like the following is now rejected:

    let mut x = 0u;
    let f = || &mut x;
    let y = f();
    let z = f(); // multiple mutable references to the same location

This also breaks code that uses a similar construction even if it does
not go on to violate aliasability semantics.  Such code will need to
be reworked in some way, such as by using a capture-by-value closure
type.

[breaking-change]
The test was also renamed to be more descriptive.
@bkoropoff
Copy link
Contributor Author

Rebased to fix merge conflict

bors added a commit that referenced this pull request Oct 17, 2014
…sakis

This PR is based on #17784, which fixes closure soundness problems in borrowck.  Only the last two commits are unique to this PR.

My understanding of regionck is still evolving, so I'm not sure if this is the right approach.  Feedback is appreciated.

- In `link_reborrowed_region`, we account for the ability of upvars to
  change their mutability due to later processing.  A map of recursive
  region links we may want to establish in the future is maintained,
  with the links being established when the mutability of the borrow
  is adjusted.
- When asked to establish a region link for an upvar, we link it to
  the region of the closure body.  This creates the necessary
  constraint to stop unsound reborrows from the closure environment.

This partially (maybe completely) solves issue #17403.  Remaining work:

- This is only known to help with by-ref upvars.  I have not looked at
  by-value upvars yet to see if they can cause problems.
- The error diagnostics that result from failed region inference are
  pretty inscrutible.
@bors bors closed this Oct 17, 2014
@bors bors merged commit fdd69ac into rust-lang:master Oct 17, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants