Skip to content
This repository has been archived by the owner on Apr 5, 2024. It is now read-only.

Latest commit

 

History

History
319 lines (244 loc) · 18 KB

closure_captures.md

File metadata and controls

319 lines (244 loc) · 18 KB

Refactor closure_captures

Goal

Be able to express memory captured by a closure more precisely.

For example

  • || p.x, here we want to express the field x is being captured instead of all of p.
  • || tuple.0, here we want to express that the 0th entry in tuple in being captured instead of the complete tuple.

Background

Consider the following code:

struct Point {
    x: i32,
    y: i32,
    name: String,
}

// hir_id_1 associated with definition of p.
let p = Point { x: 10, y: 20, name: format!("p") };
// hir_id_2 associated with definition of mirror_p.
let mut mirror_p = Point { x: -10, y: 20, name: format!("mp") };

// hir_id_3 associated with definition of m_x.
let m_x = &mirror_p.x;

// hir_id_4 associated with the expression p.x
let c1 = || println!("{}", p.x);
// hir_id_5 associated with the expression m_x
let c2 = || println!("{}", m_x);

let c3 = || {
    // hir_id_12 associated with the definition of p3.
    let p3 = Point{ x: 1, y: 2, name: format!("p3") };
    // hir_id_6 associated with the expression p3.x
    println!("{}", p3.x);
}

let c4 = || {
    // hir_id_8 associated with the expression mirror_p.x
    c_nest = || { mirror_p.x -= 20 };
    // hir_id_7 associated with the expression p.x
    drop(p.name);
}

// hir_id_11 associated with definition of p.
let p2 = Point { x: 10, y: 20, name: format!("p") };
let mut c5 = || {
    // hir_id_9 associated with the expression mirror_p.x
    p2.x += 20;
}

let c6 = move || {
    // hir_id_10 associated with the expression mirror_p.x
    p2.x += 20;
}

Root variable

In the above code p is the root variable to p.x and p.name, m_x is a root variable on its own and p3 is the root variable to p3.x.

Current state

Provides list of root variables that are completely or partially captured by the closure (or any other nested clousre), and the span pointing to their use within the capture.

More precisely maps root variables outside of the closure (that are mentioned) within the closure to access span within the closure.

upvars_mentioned[c1] =  { hir_id_1 : Span(hir_id_4) }
upvars_mentioned[c2] =  { hir_id_3 : Span(hir_id_5) }
upvars_mentioned[c3] =  {}
upvars_mentioned[c_nest] =  { hir_id_2: Span(hird_id_8) }
upvars_mentioned[c4] =  { hir_id_2: Span(hird_id_8), hir_id_1: Span(hir_id_7) }
upvars_mentioned[c5] =  { hir_id_11 : Span(hir_id_9) }
upvars_mentioned[c6] =  { hir_id_11 : Span(hir_id_10) }

Stores type and various related infromation about a given scope.

Provides list of root variables that are completely or partially captured by the closure, and maps them to an UpvarId. UpvarId contains the closure definition id and the hir id of the root variable.

closure_captures[c1] =  { hir_id_1 : UpvarId(def_id_c1, hir_id_1) }
closure_captures[c2] =  { hir_id_3 : UpvarId(def_id_c2, hir_id_3) }
closure_captures[c3] =  {}
closure_captures[c_nest] =  { hir_id_2: UpvarId(def_id_c_nest, hir_id_2) }
closure_captures[c4] =  { hir_id_2: UpvarId(def_id_c_4, hir_id_2), hir_id_1: UpvarId(def_id_c_4, hir_id_1) }
closure_captures[c5] =  { hir_id_11 : UpvarId(def_id_c5, hir_id_11) }
closure_captures[c6] =  { hir_id_11 : UpvarId(def_id_c6, hir_id_11) }

For given scope, maps all the UpvarId to capture kind, i.e. by value, or mutabale/immutable reference.

Assume that c1 ... c6 are within the same scope, then for that scope:

upvar_capture_map = {
    UpvarId(def_id_c1, hir_id_1) : ByRef(ImmBorrow),
    UpvarId(def_id_c2, hir_id_3) : ByRef(ImmBorrow),
    UpvarId(def_id_c_nest, hir_id_2) : ByRef(ImmBorrow),
    UpvarId(def_id_c_4, hir_id_2) : ByRef(MutBorrow),
    UpvarId(def_id_c_4, hir_id_1) : ByValue,
    UpvarId(def_id_c5, hir_id_11) : ByRef(MutBorrow),
    UpvarId(def_id_c6, hir_id_11) : ByValue,
}

Represents how memory is accessed.

Eg: p3.x, we can express that the base root variable is p3 and then we get the value by accessing Field(x).

If the base field is PlaceBase::Upvar we know that the access is based on a captured variable.

Generating this information

To generate upvars_mentioned, closure_captures, upvar_capture_map we need to iterate the CFG, to see how variables are being captures and used.

rustc_hir::instravisit provides a visitor/walker that can traverse the tree in execution order. We can use it and override visit_* methods, to have a walker that meets our needs.

Generated in librustc_passes.

Here we implement the instravisit visitor that can generate a list of local root variables by looking at PatKind::Binding (TODO: explain).

Once that is done, we walk the body again to look for

  • Path that represents a variable use (Res::Local). Path isn't something like a.b but just represents some variable use. If the variable isn't a local variable we add it to the list of upvars and map it to the Path span.
  • Expr that represents a Closure, i.e. handle the case of nested closures. We can query upvars_mentioned for the nested closure and include all the upvars returned, as part of captures for the current (enclosing) closure.

Generated in librustc_typeck.

This map for a given closure is very similar to upvars_mentioned. Instead of storing span we store the pair of (closure_def_id, root_var_hir_id).

We iterate over upvars_mentioned for a given closure_def_id to get captured root variable HirId and generate an UpvarId for them and build the map.

Generated in librustc_typeck.

We implement another intravist::Vistor to analyse closures within a scope. When we see a closure within the body of the scope, we walk the body of the closure, when we are done walking the closure's body we start the analysis on the closure. (We need to walk first, to handle nested closures.)

To begin the analysis we use hints like mut for c5, move in c6, to guess the initial borrow kind of each capture. Once the guesses are set, we create an ExprUseVistior that visits each expression within the closure body. ExprUseVisitor calls into MemCategorization which returns the access information about the expression as a Place.

ExprUseVisitor(https://github.com/rust-lang/rust/blob/master/src/librustc_typeck/expr_use_visitor.rs) invokes methods on the delegate when it finds a Place being consumed or borrowed. The delegate has the job of deciding what to do with that information. In this case, the delegate is the upvar analysis, which "upgrades" the kind of borrow needed depending on the accesses it sees (all reads == shared borrow, writes == mutable borrow, moves == take ownership). In our case the Delegate trait is implemented by InferBorrowKind.

It's possible that while processing the expressions in ExprUseVisitor on the closure body we see another closure (i.e. nested within the enclosing closure). Since we walked the body of the closure before we started the analysis, we have already proccessed all nested closures. We will iterate over the captured variables of this nested closure and update borrow kind for all of them for the enclosing closure. Last bit done here: expr_use_visitor::walk_captures

fn foo() {
    let a;

    let c1 = || {
        let c2 = || {
           drop(a);
        };
    };
}

The flow here would look like:

  1. Intravisit on foo, see a closure expr c1
  2. Intravisit on c1, sees a closure expr c2
  3. Intravisit the body c2. Done visiting body of c2.
  4. Analyses the closure c2 using ExprUseVisitor and generate upvar_capture_map.
  5. c2 intravisit finishes. Done visiting the body of c1.
  6. Analyses the closure c2 using ExprUseVisitor and generate upvar_capture_map.
  7. Encounter the closure expression for c2. Adjust capture infromation for a based on c2
  8. c1 intravisit finishes
  9. foo intravisit finishes

More nicely (here capture_kinds refer to upvar_capture_map for a particular closure):

| visit body of foo
| | visit body of c1
| | | visit closure c2
| | | | visit body of c2 (does nothing)
| | | | analyze body of c2, producing capture_kinds[c2]
| | analyze body of c1, reading capture_kinds[c2], producing capture_kinds[c1]

Generated in librustc_typeck.

Several methods exist that can analyse different kinds of expression and pattern to express how memory is being accessed.

Use cases

Legend:

  • U: upvars_mentioned
  • C: closure_captures
  • B: upvar_capture_map
  • U->C: Currently uses upvars_mentioned, but needs to start using closure_captures. This is most likely because we have done type checking either entierly or at least for the related scope.
  • var_hir_id: HirId of root variable of a capture
  • CaptureKind: If we have var_hir_id and access to any of the tables (i.e. access to closure_def_id), we can access the CaptureKind for the variable from upvar_capture_map using UpvarId(closure_def_id, var_hir_id).
  • AccessSpan: Span in upvars_mentioned and points to an access of the captured variable within the closure. We can assume that we have var_hir_id to access this. Since we have var_hir_id, we also have access to definition span using using the hir API.
  • Field: When we build the MIR, we desugar the closure as a structure where captures are represented by fields within the structure. We refer to these using indecies. Field(var_hir_id) expresses we need the var_hir_id of the ith capture, provided some i.
  • Index: As explained above when we build the MIR, we start representing the closure as a structure where some of the fields represent the captured variables. To build this desugared structure, we need to map var_hir_id to an index. For closure_captures[c4], we have hir_id_2 at index 0, and hir_id_1 at index 1. (Order in which the captures apprear within the closure)
Module U/C/B Purpose Uses
typeck/closure.rs U Initialize Substs. TODO: #4 var_hir_id
typeck/upvar.rs (1) U Initialize closure_captures and upvar_capture_map var_hir_id
typeck/upvar.rs (2) U->C, B Ensure initial and final types of captured variables unify var_hir_id, CapturKind
expr_use_visitor U->C, B Handle captures in case of nested closures var_hir_id, CapturKind
mem_categorization U Check if the expr uses a captured variable var_hir_id
passes/liveness.rs U->C Liveness checks, reports unused upvars var_hir_id, AccessSpan
mir_build (1) U->C hir::Expr -> mir::Expr var_hir_id
mir_build (2) C Desugar a capture and represent it as field within a struct var_hir_id, Index
mir_build (3) C Build [upvars_mutbls] var_hir_id
librustc_middle/mir U->C Debug print var_hir_id
borrow_check (1) U->C Debug message about the ith capture Field(var_hir_id)
borrow_check (2) C, B Generate borrow check upvar var_hir_id, CaptureKind
borrow_check (3) U->C Span within the closure for the ith capture Field(AccessSpan)
interpret C Figure out which variable is being captured var_hir_id, Field
region_errors U->C Report FnMut error AccessSpan
pretty.rs U->C? Pretty print the names of the captured variables var_hir_id
trait_selection U->C? Search for type that maches target return type var_hir_id, AccessSpan

TypeckTables also provides an API (upvar_capture) that takes an UpvarId and returns the CaptureKind.

We can assume that if upvars_mentioned or closure_captures wasn't called in the neighbourhood, then we somehow have access to the UpvarId. Most likely we are within the typeck module or have access to an UpvarRegion created in typeck/upvar.rs or have var_hir_id and the closure_def_id.

Proposed Solution

We can replace closure_captures and upvar_capture_map

@nikomatsakis said:

FxIndexMap<hir::Place, CaptureInfo> where you have

struct CaptureInfo {
    /// the id of some use that caused us to pick the capture-mode below
    example_use_at: HirId,

    /// captured mode we selected
    capture_mode: Mode,
}

Satisfiability

Module ✅ / 😭 How?
typeck/closure.rs
typeck/upvar.rs (1)
typeck/upvar.rs (2)
expr_use_visitor
mem_categorization
passes/liveness.rs
librustc_middle/mir
mir_build (1)
mir_build (2)
mir_build (3)
borrow_check (1)
borrow_check (2)
borrow_check (3)
interpret
region_errors
pretty.rs
trait_selection