Skip to content

Don't box captured variables that are not assigned after closure declaration #53367

@uniment

Description

@uniment

Using conditional logic and/or reassignments to initialize a variable before capturing it is a common pattern, and it's the source of most unnecessary boxes for which #15276 continues to be mentioned, so it's attractive to attempt a solution.

Consider the following example that defines two functionally equivalent inner functions (modified from Performance Tips):

julia> function abmult(r1::Int)
           if r1 < 0
               r1 = -r1
           end
           r2 = r1

           f1 = x -> x * r1
           f2 = x -> x * r2
           return f1, f2
       end
abmult (generic function with 1 method)

julia> abmult(1)
(var"#3#5"(Core.Box(1)), var"#4#6"{Int64}(1))

Notice that, throughout the lifetimes of both closures f1 and f2, neither capture r1 nor r2 can ever take on a new value—there's never an assignment subsequent to the closures' declarations in their parent scope, nor in their function bodies. f1 and f2 behave entirely identically (except for performance); clearly, the language semantics can be satisfied without boxing r1 here.

As far as I can tell from experimentation, the current rule for boxing a capture is to box if:
a) the variable has more than one syntactical assignment in parent scope,
b) the variable is syntactically assigned within a conditional in parent scope,
c) the variable is defined only after the closure is declared, or if
d) the variable is syntactically assigned within a closure that captures it.
However, this boxing policy (specifically, (a) and (b)) decides to box too aggressively in many cases, such as this.

To address this, I propose a different boxing rule: a capture should be boxed iff there is a syntactical assignment to it, which is syntactically reachable after any of its closures' instantiation. This is the actual boxing rule that we are reaching for—the only reason to box a capture is if its value could change during its closures' lifetimes; the currently implemented rule is merely an imperfect approximation of this.

To make this proposal more concrete, consider the following procedure that aims to implement it. For illustration purposes this is not optimized. Here I assume working with IR code so that each instruction is a single line, prior to insertion of boxes:

For a given variable x that is known to be captured by a closure f (and possibly closures g, h, etc.):

  1. If x is syntactically assigned to, anywhere within f, g, h, or etc., then box x.
  2. If x is not syntactically assigned to, anywhere within its parent scope, then x is a global; stop.
  3. Initialize an empty record of explored paths (a Set of line numbers).
  4. Call check_branch on the line on which f is instantiated in its parent scope.
  5. function check_branch(starting_line): Starting at starting_line, proceed line-by-line:
    a. If the current line is in the list of explored paths, then stop exploring this path (return). (to avoid infinite loops)
    b. Push the current line into the list of explored paths.
    c. If this line is an assignment to x, then box x.
    d. If the current line branches, goto _ if not _, then explore both paths (i.e. call check_branch on the line where goto goes to).
    e. If the end of the parent scope has been reached, then stop (return).
  6. If x is captured by additional closures g, h, etc., call check_branch on the line where each closure is initialized.
  7. If all the above has finished without boxing, then x is an unboxed capture.

This should be $\mathcal O(n)$ in the number of instructions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions