Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Towards a new closure representation (native code) #8984

Closed
wants to merge 1 commit into from

Conversation

mshinwell
Copy link
Contributor

Background

There have been various discussions recently on the subject of the layout of closures. The current layout of these values means that they cannot be traversed by the GC without the use of a page table (or hacks such as in #1156). They also rely on the use of Infix_tag, the only mechanism that permits pointers into the middle of heap blocks.

We can gain benefits by removing reliance on a page table. It produces a runtime performance improvement, typically of a few percent. It also fits in with the desire of the Multicore OCaml devs not to have a page table at all, stemming from the fact that the maintenance of such a table in a parallel environment is more complicated.

Removing reliance on Infix_tag opens the possibility of significant simplification to the GC code. The presence of infix pointers has apparently caused various bugs in the development versions of Multicore OCaml.

This patch

This patch changes the layout of closure values used by the native code compilers on both Closure and Flambda paths. @kayceesrk is working on another patch to do the same for the bytecode compiler. After two such patches, we would be in a position to remove Infix_tag and all associated code.

What I present here is code that should pass the compiler's testsuite, but has not been validated at scale, nor subject to benchmarking. I think some discussion is in order before proceeding any further.

Proposed layout

I have chosen a simple layout. As an example assume variables f, g and h binding mutually-defined closures (at the OCaml source level, mutually-recursive functions) with one, two and three arguments respectively. We say that f, g and h form a "set of closures". The code pointer of f is f_code (respectively for g and h); the union of the free variables of the functions (apart from variables f, g and h) are fv_0 through fv_n. The in-memory layout, of values with tag Closure_tag, is:

    -------------------------------------------------------
f = | f_code      | arity = 1 | g | h | fv_0 | ... | fv_n |
    -------------------------------------------------------

    ----------------------------------------------------------------
g = | caml_curry2 | arity = 2 | g_code | f | h | fv_0 | ... | fv_n |
    ----------------------------------------------------------------

    ----------------------------------------------------------------
h = | caml_curry3 | arity = 3 | h_code | f | g | fv_0 | ... | fv_n |
    ----------------------------------------------------------------

Statically-allocated closures, which have no free variables, do not require any runtime patching. Dynamically-allocated closures are allocated with placeholders in the closure slots (here pointing to f, g and h) and then immediately patched using caml_modify to tie the knots. There are no placeholders allocated that point at the same closure (so for example in f, there is no slot that itself points at f). Environments are explicitly de-shared.

IR changes

The Clambda language has been modified to accomodate the new representation. We have drawn on experience from Flambda 2.0, which has taught us that a good way of handling the binding of multiple mutually-recursive functions is to have a binding construct in the IR that binds multiple names at once, eliminating in particular any notion of a "variable pointing at a set of closures". As such we provide:

  • Ulet_set_of_closures, which produces a mutually-recursive binding of closures to variables, around a body in the style of a normal let-expression;
  • Uselect_closure, which allows access via the given closure to others in the same set of closures. (For example, above, given f we could get to g and h). This is equivalent to the Move_within_set_of_closures construct in Flambda (which in Flambda 2.0 is called Select_closure).

Closures within a set are referenced by integer indices, starting at zero, matching up with the order in which the function declarations are found in the appropriate maps during compilation. All offset calculations are done statically at compile time.

These constructs supercede Uclosure and Uoffset respectively. There are no changes to the Flambda language, although some as-yet-unpublished work by @xclerc already exists to replace the "set of closures" symbol-binding construct in the static term language with a construct similar to Ulet_set_of_closures.

Some possible improvements

  • We could perform a simple analysis to avoid having patched pointers to closures that are never called. For example, we could elide g within the closure of h if h never calls g.

  • We could track the free variables of functions on a per-function basis rather than on a per-set-of-closures basis.

  • We could have the closure blocks point at a shared environment block. This would need benchmarking. It would probably necessitate ensuring that the indirection through a closure to the environment block was subject to CSE.

Reading the code

If you don't want to read the whole diff, which you probably do not, then reading the diffs of clambda.mli and cmmgen.ml would be a good start; followed by the diffs of closure.ml and/or flambda_to_clambda.ml.

One test case was removed from the testsuite as it no longer seemed relevant and produces a (reasonable) compile-time performance problem.

@jordwalke
Copy link

Just curious - does this make the native representation of closures more similar or less similar to that of bytecode's form?

@sabine
Copy link
Contributor

sabine commented May 16, 2020

What's the status on this?

From the point of view of compiling to WebAssembly, getting rid of infix pointers does make things simpler (no matter whether we ship a GC on WebAssembly local memory or whether we use the WebAssembly GC feature).

@mshinwell
Copy link
Contributor Author

New closure representation merged from a different PR for 4.13.

@mshinwell mshinwell closed this Mar 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants