New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backend support for local allocations #478
Conversation
f47181b
to
df0f5ef
Compare
df0f5ef
to
72abef0
Compare
Adds local allocations (Ialloc Alloc_local) to the backend, as well as new primitives for region boundaries (Ibeginregion/Iendregion). In Cmm, instead of explicit Ibeginregion/Iendregion, there is a new block-structured Cregion. Since tail calls are supposed to end a region before transferring control, Cmm also has two new constructs that end a region early: Capply with Apply_tail, and Ctail (for general code blocks, resulting from tail calls that have been inlined).
72abef0
to
51bf135
Compare
(Cannot add a comment on the code outside of the diff.)
|
I think the current transformations over CFG values are @gretay-js This is probably worth adding to your document (The fact that the CI is green is moot, since the new |
My hope was that marking local allocations as coeffectful and Ibeginregion / Iendregion as effectful would be enough to ensure the right semantics - it is invalid to move a load across a possibly-aliasing store anyway. |
I think I am mostly worried about ensuring that no |
2392afd
to
c203970
Compare
The following is needed in
|
For grepping purposes, I think the existing line is:
|
Thanks, patch applied. |
Adds local allocations (Ialloc Alloc_local) to the backend, as well as new primitives for region boundaries (Ibeginregion/Iendregion). In Cmm, instead of explicit Ibeginregion/Iendregion, there is a new block-structured Cregion. Since tail calls are supposed to end a region before transferring control, Cmm also has two new constructs that end a region early: Capply with Apply_tail, and Ctail (for general code blocks, resulting from tail calls that have been inlined).
173842ce84 Merge flambda-backend changes ed7eba2054 Remove leading space from LINE. (#484) bd611705f7 Bump magic numbers (#5) c50c47d1f9 Add CI builds with local allocations enabled 1412792ed7 Move local allocations support behind '-extension local' 6d8e42aeb7 Better tail call behaviour in caml_applyN c7dac3da41 Typemod: toplevel bindings escape even if no variables are bound 82d6c3ead3 Several fixes for partial application and currying d05c70cc93 Pprintast support for new local syntax e0e62fcdb4 Typecheck x |> f y as (f y x), not ((f y) x) d7e34ce7bf Remove autogeneration of @ocaml.curry b9a05935ce Port #493 0a872d96a1 Code review fixes from #491 6c168bbc48 Remove local allocation counting 3c6e7f042c Code review fixes from #478 bb97207d1c Rename Lambda.apply_position a7cb6509e1 Quieten Makefile when runtime dep files are not present c656dc9bb1 Merge flambda-backend changes 11b5424a69 Avoid printing double spaces in function argument lists 7751faa4f9 Restore locations to Typedtree.{pat,let}_bound_idents_full e450b6c0e9 add build_ocaml_compiler.sexp 0403bb3eed Revert PR 9895 to continue installing VERSION b3447dbe5d Ensure new local attributes are namespaced properly 7f213fc8b3 Allow empty functions again 8f22ad82ad Bugfix: ensure local domain state is initialised 80f54dd625 Bugfix for Selectgen with regions e8133a189a Fix external-external signature inclusion 9840051375 Bootstrap d879f23efd Merge remote-tracking branch 'jane/local-reviewed' into local-merge 94454f5f1c Use Local_store for the local allocations ref 54a164cf35 Create fewer regions, according to typechecking (#59) 1c2479bdb3 Merge flambda-backend changes ce34678606 Fix printing of modes in return types 91f228128b Hook mode variable solving into Btype.snapshot/backtrack 54e4b09d64 Move Alloc_mode and Value_mode to Btype ff4611e779 Merge flambda-backend changes ce62e451d5 Ensure allocations are initialised, even dead ones 6b6ec5a744 Fix the alloc.ml test on 32-bit builds 81e9879ac5 Merge flambda-backend changes 40a7f89c96 Update repo URL for ocaml-jst, and rename script. 0454ee73d4 Add some new locally-allocating primitives (#57) 8acdda123d Reset the local stack pointer in exception handlers (#56) 8dafa98b49 Improve typing for (||) and (&&) (#55) 8c64754035 Fix make_check_all_arches (#54) b50cd457aa Allow arguments to primitives to be local even in tail position (#53) cad125dbe3 Fix modes from or-patterns (#50) 4efdb7273c Fix tailcalls tests with inlining (#52) 4a795cb4af Flambda support (#49) 74722cbf35 Add [@ocaml.principal] and [@ocaml.noprincipal] attributes, and use in oo.mli 6d7d3b87b5 Ensure that functions are evaluated after their arguments (flambda-backend #353) 89bda6b8ad Keep Sys.opaque_identity in Cmm and Mach (port upstream PR 9412) a39126a17f Fix tailcalls within regions (#48) 4ac4cfd4b8 Fix stdlib manpages build 3a95f5edaf Merge flambda-backend changes efe80c9b8b Add jane/pull-flambda-patches script fca94c47c6 Register allocations for Omitted parameter closures (#47) 103b139794 Remove various FIXMEs (#46) 62ba2c1d50 Bootstrap a0062ad6c4 Allow local allocations for various primitives (#43) 7a2165e64c Allow primitives to be poly-moded (#43) 2af3f55db6 Fix a flaky test by refactoring TypePairs (ocaml/ocaml#10638) 58dd8078aa Bootstrap ee3be10c8f Fix modes in build_apply for partial applications fe736568e5 Tweak for evaluation order of labelled partial applications (#10653) 052757089e Fix caml_modify on local allocations (#40) e657e995f6 Relax modes for `as` patterns (#42) f815bf2b4f Add special mode handling for tuples in matches and let bindings (#38) 39f1211a5f Only take the upper bounds of modes associated with allocations (#37) aec6fde3e4 Interpret arrow types in "local positions" differently c4f3319d19 Bootstrap ff6fdade6e Add some missing regions 40d586de9e Bootstrap 66d8110784 Switch to a system with 3 modes for values f2c5a85bce Bugfix for Comballoc with local allocations. (#41) 83bcd09ef1 Fix bug with root scanning during compaction (#39) 1b5ec83383 Track modes in Lambda.lfunction and onwards (#33) f1e2e97549 Port ocaml/ocaml#10728 56703cd290 Port ocaml/ocaml#10081 eb66785575 Support local allocations in i386 and fix amd64 bug (#31) c936b1902e Disallow local recursive non-functions (#30) c7a193a0f3 GC support for local allocations (#29) 8dd72709c9 Nonlocal fields (#28) e19a2f0571 Bootstrap 694b9ac5be Add syntax to the parser for local allocations (#26) f183008978 Lower initial stack size 918226ff46 Allow local closure allocations (#27) 2552e7d257 Introduce mode variables (#25) bc41c99b24 Minor fixes for local allocations (#24) a2a4e608e3 Runtime and compiler support for more local allocations (#23) d03055416b Typechecking for local allocations (#21) 9ee2332f66 Bugfix missing from #20 02c4cef20e Retain block-structured local regions until Mach. 86dbe1c7da amd64: Move stack realloc calls out-of-line 324d218997 More typing modes and locking of environments a4080b80f9 Initial version of local allocation (unsafe) git-subtree-dir: ocaml git-subtree-split: 173842ce847607a032ed3c3753ee14f22556910d
This PR adds support for local (stack) allocations to the backend, extracted from the branch at https://github.com/ocaml-flambda/ocaml-jst/tree/local-dev. To avoid creating at truly massive PR, this patch contains only the backend changes. (This also means that this PR contains no tests for local allocations. There is an extensive testsuite in the ocaml-jst repo, but the tests require front and middle-end support in order to run).
The original PR series for local allocations is at https://github.com/janestreet/ocaml/pulls?q=is%3Apr.
Concretely, the main changes here are:
Annotate allocations with a
Lambda.alloc_mode
.Alloc_heap
allocations are normal GC allocations, while the newAlloc_local
allocations allocate instead on a separate stack.At
Mach
and beyond, add two new primitivesIbeginregion
andIendregion
that delimit regions (roughly, stackframes). All local allocations during a region are freed at the end of the region, by resetting the local stack pointer.Ibeginregion
returns the current value of this pointer, whileIendregion
resets it.(Currently,
Iendregion
does some additional tracking of stats, for development and testing. This will be removed before release)At
Cmm
level, instead of explicit begin/end primitives we use a block-structuredCregion e
. The job of placingIbeginregion
/Iendregion
falls toSelectgen
.There is special handling of tail calls: since tail recursive loops must run in constant space, tail calls in tail position of a region end the region before transferring control. This is denoted by an
apply_position
inCmm
: when it isApply_tail
, theIendregion
will be inserted before the call bySelectgen
. Furthermore, there is a newCtail
construct inCmm
to handle the case where anApply_tail
call is inlined, which ends the region before the inlined body is evaluated.Finally, there is some refactoring of the generation of curry / apply stubs: instead of encoding tupled vs. curried in the sign of the arity, the stub is selected as a pair of a Lambda function kind and an integer. (This refactoring is there mainly to support the middle-end's need for both local and heap versions of the curry stubs)