Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compile recursive bindings in Lambda #12596

Merged
merged 24 commits into from
Jan 29, 2024
Merged

Conversation

lthls
Copy link
Contributor

@lthls lthls commented Sep 22, 2023

This PR represents my full work on recursive bindings. It contains a full compilation scheme in Lambda, that supports all of the definitions currently allowed (except for a few corner cases that are now properly rejected, see #12585 and #12592).

I don't advise to review this PR directly now, as it is a bit complex and better reviewed by parts. I'll now get back to #12551, get it properly reviewed and merged, then open new PRs for each part (roughly one PR per commit, except the first three that are all in #12551). I'll also add a commit at some point to remove the existing backend support for non-function recursive definitions.
Update: Now that otehr PRs have been merged, this one is reviewable.

But if someone wants to check whether my work will break existing code, the branch associated to this PR can be used.

@gasche
Copy link
Member

gasche commented Dec 4, 2023

@lthls and myself are starting to review this PR, now that the previous preliminary PRs have been reviewed, iterated upon, reviewed again and merged.

@gasche
Copy link
Member

gasche commented Dec 4, 2023

This PR removes value letrecs from the Lambda intermediate representation (IR): value letrecs are compiled away during the transformation from Typedtree to Lambda, just like pattern-matching compilation.

This may in theory break the workflow of people who produce Lambda IR directly, without going through the OCaml frontend. The only project in this category we are aware of is Malfunction, an experimental prototype. Any affected project can "fix" its code -- if it produces value letrecs, which may or may not be the case of Malfunction -- by replacing a Letrec (bindings, body) Lambda node by a call to the builder function Value_rec_compiler.compile_letrec bindings body provided in this PR.

I consider that this change to an internal IR is acceptable. (This is one of the benefits of the PR, as it allows to remove code duplication between Cmmgen and Bytegen: Lambda is the last common IR so the compilation must happen before it if we want to reduce duplication.)

@gasche
Copy link
Member

gasche commented Dec 4, 2023

Note: Malfunction currently restricts let rec bodies to contain either immediate functions (lambda (...) body) or immediate lazy thunks (lazy ...): https://github.com/stedolan/malfunction/blob/4227093923bfa917565c5194c9ebd586415bdadf/src/malfunction_parser.ml#L105-L112 It will indeed have to call Value_rec_compiler, at least whenever a lazy thunk is used, but again doing so would be very easy.

Copy link
Member

@gasche gasche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A first batch of review comments for everything except value_rec_compiler, the meat of the PR.

@@ -312,8 +312,7 @@ type lambda =

and rec_binding = {
id : Ident.t;
rkind : Value_rec_types.recursive_binding_kind;
def : lambda;
def : lfunction;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could have a documentation comment that explains that the AST format changed in 5.2, and that now users are expected to use Value_rec_compiler if they want to produce Lambda code with non-immediate-function recursion.

lambda/printlambda.ml Show resolved Hide resolved
lambda/simplif.ml Outdated Show resolved Hide resolved
lambda/simplif.ml Show resolved Hide resolved
lambda/simplif.ml Outdated Show resolved Hide resolved
lambda/simplif.ml Outdated Show resolved Hide resolved
lambda/tmc.ml Outdated
let def =
match def with
| Lfunction lfun -> lfun
| _ -> assert false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could avoid this not-so-nice code by adding yet another parameter to binding_kind, to decide whether the internal term is a lambda or a lfunction.

Copy link
Member

@gasche gasche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A second batch of review, superficial comments on value_rec_compiler.ml.

lambda/value_rec_compiler.ml Outdated Show resolved Hide resolved
lambda/value_rec_compiler.ml Show resolved Hide resolved
lambda/value_rec_compiler.ml Outdated Show resolved Hide resolved
lambda/value_rec_compiler.ml Show resolved Hide resolved
lambda/value_rec_compiler.ml Show resolved Hide resolved
lambda/value_rec_compiler.ml Outdated Show resolved Hide resolved
lambda/value_rec_compiler.ml Outdated Show resolved Hide resolved
lambda/value_rec_compiler.ml Outdated Show resolved Hide resolved
lambda/value_rec_compiler.ml Show resolved Hide resolved
@gasche
Copy link
Member

gasche commented Dec 4, 2023

(My main request for this part of the code is: proper documentation for the compilation scheme. @lthls is working on a comment to describe this under my strict supervision: I want at least 42 lines!)

@gasche
Copy link
Member

gasche commented Dec 6, 2023

(Current state: I am waiting for @lthls to update the PR with the documentation we wrote together during our meeting, and some cleanups we discussed above.)

@lthls
Copy link
Contributor Author

lthls commented Dec 7, 2023

I've pushed the documentation update and a few small additional changes. I believe I've addressed all of @gasche's review comments.

@@ -153,7 +173,7 @@ let compute_static_size lam =

| Pctconst _ ->
(* These primitives are not special-cased by [Value_rec_check],
so we could return [Dynamic] too *)
so we should never end up here; but these are constants anyway *)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anyway.

lambda/value_rec_compiler.ml Show resolved Hide resolved
lambda/value_rec_compiler.ml Show resolved Hide resolved
{ aux }
and i_lifted = fun x -> i_context.aux x
]}
*)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We forgot to say a word about whether this transformation is performance-preserving. My understanding is that closure conversion introduces a slight overhead as there will be an extra indirection to access the free local variables (the block is in the closure, and then we index into that block, the extra indexing is a new cost) -- whenever the function is called later. One should of course point out that syntactic functions are unchanged and this is the most common case. You could also mention that avoiding backpatching for functions lets the optimizer do a better job.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a compilation scheme that avoids the extra indirection, by defining each free variable separately in the recursive block? My understanding is that this is not easy to do in general, because the free variables may depend on each other in a way that invalidates the naive strategy of simply lifting each of them as one recursive definition. For example:

let rec f =
  let v0 = ... f ... in
  let v1 = ... f ... v0 ... in
  fun x -> ... v0 ... v1 ...

cannot always be turned into the obvious

let rec f = fun x -> ... v0 ... v1 ...
and v0 = ... f ...
and v1 = ... f ... v0 ...

as the dependency of v1 over v0 may be stricter than what is allowed between recursive neighbors.

Maybe we could do something by remembering an ordering constraint between v0 and v1 for the code generation pass later. This sounds complex for a worry that is mostly theoretical, so I think that the current choice is reasonable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the approach taken in the dissection PR. See ocaml-flambda/flambda-backend#1814 for an example where it doesn't work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I would still welcome a short comment on the code-generation / performance impact of the transformation.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I haven't missed it and I'm working on it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that instead of turning fun ... -> ... x ... y ... into fun ... -> .... fv.x ... fv.y ..., we could compile it into fun ... -> let {x; y} = fv in ... x ... y .... Do you think that the code you currently generate (besides being easier to express with just a substitution) is better?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually something we discussed for Flambda 2 (in the context of regular closure conversion, but the issue is the same).
With big environments, by reading them early you risk having many variables live at the same time, making register allocation less efficient. But if you have several occurrences of the same variable, the multiple loads will add some extra instructions (increasing code size and the number of runtime instructions).
Closure always uses the substitution version, as does Flambda 1. Flambda 2 used to do the version with early bindings, but we had to change it towards a substitution-like version because of some bad cases.
So I think the version with substitutions is less likely to be very bad, but there are probably a number of common cases where it's not as good as early bindings.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting! I am glad that I asked.

def : lfunction;
(* Generic recursive bindings have been removed from Lambda.
[Value_rec_compiler.compile_letrec] deals with transforming generic
definitions into basic Lambda code. *)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"have been removed from Lambda": I would include OCaml version information, no need to have people guess when the change happened.

{ aux }
and i_lifted = fun x -> i_context.aux x
]}
*)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that instead of turning fun ... -> ... x ... y ... into fun ... -> .... fv.x ... fv.y ..., we could compile it into fun ... -> let {x; y} = fv in ... x ... y .... Do you think that the code you currently generate (besides being easier to express with just a substitution) is better?

@gasche
Copy link
Member

gasche commented Dec 11, 2023

On my side, while I broadly agree with the design, I haven't reviewed the two "compilation" functions yet, split_static_function and compile_letrec in value_rec_compiler.ml.
I need to review them (they are not too large, about 200 and 100 lines) before approving, but I have been swamped with other things last week and this week and haven't got around to it yet.

If there are any volunteers to help review them for implementation correctness, they are warmly welcome! (What they are supposed to do should be clearly documented in code comments.)

@gasche gasche added this to the 5.2 milestone Dec 13, 2023
@gasche gasche self-assigned this Dec 13, 2023
@gasche
Copy link
Member

gasche commented Dec 15, 2023

(For the record, I tried to find a volunteer reviewer among the Tarides people present at INRIA last Wednesday, but everyone looked the other way. cc @hhugo or @smuenzel who may be willing to look at this.)

I heard from @Octachron that the 5.2 milestone placed on this issue means that we get a few extra weeks for reviewing. I would be able to look at the rest of the code myself in this time frame, but would of course be delighted if someone else did the work before that.

@smuenzel
Copy link
Contributor

I may be able to take a look soon -- but it will take me some time to figure out these functions.

let dynamic_size () =
Misc.fatal_error "letrec: No size found for Static binding"

let join_sizes size1 size2 =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the name, I would expect join_sizes to not produce an error when size1 is the same as size2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a comment explaining some of issues around joining sizes.

| None -> []
| Some fail -> [0 (* ignored *), fail]
in
compute_and_join_sizes_switch env [sw.sw_consts; sw.sw_blocks; fail_case]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if this code can actually be executed. Value_rec_check.classify_expression says that Texp_match is Dynamic (same with other cases, such as Texp_ifthenelse), so wouldn't we expect static bindings to never contain these cases? It might be interesting to add tests to the testsuite that exercise all the cases.

Copy link
Member

@gasche gasche Dec 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My answer to your good question is three-fold:

  1. Branching can be introduced between the typedtree and the Lambda code, for example on non-exhaustive let bindings; we were reminded of this in Segmentation fault on non-exhaustive pattern-matching in let rec #12032, which discusses possible fixes, and the bugs were fixed by @lthls in Correct size for recursive values with branching #12059
  2. As you point out in your comment on join_sizes, it would make sense to accept more definitions, for example those that return the same (non-Unreachable) size in all cases, or those that use branches with unreachable cases in the typedtree already (by extending the classification). In general I think that the code in the present PR should be "reasonably general": not be too tightly coupled to the specific set of terms accepted by Rec_check (so that we don't always have to change both in lockstep), but not more complex for the sake of generality.
  3. The design goal was never to accept more recursive values, so the PR is not trying to extend Rec_check to benefit from the extra generality in compiler when there is some. I think this is the right choice, but it means that we will indeed end up with some gap between what the type-checker accepts and what the compiler would support, with examples that would work correctly but cannot be written down at the source level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code was initially written with eager computation of sizes for let bindings, so it could be reached in cases like:

let rec l =
  let v = match foo with
    ...
  in
  v :: l

With the lazy environments, this code has very likely become unreachable, but I think it's less work to keep the code as it is than to be sure it is actually unreachable.

@gasche
Copy link
Member

gasche commented Jan 10, 2024

( @smuenzel I wonder if you are planning to do a further review round on this, or not. Either way is fine of course. )

@smuenzel
Copy link
Contributor

yes, I will try to get to it next week

Copy link
Contributor

@Ekdohibs Ekdohibs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only reviewed value_rec_compiler.ml{,i}.

lambda/value_rec_compiler.ml Show resolved Hide resolved
| Unreachable, Reachable (lfun, handler) ->
Reachable (lfun, Ltrywith (body, exn_var, handler))
| Reachable _, Reachable _ ->
Misc.fatal_error "letrec: multiple functions"
Copy link
Contributor

@smuenzel smuenzel Jan 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the result of calling split_static_function on a non-static function? Then maybe we want to produce a similar error message as below ("letrec binding is not a static function")?

Or I guess what I'm saying is that I'm not sure I understand the difference between "multiple functions" and "not a static function"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like all occurrences of Misc.fatal_error, it is supposed to be unreachable unless there is a bug in the compiler.
Having a different message helps a little bit in the unlikely event that we actually have to debug this code; here I would know if I get this message that I got a term where two different tails are functions, while the "not a static function" message would tell me that I got a term whose size was computed as Function but I couldn't find the function.
This code should be unreachable because Value_rec_check wouldn't allow it, and join_sizes also has a small comment mentioning this particular issue (that we can't join two Function sizes with the current algorithm even if Value_rec_check allowed it).

Copy link
Contributor

@Ekdohibs Ekdohibs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For value_rec_compiler.ml{,i}.

Copy link
Member

@gasche gasche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Thaks @smuenzel and @Ekdohibs for your reviews. This is good to go for me, as I have reviewed the design in depth and I can approve on behalf of @Ekdohibs for the compilation function.

@lthls you could update the Changes entry with two new reviewers.

@lthls
Copy link
Contributor Author

lthls commented Jan 24, 2024

I've updated the Changes entry. I've also checked the history, and I didn't see any individual commits that look worth preserving, so I would be in favour of squash-and-merge.

@gasche
Copy link
Member

gasche commented Jan 24, 2024

(Do you want to cleanup the history a bit?)

My plan is to wait a few days to leave other people the opportunity to comment, and the merge -- in trunk only.

@gasche gasche merged commit d6b7459 into ocaml:trunk Jan 29, 2024
11 checks passed
@gasche
Copy link
Member

gasche commented Jan 29, 2024

I went ahead and merged in trunk.

Do we want to cherry-pick in 5.2? We used the 5.2 milestone because this was "almost ready" for before the release was branched, but is it important to include this in 5.2 rather than 5.3, at the risk of having less times to spot regressions before the code is released to users?

@lthls do you have a particular reason to cherry-pick this in 5.2?

(My take: this does not add new features to the compiler, and it probably fixes a few bugs, but those are bugs that we had for a good while already and that users typically don't hit when writing their programs, so I am not sure that we are in a rush. I think we could cherry-pick or not, both choices could be justified.)

@lthls
Copy link
Contributor Author

lthls commented Jan 30, 2024

I agree we are not in a rush, but the PR is ready, has been reviewed properly, so I think it makes sense to include it in 5.2 if we can (if there is no consensus on that, we can delay without debate).

Regarding regressions, I tend to have the opposite view from yours: we're not very likely to spot regressions on trunk anyway, as not all the ecosystem is designed to work with non-released compilers, but the pre-releases are good for that so we should not deliberately delay this process for this PR. It's better to find issues while we still remember the details, in my opinion.

Finally, as a contributor to the flambda-backend compiler, this PR allows us to get rid of our buggy previous code so we will cherry-pick it, and it would be nice if this didn't make us diverge from upstream releases for too long.

@gasche
Copy link
Member

gasche commented Jan 30, 2024

That's an okay-enough justification for me if our release overlord @Octachron agrees.

(Regarding "extra testing in alpha/beta/rc versions": yes, if we are early enough in the testing process. If we add regressions in the middle or at the end of the process, we are more likely to cause stress or un-noticed issues.)

@lthls
Copy link
Contributor Author

lthls commented Jan 30, 2024

(Regarding "extra testing in alpha/beta/rc versions": yes, if we are early enough in the testing process. If we add regressions in the middle or at the end of the process, we are more likely to cause stress or un-noticed issues.)

Yes, if we somehow end up releasing an alpha without this PR I would argue for delaying until the next version too.

@Octachron
Copy link
Member

I am planning the first alpha for the end of the week. Cherry-picking this PR to 5.2 is fine with me (before this date).

gasche pushed a commit that referenced this pull request Jan 30, 2024
@gasche
Copy link
Member

gasche commented Jan 30, 2024

Ok, I cherry-picked into 5.2 at 45e6bd6.

@lthls
Copy link
Contributor Author

lthls commented Jan 30, 2024

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants