Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distinguished lambda constructors for ref variables #10090

Merged
merged 6 commits into from Jan 8, 2021

Conversation

Keryan-dev
Copy link
Contributor

This follows issue #9954 which makes notice of an undesired behavior of the simplify_lets optimisation pass on the lambda representation.

The patch consists of replacing the attribute Variable of the Llet constructor, whose role was to identify bindings related to ref values. Instead, we have the specialized constructors Lmutlet and Lmutvar used to distinguish such bindings and their subsequent use respectively.

This distinction prevents incorrect substitutions of mutable variables by simplify_lets and thus act as a fix for #9954.

@gasche
Copy link
Member

gasche commented Dec 17, 2020

Hi Keryan!

It is not so easy to assess this pull request. What I see right now in the patch is a lot of duplication. If I understand the issue correctly (one of the optimizations of Simplif is invalid on mutable bindings), it should be relatively easy to fix with a much smaller patch. So why did you decide to use this more invasive approach of separating the constructs, at the cost of duplication? There may be a good reason (@lthls hints in his issue report that this is something he would like to have), but the reason should be explained in the pull request. Otherwise we would prefer to have the non-invasive fix.

@gasche
Copy link
Member

gasche commented Dec 17, 2020

If I understand correctly, the problematic optimization translates

let y = x in e

into

e[x/y]

This is correct when y and x are usual variables, but fishy when either of them is actually an assignable (mutable variable). One worrying aspect of the current representation is that variable occurrences don't carry any metadata to tell whether they are mutable or not (we would have to use Lmutvar as in this proposal, or Lvar of mut_flag * variable); if the correctness of this optimization relied strongly on the mutability of x, we would be in trouble. (Some parts of Simplify carry information on which of the variables are mutable, but count does not, and it tries to guess what substitution would do, so it would be tricky to track this information there.)

However, when thinking about it, I believe that:

  1. let-mut x = y in e should never be substituted away. This is easy to implement by checking the binding kind today.
  2. For non-mutable bindings, let<kind> x = mut y in e can be correctly substituted away if the binding-kind information is correct, namely if it is Alias, or if it is StrictOpt and x does not occur in e. (It should not be Strict by definition.)

so it sounds like we should be able to fix the reported misoptimization by just disabling the substitution of mutable bindings.

@lthls
Copy link
Contributor

lthls commented Dec 17, 2020

@gasche I fully agree with your first point, but I don't understand why you think you can't have a Strict binding of the form let x = [mut] y in e. In particular, unless I'm mistaken let r = ref foo in let x = !r in e will be compiled using Strict bindings, so you end up after simplification with let<Variable> r = foo in let<Strict> x = r in e. I'll quote the comment in lambda.mli:

Strict: e may have side-effects; always evaluate e first
  (If e is a simple expression, e.g. a variable or constant,
   we may still substitute e'[x/e].)

The problem is that Lvar v is a simple expression if and only if v was not bound as Variable. Separating the representations of regular variables and mutable variables means we can check whether an expression is simple or not just by looking at the top constructor, as was originally intended (I assume).

By the way, I couldn't find any part of Simplif that carries mutability information. I'm curious about what part of the code would do this, so if you have a pointer to the code you were thinking about I'd appreciate if you shared it. It may prove useful to understand what the potential problems with mutable variables are.

I'll point out that I'm the one who suggested to @Keryan-dev that this might be a nice subject to work on, and we've already discussed the approach together. I haven't reviewed all of the code yet though (I plan to do this when I'm back from holidays).
I also hope that we'll be able to build on this PR to simplify a bit the conversions to the middle-end.

@Keryan-dev Your branch used to have the Lmutvar and Lmutlet changes in two separate commits. Do you think you could re-split the patch in two ? The Lmutlet part might prove more controversial, so being able to review each part independently may help.

@gasche
Copy link
Member

gasche commented Dec 18, 2020

Yes, sorry, let<strict> x = mut y can definitely happen, and should not be substituted. I see a remark in Lambda.mli that Strict bindings can then subtitute "simple" definitions, and this would be wrong here (mutable variables are not simple). So it does look like we indeed need to distinguish mutable variables from non-mutable variables -- which is a reasonable design in itself. (We could do it with Lvar of var | Lmutvar of var or with Lvar of mut_flag * var.)

Regarding "tracking mutable variables", the code I had in mind is in Closure, apologies for the misdirection.

fprintf ppf "@[<2>(let[mut]@ @[<hv 1>(@[<2>%a =%a@ %a@]"
Ident.print id value_kind k lam arg;
let expr = letbody body in
fprintf ppf ")@]@ %a)@]" lam expr
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: here I think the duplication is cumbersome, and could be avoided with a refactoring of the code (an auxiliary function or something).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, I addressed that in 4600d47.

lambda/simplif.ml Show resolved Hide resolved
@Keryan-dev
Copy link
Contributor Author

@lthls, I split the commit in two as requested. Note that Lmutvar on its own do not prevent the odd behavior of simplify_lets (unless we use on top of that the simpler fix that @gasche pointed out).

Copy link
Member

@gasche gasche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made another round of review, trying to see where the code could be simplified now that we distinguish pure variables from mutable variables.

@@ -170,7 +170,7 @@ let lambda_const_int i : Lambda.structured_constant =

let rec close t env (lam : Lambda.lambda) : Flambda.t =
match lam with
| Lvar id ->
| Lvar id | Lmutvar id ->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you should be able to simplify the code below: the Lvar case should use find_var, and the Lmutvar case should find_mutable_var directly.

@@ -861,7 +861,7 @@ let close_var env id =
let rec close ({ backend; fenv; cenv ; mutable_vars } as env) lam =
let module B = (val backend : Backend_intf.S) in
match lam with
| Lvar id ->
| Lvar id | Lmutvar id ->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, if I read the code correctly, Lmutvar id will never have a value optimization in the store, so we could return Uvar id directly.

(In the future if we had a mutability flag on Uvar as well we could get rid of mutable_variables; but this would be a more invasive change to consider later.)

@@ -25,7 +25,7 @@ open Debuginfo.Scoped_location
exception Real_reference

let rec eliminate_ref id = function
Lvar v as lam ->
Lvar v | Lmutvar v as lam ->
if Ident.same v id then raise Real_reference else lam
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that eliminate_ref is only called with pure variables id, so Ident.same v id is impossible in the Lmutvar case. Could you simplify the code with Lmutvar _ as lam -> lam directly?

@@ -430,6 +434,8 @@ let simplify_lets lam =
count (bind_var bv v) l2;
(* If v is unused, l1 will be removed, so don't count its variables *)
if str = Strict || count_var v > 0 then count bv l1
| Lmutlet(_kind, v, _l1, l2) ->
count (bind_var bv v) l2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I read the code correctly, Lmutlet will never be substituted away by the simplif function below, even if the count is <=1. So why are we counting them? I think that we should count pure variables, not mutable variables, so this could be just () and same for the Lmutvar case above.

Copy link
Contributor

@lthls lthls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reviewed this PR, and it looks correct. I've got two small comments that were not addressed by your last commit, otherwise this looks ready to me.

begin match Ident.Map.find id l with
| id' -> Lmutvar id'
| exception Not_found ->
begin try Ident.Map.find id s with Not_found -> lam end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be an error for a mutable variable to appear in s. Could you leave a comment here so that we remember about it ? For this PR keeping the old behaviour is probably best.


let rec simplif = function
Lvar v as l ->
Lvar v | Lmutvar v as l ->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mutable variables are never substituted, so you can fold the Lmutvar case with the Lconst case.

@gasche
Copy link
Member

gasche commented Jan 6, 2021

The code is looking relatively nice now, and my attempts to convince myself that the split was unnecessary have failed. Without new inputs, this looks like something that is going to get approved and merged shortly. If other people have qualms about separating mutable variables from variables at the Lambda level, they should speak now.

| Lletrec(idel, e2) ->
Lletrec(List.map (fun (v, e) -> (v, eliminate_ref id e)) idel,
eliminate_ref id e2)
| Lprim(Pfield 0, [Lvar v], _) when Ident.same v id ->
Lvar id
Lmutvar id
| Lprim(Psetfield(0, _, _), [Lvar v; e], _) when Ident.same v id ->
Lassign(id, eliminate_ref id e)
| Lprim(Poffsetref delta, [Lvar v], loc) when Ident.same v id ->
Copy link
Contributor

@lthls lthls Jan 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a subtle bug remaining here (on the line below, but I can't put a comment there directly): Lvar id needs to become Lmutvar id. I missed this in my earlier review...

Copy link
Contributor

@lthls lthls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to add a Changes entry, but otherwise this looks ready.

@mshinwell
Copy link
Contributor

I haven't reviewed the implementation, but this is a good change, which I wholeheartedly support.

@gasche
Copy link
Member

gasche commented Jan 8, 2021

Merging, then.

@gasche gasche merged commit 374ddf7 into ocaml:trunk Jan 8, 2021
@gasche
Copy link
Member

gasche commented Jan 8, 2021

Thanks @Keryan-dev for your first contribution :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants