-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Propagate the classification from Rec_check #12551
Conversation
Personally, I think it's nicer (so far as practicable) to finish detection of ill-formed programs before translation begins. |
I agree, for two reasons:
|
I tend to agree about keeping the check during typing, but the hash table hack that I used in my first commit was unpleasant so I'd rather not restore it. One possibility would be to put the check back in its previous form, returning a boolean, but export the classification function independently so that it can be called again during translation to Lambda. I'm a bit uncomfortable with the idea of doing a computation once and hoping that doing it again later would give the same result though; this looks a bit too close to what we're doing at the moment and we know it is causing issues. |
For the current discussion: for me the obvious way to transfer information from type-checking to lambda would be to add the data to the typedtree -- instead of populating a table on the side. Have you considered it? For the bigger picture: thanks! I am interested in helping this work move forward, but right now I am obsessed with the pattern-matching compiler so I will not review it right away. |
Yes, I have. But I didn't find an obvious place to put the annotation (inside the pattern ? the value bindings ? the
No problem. I can work on some of the other items from #12549 even if this is not merged. |
To compile I would find it natural to have Rec_check compute some "information about the recursive nest", and then include that information as an extra argument to Texp_let. (This can be done before or after typing the body.) You could also split the information into each (typed) binding, and so generate a new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like what I see with this new approach, which seems much cleaner to me than your previous approach to this PR (what do you think?). See the inline comment below for a small irritating suggestion.
I find it difficult to have a good picture of the impact in terms of behavior difference caused by this patch. In theory (if the logic in rec_check and in the backends was perfectly consistent) it should not change anything. In practice it may introduce bugs when the frontend is wrong, it may remove bugs where one of the backend was wrong, and it may turn some compiler-produces-unsound-code situations into compiler-produces-statically-rejected-code situations. This seems overall positive, and I trust your intuition that the change makes the behavior better overall, but I could not gain this certainty myself by just looking at the diff.
lambda/lambda.ml
Outdated
@@ -297,7 +297,8 @@ type lambda = | |||
| Lfunction of lfunction | |||
| Llet of let_kind * value_kind * Ident.t * lambda * lambda | |||
| Lmutlet of value_kind * Ident.t * lambda * lambda | |||
| Lletrec of (Ident.t * lambda) list * lambda | |||
| Lletrec of | |||
(Ident.t * Typedtree.recursive_binding_kind * lambda) list * lambda |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be in favor of making this a record: you are already touching all those places so this will not make your patch more invasive, and it is likely to help in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will try this suggestion. I think the only place where it could cause problems is is tmc.ml
, but hopefully my GADT hack will prove robust enough to handle that.
I could probably have been a bit more explicit. With this patch, the only actual difference is that when |
ctx, bindings | ||
|
||
and traverse_binding outer_ctx inner_ctx (var, def) = | ||
and traverse_binding : | ||
type a. a binding_kind -> context -> context -> a -> a list = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: instead of a GADT, you could do this with a record:
type 'a decomposed_binding = {
var: Ident.t;
def: lambda;
recompose : Ident.t * lambda -> 'a;
}
val traverse_binding : context -> context -> 'a decomposed_binding -> 'a list
I am not sure that this would be better, but at least the type involved are a bit simpler. (Your choice.)
With that presentation you cannot really use your approach of reusing the same recursive kind in the no-transform case, and adding Static in the transform case -- the recompose
function will do only one or the other. I think that always using the same recursive kind should be correct -- the TRMC transform of a non-recursive function should be two non-recursive functions.
Just to be clear: this is approved but I would like to give @lthls a chance to use a record type for recursive bindings before merging. |
I have implemented the suggestion to use records, it does look a bit nicer. Ideally, after all the let-rec PRs are merged it should become just |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder what you think about the record. I like it, but I think that you could do even better with a judicious use of if
, or maybe even better with a map_rec_binding
helper function.
There is always a tension between listing all the fields, which forces us to revisit the line on each modification to the record type (the product counterpart of exhaustive pattern matching, which is not in general a bad thing), and using with
or other forms that silently keep non-mentioned fields unchanged. I think in many cases it is clear that we are just mapping on the code subterm, and in this case a new-field-agnostic form will be nicer for future changes.
lambda/lambda.ml
Outdated
let id', l = bind id l in | ||
((id', clas, def) :: ids' , l) | ||
({ id = id'; rkind; def } :: ids' , l) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer a { binding with id = id' }
approach, which requires no changes when adding new fields in the future.
lambda/lambda.ml
Outdated
@@ -840,7 +845,7 @@ let subst update_env ?(freshen_bound_variables = false) s input_lam = | |||
let id = try Ident.Map.find id l with Not_found -> id in | |||
Lifused (id, subst s l e) | |||
and subst_list s l li = List.map (subst s l) li | |||
and subst_decl s l (id, clas, exp) = (id, clas, subst s l exp) | |||
and subst_decl s l { id; rkind; def } = { id; rkind; def = subst s l def } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or
and subst_decl s l decl = { decl with def = subst s l decl.def }
lambda/lambda.ml
Outdated
@@ -885,7 +890,9 @@ let shallow_map f = function | |||
| Lmutlet (k, v, e1, e2) -> | |||
Lmutlet (k, v, f e1, f e2) | |||
| Lletrec (idel, e2) -> | |||
Lletrec (List.map (fun (v, clas, e) -> (v, clas, f e)) idel, f e2) | |||
Lletrec | |||
(List.map (fun { id; rkind; def } -> { id; rkind; def = f def }) idel, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again I would use { r with def = f r.def }
here.
lambda/simplif.ml
Outdated
eliminate_ref id e2) | ||
let bindings = | ||
List.map (fun { id = v; rkind; def } -> | ||
{ id = v; rkind; def = eliminate_ref id def }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this is (fun r -> { r with def = eliminate_ref id r.def })
, which avoids the subtle issue of not shadowing id
when opening the record.
lambda/simplif.ml
Outdated
simplif ~try_depth body) | ||
let bindings = | ||
List.map (fun { id; rkind; def } -> | ||
{ id; rkind; def = simplif ~try_depth def }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... maybe there should be a map_rec_binding : (lambda -> lambda) -> rec_binding -> rec_binding
function in Lambda?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe we could rewrite all of these Simplif
functions in terms of the existing Lambda
iterators.
No opinion on the Changes entry, do as you prefer. If we start running late in the season we can always reword the entry. (My approach is to have an entry that describes the work done so far, and update the entry text if necessary as further PRs come in. But again, do as you want.)
Good for me: if you start feeling pressure to make it to 5.2, I can convince you that reviewing my pattern-matching PRs is in your own best interest. You would be a fine reviewer with #12534 for example. |
(Squashed and) merged. Thanks! |
This is the first step outlined in #12549. It propagates the classification computed during
Rec_check
to the places where recursive bindings are compiled.It is mostly equivalent to the previous code, with two differences:
Dynamic
. This is mostly irrelevant, but for expressions such aslet a = ... in let rec b = (fun x -> x) (Some a) in ...
the current compilation scheme would have classifiedb
asDynamic
inRec_check
but simplifications would have turned the definition back into a block of known size. Since the compilation scheme for values of unknown size is actually better than the one for values of known size (when it is correct) this is actually an improvement.Translcore
in the second commit. This changes a few test outputs as some illegal definitions now produce other warnings before the error, but the changes looked reasonable to me. If this proves controversial, I could go back to the approach I initially used and kept in the first commit, which is to use a global hash table (populated during typing and read during translation to lambda).The vast majority of the changes are just mechanical changes to propagate the classifications properly through the middle-end.