Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Syntactic function arity #12236

Merged
merged 10 commits into from
Jul 15, 2023
Merged

Conversation

ncik-roberts
Copy link
Contributor

This PR implements syntactic function arity as described in this RFC.

This is a big PR! To help everyone understand what's going on, I've written this (longish) introductory post. It starts by summarizing the related RFC, offers some changes to design that came up during implementation, includes details to help a reviewer understand the changes, and ends with a few warts in the current implementation that seem unnecessary to solve now. Readers interested in language design but not implementation details can read just through the design details. The rest of this post is relevant only to someone providing a detailed review and may make the most sense read in parallel with the review.

The motivation is twofold:

  • Semantics: in the relatively rare cases where it makes a difference, the implemented semantics is both clearer and closer to programmer expectations than the current semantics.
  • Implementation: the implemented semantics allow us to simplify some tricky and fragile parts of the compiler.

The RFC has more detail, but I’ll summarize briefly. A function has a notion of “arity”, which is how many arguments it will accept before running effects for optional argument defaults and pattern matching. (Pattern matching can have effects, e.g. lazy patterns.) Currently the parser treats all functions as unary and the arity is only determined later based on the typedtree. Effectful patterns interrupt arity, so let f (lazy x) y = x + y is treated as a unary function returning a unary function. However, this changes with this PR.

After this PR, arity is a syntactic notion, so let f (lazy x) y = x + y is a 2-ary function. f (lazy (assert false)) won’t raise, but f (lazy (assert false)) 3 will.

There is a related, but subtly different, notion of “runtime arity” of a function. If a function application is fully saturated (i.e. the function has runtime arity n and is applied to n arguments), there is an optimization to make application fast in native code. This PR makes syntactic arity almost always the same notion as runtime arity. The one exception is when the syntactic arity is greater than the max allowed runtime arity (127), in which case the application optimization stops being useful.

Design Decisions and Clarifications to RFC

There are a few details implicit in or missing from the RFC that are worth spelling out here.

Optional argument defaults

The RFC describes how pattern matching works with syntactic arity, but doesn’t give a precise spec for optional argument defaults. The intuition is the same. Pattern matches and optional argument defaults are evaluated from left-to-right, with the relative order preserved from the parameter order.

The more general characterization is as follows:

Any active function parameter in the list of parameters between fun and -> (or between let function_name and =) is evaluated in left-to-right order before entering the body of a function (or before performing the match that is part of a function body). An active function parameter is any of the following:

  • Lazy pattern
  • A record pattern matching a mutable field
  • Partial pattern matches
  • Optional parameters with default values
  • Float array patterns, like [| 1.0, x |]. Matching against such a pattern can involve allocation, because x must refer to a boxed float but the array is flat. Allocation can run the GC. Running the GC can run finalizers, which runs arbitrary code.

Coercions and constraints

The parsetree spec given in the RFC mentions constraints, but not coercions. This should be a 2-ary function:

type a = [ `A ]
type ab = [ a | `B ]
type abc = [ ab | `C ]
let f () : ab -> ab :> a -> abc = function x -> x;;

stedolan tells me this was an unintentional omission. Coercions should be treated just like normal type constraints, not interrupting the arity of the declared function.

Class functions

Class functions still aren’t n-ary:

# class c ?(arg1 = print_endline "arg1") () ?(arg2 = print_endline "arg2") () = object(_) end;;
# let c' = new c ();;
arg1
# let c'' = c' ();;
arg2

I wanted to get input on this design before doing the same thing for class functions.

Avoiding unsoundness

This definition would be unsound:

# type (_, _) eq = Eq : ('a, 'a) eq
# let bad : type a. ?opt:(a, int -> int) eq -> unit -> a =
    fun ?opt:(Eq = assert false) () x -> x + 1

(* hypothetically, this would be given this type. but we don't actually do this. *)
val bad : ?opt:('a, int -> int) eq -> unit -> 'a
# let x : 'a. 'a = bad ()

In the current world, unsoundness is avoided by not pushing the evaluation of assert false past any GADT patterns. (So bad () would raise.) But this “pushing the evaluation” is the whole point of syntactic arity. So the approach I take is to stipulate that an n-ary function definition must unify with an n-ary arrow type:

# let bad : type a. ?opt:(a, int -> int) eq -> unit -> a =
      fun ?opt:(Eq = assert false) () x -> x + 1;;
Error: This expression has type ?opt:(a, int -> int) eq -> unit -> 'a -> 'b
       but an expression was expected of type
         ?opt:(a, int -> int) eq -> unit -> a
       Type 'a -> 'b is not compatible with type a

This approach has two warts. One, it rules out some previously-accepted (if weird) definitions:

# let spurious : type a. (a, int -> int) eq -> a = fun Eq x -> x + 1;;
Error: This expression has type (a, int -> int) eq -> 'a -> 'b
       but an expression was expected of type (a, int -> int) eq -> a
       Type 'a -> 'b is not compatible with type a

Two, it interacts with locally abstract types in a non-obvious way. The unification happens after unification variables have been substituted for any locally abstract type. So a definition might typecheck, but with 'a -> 'b substituted for what was written as a locally-abstract type:

# let odd (type a) ?opt:((Eq : (a, int -> int) eq) = assert false) () : a = function x -> x + 1;;
val odd : ?opt:('a -> 'b, int -> int) eq -> unit -> 'a -> 'b = <fun>

(This definition is ok because it will assert false if opt is omitted and it is applied up to the 'a argument.)

This type-checking behavior is consistent with how locally-abstract types behave in other situations, like recursion and the value restriction, and so is perhaps acceptable.

# let rec f (type a) (x : a) = f 3;;
val f : int -> 'a = <fun>
# let r (type a) = ref ([] : a list);;
val r : '_a list ref = {contents = []}

If this is a sticking point, I’m happy to discuss alternatives.

Question

This is a substantial change to a widely-used AST construct. How does the OCaml community deal with such changes that warrant (e.g.) updating all PPXes?

A guide to review

It’s likely best to review directory-by-directory. Here’s a quick summary of the per-directory changes:

  • parsing: Add the new function construct, replacing the two old constructs. The new construct includes multiple function parameters, newtype parameters, coercions/constraints on the function body, and the function body itself (whether that’s function cases or a standard expression body).
  • typing: Though the diff is large, this is semantically a refactor of existing functionality. The new function construct involves several language features (newtypes, coercions, constraints, function cases), and so this patch factors out pieces of type-checking code for those language features to be reused in the typing of the new construct.
  • lambda: Some tricky code is deleted. I get rid of push_defaults and the currying transformation, instead preserving the syntactic arity from type-checking. I add the translation of optional default arguments – this used to be done in type-checking, but this interacts poorly with syntactic arity.
  • utils: I add a helper function, List.chunks_of, that I found useful in enforcing the max runtime arity of a function.
  • tools: Changes to ocamlprof implied by the change in parsetree.
  • ocamldoc: Changes to ocamldoc implied by the change in typedtree.

There is more detail below.

The parsetree/typedtree changes imply some changes outside of the parsing/typing directories. The first few commits split up the parsing and typechecking changes, if you’re interested in narrowing in on the changes implied by one or the other.

Parsing

This is mostly a straightforward implementation of the RFC, with a few things to note:

  • fun (type a b c) -> e is still interpreted as nested Pexp_newtype’s, and not as a zero-argument Pexp_function.
  • I maintain a function parameter as arg_label * expression option * pattern to keep the diff smaller. This is instead of adopting the Arg_simple of pattern | Arg_labelled of string * pattern | Arg_optional of string * pattern * expression option distinction described in the RFC. This could easily be changed later.
  • I introduce a Pconstraint/Pcoerce variant instead of representing coercions/constraints as a pair of optional types. I found this type to be advantageous in type-checking, as it allows us to keep the code that handles coercions/constraints separate from the code that handles the absence of a coercion/constraint. It’s parallel to Pexp_constraint/Pexp_coerce, which is a nice bonus.
  • In fun x y -> function p1 -> e1 | p2 -> e2, the function should be parsed as a Tfunction_cases and not as a Tfunction_body $expr in order to get the arity right. The PR attempts to avoid shift-reduce conflicts here by creating a new class of expression, fun_expr, which can appear on the RHS of a lambda/function definition, and which excludes function constructs. To force the interpretation of the RHS as Tfunction_body $expr, the programmer can wrap the function in parentheses.

Typing

Typing code for three constructs is significantly refactored, in increasing order of invasiveness:

  • Newtypes, like (type a) …
  • Coercions/constraints
  • Cases (already used in typing matches and fun/function constructs)
  • Functions

A source of complexity common to all of these is the fact that typing the “argument” is no longer uniform. I address this by parameterizing the typing of these constructs over a function that takes care of typing the argument.

Let me make that clearer with an example. Take constraints. Previously the only constraint-like construct was Pexp_constraint (e, t), which represents things like (e : t). Constructs like let f x : t = function <cases> were desugared as let f x = (function <cases> : t). This desugaring is no longer possible, as the function body is parsed as Pfunction_cases, i.e. not an expression. So, I took the code that handled Pexp_constraint, bubbled it up to a top-level function, and parameterized it over the argument to the constraint; that allows the argument to be an expression (as in (e : t)) or “function cases” (as in let f x : t = function <cases>).

It is relatively straightforward to see how this technique was applied with Pexp_newtype and Pexp_constraint/Pexp_coerce. It is somewhat more difficult to see how this worked with type_cases, so I have a dedicated section on that below.

map_half_typed_cases and type_cases

This section describes the refactoring I did in the typing of function cases. It’s ultimately a no-op refactor, but the approach is more complex than the other pieces of this PR.

There used to be a single function, type_cases, which handled the typing for fun parameters, match cases, function cases, letop cases, and try cases. In this PR, I pull out a large amount of that code into map_half_typed_cases – the typing of fun parameters now calls that instead of type_cases. The “half” in “half typed cases” refers to the fact that the pattern, but not the body, is type-checked. (This terminology is used already in typecore.)

Take an example like fun x y -> function <cases>. You can think of map_half_typed_cases as the code in common to the typing of x, y, and <cases>. It’s the common prelude and postlude that we always want to run when typing a parameter, including (say) the typing of the pattern. But it’s now parameterized over what it means to type the body of the parameter. For x and y, that’s a recursive call to type_function to check the rest of the parameters and the body; for <cases>, that’s checking the expression bodies of each of the cases.

GitHub’s diff is actually very helpful here. It shows how type_cases was refactored into map_half_typed_cases. The big deleted hunk of code was just moved into the new type_cases. type_cases is now only called on things that are syntactically cases, so it does some extra things:

  • Type when-guards
  • Do the ambiguous binding check in the presence of when-guards
  • Check case expression bodies

Typing a function

This is now structured as a fold over the parameters. type_function does much of the same work as the old type_function, except now it also calls the parameterized versions of typing constraints, newtypes, and cases.

A guide to review: better diffs

Refactoring newtypes/coercions involved moving a fair amount of code. This bash function helps you see that, with color:

function my_diff () {
  # git diff doesn't work with process subsitution
  tmp1=$(mktemp)
  tmp2=$(mktemp)
  merge_base=$(git merge-base trunk syntactic-function-arity)
  git show "$merge_base":typing/typecore.ml | sed -n "$1" > "$tmp1"
  git show syntactic-function-arity:typing/typecore.ml | sed -n "$2" > "$tmp2"
  git diff --patience --no-index -w -- "$tmp1" "$tmp2"
}
Refactored code Diff command
type_newtype my_diff '/Pexp_newtype/,/Pexp_pack/p' '/and type_newtype/,/and type_ident/p'
type_coerce my_diff '/Pexp_coerce(sarg/,/Pexp_send/p' '/and type_coerce/,/and type_constraint/p'

Warts in the implementation

Methods

There is a small ugly function in translation to give methods the right arity.

Take the method m of this class as an example:

class c = object(self)
  method m x y = x + y
end

This method is parsed roughly as: Pexp_function ([ “self” ], Pexp_poly (Pexp_function ([ “x”, “y” ], … (* x + y *)))). But we’d like for m to have an arity of 3 (not because this can change the semantics of the program, but so that we get a runtime arity of 3 and can benefit from the native code optimization). The approach I take is to recover the arity in translation to lambda, by fusing together function bodies when the inner one is a Texp_poly. This is possible because methods are the only places that Pexp_poly is introduced.

It seems more principled to have an AST node that encodes this more directly. (Perhaps Pexp_poly should take an extra argument, self?) But, I didn’t do this as part of this PR to avoid yet another invasive change.

Attributes

Some attributes continue to be dropped. E.g.:

fun x y -> function[@inline] z -> expr

Here, the inline attribute is dropped in translation as it’s attached to the Pfunction_cases piece of an n-ary function, and not a proper expression. (This function can’t be inlined!) This doesn’t change anything. Previously this attribute was silently dropped in the currying translation.

Attributes relevant to type-checking continue to be respected, e.g. this is OK:

fun x y -> function[@ocaml.warning "-8"] true -> expr

@goldfirere
Copy link
Contributor

goldfirere commented May 10, 2023

In thinking about this PR, I'm unsure how best to help with review. I'm quite happy to review this -- but I imagine someone with more experience within the type checker should also review... but maybe can do a higher-level pass? In the end, we know review is a bottleneck and wish to lower the burden of dealing with large patches like these. So: how can I best help here?

(Full disclosure: I helped choose this as a task for @ncik-roberts to work on, and I offered feedback on aspects of the implementation. I have not done any detailed code review here, though, and I have not contributed any of the code.)

@Octachron
Copy link
Member

Octachron commented May 10, 2023

I don't follow the motivation for the backward-compatibility-breaking example:

# type (_, _) eq = Eq : ('a, 'a) eq
# let bad : type a. ?opt:(a, int -> int) eq -> unit -> a =
    fun ?opt:(Eq = assert false) () x -> x + 1
 # let x : 'a. 'a = bad ()

Here the syntactic arity of bad is two (as the syntactic arity of ?opt:(a, int -> int) eq -> unit -> a aka the arity in the equation-less environment) and thus x should raise an assert false exception. Breaking valid user code doesn't strike me as a very attractive alternative to computing the correct arity.

@ncik-roberts
Copy link
Contributor Author

ncik-roberts commented May 10, 2023

"Syntactic arity" as I describe in this PR is a property of the expression language, not the type language. So the syntactic arity of fun ?opt:(Eq = assert false) () x -> x + 1 is 3. (?opt, (), and x are the 3 syntactic params.)

The benefit of basing arity on the expression language is that effect ordering and runtime arity is completely predictable.

You're right that the current version of OCaml avoids unsoundness by computing an arity of 2 here, but that arity is not syntactically apparent. The number of arrows in the type can't be used to figure out the arity in general (this statement is true for both trunk and for this PR):

(* this has arity 2 in both trunk and this PR *)
let arity2 : type a. a -> a lazy_t -> bool -> a =
  fun x (lazy y) ->
    (fun b -> if b then x else y)

That's a clarification of what syntactic arity means but doesn't respond to your point. I'll mention that it's a small change to rewrite bad into something that type-checks and behaves properly by breaking up the arity:

# let good : type a. ?opt:(a, int -> int) eq -> unit -> a =
    fun ?opt:(Eq = assert false) () -> (fun x -> x + 1)
val good : ?opt:('a, int -> int) eq -> unit -> 'a = <fun>

Requiring this rewrite in these cases is certainly a trade-off against the benefits of arity. But "these cases" are limited to ones where the user is hiding an arrow in the type of the function being defined using a GADT pattern bound by that same function's parameters. Your judgement is helpful in gauging the impact of this. My feeling is that this would be very rare.

@gasche
Copy link
Member

gasche commented May 22, 2023

@Octachron asked me for my opinion on this specific matter (the unification with a function type to guard against unsoundness). Here is my understanding/opinion right now.

The unification can be understood as the fact that we are in fact type-checking an eta-expanded version of the program, that first takes several arguments (according on the syntactic arity of the function) and then applies the patterns to them. (This expansion is explained in the RFC document.)

(* before *)
let f ?opt:(p1 = def1) p2 p3 = body

(* after expansion *)
let f ?opt:arg1 arg2 arg3 =
  let p1 = match arg1 with
    | None -> assert false
    | Some x -> x
  in
  let p2 = arg2 in
  let p3 = arg3 in
  body

Another way to prevent unsoundness would be to perform the expansion and type-check the result normally. Is it equivalent to the approach that was implemented, with a unification after the fact? (This is not obvious to me in presence of Pexp_newtype abstractions etc.) Otherwise, would it maybe be possible to formulate the check in a way that would more clearly relate to the obviously-sound expansion?

Then there is the question of what we think of the expansion and the restriction it imposes on the typability of OCaml programs -- it rejects more programs. My current thoughts:

  1. The expansion is clearly reasonable in presence of optional arguments, whose operational semantics is tricky right now and are greatly simplified by the expansion.
  2. The expansion is arguably justified in presence of inexhaustive patterns, or more generally all patterns whose evaluation may also perform an effect (lazy patterns, or even mutable field reads).
  3. It is a harder sell for vanilla exhaustive patterns that have a type-checking impact.

The bad example that was discussed previously was an example of (1).

An example of (2) would be:

type (_, _) eqtest = Equal : ('a, 'a) eqtest | Nope : ('a, 'b) eqtest

let ex2 : type a . (a, unit -> unit) eqtest -> a = fun Equal () -> ()

An example of (3) was also given:

type (_, _) eq = Equal : ('a, 'a) eq
let ex3 : type a . (a, unit -> unit) eq -> a = fun Equal () -> ()

Personally I certainly agree that rejecting examples in category (1) is fair game, and I would agree that rejecting (2) is also reasonable, but I do find rejecting (3) more frustrating (I suspect this is the category that @Octachron finds problematic). I can see arguments going both ways:

  • in favor of rejecting (3): this makes the language more regular, no need to special-case (3) if we already perform the expansion for (1) and (2)
  • in favor of accepting (3): this is better for backward-compatibility, and there is a clear criterion for when to not perform the expansion (exhaustive patterns are kept as formal arguments in the eta-expansion, instead of being replaced by formal variables).

A weakness of the "accepting (3)" position is that the "clear criterion" very much depends on typing, we cannot tell if a pattern-matching is exhaustive or not just from the definition. The RFC as proposed is purely syntactic, which has the force of simplicity.

@ncik-roberts
Copy link
Contributor Author

ncik-roberts commented May 22, 2023

Another way to prevent unsoundness would be to perform the expansion and type-check the result normally. Is it equivalent to the approach that was implemented, with a unification after the fact?

I would say that the implementation is extremely close to just doing the expansion, but to be exhaustive, I would add two caveats:

  1. The implemented approach uses the technology of type_cases to type the patterns (as before), whereas the expansion as literally written would use type_let.
  2. It's not obvious to me how to define the expansion over type annotations on a function body.

The caveats seem minor to me; I discuss them below. My response here doesn't help decide when we should do this expansion, but it does support a notion that the restriction on typability is a question about the expansion as opposed to a question about the current implementation.


type_cases vs. type_let

The currently implementation is actually more like an expansion into:

(* before *)
let f ?opt:(p1 = def1) p2 p3 = body

(* after expansion *)
let f ?opt:arg1 arg2 arg3 =
  match
      match arg1 with
      | None -> def1
      | Some x -> x
  with
  | p1 ->
      match arg2 with
      | p2 ->
        match arg3 with
        | p3 -> body

This is for more obvious backward-compatibility. The old approach for typing funs used type_cases. I can't think of a case where the expansion you wrote (using let) and the expansion I wrote (using match) would change whether a program is accepted, but any difference here would have the potential to be a regression. And the choice of match vs. let is unrelated to soundness.

Type annotations on a function body

I mean things like:

let foo pat1 pat2 : ty = function
  | pat_case1 -> body1
  | pat_case2 -> body2

The difficulty is that there's no obvious place to annotate ty on the expanded program.

I suppose you could expand to:

let foo arg1 arg2 arg3 =
  let pat1 = arg1 in
  let pat2 = arg2 in
  ((match (arg3 : ty_arg) with
    | pat_case1 -> body1
    | pat_case2 -> body2)
  : ty_res)

where you additionally stipulate that ty_arg -> ty_res unifies with ty. I can't think of a way that literally doing this expansion would behave differently to what's implemented.

@goldfirere
Copy link
Contributor

I think this design point is a good one to debate. While I tend to lean toward simplicity as an important factor, it is annoying that simplicity and backward compatibility pull in opposite directions, and others are better suited than I in judging the weight of this backward incompatibility.

It strikes me, though -- and in conversation with @ncik-roberts -- that review on the correctness of this patch can be done even without a decision on this particular design point. Nick esitmates that changing the behavior in @gasche's case (3) would just add complexity, meaning that time spent reviewing the patch as-is would all be well spent. I am happy to be this reviewer (as I offered above), but I wanted to check that my review would be seen as sufficiently independent here before putting the time in. What do you think?

@gasche
Copy link
Member

gasche commented May 22, 2023

I think that applying the expansion (or an equivalent type-checking transformation) unconditionally, at the cost of breaking typability in case (3) (as currently proposed in this PR) is the better choice here. Indeed, I find the degree of sophistication involved in accepting (3) unreasonable; my understanding is that we would have to type-check each pattern in fun p1 ... pn -> once, and:

  • if it is exhaustive, add its GADT equations in the environment and continue
  • if it is not exhaustive, delay it for lateer -- type-check it again after the -> and add its equations only at this point

I find this too complex, and it has a backtracking flavor that is unpleasant. (Also, checking whether a pattern is exhaustive is complex and we occasionally get it wrong. Having this complex check determine the desugaring / operational behavior of the code does not feel great.)

Unfortunately, I also have the impression that detecting this situation -- a function constructor is ill-typed because the current type is not known to be a function type but would be if we had not delayed the introduction of some GADT equations -- is also difficult, so I am pessimistic about the fact that we could write a targeted error message in this situation.

Copy link
Contributor

@goldfirere goldfirere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a partial review -- I stopped in type_function. More next week.

parsing/parsetree.mli Show resolved Hide resolved
parsing/parsetree.mli Show resolved Hide resolved
parsing/parser.mly Outdated Show resolved Hide resolved
parsing/parser.mly Show resolved Hide resolved
parsing/parser.mly Outdated Show resolved Hide resolved
typing/typedtree.mli Show resolved Hide resolved
typing/rec_check.ml Show resolved Hide resolved
typing/typecore.ml Outdated Show resolved Hide resolved
typing/typecore.ml Outdated Show resolved Hide resolved
typing/typecore.ml Outdated Show resolved Hide resolved
@ncik-roberts ncik-roberts force-pushed the syntactic-function-arity branch 2 times, most recently from 1cc4484 to 7258a15 Compare May 30, 2023 20:13
@ncik-roberts
Copy link
Contributor Author

Unfortunately, I also have the impression that detecting this situation -- a function constructor is ill-typed because the current type is not known to be a function type but would be if we had not delayed the introduction of some GADT equations -- is also difficult, so I am pessimistic about the fact that we could write a targeted error message in this situation.

@gasche The current approach proceeds as follows for an n-ary function:

  1. Typecheck the function, eagerly introducing GADT equations.
  2. Then, outside the scope of the GADT equations, check that the function unifies with n nested arrow types.

So, if I am understanding you correctly, it seems to me that we can detect the situation you describe and write a targeted error message in the case that the unification in Step 2 fails.

typing/rec_check.ml Outdated Show resolved Hide resolved
Copy link
Contributor

@goldfirere goldfirere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A little bit more review complete. More in an in-person meeting tomorrow.

typing/rec_check.ml Show resolved Hide resolved
typing/untypeast.ml Show resolved Hide resolved
Copy link
Contributor

@goldfirere goldfirere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting closer!

typing/typecore.ml Show resolved Hide resolved
typing/typecore.ml Outdated Show resolved Hide resolved
typing/typecore.ml Outdated Show resolved Hide resolved
Copy link
Contributor

@goldfirere goldfirere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I have reviewed everything here save the files in lambda and in ocamldoc, saving those for someone more expert in those areas of the compiler.

typing/typecore.ml Show resolved Hide resolved
typing/typecore.ml Show resolved Hide resolved
typing/typecore.ml Outdated Show resolved Hide resolved
typing/typecore.ml Show resolved Hide resolved
@ncik-roberts
Copy link
Contributor Author

@lpw25 volunteered to read the translcore.ml piece of this.

Copy link
Contributor

@lpw25 lpw25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving the parts in lambda/, which look good to me.

lambda/translcore.ml Outdated Show resolved Hide resolved
Copy link
Contributor

@goldfirere goldfirere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it's in good shape, modulo the little concern about Merlin (which I believe @ncik-roberts is looking into). All the files have been reviewed for correctness. What's left for getting this merged?

@gasche
Copy link
Member

gasche commented Jul 13, 2023

I agree that it is probably time to merge this PR. Two questions:

  • is there a Changes entry? I cannot find it in the logs. (it should also credit the work on the RFC, and js-internal reviewers if applicable.)
  • do you want to cleanup the commit history before merging? (If the answer is "no" I would be tempted to merge as-is instead of squashing, because there are really a lot of separate changes here. But rebasing/squashing/fixupping some things could be nice if is your idea of a nice pastime.)

@Octachron
Copy link
Member

For record's sake, my final opinion on the backward compatibility issue is that the change will be fine with a specialized and understandable error message for this exotic case. Moreover I agree with @ncik-roberts that this is can be done in a straightforward way on top of this PR.

For the remaining review, I don't know if @gasche has reviewed the rec_check changes?

@ncik-roberts
Copy link
Contributor Author

I've just added the Changes entry. I will now look at these things:

  • Responding to @gasche 's review comments on rec_check.ml (thanks!)
  • Rebasing over the conflicting changes. (And perhaps also fixing up the commit history.)
  • Adding the improved error message that @Octachron mentions.

Comment on lines 308 to 310
[C] represents a type constraint or coercion placed immediately
before the arrow, e.g. [fun P1 ... Pn : t1 :> t2 -> ...]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OCaml doesn't accept this syntax at present. Rather inconsistently, while we currently allow simple constraints on both let and fun forms:

let f x : t = e            (* valid *)
let f = fun x : t -> e     (* valid *)

we allow the let form with a simple or full subtyping constraint

let f x :> s = e            (* valid *)
let f x : t :> s = e        (* valid *)

but don't allow the corresponding constrained fun forms:

let f = fun x :> s -> e     (* invalid *)
let f = fun x : t :> s -> e (* invalid *)

It would be good to eventually allow these last two forms, too, but for the moment the comment needs to be updated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this. I've updated the example to use a simple constraint with fun.

ncik-roberts added a commit to ncik-roberts/ocaml that referenced this pull request Jul 25, 2023
ncik-roberts added a commit to ncik-roberts/ocaml that referenced this pull request Aug 2, 2023
@Octachron Octachron added the parsetree-change Track changes to the parsetree for that affects ppxs label Aug 3, 2023
ncik-roberts added a commit to ncik-roberts/ocaml that referenced this pull request Sep 9, 2023
ncik-roberts added a commit to ncik-roberts/flambda-backend that referenced this pull request Sep 11, 2023
Tests from:
  - ocaml/ocaml#12236 (and the corresponding updates to outputs found in ocaml/ocaml#12386 and ocaml/ocaml#12391)
  - ocaml/ocaml#12496 (not merged)
ncik-roberts added a commit to ncik-roberts/flambda-backend that referenced this pull request Sep 12, 2023
Tests from:
  - ocaml/ocaml#12236 (and the corresponding updates to outputs found in ocaml/ocaml#12386 and ocaml/ocaml#12391)
  - ocaml/ocaml#12496 (not merged)
ncik-roberts added a commit to ncik-roberts/flambda-backend that referenced this pull request Sep 13, 2023
Tests from:
  - ocaml/ocaml#12236 (and the corresponding updates to outputs found in ocaml/ocaml#12386 and ocaml/ocaml#12391)
  - ocaml/ocaml#12496 (not merged)
ncik-roberts added a commit to ncik-roberts/flambda-backend that referenced this pull request Sep 14, 2023
Tests from:
  - ocaml/ocaml#12236 (and the corresponding updates to outputs found in ocaml/ocaml#12386 and ocaml/ocaml#12391)
  - ocaml/ocaml#12496 (not merged)
ncik-roberts added a commit to ncik-roberts/flambda-backend that referenced this pull request Sep 15, 2023
Tests from:
  - ocaml/ocaml#12236 (and the corresponding updates to outputs found in ocaml/ocaml#12386 and ocaml/ocaml#12391)
  - ocaml/ocaml#12496 (not merged)
gasche added a commit that referenced this pull request Sep 15, 2023
…en-typing-constraints

Restore a call to `instance` that was dropped in #12236
eutro pushed a commit to eutro/ocaml that referenced this pull request Sep 17, 2023
ncik-roberts added a commit to ncik-roberts/flambda-backend that referenced this pull request Oct 4, 2023
Tests from:
  - ocaml/ocaml#12236 (and the corresponding updates to outputs found in ocaml/ocaml#12386 and ocaml/ocaml#12391)
  - ocaml/ocaml#12496 (not merged)
ncik-roberts added a commit to ncik-roberts/flambda-backend that referenced this pull request Oct 5, 2023
Tests from:
  - ocaml/ocaml#12236 (and the corresponding updates to outputs found in ocaml/ocaml#12386 and ocaml/ocaml#12391)
  - ocaml/ocaml#12496 (not merged)
ncik-roberts added a commit to ncik-roberts/flambda-backend that referenced this pull request Oct 5, 2023
Tests from:
  - ocaml/ocaml#12236 (and the corresponding updates to outputs found in ocaml/ocaml#12386 and ocaml/ocaml#12391)
  - ocaml/ocaml#12496 (not merged)
ncik-roberts added a commit to ncik-roberts/flambda-backend that referenced this pull request Oct 5, 2023
Tests from:
  - ocaml/ocaml#12236 (and the corresponding updates to outputs found in ocaml/ocaml#12386 and ocaml/ocaml#12391)
  - ocaml/ocaml#12496 (not merged)
ncik-roberts added a commit to ncik-roberts/flambda-backend that referenced this pull request Oct 17, 2023
Tests from:
  - ocaml/ocaml#12236 (and the corresponding updates to outputs found in ocaml/ocaml#12386 and ocaml/ocaml#12391)
  - ocaml/ocaml#12496 (not merged)
@gpetiot
Copy link
Contributor

gpetiot commented Nov 2, 2023

I don't think it was mentioned in the comments, why is the Pexp_newtype constructor not replaced with Pexp_function, are there cases where we cannot represent it with the new one?
(for context I'm applying this change in ocamlformat, and it would make the work easier if Pexp_newtype was removed)

@ncik-roberts
Copy link
Contributor Author

ncik-roberts commented Nov 2, 2023

Yes, this is a Pexp_newtype but not a Pexp_function, supposing a suitable S is in scope:

fun (type a) -> (module struct
  type t = a
end : S with type t = a)

There are many such examples; the RHS of the -> can be any expression.

I suppose it could be reasonable to parse these as Pexp_function, though it’s a bit odd as they aren’t functions. (But to be fair, this is a statement about runtime behavior, not syntax.) In any case it wasn’t part of this PR, which was complex enough as is.

ncik-roberts added a commit to ocaml-flambda/flambda-backend that referenced this pull request Dec 28, 2023
* Newtypes

* Constraint/coercion

* Add map_half_typed_cases

* Implement type-checking/translation

This also promotes tests whose output changes.

* Add upstream tests

Tests from:
  - ocaml/ocaml#12236 (and the corresponding updates to outputs found in ocaml/ocaml#12386 and ocaml/ocaml#12391)
  - ocaml/ocaml#12496 (not merged)

* Fix ocamldoc

* Update chamelon minimizer

* Respond to requested changes to minimizer

* update new test brought in from rebase

* Fix bug in chunking code

* `make bootstrap`

* Add Ast_invariant check

* Fix type-directed disambiguation of optional arg defaults

* Minor comments from review

* Run syntactic-arity test, update output, and fix printing bug

* Remove unnecessary call to escape

* Backport changes from upstream to comparative alloc tests

* Avoid the confusing [Split_function_ty] module

* Comment [split_function_ty] better.

* [contains_gadt] as variant instead of bool

* Calculate is_final_val_param on the fly rather than precomputing indexes

* Note suboptimality

* Get typecore typechecking

* Finish resolving merge conflicts and run tests

* make bootstrap

* Add iteration on / mapping over locations and attributes

* Reduce diff and fix typo in comment:

* promote change to zero-alloc arg structure

* Undo unintentional formatting changes to chamelon

* Fix minimizer

* Minimize diff

* Fix bug with local-returning method

* Fix regression where polymorphic parameters weren't allowed to be used in same parameter list as GADTs

* Fix merge conflicts and make bootstrap

* Apply expected diff to zero-alloc test changed in this PR
@xavierleroy xavierleroy mentioned this pull request Feb 16, 2024
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
parsetree-change Track changes to the parsetree for that affects ppxs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Defaults to optional arguments suppress warning 68 in hard-to-predict ways
7 participants