Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Local open in types #12044

Merged
merged 7 commits into from
Jun 24, 2023
Merged

Local open in types #12044

merged 7 commits into from
Jun 24, 2023

Conversation

johnyob
Copy link
Contributor

@johnyob johnyob commented Feb 26, 2023

Motivation

Related issue: #9484

Currently the AST permits Foo.(...) in patterns and expressions but not types. There are cases where such syntax could reduce the verbosity of types without requiring an open in the module.

Description

The PR extends Parsetree.core_type and Typedtree.core_type with a (P|T)typ_open constructor.
The parser is extended to permit these opens in any delimited types (parenthesised, constructor and class arguments, variants, objects, extensions).

(* Some examples *)
module F = struct
  type t = int
  type s = string
  type r = unit
end

type paren = F.(t * s * r)

type ('a, 'b, 'c) triple = 'a * 'b * 'c

type args = F.(t, s, r) triple

type obj = F.< x : t; y : s; z : r >

type var = F.[ `X of t | `Y of s | `Z of r ]

The outcome tree encoding of the type (e.g. paren) is:

# type paren = F.(t * s * r);;
type paren = F.t * F.s * F.r

Personally I'm not entirely happy with this result (the ideal output would be F.(t * s * r)), however, I'm not sure how to achieve this (so any pointers would be greatly appreciated!)

Copy link
Member

@gasche gasche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation looks very reasonable to me, what I would expect. I don't think that there is a simple way to magically guess open on the way back when printing types, and I don't think it should be a requirement for this feature.

{ mk_ptyp_class ~loc:$sloc tys cid }
| mktyp( /* begin mktyp group */
mkrhs(mod_ext_longident) DOT delimited_type
{ Ptyp_open ($1, $3) }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: I like menhir's ability to name things instead of using $3 to mean something, and I would encourage you to use it for new parsing rules. (Parts of the current parser were inherited from the previous ocamlyacc grammar which did not support this.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do -- I also prefer this style

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typing/tast_iterator.ml Outdated Show resolved Hide resolved
typing/tast_mapper.ml Outdated Show resolved Hide resolved
vs


type policy = Fixed | Extensible | Univars
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The stuff above looks like an erroneous resolution of a merge conflict with the recent TyVarEnv refactoring. i would remove all of it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this was fixed in commit 2bc28b4. I can fix it, but I'm more than happy for the commits to be squashed when merging

@gasche
Copy link
Member

gasche commented Feb 27, 2023

I see a potential issue with the A.< (object type) > form, which is a lexer-level conflict with BER MetaOCaml's .< (code) >. syntax. Could we leave out the object-type open form for now, out of abundance of caution?

(If we believe that this creates a jarring inconsistency with the open form for bracketed polymorphic variants, then I think we could also leave those out. Almost all the value comes from the parenthesized form anyway.)

Relatedly to syntax bikeshedding: should we consider having the let Path in <type> form as well? I think that a reasonable answer is "no" (as in the current status of the PR), because (1) we also don't support this for patterns, and (2) there are no keyword constructions in the language of type expressions right now, so it would feel a bit out of place.

@nojb
Copy link
Contributor

nojb commented Feb 27, 2023

I don't want to be a wet cloth (and I am not strongly opposed to this PR), but going over the discussion in #9484, I confess I don't find the justification of the feature very convincing: this feature "could drastically reduce the verbosity of complex type signatures". The real-life example given in that PR is:

type ('a, 'b, 'c, 'r_type, 'w_type) pipe =
  ('a, 'b, [> Stream.Readable.kind] as 'r_ype) Stream.Readable.subtype ->
  (('b, 'c, [> Stream.Writable.kind] as 'w_type) Stream.Writable.subtype as 'w_stream) ->
  'w_stream

that becomes

type ('a, 'b, 'c, 'r_type, 'w_type) pipe = Stream.(
  Readable.(('a, 'b, [> kind] as 'r_type) subtype) ->
  Writable.(('b, 'c, [> kind] as 'w_type) subtype) as 'w_stream ->
  'w_stream
)

which does not seem to make such a big difference to me. And, local opens of types have the same downside as local open of values: they harm readability because you don't know if the names that appear inside the local open are shadowed or not.

Having said that, one may argue this feature on other grounds (eg regularity of the language), but I thought I would chime in with my 2c: personally, I don't see myself using this feature very often, if at all.

@gasche
Copy link
Member

gasche commented Feb 27, 2023

I think that the feature does make the language more regular (we get asked about it from users expecting it to just work, and the UX is better when people are not randomly surprised by stuff not working), and has similar tradeoffs to open constructors for other grammatical categories (structures, expressions, patterns). It is also fairly simple to implement, see the PR which is mostly boilerplate with zero new logic in the type-system.

@nojb
Copy link
Contributor

nojb commented Feb 27, 2023

I think that the feature does make the language more regular (...) It is also fairly simple to implement, see the PR which is mostly boilerplate with zero new logic in the type-system.

Agreed, this is good enough for me!

Comment on lines 552 to 553
iter_loc sub mod_ident;
sub.typ sub t
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: two space indent to match the other cases in this function. Thanks!

Comment on lines 663 to 667
let path, new_env =
!type_open Asttypes.Fresh env loc mod_ident
in
let cty = transl_type new_env policy t in
ctyp (Ttyp_open (path, mod_ident, cty)) cty.ctyp_type
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idem.

Copy link
Contributor

@nojb nojb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went over the code, it looks quite correct. I don't have an opinion about object types, but I thought that the direction for the MetaOCaml patch was to use attribute syntax instead? (Perhaps I misremember the discussion.)

@lpw25
Copy link
Contributor

lpw25 commented Feb 27, 2023

I like the general feature, but this one just seems confusing:

type args = F.(t, s, r) triple

Could we perhaps get rid of the constructor case?

@@ -15,6 +15,7 @@

(* Typechecking of type expressions for the core language *)

open Asttypes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personal taste: I would rather avoid adding a global open and use Asttypes.override_flag below.

@johnyob
Copy link
Contributor Author

johnyob commented Mar 2, 2023

I see a potential issue with the A.< (object type) > form, which is a lexer-level conflict with BER MetaOCaml's .< (code) >. syntax. Could we leave out the object-type open form for now, out of abundance of caution?

(If we believe that this creates a jarring inconsistency with the open form for bracketed polymorphic variants, then I think we could also leave those out. Almost all the value comes from the parenthesized form anyway.)

Relatedly to syntax bikeshedding: should we consider having the let Path in <type> form as well? I think that a reasonable answer is "no" (as in the current status of the PR), because (1) we also don't support this for patterns, and (2) there are no keyword constructions in the language of type expressions right now, so it would feel a bit out of place.

I concur with this caution here, will remove 👍

@johnyob
Copy link
Contributor Author

johnyob commented Mar 2, 2023

I like the general feature, but this one just seems confusing:

type args = F.(t, s, r) triple

Could we perhaps get rid of the constructor case?

Personally I'm a fan of this syntax since I'd prefer it over F.((t, s, r) triple), but to avoid bike-shedding over syntax I can remove it in this PR

@goldfirere
Copy link
Contributor

Just coming across this and wanted to say that I've reached for this syntax before and it will be nice to have it. Thanks, @johnyob!

@gasche
Copy link
Member

gasche commented Mar 24, 2023

@johnyob I missed the fact that you had updated the PR by removing the more controversial examples of the syntax. Great, thanks! I need to do another review round and I hope that this would be good to merge.

@gasche gasche self-assigned this Mar 24, 2023
Copy link
Member

@gasche gasche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did have another look and I do have some comments again. Overall this is looking nice and, if/when @johnyob is available to do a rebase and answer the last comment round, I think we should move toward merging.

( @johnyob if you don't have time to work on this anymore, let us know and I could probably push a rebase myself. )

- variant types [ `A ]
- extension [%foo ...]
*)
delimited_type_without_object_type:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A short comment to explain why you are single-casing object types in the definition here could help. (The difference of course is that they do not support Foo. qualification.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I am okay with the current state, but I would actually slightly prefer to:

  1. Call this one delimited_type_supporting_local_open
  2. Moving the Ptyp_extension out of this group, so that we stop supporting M.[%foo] as a local open. I see no strong need for that syntax, and I can see hypothetical situations where we would want to define a different meaning for quantified expansions. (I'm not sure, but we regularly get bitten by having accepted too many things that we now want to give a different meaning to, so I would not support a syntax unless there is a clear need.)

| tys = actual_type_parameters
HASH
cid = mkrhs(clty_longident)
{ mktyp ~loc:$sloc (Ptyp_class (cid, tys)) }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are the two cases above moved out of the common mktyp( ... ) wrapper below? (My guess would be that they were different when you wanted to allow local-open on parameter tuples, but that you reverted this aspect and could in fact make the diff smaller and the code slightly more factorized by putting them back in the wrapper.)

@johnyob
Copy link
Contributor Author

johnyob commented Jun 15, 2023

( @johnyob if you don't have time to work on this anymore, let us know and I could probably push a rebase myself. )

I'm happy to work on addressing your comments and rebasing this weekend :)

@johnyob johnyob force-pushed the type-open-5.0 branch 2 times, most recently from 5301cad to 1836bf3 Compare June 17, 2023 20:29
@johnyob
Copy link
Contributor Author

johnyob commented Jun 17, 2023

@gasche I've rebased, hopefully this should be in good shape for merging :)

@gasche
Copy link
Member

gasche commented Jun 18, 2023

@johnyob could you add a Changes entry in the "Language features" section? Ideally having a short example of the new syntax would be nice.

@gasche
Copy link
Member

gasche commented Jun 18, 2023

(I was planning to merge and do this myself but I see that you are working on the CI results.)

@johnyob
Copy link
Contributor Author

johnyob commented Jun 18, 2023

make alldepend is a noop locally, but the CI is failing 🤔 Even running tools/ci/actions/check-alldepend.sh locally gives me a successful result

parsing/ast_helper.ml Outdated Show resolved Hide resolved
@Octachron
Copy link
Member

It looks like the dependency graph varies between the ARM64 and AMD backend, and that's the part causing the failure. You can discard the dependency change in asmcomp to make the linter happy (while I open an issue on this matter).

@gasche
Copy link
Member

gasche commented Jun 19, 2023

(I don't understand the CI situation regarding dependencies. Some explanations in an issue would help, or @Octachron is free to merge the current PR when he feels that the CI is in an acceptable state.)

@johnyob
Copy link
Contributor Author

johnyob commented Jun 19, 2023

(I don't understand the CI situation regarding dependencies. Some explanations in an issue would help, or @Octachron is free to merge the current PR when he feels that the CI is in an acceptable state.)

I must also admit that I am similarly perplexed about the CI situation 😅

@Octachron
Copy link
Member

I will push the fix for the dependency update. This is unrelated to the CI, this is a matter of the dependency graph being different when building the compiler on a M1.

@Octachron
Copy link
Member

I have pushed the dependency fix, and resynced the bootstrapped parser. I propose to do a squash merge once the CI is green (to avoid keeping the linter-related commits).

@gasche
Copy link
Member

gasche commented Jun 24, 2023

Thanks! I agree that squashing is fine.

@johnyob
Copy link
Contributor Author

johnyob commented Jun 24, 2023

Thanks for the help @Octachron! 🙏 Much appreciated :)

@gasche gasche merged commit 2e27032 into ocaml:trunk Jun 24, 2023
10 checks passed
@hyphenrf
Copy link

hyphenrf commented Jun 26, 2023

Hello, quick question regarding this PR.. Considering the following

type t = What of Foo.(a * b * c)
type u = Unboxed of Foo.a * Foo.b * Foo.c
       | Boxed of (Foo.a * Foo.b * Foo.c)

What is What equivalent to, Unboxed or Boxed? It seems to be the latter,

# module A = struct type t = int end
  type b = B of A.(t * t)
  ;;
module A : sig type t = int end
type b = B of (A.t * A.t)

and if that's the case... why? I'd be (often accidentally) paying runtime performance for syntactic convenience.. plus it's strictly less expressive like this.

@lpw25
Copy link
Contributor

lpw25 commented Jun 26, 2023

type t = What of Foo.(a * b * c)

If it can be done without butchering the parser, this should really just be an error.

@gasche
Copy link
Member

gasche commented Jun 26, 2023

If it can be done without butchering the parser, this should really just be an error.

cc @let-def / @jmid

@let-def
Copy link
Contributor

let-def commented Jun 30, 2023

Tricky issue, the syntax of types in constructor arguments is very "irregular".
It's possible to achieve that by slightly refactoring the type syntax:

  • factor out atomic_type_without_open from atomic_type
  • change constructor_arguments to not allow open syntax at the top

With this we can reject the open syntax if there is only one argument:

# type t = What of Foo.(a * b * c);;
Error: Syntax error
# type t = What of Foo.(a * b * c) * foo;;
Error: Unbound module Foo

This will also reject the open syntax for non-tuples:

# type t = What of Foo.(t);;
Error: Syntax error

From there, LRgrep can easily produce a nicer error message. And before it is production-ready, we can add a separate rule for explaining why this syntax is rejected.

The parser is not butchered too much:

--- a/parsing/parser.mly
+++ b/parsing/parser.mly
@@ -3164,9 +3164,12 @@ generalized_constructor_arguments:
 ;

 constructor_arguments:
-  | tys = inline_separated_nonempty_llist(STAR, atomic_type)
+  | atomic_type_without_open
     %prec below_HASH
-      { Pcstr_tuple tys }
+      { Pcstr_tuple [$1] }
+  | ty = atomic_type STAR tys = inline_separated_nonempty_llist(STAR, atomic_type)
+    %prec below_HASH
+      { Pcstr_tuple (ty :: tys) }
   | LBRACE label_declarations RBRACE
       { Pcstr_record $2 }
 ;
@@ -3468,7 +3471,7 @@ delimited_type:
     { $1 }
 ;

-atomic_type:
+atomic_type_without_open:
   | type_ = delimited_type
       { type_ }
   | mktyp( /* begin mktyp group */
@@ -3479,10 +3482,6 @@ atomic_type:
       HASH
       cid = mkrhs(clty_longident)
         { Ptyp_class (cid, tys) }
-    | mod_ident = mkrhs(mod_ext_longident)
-      DOT
-      type_ = delimited_type_supporting_local_open
-        { Ptyp_open (mod_ident, type_) }
     | QUOTE ident = ident
         { Ptyp_var ident }
     | UNDERSCORE
@@ -3491,6 +3490,17 @@ atomic_type:
   { $1 } /* end mktyp group */
 ;

+%inline atomic_type:
+  | atomic_type_without_open
+  | mktyp(
+      mod_ident = mkrhs(mod_ext_longident)
+      DOT
+      type_ = delimited_type_supporting_local_open
+        { Ptyp_open (mod_ident, type_) }
+    )
+  { $1 }
+;
+

Alternatively, we could check for this construction in the semantic action and warn if necessary, but for this we would need to keep track of more details of the concrete syntax, like parentheses (that could be done with reloc_typ in https://github.com/johnyob/ocaml/blob/547539831025f56569b547a3b564cf106d8d321c/parsing/parser.mly#L3421).
This is probably better if we want to reject, or warn, only about Foo.(a * b) syntax.

@gasche
Copy link
Member

gasche commented Jun 30, 2023

Weird, I am pretty sure that I tried exactly your proposal (or something very very close) and my attempt died by a thousand reduce/reduce conflicts. Thanks a lot! I will send is as a separate PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
parsetree-change Track changes to the parsetree for that affects ppxs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants