Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Private items in structures and signatures #2016

Closed
wants to merge 7 commits into from

Conversation

@trefis
Copy link
Contributor

commented Sep 3, 2018

This GPR implements the change described in https://blog.janestreet.com/plans-for-ocaml-408/#private-structure-items.

Some notes about the implementation:

  • the parser changes benefit from the switch to menhir: I could use an %inline rule! (One gets 11 shift/reduce conflict without it.)
  • when typing a structure/signature I made the choice of not including private items in the signature (read: Types.signature, the overloading of the name is unfortunate). One could have also chosen to include them but mark them as private.
    The chosen implementation has several benefits:
    • it is simpler; in particular, no change to Includemod is required.
    • the cmi files are smaller
@trefis

This comment has been minimized.

Copy link
Contributor Author

commented Sep 3, 2018

So, I get a ton of errors from travis that seem unrelated to this PR: the location of most of the error messages in the testsuite seem to have changed but:

  1. this PR only adds files to the testsuite, but doesn't modify any existing one
  2. I agree with the diff travis gives me
  3. I do not see from the diff of this GPR how that change could happen

For example here I agree the line displayed is the 19th and not the 15th.
I don't know what's going on. @gasche ?

@gasche

This comment has been minimized.

Copy link
Member

commented Sep 3, 2018

This might be related to the Menhir bootstrap, but it's unclear how. (In theory, a menhir change that did not come with a bootstrap commit could change the locations.)

I would rather make the guess that some PR recently merged broke the CI in a subtle way that is related to a trunk rebase (the PR's branch was fine, but merging on trunk breaks the tests).

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented Sep 4, 2018

I initially assumed so too, but I don't think so. I took the tip of trunk, built and ran the tests (everything was fine), then did make promote-menhir, and built and ran the testsuite again, everything was still fine.

Note that applying that diff (which removes the use of %inline) does fix the problem:

diff --git a/parsing/parser.mly b/parsing/parser.mly
index cff42b8c5..36eab64da 100644
--- a/parsing/parser.mly
+++ b/parsing/parser.mly
@@ -903,11 +903,6 @@ paren_module_expr:
       { unclosed "(" $loc($1) ")" $loc($5) }
 ;
 
-%inline private_item:
-  | { false }
-  | PRIVATE { true }
-;
-
 structure: extra_str(structure_nodoc) { $1 }
 structure_nodoc:
     seq_expr post_item_attributes structure_tail_nodoc
@@ -922,11 +917,16 @@ structure_tail_nodoc:
 ;
 
 structure_item:
-    private_item let_bindings
-      { val_of_let_bindings ~private_:$1 ~loc:$sloc $2 }
-  | private_item structure_item_with_ext
+    PRIVATE let_bindings
+      { val_of_let_bindings ~private_:true ~loc:$sloc $2 }
+  | PRIVATE structure_item_with_ext
       { let item, ext = $2 in
-        wrap_str_ext ~loc:$loc (mkstr ~private_:$1 ~loc:$loc item) ext }
+        wrap_str_ext ~loc:$loc (mkstr ~private_:true ~loc:$loc item) ext }
+  | let_bindings
+      { val_of_let_bindings ~private_:false ~loc:$sloc $1 }
+  | structure_item_with_ext
+      { let item, ext = $1 in
+        wrap_str_ext ~loc:$loc (mkstr ~private_:false ~loc:$loc item) ext }
   | item_extension post_item_attributes
       { let docs = symbol_docs $sloc in
         mkstr ~loc:$sloc (Pstr_extension ($1, (add_docs_attrs docs $2))) }
@@ -1048,9 +1048,12 @@ signature_nodoc:
   | signature_item signature_nodoc { text_sig $startpos($1) @ $1 :: $2 }
 ;
 signature_item:
-  | private_item signature_item_with_ext
+  | PRIVATE signature_item_with_ext
       { let item, ext = $2 in
-        wrap_sig_ext ~loc:$loc (mksig ~private_:$1 ~loc:$sloc item) ext }
+        wrap_sig_ext ~loc:$loc (mksig ~private_:true ~loc:$sloc item) ext }
+  | signature_item_with_ext
+      { let item, ext = $1 in
+        wrap_sig_ext ~loc:$loc (mksig ~private_:false ~loc:$sloc item) ext }
   | item_extension post_item_attributes
       { let docs = symbol_docs $sloc in
         mksig ~loc:$sloc (Psig_extension ($1, (add_docs_attrs docs $2))) }

I guess I just don't know what I'm doing ¯\_(ツ)_/¯

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented Sep 4, 2018

I also don't understand how the printed location can change, but the highlighted part of the code remain unchanged.

@Armael any guess?

@gasche

This comment has been minimized.

Copy link
Member

commented Sep 4, 2018

I believe that the issue comes from nullable productions and the difference between $loc and $sloc. If your rule starts with an empty symbol, the $loc location will start at the end of the previous non-empty symbol, while the $sloc location will start at the beginning of the next non-empty symbol (so it is finer-grained). From a user perspective, $sloc is essentially always better, but the current Menhir parser still uses $loc in many places, by default, to match the locations produced by the Yacc parser.

If that guess is right, here is what you can do:

  • In a first commit, just turn all $loc into $sloc in the rules that interest you. Some locations may change, verify that you agree with the changes and commit that.
  • In a second commit, implement your change (with or without inline rules); then the locations should not change anymore.

We may want to think about turning all $loc into $sloc globally. I believe that this is always correct from a user point of view, but it could not be part of the Menhir PR as it would have broken our correctness testing check.

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented Sep 4, 2018

Alright, your guess sounds believable to me.
However:

  • In a first commit, just turn all $loc into $sloc in the rules that interest you. Some locations may change, verify that you agree with the changes and commit that.

    Seems incredibly tedious in my case

  • We may want to think about turning all $loc into $sloc globally. I believe that this is always correct from a user point of view

    That seems like a nicer approach but I don't why it would always be correct (I'd be happy to review a PR which provides some argument that it is).

So, I think I'm just going to avoid using %inline for now. And we can just add it back at a later point when the $loc/$sloc situation is resolved; or at any other point when cleaning up parser.mly.

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented Sep 4, 2018

PS: thanks!

@gasche

This comment has been minimized.

Copy link
Member

commented Sep 4, 2018

I don't understand what is tedious. If I read your PR patch correctly, you touch exactly two semantic actions using $loc instead of $sloc, so you would have a commit with those two changes plus testsuite update (use make promote). Are you just worried about the size of the testsuite diff?

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented Sep 4, 2018

Are you just worried about the size of the testsuite diff?

Not really, I'd be happy to make a separate PR for just that change.

you touch exactly two semantic actions using $loc instead of $sloc

Hum, for some reason (simply looking at the locations changes in the testsuite) I assumed I would have to update all the rules used by the rules I directly modified, and the rules that they themselves use, etc.
I guess I'm probably wrong, I'll check.

@gasche

This comment has been minimized.

Copy link
Member

commented Sep 4, 2018

Unrelated, but an obvious question is why use private let, private module, private exception etc. instead of having the flag after the keyword (let private, module private, exception private etc.) as is the norm for existing uses of private or (non)rec, override, virtual, mutable and other flags.

@trefis trefis force-pushed the trefis:private-items branch from 6dbbfce to 399d124 Sep 4, 2018

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented Sep 4, 2018

I guess I'm probably wrong, I'll check.

Apparently I was. Changing the 3 relevant $loc to $sloc caused absolutely no diff on trunk (which I guess makes sense because no empty symbol came into play) and made the testsuite diff disappear in my PR.
So I included the change in the commit update the parser.

Unrelated, but an obvious question is why use private let, private module, private exception etc. instead of having the qualified after the keyword (let private, module private, exception private etc.) as is the norm for existing uses of private or (non)rec, override, virtual, mutable and other flags.

Fair question!
The answer will be because I wasn't careful and for some reason assumed it was private method...

I'd be happy to flip that around.
PS: I vote for of having let private rec and module private rec over let rec private and module rec private.

@lpw25

This comment has been minimized.

Copy link
Contributor

commented Sep 4, 2018

I have a preference for private module and over module private. The reason is that to me the private aspect is more about the relationship between the definition and the surrounding scope than a property of the module itself.

At some point I would really like to add support for virtual modules -- analogous to virtual classes and a way to support simple mixins in the module system. Just like with private this requires allowing signature items to be annotated with virtual, and it would definitely be better for those annotations to appear in the same location syntacticly. However, with virtual there is a conflict that emphasises the difference I mentioned in the previous paragraph -- there is already class virtual as syntax for making a virtual class, so it has to be virtual class ... for making a virtual binding of a class.

Unfortunately, as you point out, the existing class system syntax for private and virtual is already inconsistent in this regard, using both class virtual and method virtual rather than virtual method. So I guess we're going to get some inconsistency whatever way we go in the long run (assuming we do add some form of virtual modules at some point).

@garrigue

This comment has been minimized.

Copy link
Contributor

commented Sep 5, 2018

I'm trying to understand what this PR is about. I see only a reference to a paragraph in @lpw25 's blog, which has only examples of using this feature for type and module aliases in signatures,
But the PR seems to do more. Are such aliases also allowed in implementations? What do they mean there?

Looking at the blog, I'm not sure private is the right name, except for its being already reserved.
The idea seems to be to add transparent alias bindings in signatures. In particular, one can introduce a binding in the signature that does not appear in the implementation, which is certainly not the case neither with private methods nor private abbreviations and datatypes.
The closest thing I can think of is let x := ... in .. in Coq, i.e. a binding that reduces transparently.
Destructive substitutions in module constraints are also similar, so we could write:

type tmp := t list
module M := F(X)

hoping that we don't end up wanting to use this syntax for something else later.
Well, actually we could also do it the other way round, defining destructive substitutions as replacing a concrete binding by such an alias.

Independently of this, it would be nice to have a more comprehensive description of the syntax and semantics of this feature.

@bobot

This comment has been minimized.

Copy link
Contributor

commented Sep 5, 2018

So, for reference, it would allow to fix some cases of MR#6635, i.e. type alias added in .mli only for readability.

@gasche

This comment has been minimized.

Copy link
Member

commented Sep 5, 2018

Looking at the blog, I'm not sure private is the right name, except for its being already reserved.

My understanding is that private structure items appear in the implementation, but not in the interface (even the most general inferred interface). For this purpose, it seems to me that private makes perfect sense: the value is accessible only within the module, inside the abstraction boundary, it is private to this module in the usual OO sense.

private in signatures is more delicate and indeed it corresponds to substituted definitions: a private item is used to type-check the rest of the signature, but not included in the result. This is the same behavior as private implementation items, but the private intuition does not carry as well.

@alainfrisch

This comment has been minimized.

Copy link
Contributor

commented Sep 5, 2018

Might be worth mentioning #1506, and discussing both PRs together. #1506 was about allow opening arbitrary module expressions (evaluating to a structure) in structure. This allows encoding declarations that should not be exposed outside the current structure as:

open (struct
    type t = ....
end)

As @lpw25 pointed out, it might be better to directly express the intent of making some declarations local to the current structure (marking them as private), and this approach scales better to doing the same in signatures. Opening arbitrary module expressions can then itself be encoded by inventing a name for the bound module expression:

private module Foobar1234 = ...
open Foobar1234

Typically, one wouldn't do that when the module expression is directly a structure; one would simply mark all its items as private. The encoding above would be necessary for opening the result of a functor, or for opening a named module with a reduced signature.

So let me cc @yallop, @objmagic.

@trefis , @lpw25 : do you intend this PR to subsume #1506, or do you that both should be considered for inclusion?

@gasche

This comment has been minimized.

Copy link
Member

commented Sep 5, 2018

Personally I don't think that the two proposals are conflict. While they share some goals and results, they are fairly distinct language features and I think they would both have use-cases.

On one hand, open <module-expr> is not new syntax, but a generalization of existing syntax: it's a natural thing to have in the language, there are no conceptual downsides, the only criticism to be made is the implementation quality and maintenance cost. On the other hand, this new private is a new concept in the language that has to be explained, we can discuss the design and syntax, but there probably is an consensus that it is still valuable even in a language with open <module-expr>.

@Drup

This comment has been minimized.

Copy link
Contributor

commented Sep 5, 2018

@gasche On the other hand, there is really no need to have two implementations. Each version can be more or less encoded in the other. This is probably going to be one of the big point to discuss if we want both: which one to implement in term of the other.

I have not yet review the implementation, but from my discussion with @trefis and looking at generalized open, private items are, in a way, lower level (the implementation should do less work, especially after #1892), which might make them a decent choice for the primary implementation, and doing generalized open on top of them.

@alainfrisch

This comment has been minimized.

Copy link
Contributor

commented Sep 5, 2018

Agreed, I'm personally rather in favor of including both proposals -- it's not like the OCaml globally aimed at orthogonality in its feature set. But since the two proposals come roughly at the same time and can be encoded in each other to some extent, I thought it would be useful to remind everyone about the existence of the other PR to evaluate the usefulness of both (and perhaps to discuss some possible sharing between the two implementations).

@gasche

This comment has been minimized.

Copy link
Member

commented Sep 5, 2018

@Drup: I wouldn't expect an implementation approach translating one construction into the other, but rather the two implementations sharing common utilities to ensure that non-included items do not leak out in the final signature.

There is also no strong need for deciding an ordering a priori: general code can be reused, but less-general code can be factored out and reused as well by a subsequent PR.

I think it's mostly a matter of people discussing the design(s) and doing thorough implementation reviews in time. (@alainfrisch started doing a review for #1506 but the current status seems stalled, it is not clear to me why and who has to pour some work next.) For both PRs, they can be merged whenever they have been approved.

@garrigue

This comment has been minimized.

Copy link
Contributor

commented Sep 6, 2018

@gasche OK, the blog entry had no precise explanation about the behavior for structures.
While I can understand why you would use that name in structures (particularly for values), it still feels strange to use it in signatures, as it is explicitly stated that you can introduce such a binding in a signature even if it is not present in the implementation. Actually, this looks like two different features to me.
And this clearly overlaps with destructive substitution (same problems with avoiding escapement).

@lpw25

This comment has been minimized.

Copy link
Contributor

commented Sep 6, 2018

it still feels strange to use it in signatures

I can see what you mean, but at the same time for items allowed in both structures and signatures the feature really is the same in both places. For example,

private type tmp = t
type t = int

does the same thing in both contexts. So I think it would be strange to use different syntaxes for the two different contexts.

@garrigue

This comment has been minimized.

Copy link
Contributor

commented Sep 6, 2018

@lpw25
As I have stated already a few times, I don't think that private items are the right way to go, at least at the syntactic level:

  • while your blog post says that the goal is not to allow selective exports in compilation units (without a mli), it seems clear that as soon as you provide them they will be used for that
  • this is yet another meaning for private not much related with the previous ones (actually it is related to private methods in ocaml 1.00, but they are no longer in the language)
  • I thought we had an agreement for going the open struct way, which avoids extending the syntax
  • if we decide to extend the syntax, there should be more global planning, or the language is going to be even more confusing than it is now

This said, I do understand that the open struct way is not so nice for signatures, particularly when the name is just there to avoid shadowing, so a shorter notation could be good for types for instance.
As for the implementation, I have no strong preference. It might be fine to actually use this code to implement open struct. Or should I provide a third implementation with what I have in mind ? :)

Of course this is only my opinion.

By the way, maybe I should have started a discussion on you blog post. Would it be a good idea?
Hopefully we should be meeting soon anyway.

@lpw25

This comment has been minimized.

Copy link
Contributor

commented Sep 7, 2018

By the way, maybe I should have started a discussion on you blog post. Would it be a good idea?

Well, ideally the discussion should have started before the blog post, when I posted the same plans to the caml-devel list.

@garrigue

This comment has been minimized.

Copy link
Contributor

commented Sep 7, 2018

Sorry, it looks like I missed this one. I think each of those should be discussed first, at least for those which introduce language or typing changes.

@alainfrisch

This comment has been minimized.

Copy link
Contributor

commented Sep 7, 2018

@trefis, @lpw25 : can you elaborate on the meaning and use cases for the new feature on signature items (the blog post only mentioned structure items)?

Are type aliases, class types and module types the only meaningful items to support the private modifier?
Why would we allow for instance private val x: ... or private exception ...(the whole declaration is useless, if I understand correctly), private open (the private is useless), private type t = A | B?

Concerning structure items: is private ... strictly equivalent to open struct ... end, or are there cases where they behave slightly differently?

@trefis trefis force-pushed the trefis:private-items branch from 399d124 to 7587adc Sep 7, 2018

@trefis trefis force-pushed the trefis:private-items branch from 7587adc to f4fdcb4 Sep 7, 2018

@yallop

This comment has been minimized.

Copy link
Member

commented Sep 7, 2018

@gasche:

(@alainfrisch started doing a review for #1506 but the current status seems stalled, it is not clear to me why and who has to pour some work next.)

I believe @objmagic is planning to return to #1506 soon, to rebase the PR and incorporate feedback.

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented Sep 7, 2018

Concerning structure items: is private ... strictly equivalent to open struct ... end, or are there cases where they behave slightly differently?

As far as I could tell from the paper linked in #1506 they should indeed be equivalent.

It's been a while since I looked at the implementation of that feature however, so I might be missing something.


@trefis , @lpw25 : do you intend this PR to subsume #1506, or do you that both should be considered for inclusion?

I personally tend to agree with the answer given by @gasche: there is a decent amount of overlap (as discussed above), but some other features of #150 (e.g. open (M : S)) are not covered by this proposal and seem clearly desirable.

For the part where they overlap I believe the current proposal is, currently, cleaner in terms of its implementation and leads to better error messages (no synthesized names appears).
That said, if both PRs are to be merged, then, as was suggested by @alainfrisch and @Drup, the open struct ... end case could avoid introducing a fresh name and directly mark the items as private, reusing the machinery of this GPR.


while your blog post says that the goal is not to allow selective exports in compilation units (without a mli), it seems clear that as soon as you provide them they will be used for that

A few things to note here:

  • I'm not sure I agree: it's not clear to me why people would choose to use private over .mli files. For that purpose, .mli files are just way nicer in term of readability and documentation.
  • I don't think it's a problem if the feature starts to get used for that purpose. People can do whatever they want.
  • the same remark can be made about the open struct ... end.

Both @garrigue and @gasche seem to agree that this PR extends the syntax, while the open struct ... end construct doesn't.

I don't understand this.

Both propose to give meaning to a new combination of keywords that already exist.


this is yet another meaning for private not much related with the previous ones

That is a fair point.
Although some of the already existing meanings associated to private seem no more (or less) related between each other than with the one proposed here.

And fwiw, I think the meaning proposed here is probably more intuitive than this one for example.


Are type aliases, class types and module types the only meaningful items to support the private modifier?
Why would we allow for instance private val x: ... or private exception ... (the whole declaration is useless, if I understand correctly), private open (the private is useless), private type t = A | B?

I agree that private open, private exception, etc. are not interesting in signatures and we could forbid them.
At the same time, they are not harmful, they bring syntactic consistency and it would require writing more code to forbid them.

can you elaborate on the meaning and use cases for the new feature on signature items (the blog post only mentioned structure items)?

I believe that, apart from the more precise question I replied to above, Leo already answered that point.
But let me know if you disagree :)

@gasche

This comment has been minimized.

Copy link
Member

commented Sep 7, 2018

Both [proposals] propose to give meaning to a new combination of keywords that already exist.

open <module-path> and open <module-expr> are morally the same syntactic construct: the meaning is exactly the same in both cases, make all the names in the module argument visible in the current scope. Then open struct .. end is not a combination of keyword with its own meaning, it is obtained by combination of two existing constructs. On the other hand, private <str-item> really introduces a new construction not previously present in the language.

Another way to put it is: if someone encounters the new construction, (1) will they notice that it is a new feature and (2) will they feel a need to consult documentation to know what it does? I'm rather confident that, while most people would recognize open struct .. end as a new idiom that they haven't seen before, they probably couldn't tell whether it has been allowed for a long time and they never tried, or it is a new feature.

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented Sep 7, 2018

I see. Thanks for clarifying!

Note that I'm fairly sure every one encountering open struct in a signature, while they might not know whether it's a new feature or not, will probably feel the need to consult the documentation to know what it does; in particular regarding it's dynamic semantics, a topic which, unsurprisingly, is already covered in the OCaml 2017 paper.

So I'm not sure that argument quite holds :)

@alainfrisch

This comment has been minimized.

Copy link
Contributor

commented Sep 7, 2018

can you elaborate on the meaning and use cases for the new feature on signature items (the blog post only mentioned structure items)?
I believe that, apart from the more precise question I replied to above, Leo already answered that point. But let me know if you disagree :)

Sorry, I cannot find any intended use cases for allowing private on signature items.

@bobot mentioned the use of type aliases to simplify writing the signature, without actually exposing the alias to external code. This is a very narrow use case, and possibly a bit puzzling for users (they see a type alias in the documentation of a library but cannot use it, and the compiler complains as if the alias did not exist at all). Are there other use cases?

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented Sep 7, 2018

One example is shown in the blog post (and discussed in more details in section 3 of the open struct paper):

module type S = sig
  type t
  module Sub : sig
    private type outer = t
    type t
    val to_outer : t -> outer
  end
end

Of course, as Jacques noted, this can already be expressed using destructive substitution:

module type S = sig
  type t
  module Sub : sig
    type outer = t
    type t
    val to_outer : t -> outer
  end with type outer := t
end

It's just a bit lighter with private.


Btw, I do not agree with @bobot that the aliases in arg.mli should be made private.
Also, I haven't completely made up my mind about it yet, but I think private items should not appear in any documentation. In the example discussed above I wonder if the documentation should not print:

module type S = sig
  type t
  module Sub : sig
    type t
    val to_outer : t -> t
  end
end

with links the links pointing to the right t each type. (This does feel a bit weird, as I said I haven't completely made up my mind yet)

Anyway, that's a discussion for another day, on another bugtracker.

@alainfrisch

This comment has been minimized.

Copy link
Contributor

commented Oct 24, 2018

This PR was discussed today at the caml-dev meeting. Conclusion:

  • Restrict to signature items (not structure), and only type aliases (no concrete types, val declarations, etc).
  • Find another syntax than private.
@trefis trefis referenced this pull request Oct 26, 2018
@trefis

This comment has been minimized.

Copy link
Contributor Author

commented Oct 26, 2018

Cf. #2122 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
8 participants
You can’t perform that action at this time.