Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prefix the stdlib #1010

Merged
merged 6 commits into from Feb 12, 2018

Conversation

@diml
Copy link
Member

diml commented Jan 13, 2017

This patch changes the compilation scheme of the stdlib so that every module uses a long compilation unit name that:

  • is much less likely to clash with modules from other projects
  • doesn't systematically pollute the environment with many short names

Motivations

1. we can add new modules to the stdlib without guilt

With this patch, we can add new modules to the stdlib without worrying about clashing with a module from another project. #964 and #1002 will benefit immediately from this.

2. it's easier to use an alternative stdlib

Currently, when using an alternative standard library such as Base, you still get all the default modules and values from the stdlib in scope. We do terrible things in Base to at least deprecate them but it is not very satisfactory.

With this patch, all we need is an option to prevent the compiler from opening Stdlib (formely Pervasives) by default and we will get a clean initial environment. We'll still get the long names, but it is not as bad.

User visible changes

  • module X from the stdlib can be referred as Stdlib.X
  • element x from Pervasives can be referred as Stdlib.x
  • names of exceptions defined in the stdlib change in backtraces. For instance Pervasives.Exit becomes Stdlib__pervasives.Exit

Patch details

This patch uses the same technique we use for Jane Street libraries. The -o option is used to change the compilation unit name. For instance list.ml compiles to stdlib__list.cmo and we use module aliases to get back the short names.

The aliases from long names to short names live in the new Stdlib module added by this pull request. This module is now the one initially opened instead of Pervasives, so that we get all the short names in scope as before. This change requires a bootstrap.

stdlib/stdlib.ml looks like this:

module Arg = Arg
module Array = Array
...
module Pervasives = Pervasives
...

include Pervasives

The build system rewrites module FooBar = FooBar to module FooBar = Stdlib__fooBar and compiles the resulting file with -no-alias-dep -w -49.

Type printing

To avoid printing paths of the form Stdlib__map in error messages, this patch adds a simple heuristic to the type printer: given a path of the form <x>__<y>, if <x>.<y> is an alias for <x>__<y> then use <x>.<y> instead.

Contrary to the implementation -short-paths, this is simple and fast so it is enabled by default.

@diml diml added the stdlib label Jan 13, 2017

@c-cube

This comment has been minimized.

Copy link
Contributor

c-cube commented Jan 13, 2017

Note: #640 would also benefit (a IO module would be awesome)


let rec tree_of_path = function
| Pident id ->
Oide_ident (ident_name id)
| Pdot(Pident id, s, _pos) when Ident.same id ident_pervasive ->
| Pdot(Pident id, s, _pos) when is_pervasive_or_stdlib_ident id ->

This comment has been minimized.

@Drup

Drup Jan 13, 2017

Contributor

This doesn't catch Stdlib.Pervasives.foo. Is that a problem ?

This comment has been minimized.

@diml

diml Jan 16, 2017

Author Member

Indeed, I updated the code

@Drup

This comment has been minimized.

Copy link
Contributor

Drup commented Jan 13, 2017

I like this very much.

The fact that type errors can refer to Stdlib__* is, imho, quite a big problem. These modules shouldn't even really exist, we only need them because of limitations of module aliases (or pack). Also,-short-path is not enabled by default.

@sliquister

This comment has been minimized.

Copy link
Contributor

sliquister commented Jan 14, 2017

A tiny benefit of this change is that it allows the code generated by ppx rewriters to refer to the stdlib (say List.map) with less risk of shadowing, by calling Stdlib.List.map.
See https://github.com/whitequark/ppx_deriving/blob/master/src/ppx_deriving_runtime.ml for an example of not being able to do that now.

This is a bit non-backwards compability, because module type of Queue is a signature with an abtract type before this feature, but a concrete type type 'a t = 'a Queue.t after this feature.
Using module type of this way is bad idea because of how fragile it is, but some people likely do that.

@hcarty

This comment has been minimized.

Copy link
Contributor

hcarty commented Jan 14, 2017

Regarding the ppx_deriving improvements, this would also help with error messages when using something like ppx_deriving. The modules that are used to avoid shadowing can lead to some difficult to decipher error messages, particularly when not using short types.

@diml diml force-pushed the diml:prefix-the-stdlib branch from 1ff2e6f to 45d35df Jan 16, 2017

@diml

This comment has been minimized.

Copy link
Member Author

diml commented Jan 16, 2017

I added a trivial heuristic to avoid printing the Stdlib__ names in error messages and in the toplevel. Now the only place they can appear is in backtraces. I also simplified the build to avoid the need for the Stdlib__ module.

@diml diml force-pushed the diml:prefix-the-stdlib branch from 45d35df to d2313e8 Jan 16, 2017

@Chris00

This comment has been minimized.

Copy link
Member

Chris00 commented Jan 16, 2017

Can it be a general rule? I.e. when X__Y and (an alias) X.Y refer to the same module, then favor X.Y in paths. This would be useful for user libraries and would encourage uniform naming of "inner" libraries.

@diml

This comment has been minimized.

Copy link
Member Author

diml commented Jan 16, 2017

Yes, that's how I implemented it. I updated the description of this PR

@bobzhang

This comment has been minimized.

Copy link
Member

bobzhang commented Jan 16, 2017

Can the heuristic be made general to single underscore?

@diml

This comment has been minimized.

Copy link
Member Author

diml commented Jan 16, 2017

I don't think it's desirable to do that for every underscore, what is your use-case?

@Drup

This comment has been minimized.

Copy link
Contributor

Drup commented Jan 16, 2017

@diml I like your solution to the alias problem and it's consistent with the behavior of short-path. Not sure if it's really documented though. The implementation looks correct, but it's unfortunate that it's completely distinct to the one for short-path.

else
loop (i + 1)
in
loop 0

This comment has been minimized.

@Drup

Drup Jan 16, 2017

Contributor

There is pretty much the same code in Printyp.penalty.

This comment has been minimized.

@diml

diml Jan 17, 2017

Author Member

Indeed, I factorized the code

@diml

This comment has been minimized.

Copy link
Member Author

diml commented Jan 17, 2017

I updated the example in the manual using module aliases as a replacement for -pack to use double underscores and documented the heuristic.

@Octachron

This comment has been minimized.

Copy link
Contributor

Octachron commented Jan 17, 2017

I wonder if it would be worthwhile to generalize the heuristic to multiple level of nested sublibraries, i.e. try to turn Lib__Sublib1__…M into Lib.Sublib1….M?

@diml

This comment has been minimized.

Copy link
Member Author

diml commented Jan 18, 2017

I'm not sure it's worthwhile as I've never seen such a hierarchy in an OCaml project. In any case since it's not really in the scope of this PR, I'd rather keep the simple heuristic here and we can discuss more complex ones latter

@Octachron

This comment has been minimized.

Copy link
Contributor

Octachron commented Jan 18, 2017

I'm not sure it's worthwhile as I've never seen such a hierarchy in an OCaml project.

The fact that such a hierarchy is quite hard to set up currently may well be the cause of this inexistence.
But I agree that such change is utterly not necessary in the scope of this PR.

@alainfrisch

This comment has been minimized.

Copy link
Contributor

alainfrisch commented Jan 23, 2017

Changing the name of existing files in the stdlib (such as list.cmi) might break existing projects that rely on such filenames (for instance to build an explicit set of allowed interfaces for compiling addins). A less invasive approach would be to rely on aliases only for new modules, with aliases such as module Float = Stdlib_float added to Pervasives (which is already opened). This could also be used as a first step to check that all build systems and external tools out there work ok with the use of module aliases in the standard library.

@diml

This comment has been minimized.

Copy link
Member Author

diml commented Jan 23, 2017

It seems cleaner to me to do the same for every module. Fixing such projects would be trivial

@dra27

This comment has been minimized.

Copy link
Contributor

dra27 commented Jan 23, 2017

Trivial, but experience in previous versions seems to suggest we should be avoiding barriers to upgrading as far as possible. A halfway-house where the new modules/values are always in Stdlib and old ones are only done this way if a configure-time option is given would at least allow a period of time (1 or 2 releases) to review the impact? We had a regression on missing files (which is a related thing to renamed files!) in 4.03.0.

I must confess that the need to alter so many test cases makes me lean towards wondering if the last two drawbacks - especially names in backtraces - should be addressed at once? An uncaught exception error message is already not great, and it starts to look even worse with Stdlib__ in it. Fragile or not, a brief look on GitHub suggests module type of SomeModuleFromStdLib is certainly used and removing that is not trivial if you've got to support multiple versions of OCaml. It feels like we should be ensuring that noone ever has to write Stdlib__ in anything.

I have a non-critical nitpick about the way stdlib__.ml is generated - it's a massive piece of code duplication. Why not make $(P) a sentinel value (i.e. illegal in both module + filename) and use STDLIB_MODULES to generate the entire file?

Chris00 added a commit to Chris00/lacaml that referenced this pull request Jan 24, 2017

Use Lacaml__ as a prefix for internal modules
In particular, for Lacaml__M modules that are aliased to submodules
Lacaml.M, the latter path should eventually be used by the type printer.
See ocaml/ocaml#1010

Fixes https://github.com/mmottl/lacaml

Chris00 added a commit to Chris00/lacaml that referenced this pull request Jan 24, 2017

Use Lacaml__ as a prefix for internal modules
In particular, for Lacaml__M modules that are aliased to submodules
Lacaml.M, the latter path should eventually be used by the type printer.
See ocaml/ocaml#1010

Fixes mmottl#25
@diml

This comment has been minimized.

Copy link
Member Author

diml commented Jan 24, 2017

I'll run some tests on opam so that we can get a clear idea of the breakage.

Regarding exception names, what we need is to do is change the module name used in extension constructor names. I can think of two options:

  • add a command line argument
  • add the same heuristic as for type printing: if in the initial environment (after interpreting the -open arguments) Foo.Bar is an alias for Foo__bar then use the former instead of the latter

The second option seems reasonable to me and in-line with the heuristic for type printing.

For the module type of problem, it seems to me that it would best if aliases were resolved before abstracting the signature. At least at Jane Street we have been bitten several times by the fact tbat adding an alias somewhere breaks a use of module type of somewhere else. @garrigue, what do you think?

About the last point, I agree that the duplication is unfortunate. However i think it should be the other way around: stdlib.ml is the source file that people should look at instead of the build system

@dra27

This comment has been minimized.

Copy link
Contributor

dra27 commented Jan 25, 2017

I agree that using the Foo__bar heuristic for exceptions seems a good (and natural) plan. Obviously doing alias resolution earlier deals with the problem completely, but if we don't end up doing that, is there the possibility of a warning along the lines of @lpw25's (caml-devel) suggestion of (module type of struct include Queue end)? Even if there's no possibility of a warning, that pattern feels better for the test cases to me than having Stdlib__ prefixes ended up in code. If in the future we come up with an even better way of doing this, it would be nice to be able to be able to say that no one should have ever used the prefixes anywhere at any time!

That StdlibModules file is used elsewhere - there'd also need to be a solution for the Camlinternal modules which aren't mentioned in stdlib.ml. I also don't particularly like either the unnecessary double-typing of each module name module List = List or the magic transformation of the second half (but I realise that it means that the file is always syntactically valid). It would also feel a little odd to be having the build system extract information automatically from stdlib.ml, at least to me.

@diml

This comment has been minimized.

Copy link
Member Author

diml commented Jan 25, 2017

Maybe it's possible to implement the warning, although it would be simpler to just resolve aliases. I need to try again the module type of struct include ... in the testsuite.

Regarding StdlibModules, AFAICT it only used to expunge the toplevel. The duplication between stdlib/Makefile and stdlib/StdlibModules can certainly be avoided and I was planning to look at it after as it's not specific to this PR. Technically in stdlib/Makefile we could just use $(wildcard *.ml) to list the standard library modules. The only problem is the ordering for linking.

To me the current stdlib.ml file states the interface of the standard library, so it's valuable to have it as it is now. The magic transformation to add the Stdlib__ prefixes is only there so that this stays a detail of the build system rather than something the user should know about.

To build JS libraries we do things slightly differently: the special alias file is named lib__.ml-gen and is indeed generated by the build system. lib.ml is just a normal module, expect that it is not prefixed. In the case of the stdlib, since Pervasives has no dependency on other modules of the standard library, we can use stdlib.ml direcrtly as the alias module. I though it was better this way as it avoids the need for a Stdlib__ module.

@diml diml referenced this pull request Feb 12, 2018

Merged

Deprecate Pervasives #1605

@shindere

This comment has been minimized.

Copy link
Contributor

shindere commented Feb 13, 2018

@diml

This comment has been minimized.

Copy link
Member Author

diml commented Feb 13, 2018

Ah, indeed, I forgot we are calling make recursively.

We should use Dune to build the compiler :)

@shindere

This comment has been minimized.

Copy link
Contributor

shindere commented Feb 13, 2018

@diml

This comment has been minimized.

Copy link
Member Author

diml commented Feb 13, 2018

Dune?

The new name of Jbuilder: https://github.com/ocaml/dune

@shindere

This comment has been minimized.

Copy link
Contributor

shindere commented Feb 13, 2018

@trefis

This comment has been minimized.

Copy link
Contributor

trefis commented Feb 13, 2018

Following the merge of this GPR, merlin was very sad when working on the compiler codebase (I got a bunch of errors such as "This expression has type bool Stdlib.ref = bool Stdlib.Pervasives.ref but an expression was expected of type 'a ref").

I pushed 4a6e431 which fixes the issues for me. Anyone should feel free to revert if it breaks something on their side.

@diml

This comment has been minimized.

Copy link
Member Author

diml commented Feb 14, 2018

If you get a lot of Reference to undefined ... when running the tests, try cleaning testsuite/tools. It should fix the issue

@hhugo

This comment has been minimized.

Copy link
Contributor

hhugo commented Feb 17, 2018

ocamlfind needs to be patched. I've opened https://gitlab.camlcity.org/gerd/lib-findlib/merge_requests/10

@diml

This comment has been minimized.

Copy link
Member Author

diml commented Feb 19, 2018

Thanks!

@gasche

This comment has been minimized.

Copy link
Member

gasche commented Feb 20, 2018

On my machine, building a fresh trunk ( 2b25681 ) then running

cd testsuite
make all

fails with the following tests failing:

List of failed tests:
    tests/typing-poly/'poly.ml' with 1 (expect)
    tests/typing-modules/'aliases.ml' with 1 (expect)
    tests/typing-gadts/'pr5689.ml' with 1 (expect)
    tests/messages/'precise_locations.ml' with 1 (expect)
    tests/typing-objects/'Tests.ml' with 1 (expect)
    tests/typing-gadts/'omega07.ml' with 1 (expect)
    tests/typing-gadts/'dynamic_frisch.ml' with 1 (expect)
    tests/typing-objects/'Exemples.ml' with 1 (expect)
    tests/tool-ocamldoc-html/Module_whitespace.ml
    tests/typing-misc/'pr7668_bad.ml' with 1 (expect)

I believe that this is caused by this PR, given that the proposed expect-diffs look as follows:

 [%%expect{|
-val partition_map :
-  ('a -> [< `Left of 'b | `Right of 'c ]) -> 'a list -> 'b list * 'c list =
-  <fun>
-Line _, characters 35-96:
-  ...................................partition_map (fun x -> if x then `Left ()
-  else `Right ()) xs
-Error: This expression has type unit list * unit list
-       but an expression was expected of type int list * int list
-       Type unit is not compatible with type int
+Line _:
+Error: Reference to undefined global `List'
 |}]
@shindere

This comment has been minimized.

Copy link
Contributor

shindere commented Feb 20, 2018

@gasche

This comment has been minimized.

Copy link
Member

gasche commented Feb 20, 2018

I can reproduce the failures from a fresh git clone on my machine, so I don't think it is a dependency issue. Can someone else reproduce the failure from a fresh git clone?

@shindere

This comment has been minimized.

Copy link
Contributor

shindere commented Feb 20, 2018

@lpw25

This comment has been minimized.

Copy link
Contributor

lpw25 commented Feb 20, 2018

@gasche I had those errors the other day, and they were hard to get rid of. Eventually, I git cleaned and make distcleaned enough to fix the problem, but I don't know exactly what it was that was causing the issues.

One theory I had was that it was the install directory. Have you tried deleting the directory that you have configured it to install in (or configuring it to install in a different directory)?

@gasche

This comment has been minimized.

Copy link
Member

gasche commented Feb 20, 2018

Tests must work without any installation action, and independently of the install directory. In particular, tests must work without running"make install". If this property is broken, then it is a bug that must be fixed. The bug may have been introduced somewhere outside this pull-request, however.

@gasche

This comment has been minimized.

Copy link
Member

gasche commented Feb 20, 2018

I just tried and make install fixes the issue. The problem must come from test scripts relying on the installed environment, so it is probably independent of the present PR.

@shindere

This comment has been minimized.

Copy link
Contributor

shindere commented Feb 20, 2018

@shindere

This comment has been minimized.

Copy link
Contributor

shindere commented Feb 20, 2018

@diml

This comment has been minimized.

Copy link
Member Author

diml commented Feb 20, 2018

Could it be that expect_test itself when testing programs uses an
installed version of the stdlib rather than the one that comes with the
sources?

I checked, and it's only calling Toploop.initialize_toplevel_env. In particular it is not calling Compmisc.init_path, so it the installed stdlib directory should in theory not be added to the search path.

@gasche

This comment has been minimized.

Copy link
Member

gasche commented Feb 20, 2018

I can reliably reproduce the issue by building and installing an old trunk, then building the current trunk and running the tests from there. I don't know which change is responsible for this breakage, but in any case it should be fixed. (Given that it may be irrelevant to the present PR, I may create a devel- thread later in the day.)

@shindere

This comment has been minimized.

Copy link
Contributor

shindere commented Feb 20, 2018

@gasche now trying with a fresh clone but also as a different user that has
no opam installed, on a system where OCaml is not installed.

@diml

This comment has been minimized.

Copy link
Member Author

diml commented Feb 20, 2018

I wrote an expect test that prints Config.load_path and it indeed contains the installed stdlib directory

@diml

This comment has been minimized.

Copy link
Member Author

diml commented Feb 20, 2018

This should be fixed after this: #1621

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.