Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose more compiler-libs internals in Toploop #6704

Open
vicuna opened this issue Dec 11, 2014 · 14 comments

Comments

Projects
None yet
1 participant
@vicuna
Copy link

commented Dec 11, 2014

Original bug ID: 6704
Reporter: @whitequark
Status: acknowledged (set by @damiendoligez on 2015-01-08T23:01:05Z)
Resolution: open
Priority: normal
Severity: feature
Category: toplevel
Related to: #6692 #7589
Monitored by: @nojb @diml @dbuenzli

Bug description

The toplevel is expunged, which means that even if I require compiler-libs.common, I get duplicate modules that are often unusable due to:

  1. exceptions having a different internal ID (this affects the entirety of Location),
  2. mutable cells having different identity.

Since 4.02 now includes module aliases, would it be possible to have Toploop re-export the entirety of compiler-libs that it expunges?

toploop.ml:
module Compiler_libs = struct
module Location = Location
...
end

This would greatly help those developing loadable modules that extend the toplevel, both the regular one and custom ones like utop (as utop is expunged in a similar way).

@vicuna

This comment has been minimized.

Copy link
Author

commented Dec 12, 2014

Comment author: @gasche

I didn't know about expunge before this PR, and while I vaguely understand the comment at the top of toplevel/expunge.ml (am I correct that the "List.map" there is a search-and-replace typo for "map"?), it says what the module does but not why it does it. Would you care to explain why?

My understanding of the "what" is as follows: the expunge.ml script removes the module that are part of the standard library from the symbol table of the produced bytecode executable (this is done in the "ocaml" target of the Makefile at the root of the distribution, including the list of stdlib modules from stdlib/StdlibModules).

(How do you know that utop is expunge? I quickly looked at its Makefile and found no trace of such a process.)

@vicuna

This comment has been minimized.

Copy link
Author

commented Dec 12, 2014

Comment author: @whitequark

Yes, it is a typo. The idea behind expunging is simple: the toplevel itself depends on compiler-libs, however we do not want many of the generic names from compiler-libs to pollute the module namespace. In other words we want to keep the module namespace almost same as in ocamlc (Toploop and Topdirs are also provided).

Expunging does this by removing the non-stdlib modules from the internal symbol table of the runtime, which is also used by Dynlink, and by Dynlink-like code that loads the newly compiled bytecode into the toplevel. The effect of this is that the toplevel code itself still refers to compiler-libs, however if you load compiler-libs explicitly, you have a "split brain" situation: Dynlink acts as if compiler-libs was never loaded, loads a new copy, and resolves all future references to it.

However, the toplevel itself leaks outside a few implementation details of compiler-libs, such as: Lexer.Error and Syntaxerr.Error (which would appear if you try to invoke !Toplevel.parse_toplevel_phrase); the many references to Outcometree, Path, Env and Types from Toploop; the behavior of Toplevel-internal Location module, unable to catch any exceptions registered with Location.register_error_of_exn; probably more.

I now actually wonder if expunging is needed at all, given that compiler-libs are not in default cmi search path anymore (I'm assuming they were).

utop is expunged by: https://github.com/diml/utop/blob/master/myocamlbuild.ml#L42-L85

@vicuna

This comment has been minimized.

Copy link
Author

commented Jan 8, 2015

Comment author: @damiendoligez

The point of expunging is to give a clean environment to the toplevel user, but it's probably a good idea to reexport compiler-libs under one module name.

typo: fixed in trunk (15771).

@vicuna

This comment has been minimized.

Copy link
Author

commented Jan 9, 2015

Comment author: @garrigue

Since module aliases are just aliases, you cannot use them after expunging: the referenced module is no longer there.
A better approach would be to build a packed version of the compiler libraries, and use it to build the toplevel. Then there would be no need to expunge to start with.

@vicuna

This comment has been minimized.

Copy link
Author

commented Jan 9, 2015

Comment author: @whitequark

I really like the idea of removing expunging entirely.

@vicuna

This comment has been minimized.

Copy link
Author

commented Mar 4, 2015

Comment author: @whitequark

Another argument against expunging is that it fails if you try to load compiler-libs.toplevel in a rather unsound way: dbuenzli/rresult#5 (comment)

@vicuna

This comment has been minimized.

Copy link
Author

commented Oct 11, 2016

Comment author: @dbuenzli

FWIW I'm running into the same problem in another project that tries to programatically invoke toplevel directives for loading byte code in the toplevel.

My problem is 1) that whitequark mentions, basically trying to invoke Topdirs.dir_load might raise exceptions that you are neither able to catch nor to print. In my case I can't even apply horrible hacks like https://github.com/whitequark/ocaml-m17n/blob/39f694595f4fcebac20823c290e6184d7b58e1cc/src/m17n_util.ml#L33-L47 since I don't know how to programatically generate a Symtable.Error.

Note that Toploop.{parse_toplevel_phrase,preprocessor_phrase,excute_phrase} suffer from the same problems. So as exposed the current Toploop API is completely unusable from the toplevel itself.

The smallest improvement that could be made is to provide a

Toploop.catch : formatter -> ('a -> 'b) -> 'a -> 'b option

That catches and prints the errors all these functions may raise. I thought the current Topdir.print_exception_outcome would do the job but that's apparently not the case (Not sure what it does exactly).

@vicuna

This comment has been minimized.

Copy link
Author

commented Jul 17, 2017

Comment author: @dbuenzli

Also it seems that 4.04 introduced a new API in Env that allows to intercept cmi lookups.

It was suggested to me to combine this with odig to enable autoloading libraries (see b0-system/odig#10 (comment) for a rough prototype on utop-full) somehow it would be nice to expose that API in the Toploop API.

@vicuna

This comment has been minimized.

Copy link
Author

commented Jan 9, 2019

Comment author: @nojb

What about "wrapping" compiler-libs and getting rid of expunge? Using the main Makefile terminology, I am thinking something along the following lines:

Compiler_libs
[ $(COMMON) modules ]
Bytecomp
[ $(BYTECOMP) modules ]
Optcomp
[ $(OPTCOMP) modules ]
Toplevel
[ $(TOPLEVEL) modules ]

It would break all users of compiler-libs; we could soften the blow by distributing both the wrapped and unwrapped versions for a while.

Opinions?

@vicuna

This comment has been minimized.

Copy link
Author

commented Jan 9, 2019

Comment author: @diml

I'm in favour of wrapping the compiler-libs. That would be an improvement for all users of compiler-libs as well as they define a lot of common names such as Path, Ident, Misc, ...

@vicuna

This comment has been minimized.

Copy link
Author

commented Jan 9, 2019

Comment author: @gasche

This is precisely what https://github.com/janestreet/ocaml-compiler-libs does, right?

@vicuna

This comment has been minimized.

Copy link
Author

commented Jan 9, 2019

Comment author: @nojb

Do we prefer a single top-level module ("Compiler_libs") or several ("Ocaml_common", "Ocaml_bytecomp", "Ocaml_optcomp", "Ocaml_toplevel") ?

@vicuna

This comment has been minimized.

Copy link
Author

commented Jan 9, 2019

Comment author: @diml

@gasche, indeed. The annoying thing is that we need to do a weird shadowing trick to hide the toplevel names. If the compiler libs where prefixed right from the start, that would make things simpler.

@nojebar, one important property to enforce is that if you use a module from bytecomp but only specify a dependency on common, then you should get a failure at compilation time rather than at link time. This is a common pitfall for users. I believe this would be easier to enforce with several top-level names, so my preference is for several top-level names.

@vicuna

This comment has been minimized.

Copy link
Author

commented Jan 11, 2019

Comment author: @nojb

#2218

@vicuna vicuna added the toplevel label Mar 14, 2019

This was referenced Mar 14, 2019

@vicuna vicuna added the feature-wish label Mar 20, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.