Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Refactor typing/env to separate the filesystem-related logic #2228
This PR, which builds on top of #2227, is again only a refactoring PR: it should not affect the semantics of the compiler (there are two small changes in behavior that I mention below; the rest should be strictly identical, only moving code and data around).
The PR came as I was trying to understand the code of Env that is related to compilation units.
The present PR separates this persistent-files logic to a separate module, Persistent_env, that Env then uses internally. The last commit does the actual file split, all other preceding commits are small changes that were necessary for it, either because they abstract away certain details of the Env module that are affected by the split, or because they disentangle the mess of dependencies within the module that makes the split difficult -- it took me 3 attempts to discover everything that could create a cyclic dependency.
Now on the two small behavior changes:
I just handled a somewhat-painful rebase on top of #2041 (now merged in trunk). Nothing much to declare (I wasn't sure whether
I haven't looked at the code in this PR in details yet but:
Good point on the copyright; in the past I've just put my copyright when creating new files, but they were completely new.
Re. #2231: given that the present PR is non-urgent, I guess that the most reasonable thing is for me to wait until those recent PRs are settled to merge -- and deal with the rebase pains on my end. Grmbl...
There is a small change of behavior in this patch due to a different handling of weak dependencies (those with crco=None); in Env.check_consistency, only non-weak dependencies would get [Env.add_import] called, while the `toplevel/` implementations would also call [Env.add_import] on weak dependencies. After this patch, we systematically call [add_import] only on non-weak dependencies, even in `toplevel/`. ([Gabriel:] As far as I can see, the use of [add_import] in the toplevel never leads to a use of [Env.imports()] for producing a dependency list, as the toplevel does not produce cmi/cmo files; are they just no-ops?)
This small change of behavior simplifies the internal plumbing of env by avoiding the need to passes the 'current_unit_name' state to cmi-checking exceptions -- it allows to separate the cmi/crc logic to a separate module in a future commit. We believe that the change does not actually reduce error message clarity, as the name of the offending unit appears in the location filename anyway (see how these exceptions are handled by Location.error_of_printer_file in the error printer). Before: File "a.mli", line 1: Error: Unit A imports from B, which uses recursive types. The compilation flag -rectypes is required After: File "a.mli", line 1: Error: Invalid import of B, which uses recursive types. The compilation flag -rectypes is required
…module Persistent_env is a new module that handles the relation between the type-checking state and the "persistent" typing information laying in .cmi files on the filesystem. In particular, it handles the collection and production of CRC information for the .cmi files being read and written to the filesystem; the using modules (in our case, only Env) are in charge of turning the cmi files into higher-level information (components and signatures). Persistent_env exposes a type `'a t` of a persistent environment, which acts as a mutable store of `'a` values. There is no global state in the module itself: while Env (and thus the OCaml type-checker) uses a single global persistent environment, it should be possible to create several independent environments to represent, for example, several independent type-checking sessions.
The hope is that the (env => persistent_env) refactoring does not break reasonable user code; the fact that this test had to be updated is a bad sign. On the other hand, we believe that utop is unaffected by the change, which suggests that real-world toplevel are less likely to be affected.
The hope is that a tailor-made algebraic datatype is more readable / less confusing than using ('a option) directly -- one may confuse getting None when looking in a table with the Not_found case. (Suggested by Jérémie Dimino)
The debugger reimplements its own error-reporting logic without using the reporter-registration mechanism of the compiler, so it needs to be adapted after the split between `Env` and `Persistent_env` in ocaml#2228. (Interestingly, this forced me to expose the `Error of error` exception in the Persistent_signature, which was not the case before. It was probably a mistake to not expose an exception value that can be raised by (correctly-written) consumers of the module.) I noticed the issue while inspecting a testsuite failure (ocaml#8544). Before this patch: ``` $ cat tests/tool-debugger/find-artifacts/_ocamltest/tests/tool-debugger/find-artifacts/debuggee/ocamlc.byte/debuggee.byte.output Loading program... done. Breakpoint: 1 10 <|b|>print x; Uncaught exception: Persistent_env.Error(_) ``` After: ``` $ cat tests/tool-debugger/find-artifacts/_ocamltest/tests/tool-debugger/find-artifacts/debuggee/ocamlc.byte/debuggee.byte.output Loading program... done. Breakpoint: 1 10 <|b|>print x; Debugger [version 4.09.0+dev0-2019-01-18] environment error: The files /usr/local/lib/ocaml/stdlib.cmi and [...]_ocamltest/tests/tool-debugger/find-artifacts/debuggee/ocamlc.byte/out/blah.cmi make inconsistent assumptions over interface Stdlib ```