-
Notifications
You must be signed in to change notification settings - Fork 101
Refactor resolver #672
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor resolver #672
Conversation
close_out oc | ||
|
||
let units_cache = Hashtbl.create 23 (* because. *) | ||
let save_page file page = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks wrong that Compilation_unit
is saving/loading pages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This module is now saving and loading .odoc
files. How should we call it ?
I used the name "compilation_unit" for Lang.Page | Lang.Compilation_unit
in the Resolver
module, the second case is named Module
. This will need to be changed too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documentation_unit
(and t = Page | Module
) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This module will continue to load both pages and modules. This type t
is like this:
type t =
| Page_content of Lang.Page.t
| Module_content of Lang.Compilation_unit.t
An other mismatch is that Lang.Compilation_unit.t
contains either a Module
or Pack
.
Perhaps we can say that a "compilation unit" is either a (compiled) page, a module or a pack and merge Lang.Page.t
into Lang.Compilation_unit.t
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not particularly fussy about what we end up calling it, so long as we steer clear of uses of 'compilation unit' that are different from what the manual says they are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed to unambiguous names. This module is now Odoc_file
and this record is Page_content ... | Unit_content ...
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Unit" alone doesn't really mean much (except maybe the type). Either you put the whole thing ("Compilation unit") or choose another word.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We commonly use the word "unit" for "compilation unit" already. Compilation_unit_content of Lang.Compilation_unit.t
sounds unnecessary to me.
(** Lookup a module. First looks into [imports_map] then searches into the paths. *) | ||
let lookup_unit ~important_digests ~imports_map ap target_name = | ||
match StringMap.find target_name imports_map with | ||
| Odoc_model.Lang.Compilation_unit.Import.Unresolved (_, Some digest) -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the imports ever anything other than Unresolved
after this patch? If not, can we get rid of Compilation_unit.Import.t
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the loader
code, imports are indeed always Unresolved
. I'll simplify the type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've found that resolved imports are used by the link-deps
command, it was wrong to remove that code.
I've pushed a fix and a test case for the link-deps
command but performances have regressed a lot (15% slower on core_kernel).
A small part of this PR needs to be reverted to make this work as before.
The list of imports can be big and sometimes most of them are not necessary. Instead of resolving as much as possible for every units, we can mark as resolved only the ones that were used while compiling. The output of link-deps
shouldn't change. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced link-deps
can ever produce the correct set of dependencies. In any case, I'm also not convinced the logic you're talking about is sound. It's currently doing this:
List.iter odoctree.Odoc_model.Lang.Compilation_unit.imports
~f:(fun import ->
match import with
| Odoc_model.Lang.Compilation_unit.Import.Unresolved _ -> ()
| Odoc_model.Lang.Compilation_unit.Import.Resolved (root, _) ->
Hash_set.add deps root);
and iterating this function over all odoc files in a directory. The only thing this gains you over using simply unioning the output of compile-deps
is that it won't mention modules that couldn't be resolved at compile
time - ie, modules for which we're missing the cm{t,ti,i}
file. There's probably one minor difference - if we're making a module that's entirely aliases to other modules, and nothing else within the same directory dereferences those aliases, then we might resolve that at compile time and thus miss it if we don't make a note of resolved imports. This is pretty rare though, and I only know of one instance of this (ocaml-compiler-libs).
The second point is that this entirely ignores all the references in the odoc files, so if there are any references to modules that aren't a compilation dependency then this logic will miss those dependencies. Even if we were to resolve the first part of the reference at compile time (as was suggested in a recent meeting), we'd need to look recursively at the referenced module - e.g. if we had the reference Stdlib.List.t
and we resolved the Stdlib
part, we'd need to have a link dependency not just on stdlib.odoc
but also stdlib__list.odoc
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed. Ideally, the link-dependencies of a unit is everything else. The smallest reasonable set would be the transitive compile-deps, which the link-deps
command doesn't even compute.
Do you suggest removing this command ?
it won't mention modules that couldn't be resolved at compile time
With my proposal it will also remove modules that weren't used at compile time, which may not be what we want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's likely worse, as it's much more likely we'll hit the problem of references above. At least with the 'union of compile-deps' scheme we're not producing fewer link deps than before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've re-implemented the efficient lookup for imports, the performance regression is gone. The link-deps
command is still here and has a testcase.
|
||
exception Fetch_failed of msg | ||
|
||
let build_resolver : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm very glad this thing has gone :-)
20ef746
to
dea0894
Compare
Do "lookup" and "resolve" in one operation to avoid opening each files twice. Remove unecessary caches and simplify the code a bit.
The result of these lookups is not used at all, except for the warnings that might be printed in case of error. Some modules have a lot of imports but need few lookups while resolving, especially the entry point of wrapped libraries.
Remove the separate loading of the root. Handle pages and modules from the same loading function. Remove unecessary caching.
Rename Odoc_odoc.Env into Odoc_odoc.Resolver. Remove the "builder" step.
This reverts commit 9549f36. Imports need to be resolved for the link-deps command. Now done in Compile rather than in Env. A test is added.
Imports are needed for the link-deps command. Some modules have a lot of imports declared but only a few are needed. Only the "Root" is needed for imports, which is written at the beginning of the files.
Previous commits were using "module" for Lang.Compilation_unit and "compilation unit" for .odoc files.
This PR does some code improvements:
Odoc_odoc.Env
toOdoc_odoc.Resolver
and simplify its API (remove thebuilder
part).odoc
files entirely. Remove the intermediate step of loading only the "root" header, which was only taken advantage of when looking up imports. (5% speed up ondune build @doc
in core_kernel)