-
Notifications
You must be signed in to change notification settings - Fork 409
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refining dependencies #7498
Refining dependencies #7498
Conversation
a120bdf
to
c5bd0a5
Compare
Added some maps to get entry modules for lib closure in order to compare it to |
Signed-off-by: arozovyk <artemiy.rozovyk@ocamlpro.com>
Signed-off-by: arozovyk <artemiy.rozovyk@ocamlpro.com>
a2a007a
to
fcf1801
Compare
Signed-off-by: arozovyk <artemiy.rozovyk@ocamlpro.com>
947eea0
to
b2afbee
Compare
@arozovyk If you want to get quicker feedback from the CI, make a small PR (like docs or comments) so that the CI will run as you will lose the first-time contributor status here. |
Signed-off-by: arozovyk <artemiy.rozovyk@ocamlpro.com>
1ab1e6d
to
4b871d5
Compare
@Alizter |
@arozovyk The PR has to be merged which is why I suggested a small improvement like documentation. |
Signed-off-by: arozovyk <artemiy.rozovyk@ocamlpro.com>
2534b1c
to
eede49c
Compare
It appears that bootstrapping is broken, hence why most of the jobs fail. Try debugging why |
Signed-off-by: arozovyk <rozovyk.artem@gmail.com>
09d1814
to
ecfbb27
Compare
Yea, I'm breaking it from time to time :) But now it's "Install Deps on Unix" that's failing. I think it's due to "-open" flags. Just added some (well overdue) checks to try to fix it. |
Signed-off-by: arozovyk <artemiy.rozovyk@ocamlpro.com>
Signed-off-by: arozovyk <artemiy.rozovyk@ocamlpro.com>
f6f0282
to
2700e71
Compare
Signed-off-by: arozovyk <artemiy.rozovyk@ocamlpro.com>
Signed-off-by: arozovyk <artemiy.rozovyk@ocamlpro.com>
Signed-off-by: arozovyk <artemiy.rozovyk@ocamlpro.com>
dac8fa7
to
faf4471
Compare
So skipping filtering for unwrapped libs seems to have worked. Now I'm breaking a decent amonut of tests but it looks like much of it is due to the changes to |
Signed-off-by: arozovyk <artemiy.rozovyk@ocamlpro.com>
0246fd4
to
0693bc2
Compare
@arozovyk How is it going by the way? |
@Alizter Haven't been able to work much on it last week due to a conference, but I managed to fix the obvious mess-ups leading to insane memory usage, and now I'm testing it against a (not very large) monorepo. Currently it uses like 40-60% more memory but I have few ideas how to do things better algorithmically, might take some time though. I dont have the screenshot for the version @nojb tested, but it was something like 50G of memory used for the example above:) (Also, I'm missing some specific cases from what i see in the CI, but it should be easy to fix now). |
Are you saying that it used 60% more allocations? From the graph you just posted it appears it is even using less memory only 280MB whereas main is using 300MB. Or are the pictures swapped? |
I meant total allocations, yea. Hm, the pictures are in the right order, so I think I'm misunderstanding the result and the PR indeed has less memory usage on average (?) |
9cf41bd
to
3ba0cd5
Compare
01e97d5
to
900aa8b
Compare
Hello! What is the status of the PR? I ran a few rough benchmarks with LexiFi's codebase and compared it with However, I observed a slowdown in the build proper: for a full build (starting from nothing), the build time goes from ~1m55 to ~2m5 (roughly). The slowdown in zero builds (starting from a fully built tree) is much more marked: it goes from ~3s to ~15s. Independently of that, could you state precisely what is the optimization that is implemented by this PR? I did a few tests that I thought would trigger the optimization but there wasn't any difference as far as I could see, but perhaps I am missing something. |
Hello ! There are two reasons for that, fisrt, currently Dune keeps only a part of Secondly, in order to filter dependencies , we have to work with just the module names as strings given by Given a file alcotest-async.1.6.0/test/e2e/alcotest-lwt/failing/async_failure.ml
declared in Alcotest_lwt's interface. The only solution that I'm currently aware of is to try and find the corresponding instance of Moreover, as you know, depending on the value of The branch @nojb tested solves the aforementioned problem in a "silly" (but rather precise) way: computing the mapping of each lib (by its entry names) in Currently, I'm implementing a slightly less silly solution that changes the type of In the second case, ( In both cases, independently of the value Also, for the rebuild problem, even if we manage to find a viable solution using just I'll get back to you as soon as I have any new results. |
Would it make sense to only enable this optimization when |
Hello, sorry for the delay, I've been trying to understand things better before responding. I'm not sure I follow all the details in your response, so I thought I would spell out what I understand about this topic and we can see if it maches up with what this PR is doing. Currently, for every module in a "buildable" (library or executable), Dune declares that the Given this, the naïve approach would be to approximate, for each module being compiled, a subset of the libraries listed in the My questions are:
Sorry for the many questions, I am trying to understand what is going on :) Thanks! |
900aa8b
to
749e043
Compare
Hello, @nojb Also sorry for the delay, I've been working on the new implementation (described in my previous post) and testing it out to also get a better understanding of the performance side of things. Just to start answering your questions :
No, see details below.
The current code does need to compute transitive closures when
Sure,
Also, please observe the number of "-I" ( In other cases, the PR seems to also detect the libraries that are not needed and doesn't even compile them :
That being said, there are cases where the total build time seems to be slightly slower.
As for the "how it works": To answer your question about So how to narrow down this set of libraries? Intuitively, it should have been as easy as you describe it in your third paragraph. First, as I said in my previous post, you can't simply compare a Secondly, there is a quite common exception (also described in my prev. post) where you can't simply rely on Why closures? Rather that comparing a library entry module names with There are quite a few exceptions where it is still not enough, and I identified (and found a way to avoid filtering) the most of them (virtual libraries, anything related to Also for the perfomance, I'm sure there are few things that can be done to filter libraries faster, but it will also take time. Hope that clarifies some things. Cheers |
cc3bb7f
to
bdf5598
Compare
else | ||
new_list | ||
:: aux | ||
(List.fold_right new_list ~init:seen ~f:(fun a b -> Set.add b a)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I randomly happened to look at the PR, and found this spot. Any reason not to use a tailrec List.fold_left here?
Performance wise, it could also be the case that using a Hashtbl to collect "seen" objects would work better here (I've no idea whether this is a bottleneck!). You could also filter out already-seen values and add objects to the table with a single pass on the h
list.
Some update on this PR: unfortunately @arozovyk and I will no longer have time to work on this. Here is a summary of the current state, for people who might be interested to either triage this or resume work:
We also no longer have access to an industrial user to validate the usefulness of this work, and without such experimental validation it does not seem productive to keep working on this PR. |
@alainfrisch @nojb if you're interested with experimenting with this feature, you could refine it so that it can be experimentally enabled for your projects so that you can study the performance further and perhaps optimize it. I don't see an issue with merging an experimental feature as long as it doesn't impact the rest of dune. |
1 similar comment
@alainfrisch @nojb if you're interested with experimenting with this feature, you could refine it so that it can be experimentally enabled for your projects so that you can study the performance further and perhaps optimize it. I don't see an issue with merging an experimental feature as long as it doesn't impact the rest of dune. |
Just to clarify (and confirm our understanding), following @nojb experiments, we observed the following:
@lthls : is that right? In our case, most of of internal libraries depend on a single "core helper" library (which would act as libA above), so in practice, any change in any "intermediate library" (libB) trigger full recompilation of all modules in the "terminal" program (libC). But even if that program globally depends on all intermediate libraries, many of those libraries are only used by just a few modules in that program; and the goal would precisely be to avoid recompiling those who don't depend (even transitively) on a given library when it is modified. Given that a "zero-build" is already significantly slower with the PR than without (around 50%-100% longer), we fear that dropping the restriction mentioned by @lthls would make such "zero-build" even slower, to the point where we wouldn't use the refined semantics anyway. I don't think we will pick up work on this topic on our side, but otherwise, we'll drop a note here! FTR, we initially had in mind a even finer notion of dependencies, where dune would only recompile module whose actual dependencies (in terms of modules, not libraries) had been touched since the previous build. We agreed with OCamlPro to restrict it to libraries (i.e. only remember that a given module depends on such and such libraries, but not on individual modules in them); the goal was to keep the representation of the dependency graph reasonably compact, but apparently this was not enough. I wonder if the refinement (even the finer one described in the previous paragraph) wouldn't be better addressed without touching this graph, simply by short-circuiting a recompilation when we observe it is unnecessary. Typically, if dune wants to call ocamlopt to rebuild m.cmx, if it knows that the existing m.cmx had been compiled previously with the right command-line (same flags, etc), then it could just check the digests (for .cmi and .cmx files) listed in the m.cmx and look if any of the actual dependencies (recorded in the .cmx) is now invalid; if all dependencies are still valid, no need to recompile. This would mean opening a bunch of .cmi/.cmx files, but dune could maintain of cache of those digests. |
Note quite. In your description here, you say that libC depends on libA and libB. In this case, the current state of the PR works: we only need libA, so we can remove libB as a dependency. In the example you actually sent us, libC depends on libB (and not libA), but its modules actually depend on libA and rely on transitive dependencies to resolve that. So we detect that we only need a dependency on libA, but our current heuristic only allows to cover the actual dependencies through declared dependencies, and so our only solution is to keep libB, which is the only declared dependency that covers libA. |
I don't know if I am missing something, but I don't see any difference between adding
We compile once (with the current tip of this branch a427786):
next, we touch
|
Ok, this is unexpected. That means there is a bug somewhere in this PR. I'm not sure why we didn't cover this case properly, but at this point I don't think we'll invest the time to fix it. |
Closing until someone decides to take over this PR. |
This PR addresses the problem of unnecessary module recompilations occurring on library interface changes despite the fact that the module being recompiled does not directly depend on that library (only some other module(s) in the library/executable do).
The cause of this seems to be the fact that dune indiscriminately associates a
Dep.Fact.File_selector
(representing a set ofFiles.t
of a library) to every single module of executable/library compiled, when only a subset of modules does in fact depend on it. We can observe it inBuild_system.Exported.execute_rule_impl
where theAction.t
is run, returning a set ofDep.Fact.t
used to compute the rule digest.The rule responsible for this association is produced by
Command.run
inModule_compilation.build_cm
. One of its arguments (the 4th) containsCommand.Args.Hidden_deps (Lib_file_deps.deps requires)
,requires
being the list of library dependencies defined by compile info (eitherLib.Compile.direct_requires
orLib.Compile.requires_link
, depending onimplicit_transitive_deps
value). On rule execution, the command is expanded and the aforementioned list of library dependencies is registered asDep.Fact.File_selector
s for targets being produced.In this PR I attempt using
ocamldep
s output in order to filter the list of "requires" preventing them from being registered asDep.Fact.File_selector
for modules that do not need it.The solution is structured as following :
Ocamldep
module, instead of keeping only intra-library/executable modules, keep the fullocamldep
s output and write it toall-deps
file. Introduce new typeModule_dep.t = | Local of Module.t | External of External_name.t
. When parsingall-deps
file, returnLocal of Module.t
when the module is found (old behavior), otherwise returnExternal of External_name.t
. Modify theDep_graph.t
to contain both types of dependencies.Compilation_context.Includes.t
(where we makeCommand.Args.Hidden_deps
) add 2 arguments: module that is being compiled and dependency graph. (Not sure what is the proper way of doing that, here I just transformedCompilation_context.t.includes
into a partially applied function that waits for a module). Given the additional information, filter "requires" of typeLib_flags.L.t
that are local (as opposed to the ones coming from the "installed world") in order to only keep the "real" dependencies indicated byocamldep
.