Skip to content

feat: finer dependency analysis between libraries (#4572)#14021

Draft
robinbb wants to merge 14 commits intoocaml:mainfrom
robinbb:robinbb-issue-4572
Draft

feat: finer dependency analysis between libraries (#4572)#14021
robinbb wants to merge 14 commits intoocaml:mainfrom
robinbb:robinbb-issue-4572

Conversation

@robinbb
Copy link
Copy Markdown
Contributor

@robinbb robinbb commented Apr 3, 2026

Implements per-module library dependency filtering as proposed in #4572. When
library A depends on library B, Dune currently assumes every module in A
depends on every module in B — causing full recompilation of A whenever any
module in B changes. This PR uses ocamldep's per-module output to declare
cross-library file dependencies only for libraries a module actually
references.

This PR builds on prerequisite PRs:

Depends on #14030.

The changes in this PR (on top of the above):

  • A new Lib_index (in lib_file_deps.ml) maps library entry module names
    back to their libraries. It is computed once per stanza and stored in
    Compilation_context.t.
  • read_immediate_deps_raw_of reads raw ocamldep output (including
    cross-library references that were previously silently dropped), filters
    out stanza-internal names, and looks up the remainder in the index to
    determine which libraries are actually needed.
  • Conservative fallbacks ensure correctness: special module kinds, link-time
    generated modules, and unresolvable references fall back to all-library
    deps.
  • The nine baseline tests are updated to show the improved behaviour.

Resolves #4572

@robinbb robinbb force-pushed the robinbb-issue-4572 branch 2 times, most recently from 13c8e53 to 618ea7f Compare April 3, 2026 03:40
@rgrinberg
Copy link
Copy Markdown
Member

Thanks for working on this. I expect users are going to be quite pleased to have this feature. Some preliminary questions:

Includes.make no longer bundles Hidden_deps with -I flags. Include
flags remain shared across all modules in a stanza; file dependencies are
now declared per-module in build_cm.

Why are the include flags are needed if we don't expect the library to be used by the module?

For each module, read_immediate_deps_raw_of reads the raw ocamldep output
(including cross-library references that were previously silently dropped),
filters out stanza-internal names, and looks up the remainder in the index
to determine which libraries are actually needed.

Do we have tests for when an internal module name is shadowing a library?

Conservative fallbacks ensure correctness: alias and wrapped-compat modules
skip library deps entirely; special module kinds, link-time generated
modules, and unresolvable references fall back to all-library deps.

Is it possible to move these changes to a separate PR and rebase this work on top of that?

@robinbb robinbb force-pushed the robinbb-issue-4572 branch 3 times, most recently from 7373af6 to 1892da3 Compare April 3, 2026 22:31
@robinbb
Copy link
Copy Markdown
Contributor Author

robinbb commented Apr 3, 2026

Is it possible to move these changes to a separate PR and rebase this work on top of that?

@rgrinberg I have tried to do so with this PR #14030, depending on what you really meant by "these changes". :-)

@robinbb
Copy link
Copy Markdown
Contributor Author

robinbb commented Apr 3, 2026

Thanks for working on this. I expect users are going to be quite pleased to have this feature.

You're very welcome, and thank you for your prompt attention to it!

@robinbb robinbb force-pushed the robinbb-issue-4572 branch from 1892da3 to 76a7875 Compare April 3, 2026 22:47
@robinbb
Copy link
Copy Markdown
Contributor Author

robinbb commented Apr 3, 2026

Do we have tests for when an internal module name is shadowing a library?

I guess not. I have made #14031 for this. Please have a look.

@robinbb robinbb force-pushed the robinbb-issue-4572 branch 5 times, most recently from c2b7f96 to e22e456 Compare April 6, 2026 17:06
@nojb
Copy link
Copy Markdown
Collaborator

nojb commented Apr 6, 2026

Thanks for working on this patch @robinbb!

I tried this in our codebase by adding a new (unused) module to a library which is close to the root of the dependency tree, and doing dune build. Then I modified the unused module and redid dune build. At this point I expected for there not to be any recompilation at all, since the modified module is not used anywhere in the codebase. However, I observe some recompilation of modules (in other libraries) which do not depend on it. The libraries involved are all unwrapped.

Do you have any insight as to what may be causing the extra recompilation?

@Alizter Alizter requested review from Alizter and art-w April 6, 2026 17:22
@nojb
Copy link
Copy Markdown
Collaborator

nojb commented Apr 6, 2026

Do you have any insight as to what may be causing the extra recompilation?

I think one case where unexpected extra recompilation is triggered is when there a library consisting of a single module. If memory serves there is an optimization that avoids calling ocamldep in this case, but that may be defeating the optimization in this patch.

@Alizter
Copy link
Copy Markdown
Collaborator

Alizter commented Apr 6, 2026

Transparent module aliases hide library references from ocamldep when a module re-exports a library wrapper via module M = Mylib, ocamldep will miss the dependency on Mylib.

Here is the regression:


  $ cat > dune-project << EOF
  > (lang dune 3.0)
  > EOF
  $ mkdir lib && cat > lib/dune << EOF
  > (library (name mylib))
  > EOF
  $ cat > lib/mylib.ml << EOF
  > let v = 42
  > EOF
  $ cat > lib/mylib.mli << EOF
  > val v : int
  > EOF
  $ cat > dune << EOF
  > (executable (name main) (libraries mylib))
  > EOF
  $ cat > re.ml << EOF
  > module M = Mylib
  > EOF
  $ cat > re.mli << EOF
  > module M = Mylib
  > EOF
  $ cat > main.ml << EOF
  > let () = print_int Re.M.v
  > EOF

  $ dune build ./main.exe

Change mylib's interface:

  $ cat > lib/mylib.mli << EOF
  > val v : int
  > val w : string
  > EOF
  $ cat > lib/mylib.ml << EOF
  > let v = 42
  > let w = ""
  > EOF

The incremental build must succeed:

  $ dune build ./main.exe
  File "_none_", line 1:
  Error: Files .main.eobjs/native/dune__exe__Main.cmx and lib/mylib.cmxa
         make inconsistent assumptions over interface Mylib
  [1]

This probably also occurs for any module from Mylib not just the wrapper, so affects unwrapped, wrapped and singleton buildable stanzas. Before this change, the coarse glob would handle this unknowingly.

@nojb
Copy link
Copy Markdown
Collaborator

nojb commented Apr 6, 2026

Do you have any insight as to what may be causing the extra recompilation?

I think one case where unexpected extra recompilation is triggered is when there a library consisting of a single module. If memory serves there is an optimization that avoids calling ocamldep in this case, but that may be defeating the optimization in this patch.

Continuing to investigate this, an .mli in a library A containing only

val foo: Buffer.t

triggers the recompilation of the .cmi every time when a module in a library on which A depends on is changed. From a distance this looks like some missing logic to handle dependencies on stdlib modules.

@robinbb robinbb force-pushed the robinbb-issue-4572 branch from 3082bfa to 1fd630d Compare April 9, 2026 04:28
…issue ocaml#4572)

Add a new function `read_immediate_deps_raw_of` to the ocamldep module that
returns raw module names (as Module_name.Set.t) from ocamldep output, without
resolving them to Module.t values.

This function will be used to determine which external library modules a
module actually references, enabling finer-grained dependency tracking where
modules are only recompiled when libraries they actually use change.

Signed-off-by: Robin Bate Boerop <me@robinbb.com>
@robinbb robinbb force-pushed the robinbb-issue-4572 branch 9 times, most recently from 37f60f1 to 8a13c95 Compare April 9, 2026 07:06
the second run of dune.

$ dune build @package-cycle
Error: Dependency cycle between:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The removal of this error is strange, because it means we will be able to build packages together in a workspace that otherwise would have cycle errors and cause the (opam and dune) solvers to fail to find a solution.

I think we need to do cycle detection at the level of packages in order to avoid such situations. Before it wasn't needed because the dependencies were coarse enough to imply the sufficient conditions to avoid a cycle.

@rgrinberg any advice on where and how this detection should be done?

@Leonidas-from-XIV
Copy link
Copy Markdown
Collaborator

Can we also run a meaningful revdep check on this? This has the potential to affect a bunch of downstream packages.

Yes, we have a (manually triggered) workflow for this, so provided the branch is rebased on a reasonably recent version of main (I fixed some smaller issues in it recently) it should be possible to trigger this workflow by

For smaller package sets this takes about 10 minutes, bigger sets might take several hours I assume.

@robinbb robinbb force-pushed the robinbb-issue-4572 branch from 8a13c95 to 72009e7 Compare April 9, 2026 11:56
Change deps_of_entries from Lib.t list to (Lib.t * Module.t option) list.
Add deps_of_module (unused). Update all callers to pass (lib, None).
No behavioral change — the None path is functionally identical.

Signed-off-by: Robin Bate Boerop <me@robinbb.com>
@robinbb robinbb force-pushed the robinbb-issue-4572 branch from 72009e7 to 452bd20 Compare April 9, 2026 13:06
@robinbb
Copy link
Copy Markdown
Contributor Author

robinbb commented Apr 9, 2026

I am chasing down a CI failure in this branch. The failure is good, in the sense that it has caught legitimate regressions introduced by prior versions of this branch - ones that were not caught by the cram test suite.

Add Lib_index module to lib_file_deps.ml/mli. Not used anywhere yet.

Signed-off-by: Robin Bate Boerop <me@robinbb.com>
robinbb added 3 commits April 9, 2026 11:07
Add lib_index field and accessor to Compilation_context. Wired up
with Lib_index.create but not yet used in module_compilation.ml.

Signed-off-by: Robin Bate Boerop <me@robinbb.com>
Split lib_cm_deps into lib_cm_deps_arg (static Command.Args.t) and
lib_cm_deps_dyn (Action_builder, no-op for now). Add can_filter
(always false). No behavioral change — lib_cm_deps_arg carries the
same Hidden_deps as before, lib_cm_deps_dyn is a no-op.

Signed-off-by: Robin Bate Boerop <me@robinbb.com>
Add real can_filter computation (Melange excluded, dep_graph_dir check,
module kind check). The value is not used yet — lib_cm_deps_arg still
provides static deps for all modules. This tests whether merely computing
the can_filter conditions has side effects.

Signed-off-by: Robin Bate Boerop <me@robinbb.com>
@robinbb robinbb force-pushed the robinbb-issue-4572 branch from 1f2a0ce to 0e929f1 Compare April 9, 2026 18:35
Enable can_filter and add full filtering logic in lib_cm_deps_dyn
(Lib_index lookup, ocamldep refs, transitive deps, -open extraction).
Static lib_cm_deps_arg remains for ALL modules — never removed.

Both paths fire: modules get a superset of deps. The filtering
optimization does not activate (tests unchanged), but the dynamic
machinery runs for real. This tests whether the dyn_deps computation
itself causes CI failures.

Signed-off-by: Robin Bate Boerop <me@robinbb.com>
@robinbb robinbb force-pushed the robinbb-issue-4572 branch 3 times, most recently from 4ff661e to eecf3ac Compare April 10, 2026 06:14
@art-w
Copy link
Copy Markdown
Collaborator

art-w commented Apr 10, 2026

Very cool work indeed!

Is you new PR #14116 superseding this one?

Either way, I think it's worth spending more time to improve the test suite, because it's easy to accidentally rely on subtle behaviors that are more permissive than the general case. For example, the -opaque flag makes it possible to skip some recompilations on benign changes, even though these recompilations would normally be required.

Here's a small reproduction which should help debug the issue where a library re-exports one of its dependencies with a module alias. This cram test also shows the subtleties in the setup that could make it pass (when it shouldn't!)


First we need a library where we'll perform the changes:

$ mkdir impl
$ cat > impl/dune <<EOF
> (library (name impl))
> EOF
$ cat > impl/impl.ml <<EOF
> let foo = "initial build"
> EOF

Then another library, which exposes an alias to impl:

$ mkdir alias
$ cat > alias/dune <<EOF
> (library (name alias) (libraries impl))
> EOF
$ cat > alias/alias.ml <<EOF
> module Impl = Impl
> EOF

And a binary which depends on Alias to access Impl:

$ mkdir bin
$ cat > bin/dune <<EOF
> (executable (name main) (libraries alias))
> EOF
$ cat > bin/main.ml <<EOF
> let () = print_endline Alias.Impl.foo
> EOF

Compilation is different when the executable is defined in a single file, so the bug doesn't trigger unless we also add an empty unused file: (!!)

$ touch bin/unused.ml

The first build succeeds:

$ dune exec ./bin/main.exe
initial build

Now we introduce a "soft" update, such that impl.cmi isn't modified:

$ cat > impl/impl.ml <<EOF
> let foo = "second build, no change to cmi"
> EOF

In that specific case, we are allowed to not recompile bin/main.ml because the -opaque flag guarantees we are not dependent on the implementation of Impl: (the linker checks that the cmi digest hasn't changed)

$ dune exec ./bin/main.exe
second build, no change to cmi

But if the cmi of Impl changed, then skipping the recompilation of main.ml is wrong: (the linker check fails)

$ cat > impl/impl.ml <<EOF
> let new_value = 42
> let foo = "third build, forced a cmi update"
> EOF

$ dune exec ./bin/main.exe
File "_none_", line 1:
Error: Files bin/.main.eobjs/native/dune__exe__Main.cmx and impl/impl.cmxa
       make inconsistent assumptions over interface Impl
[1]

Regarding revdeps testing, one issue to keep in mind is that testing a clean build (and not an incremental build) is likely to hide the failures:

$ dune clean
$ dune exec ./bin/main.exe
third build, forced a cmi update

When can_filter is true, remove static Hidden_deps so dyn_deps becomes
the sole library dep provider. Modules that don't reference a library
are no longer recompiled when that library changes.

The can_filter condition excludes Melange, special module kinds, and
cases where the dep graph dir doesn't match the obj dir. For sandboxed
builds (dune build -p), can_filter is typically false due to the
dep_graph_dir mismatch, so static deps are preserved.

Test expectations updated from count=2 to count=0. The @package-cycle
structural cycle test is updated (empty modules produce no file deps).

Also includes -open flag extraction and -as-argument-for/-parameter
module name handling to ensure filtering doesn't drop implicit deps.

Resolves ocaml#4572

Signed-off-by: Robin Bate Boerop <me@robinbb.com>
@robinbb robinbb force-pushed the robinbb-issue-4572 branch from 12e0764 to d6ae7e8 Compare April 10, 2026 15:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Finer dependency analysis between libraries

7 participants