Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refine the computation of atomic actions from a solver solution #5066

Closed
wants to merge 4 commits into from

Conversation

Armael
Copy link
Member

@Armael Armael commented Feb 23, 2022

Related: #4541
Solves the issue specifically mentionned at #4541 (comment)

This PR refines the computation of atomic actions (and more
importantly, their dependencies) computed by opam from a solver
solution. It improves it in the sense that it eliminates some cases
where opam would complain that a solution is cyclic, even though it
would be possible to execute it while preserving dependencies.

Background

(this is my limited comprehension of what happens from reading the code, I'm
happy to be corrected!)

When opam invokes a solver with a query for installing new packages, the solver
eventually produces a new "universe" indicating which versions specifically of
which packages can be installed to satisfy the query.

In order to then actually install the packages, opam needs to perform a bit more
work: from knowing 1) the dependency constraints of all packages; and 2) the set
of packages that need to be installed, it needs to deduce in which order to
install the new packages so as to obey dependencies.

What opam does is to compute a graph of the relevant packages, where an edge "A
-> B" in the graph means that action/package A depends on B. It then checks that
this is an acyclic graph, and the install plan can be extracted e.g. by doing a
topological sort of the packages in the graph.

However, the issue is that this graph is an approximation.

Strictly speaking, a given opam package does not have a set of dependencies:
dependencies can be specified as general boolean formulas. Considering that
these formulas are put in CNF, this means that dependencies are of the form:

A depends on (B1 | .... | Bn) &
             (C1 | .... | Cm) &
             ...

When considering that package A depends on a set of packages {B1, ..., Bn, C1
... Cm, ...}, this corresponds to approximating "A depends on B1 or B2" as
"A depends on B1 and B2".

Because of this approximation, it may happen that the resulting graph contains a
cycle even though it could be possible to install all the packages while
satisfying the dependencies (in their general form as boolean formulas).

This means that in full generality we should consider the hypergraph of
packages, where, for a package A, instead of edges A->B1, ..., A->Bn ("A depends
on B1" and ... and "A depends on Bn"), we have potato-shaped hyper-edges
A->B1...Bn, A->C1...Cm, etc ("A depends on B1 or ... or Bn" and "A depends on C1
or ... or Cm" and ...).

This gives us a representation that is faithful to the general form of
dependencies. However, it is now hard (NP complete, AFAICT) to find an acyclic
schedule in this hypergraph...

What this PR implements

This PR avoids solving the general problem of finding an acyclic schedule in the
general hypergraph of packages. Instead, it does compute the hypergraph (this is
basically trivial) but uses this representation to improve the approximation in
a simple case. For package A, where:

A depends on (B1 | .... | Bn) &
             (C1 | .... | Cm) &
             ....

if there is a Bi that is already installed in the universe (and stays
installed), then we can consider the clause satisfied, and avoid depending on
the other Bis.

For instance, with #4541 (comment), if you put the dependencies of dune.2.9.1 in CNF, there are clauses of the form:

(ocaml >= 4.08 | ocamlfind-secondary)

before this PR, these would result on dune depending on ocamlfind-secondary (even in
a switch with ocaml >= 4.08), eventually resulting in a cycle. With this PR,
these do not result in a dependency, avoiding the cycle.

Of course, in principle there are still cases where opam will report a cycle
that does not exist; but I would suggest to see whether they actually come up in
practice before implementing a more complex approach.

This PR refines the computation of atomic actions (and more
importantly, their dependencies) computed by opam from a solver
solution. It improves it in the sense that it eliminates some cases
where opam would complain that a solution is cyclic, even though it
would be possible to execute it while preserving dependencies.
@gasche
Copy link
Member

gasche commented Feb 23, 2022

What are the properties that we want to enforce on the selected installation order? I don't understand if we want:

  • strong requirement: all packages on which A depends are installed before A, or
  • weak requirement: the dependency formula of A is satisfied before A gets installed

One could imagine a package that depends on (B1 | B2), and checks during its installation the installation status of B1 or B2, and does something different when they are present or absent. (For example: the package foo needs a cooperative-concurrency library, it depends on (Lwt | Async) and will export ocamlfind subpackages foo.lwt and foo.async if the corresponding library is present.) For such a package, we would need the "strong requirement", and the current behavior is correct. Your proposed optimization is only valid for the "weak requirement" specification of an installation order.

@Armael
Copy link
Member Author

Armael commented Feb 23, 2022

Indeed, that's something I wondered as well, and I don't really know for sure.
Two comments:

  • what you describe seems to correspond to what happens with optional dependencies. But I'm not sure what's the exact relationship with or-formulas in non-optional dependencies and optional dependencies. At least in the surface language, they are two different things.
  • "a package that depends on (B1 | B2)" can correspond to a clause (after CNF transformation) of the following dependency constraint : (B1 & foo) | (B2 & bar)". If it is not possible to install bar in the current switch, then it seems wrong to depend on B2, even if it can be installed independently in the current switch. (This is what happens with dune 2.9.1) (but granted, the strategy applied in this PR is a bit less "cautious" and applies to possibly more scenarios)

@gasche
Copy link
Member

gasche commented Feb 24, 2022

What we are trying to do here is to determine a "dependency order" from two things: the original dependency formulas, and the satisfying assignment found by the solver. We need to be careful at whether the problem is well-posed:

  • strong robustness: does the proposed criterion return the same answer for logically-equivalent formulas? (by "formulas" here, do we mean the dependency formula of a single package, or of the whole universe at once?)
  • weak robustness: does the proposed criterion return the same answer for formulas that are the same truth value for the given assignment?

For example, if bar is never installable, then indeed (B1 & foo) | (B2 & bar) is logically equivalent to (B1 & foo), so a strongly robust criterion should not report B2 as a dependency. On the other hand, the original criterion (if B1 is "installed", then deps(B1 | B2) = deps(B2)) is only a consequence of weak robustness.

@gasche
Copy link
Member

gasche commented Feb 24, 2022

Looking at an example. I started from the following disjunctions found in opam-repository: (("lablgtk" & "conf-gtksourceview") | ("lablgtk3" & "lablgtk3-sourceview3")).

The CNF of this formula (note: reasoning modulo CNF conversion makes sense if we assume a "strongly robust" criterion) is as follows, with temporary variables x (for: gtk & sourceview), y(for: gtk3 & sourceview3):

x | !gtk | !sourceview
/\
!x | gtk
/\
!x | sourceview
/\
y | !gtk3 | !sourceview3
/\
!y | gtk3
/\
!y | sourceview3
/\
x | y

Let's assume we have a satisfying assignment that installs gtk and sourceview, and also installs gtk3 for unrelated reasons. So: x = gtk = sourceview = gtk3 = true, y = sourceview3 = false. How do we want to compute dependencies?

x | !gtk | !sourceview                 deps: x
/\
!x | gtk                               deps: gtk
/\
!x | sourceview                        deps: sourceview
/\
y | !gtk3 | !sourceview3               deps: []
/\
!y | gtk3                              deps: ?
/\
!y | sourceview3                       deps: []
/\
x | y                                  deps: x

Clearly the clause (!x | gtk) is valid because gtk is true, so gtk must be a dependency of the clause and thus of the package. The tricky case is (!y | gtk3), where two reasonings make sense:

  • if !y is already true, no need to depend on gtk3, the package is installable without it (weak requirement)
  • but gtk3 is true, so it may also be considered a dependency (strong requirement)

@gasche
Copy link
Member

gasche commented Feb 24, 2022

My gut feeling is that, as long as we cannot rule out the weird (Lwt | Async) case I mentioned, your proposal is not actually correct and that there is a cycle in "graphics.5.1.2 ocamlfind.1.9.1 ocamlfind-secondary.1.9.1 dune.2.9.1". I see two other ways out:

  • We could/should mark ocaml-secondary as installable only with OCaml < 4.08.
  • We could introduce a notion of "strong dependency" and "weak dependency", where the installation status of weak dependencies is guaranteed not to affect the build artifacts (as long as the dependency formula is satisfied, of course). But that sounds like a complex gun to shoot oneself in the foot.

@Armael
Copy link
Member Author

Armael commented Feb 24, 2022

Looking at an example. I started from the following disjunctions found in opam-repository: (("lablgtk" & "conf-gtksourceview") | ("lablgtk3" & "lablgtk3-sourceview3")).

I agree with the reasoning, though I don't know how you computed the CNF formula? I would have instead distributed & over | (IIRC that's what the cudf library does when normalizing this formula to CNF), giving:

(lablgtk | lablgtk3) &
(lablgtk | lablgtk3-sourceview3) &
(conf-gtksourceview | lablgtk3) &
(conf-gtksourceview | lablgtk3-sourceview3)

I'm not sure how your reasoning works on this one? (though, note that my PR only changes something in the case where one of the packages is already installed in the universe before the query)

@Armael
Copy link
Member Author

Armael commented Feb 24, 2022

My gut feeling is that, as long as we cannot rule out the weird (Lwt | Async) case I mentioned, your proposal is not actually correct

Right. I didn't think that we needed to account for that, but I think I agree with you now.

However, I do not agree that this is actually a cycle! For me, there being a cycle would mean that there is no way of co-installing the packages involved in the cycle. But e.g. for dune 2.9.1 it is possible: simply split the install query into two queries, e.g. installing everything but ocamlfind-secondary, then installing ocamlfind-secondary. This will rebuild dune as part of the second query.
If there exist a way of installing the packages by manually splitting the query, then ideally opam should be able to find such a schedule itself for the single big query.

So my proposal would be to amend this PR to:

  • still relax the cycle-checking code to avoid detecting the query in opam admin check --cycle --ignore-test-doc does not give accurate results #4541 (comment) as a cycle
  • make sure that dune gets re-installed as part of the query: the computed atomic actions would do: "install dune (and its other dependencies), install ocamlfind-secondary, rebuild dune (because it may rely on ocamlfind-secondary)". (with the current PR, the last rebuild does not happend)

@gasche
Copy link
Member

gasche commented Feb 24, 2022

But e.g. for dune 2.9.1 it is possible: simply split the install query into two queries, e.g. installing everything but ocamlfind-secondary, then installing ocamlfind-secondary. This will rebuild dune as part of the second query.

Ah! I thought vaguely about this but:

  • I wonder if we can bound the amount of extra work that occurs here, if we allow rebuilding? (Are there corner cases that are O(N^2) where N is the cycle size? worse?)
  • I convinced myself (too quickly maybe) that this did not actually work for two reasons:
    1. My mental model had "rebuild dune" as "uninstall dune; install dune", and I believe the first uninstallation step is not valid, it leaves the repository in a broken state.
    2. If our reasoning is that the "state" of a package may depend on all its dependencies, then does your strategy actually work? You install ocamlfind, then dune(ocamlfind), then ocamlfind-secondary(dune(ocamlfind)) and finally dune(ocamlfind-secondary(dune(ocamlfind)), ocamlfind). Note that, at this stage, you have an artifact for ocamlfind-secondary that was built against a stale artifact for its dependency (that is, dune(ocamlfind)), which may be different from the install artifact that is now in your switch. Again, if we reason that the state of a package may depend on its dependencies, we should rebuild ocamlfind-secondary now... and we in fact have a cycle.

@Armael
Copy link
Member Author

Armael commented Feb 24, 2022

I would like

opam install graphics.5.1.2 ocamlfind.1.9.1 ocamlfind-secondary.1.9.1 dune.2.9.1 

to do the same as

opam install graphics.5.1.2 ocamlfind.1.9.1 dune.2.9.1 && opam install  ocamlfind-secondary.1.9.1

(where the latter works with current opam).

Note that you write that ocamlfind-secondary depends on dune, but that's not the case: it depends on ocamlfind directly.

@gasche
Copy link
Member

gasche commented Feb 24, 2022

Right, I mixed up the dependencies in my example... But I think I could fix it. Let's start again:

A depends on (B|C), C depends on nothing, B depends on A. You suggest to build (C; A; B; A). But the build plan gives:

  • C()
  • A(C)
  • B(A(C))
  • A(C, B(A(C))

and at this point the version of B that we have is built against a stale version of A, and it should be rebuilt, and I think we have cycle.

Your "real-world cycle" is too large, and the package names are too long, but my impression is that something is wrong in the picture:

  • If we use the "strong requirement" view of dependencies (as our approximation for "on what artifacts may my own artifact depend"), we should rebuild B and there is a cycle (I think that with this view, cycles cannot be broken by rebuilding in the way you suggest, they really are cycles.)
  • If we use the "weak requirement" view of dependencies, we don't need to rebuild A.

The fact that opam install accepts ((C; A); (B; A)) today may suggest that opam itself is somehow inconsistent in the assumptions it makes about package dependencies.

(Granted, the idea that the build of B may be affected by the state of A, and thus depend on whether A itself saw (C) or (C,B) installed is kind of far-fetched. But it can happen in theory, right?)

Am I missing something?

@rjbou rjbou requested a review from AltGr February 24, 2022 14:53
@kit-ty-kate
Copy link
Member

@Armael thanks a lot for this PR! It seems to have some issues though:

  • https://github.com/ocaml/opam/runs/5320206023 shows some tests do not succeed as expected and a few weird things get removed from the action graph (e.g. the dependency on ocplib-endian in mirage-protocol)
  • The tests that I’ve just added doesn’t seem to show the new code is working expected

@rjbou rjbou added the PR: WIP Not for merge at this stage label Feb 25, 2022
@Armael
Copy link
Member Author

Armael commented Feb 25, 2022

I think you are right again.
This relies on something I wasn't initially sure about: rebuilds propagate transitively to all possible revdeps.
In other words, if A depends on B which depends on C, rebuilding C requires rebuilding B then A. (I tested this by creating an opam-repository with dummy packages A B and C).

Consequently:

  • this means that every possible dependency (even "weak" ones) involved while computing actions needs to be considered as a strong dependency; otherwise, we would just get a cycle between rebuilds afterwards. The idea behind this PR is thus fundamentally incorrect.
  • the fact that we are currently able to manually break the cycle with the dune 2.9.1 example is a bug/inconsistency in opam. Instead, when trying the second install invocation, opam should instead report a cycle between rebuild actions.

I'll thus move to close this PR. Not all hope is lost for the broader issue though, and the diagnostic of @kit-ty-kate in #4541 is correct: the culprit is opam admin check --cycles which does not detect this cycle.
I'll try to look at that and see how it can be fixed.

@Armael Armael closed this Feb 25, 2022
@AltGr
Copy link
Member

AltGr commented Mar 2, 2022

Wow, I am impressed and this is very interesting!
Unfortunately I concur with @gasche's original remark. In particular, I have always considered depopt: "foo" to be equivalent to depends: ["foo" | <nil>] (although it's not at all the way it's implemented; I'll give it some more thought) — meaning that you're not allowed to keep your package without foo support enabled as long as foo is installed. It was also an admitted constraint that any installation should be repeatable in a single action-graph application (avoiding complex bootstraps).

But it would probably remain sensible to exclude dependencies belonging to an unsatisfied conjunction (assuming DNF!!) from the action graph ? I don't think this would break any of our invariants 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR: WIP Not for merge at this stage
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants