-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Best effort co-concretization (iterative algorithm) #28941
Best effort co-concretization (iterative algorithm) #28941
Conversation
93ae2cd
to
253a6ab
Compare
@becker33 @tgamblin e4s-pr-generate took just 2 mins and a single core using |
253a6ab
to
5edd66d
Compare
57864b7
to
c1b872b
Compare
5aa3301
to
d0a2c0f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One request, one idea
# The algorithm is greedy, and it might decide to solve the "best" | ||
# spec early in which case reuse is suboptimal. In this case the most | ||
# recent version of libdwarf is selected and concretized to libelf@0.8.13 | ||
(['libdwarf@20111030^libelf@0.8.10', | ||
'libdwarf@20130207^libelf@0.8.12', | ||
'libdwarf@20130729'], 'libelf@0.8.12', 1), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we improve this by having both root literals (the literals we already use) and dependency literals (explicitly mentioned dependencies of roots) and minimizing the unsolved dependency literals as well? I think that would cause us to concretize the most specific specs first, which would increase our odds of a more minimal build.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a comment, following a discussion with @becker33 yesterday. We can do something very similar to this by adding weights during the setup phase for the literals that are solved. Then we can minimize the sum of the weights for reuse (currently weights are uniform and equal to 1 for each literal spec).
We can experiment with different ways of computing the weights in a separate PR.
c1f062f
to
7cebbff
Compare
@becker33 Pending unrelated errors in pipelines I think this is done. |
7cebbff
to
9e81c86
Compare
@becker33 Is this PR waiting further review before being merged? |
9e81c86
to
d2a1e49
Compare
@becker33 It's a mystery to me how @eugeneswalker (or more likely Github on his behalf) could dismiss your review with a commit that doesn't exist anymore, but can you please ✔️ again? I rebased the PR to double check that everything is working correctly with #28673 in |
What da!?! I definitely didn't mean to do this, didn't even know about this PR. If anyone knows how this could happen... |
d2a1e49
to
543eb60
Compare
This change injects the reusable specs in the SpecBuilder and avoids retrieving them from either the DB or the buildcache (making the underlying abstractions less coupled). It also fixes a bug where we were "reusing" only the root specs and not the other nodes of the DAG.
3792d93
to
08bc55e
Compare
Currently, environments can either be concretized fully together or fully separately. This works well for users who create environments for interoperable software and can use `concretizer:unify:true`. It does not allow environments with conflicting software to be concretized for maximal interoperability. The primary use-case for this is facilities providing system software. Facilities provide multiple MPI implementations, but all software built against a given MPI ought to be interoperable. This PR adds a concretization option `concretizer:unify:when_possible`. When this option is used, Spack will concretize specs in the environment separately, but will optimize for minimal differences in overlapping packages. * Add a level of indirection to root specs This commit introduce the "literal" atom, which comes with a few different "arities". The unary "literal" contains an integer that id the ID of a spec literal. Other "literals" contain information on the requests made by literal ID. For instance zlib@1.2.11 generates the following facts: literal(0,"root","zlib"). literal(0,"node","zlib"). literal(0,"node_version_satisfies","zlib","1.2.11"). This should help with solving large environments "together where possible" since later literals can be now solved together in batches. * Add a mechanism to relax the number of literals being solved * Modify spack solve to display the new criteria Since the new criteria is above all the build criteria, we need to modify the way we display the output. Originally done by Greg in spack#27964 and cherry-picked to this branch by the co-author of the commit. Co-authored-by: Massimiliano Culpo <massimiliano.culpo@gmail.com> * Inject reusable specs into the solve Instead of coupling the PyclingoDriver() object with spack.config, inject the concrete specs that can be reused. A method level function takes care of reading from the store and the buildcache. * spack solve: show output of multi-rounds * add tests for best-effort coconcretization * Enforce having at least a literal being solved Co-authored-by: Greg Becker <becker33@llnl.gov>
Currently, environments can either be concretized fully together or fully separately. This works well for users who create environments for interoperable software and can use `concretizer:unify:true`. It does not allow environments with conflicting software to be concretized for maximal interoperability. The primary use-case for this is facilities providing system software. Facilities provide multiple MPI implementations, but all software built against a given MPI ought to be interoperable. This PR adds a concretization option `concretizer:unify:when_possible`. When this option is used, Spack will concretize specs in the environment separately, but will optimize for minimal differences in overlapping packages. * Add a level of indirection to root specs This commit introduce the "literal" atom, which comes with a few different "arities". The unary "literal" contains an integer that id the ID of a spec literal. Other "literals" contain information on the requests made by literal ID. For instance zlib@1.2.11 generates the following facts: literal(0,"root","zlib"). literal(0,"node","zlib"). literal(0,"node_version_satisfies","zlib","1.2.11"). This should help with solving large environments "together where possible" since later literals can be now solved together in batches. * Add a mechanism to relax the number of literals being solved * Modify spack solve to display the new criteria Since the new criteria is above all the build criteria, we need to modify the way we display the output. Originally done by Greg in spack#27964 and cherry-picked to this branch by the co-author of the commit. Co-authored-by: Massimiliano Culpo <massimiliano.culpo@gmail.com> * Inject reusable specs into the solve Instead of coupling the PyclingoDriver() object with spack.config, inject the concrete specs that can be reused. A method level function takes care of reading from the store and the buildcache. * spack solve: show output of multi-rounds * add tests for best-effort coconcretization * Enforce having at least a literal being solved Co-authored-by: Greg Becker <becker33@llnl.gov>
Use case description copied from #27964
Currently, environments can either be concretized fully together or fully separately. This works well for users who create environments for interoperable software and can use concretization: together. It does not allow environments with conflicting software to be concretized for maximal interoperability.
The primary use-case for this is facilities providing system software. Facilities provide multiple MPI implementations, but all software built against a given MPI ought to be interoperable.
This PR adds a concretization option together_where_possible. When this option is used, Spack will concretize specs in the environment separately, but will optimize for minimal differences in overlapping packages.
Implementation differences
The algorithm used here is greedy, since specs are computed in multiple rounds where clingo concretizes together as many input specs as possible. The gist of the algorithm is:
https://github.com/alalazo/spack/blob/f6420eaf6eb5e10a3e4d0b0a1a6c20e4404f92c1/lib/spack/spack/solver/asp.py#L2092-L2135
To relax the requirements on input specs and allow for not solving some of them we add a new indirection:
https://github.com/alalazo/spack/blob/f6420eaf6eb5e10a3e4d0b0a1a6c20e4404f92c1/lib/spack/spack/solver/asp.py#L1787-L1802
and we give clingo the choice to solve for the input literal or not. All the input specs computed in previous rounds are then reused in later rounds to ensure the environment is as contained as possible.
Comparison with #27964
When trying out the encoding in #27964 for production environments we saw that there were huge requests:
The solve-time can very likely be improved a lot by removing the symmetry of the new encoding with some artificial rules. The memory requests during the grounding phase instead will necessarily grow linearly with the number of PSIDs needed to solve the environment. That number is not known a-priori. It is capped by the number of specs in an environment (which is computationally unfeasible) but practically is usually much less. For instance,
e4s
has 119 specs and needs 3 process spaces to concretize.This PR instead takes a greedy approach and perform N solves each with a single process space. It might give a different (and "suboptimal" i.e. with a number of subprocesses greater than strictly needed) result if compared with the encoding in #27964 but it is complete, i.e. if there is at least a solution the algorithm will not give a false negative. On the bright side, due to its greedy nature, it's much faster when solving for environments (
e4s
is solved in 3 rounds in ~1 min.).