Modify backtracking algorithm and remove swapping #65

iangreenleaf · 2017-04-13T03:01:52Z

The Good

This pulls in fixes for several issues I've uncovered in the process, like #63 and CocoaPods/Resolver-Integration-Specs#10.

It also makes the backtracking algorithm predictable, so that you can assume certain guarantees about the resolution you'll get based on the order of items you feed in via sort_dependencies and search_for. This was the original reason I started looking into this.

The Bad

Performance is worse right now. The integration specs see about a 3-4X slowdown. I haven't gone deep on profiling, so I don't know if there are some easy Ruby optimizations that could speed this up or if this is just the reality of removing swapping. There may also ways to optimize the algorithm for common scenarios, though I picked off a couple of the most obvious ones.

Performance gets very, very bad if sort_dependencies doesn't do some work to optimize for conflict-heavy packages. The Bundler and CocoaPods indexes are fine here. A couple of the test indexes are slow because they sort funny (not such a big deal), but the Berkshelf index is also deeply slow due to this (probably a bigger deal).

The Ugly

To guarantee the "best" outcomes (i.e. that your original dependencies resolve to the highest valid versions), sort_dependencies needs to ensure that the original dependencies are sorted ahead of everything else on the list. The current indexes don't do this. Right now I have some code in Molinillo as a workaround, but I think that's probably not really the appropriate place to handle it—probably we should trust the inputs to know what they're doing. So this change would require updates to the packages that use Molinillo to avoid them running into issues.

The old version had bugs that manifested once swapping was removed. This version is not performant but should behave predictably.

When a conflict arises, it compiles a list of all dependencies in the chains that conflicted, and will jump to the first nearest state involving one of those dependencies (since attempting to resolve unrelated dependencies won't do any good to the conflict).

The old version had bugs that manifested once swapping was removed. This version is not performant but should behave predictably.

When a conflict arises, it compiles a list of all dependencies in the chains that conflicted, and will jump to the first nearest state involving one of those dependencies (since attempting to resolve unrelated dependencies won't do any good to the conflict).

The old version was depending on the parent_of construction, which doesn't include every node with an incoming path.

…ing" This reverts commit 639d4ae.

Conflict sets now include the immediate conflicts + the earliest responsible parent for each conflicting package. There's a nice explanation here: http://www.well-typed.com/blog/preview/backjumping/

…ncy sorting"" This reverts commit 10de81e.

Now enforces that the originally requested requirements are always resolved first, while everything else is allowed to be resorted. This makes it a breadth-first search at the top level and more like a directed depth-first search for all dependencies.

One big stable sort is a bit less messy than what was there before.

The new conflict-set-based backtracking logic needs to explore these conflict branches to accurately identify all responsible parents.

We can't ignore non-conflicting requirements, because `search_for` may only return specs matching that requirement, so we never explore all possible branches to complete the conflict set.

segiddins · 2017-04-13T04:23:48Z

This is pretty cool!

Not to be a total downer, but I don't think we'll be able to merge this with the test regressions and performance regressions -- it's definitely a goal that molinillo be able to solve whatever's thrown at it regardless of optimality of inputs, but I've been hoping to make incremental improvements in that regard, especially knowing that in the 'real world', pathological cases are often much larger than those in the tests here (and thus the performance hits to them are major).

iangreenleaf · 2017-04-13T05:07:40Z

Yeah, I figured that might be your take on it. That's fair. I'd be happy to keep it going if it was on track to get merged, but I don't have endless time to sink into this, so if you're not gung-ho on the idea I probably won't keep pushing on it.

segiddins · 2017-04-17T03:19:44Z

Sorry :(

iangreenleaf added 29 commits March 31, 2017 18:19

Much simpler backtracking

aa4d01d

The old version had bugs that manifested once swapping was removed. This version is not performant but should behave predictably.

Experimental change that controls some of the dependency sorting

639d4ae

Optimization: ignore other specs when one is activated

5a579dd

New specs for some new cases

4e970dd

Save circular dependency errors til resolution finishes

6cb4925

Rebase submodule to master and add a spec

bb2c6d3

Spec for backjumping bug

9015c1a

Much simpler backtracking

8a9d312

The old version had bugs that manifested once swapping was removed. This version is not performant but should behave predictably.

Get full set of requirements involved in conflict

b206d57

The old version was depending on the parent_of construction, which doesn't include every node with an incoming path.

Include leaf node in conflict set for backtracking

03e2a3c

Merge branch 'backjumping-bug' into backtracking

df1011c

Merge branch 'circular-deps' into backtracking

e2dbfb0

Revert "Experimental change that controls some of the dependency sort…

10de81e

…ing" This reverts commit 639d4ae.

Simplify conflict sets

a883c43

Conflict sets now include the immediate conflicts + the earliest responsible parent for each conflicting package. There's a nice explanation here: http://www.well-typed.com/blog/preview/backjumping/

Revert "Revert "Experimental change that controls some of the depende…

7ce027d

…ncy sorting"" This reverts commit 10de81e.

Conflict sets only include parents of failed requirements

d324a05

Optimization: resolve requirements for existing specs first

7426c6a

Clean up dep sorting code

c156d4c

One big stable sort is a bit less messy than what was there before.

Remove optimization around existing specs

dbf324d

The new conflict-set-based backtracking logic needs to explore these conflict branches to accurately identify all responsible parents.

Return to a simple conflict set construction

25542a5

We can't ignore non-conflicting requirements, because `search_for` may only return specs matching that requirement, so we never explore all possible branches to complete the conflict set.

Handle un-activated specs and top-level deps

ce90bfd

Skip certain specs on some indexes

ea8443a

Use sets to simplify conflict set management

0095865

Remove swapping functionality

487f47b

Use name_for helper

874f424

Show the conflict set in debug

23dcf19

iangreenleaf closed this Apr 13, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modify backtracking algorithm and remove swapping #65

Modify backtracking algorithm and remove swapping #65

iangreenleaf commented Apr 13, 2017

segiddins commented Apr 13, 2017

iangreenleaf commented Apr 13, 2017

segiddins commented Apr 17, 2017

Modify backtracking algorithm and remove swapping #65

Modify backtracking algorithm and remove swapping #65

Conversation

iangreenleaf commented Apr 13, 2017

The Good

The Bad

The Ugly

segiddins commented Apr 13, 2017

iangreenleaf commented Apr 13, 2017

segiddins commented Apr 17, 2017