Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify backtracking algorithm and remove swapping #65

Closed
wants to merge 29 commits into from

Conversation

iangreenleaf
Copy link
Contributor

The Good

This pulls in fixes for several issues I've uncovered in the process, like #63 and CocoaPods/Resolver-Integration-Specs#10.

It also makes the backtracking algorithm predictable, so that you can assume certain guarantees about the resolution you'll get based on the order of items you feed in via sort_dependencies and search_for. This was the original reason I started looking into this.

The Bad

Performance is worse right now. The integration specs see about a 3-4X slowdown. I haven't gone deep on profiling, so I don't know if there are some easy Ruby optimizations that could speed this up or if this is just the reality of removing swapping. There may also ways to optimize the algorithm for common scenarios, though I picked off a couple of the most obvious ones.

Performance gets very, very bad if sort_dependencies doesn't do some work to optimize for conflict-heavy packages. The Bundler and CocoaPods indexes are fine here. A couple of the test indexes are slow because they sort funny (not such a big deal), but the Berkshelf index is also deeply slow due to this (probably a bigger deal).

The Ugly

To guarantee the "best" outcomes (i.e. that your original dependencies resolve to the highest valid versions), sort_dependencies needs to ensure that the original dependencies are sorted ahead of everything else on the list. The current indexes don't do this. Right now I have some code in Molinillo as a workaround, but I think that's probably not really the appropriate place to handle it—probably we should trust the inputs to know what they're doing. So this change would require updates to the packages that use Molinillo to avoid them running into issues.

The old version had bugs that manifested once swapping was removed. This
version is not performant but should behave predictably.
When a conflict arises, it compiles a list of all dependencies in the
chains that conflicted, and will jump to the first nearest state
involving one of those dependencies (since attempting to resolve
unrelated dependencies won't do any good to the conflict).
The old version had bugs that manifested once swapping was removed. This
version is not performant but should behave predictably.
When a conflict arises, it compiles a list of all dependencies in the
chains that conflicted, and will jump to the first nearest state
involving one of those dependencies (since attempting to resolve
unrelated dependencies won't do any good to the conflict).
The old version was depending on the parent_of construction, which
doesn't include every node with an incoming path.
Conflict sets now include the immediate conflicts + the earliest
responsible parent for each conflicting package.

There's a nice explanation here:
http://www.well-typed.com/blog/preview/backjumping/
Now enforces that the originally requested requirements are always
resolved first, while everything else is allowed to be resorted. This
makes it a breadth-first search at the top level and more like a
directed depth-first search for all dependencies.
One big stable sort is a bit less messy than what was there before.
The new conflict-set-based backtracking logic needs to explore these
conflict branches to accurately identify all responsible parents.
We can't ignore non-conflicting requirements, because `search_for` may
only return specs matching that requirement, so we never explore all
possible branches to complete the conflict set.
@segiddins
Copy link
Member

This is pretty cool!

Not to be a total downer, but I don't think we'll be able to merge this with the test regressions and performance regressions -- it's definitely a goal that molinillo be able to solve whatever's thrown at it regardless of optimality of inputs, but I've been hoping to make incremental improvements in that regard, especially knowing that in the 'real world', pathological cases are often much larger than those in the tests here (and thus the performance hits to them are major).

@iangreenleaf
Copy link
Contributor Author

Yeah, I figured that might be your take on it. That's fair. I'd be happy to keep it going if it was on track to get merged, but I don't have endless time to sink into this, so if you're not gung-ho on the idea I probably won't keep pushing on it.

@segiddins
Copy link
Member

Sorry :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants