Skip to content
This repository has been archived by the owner on May 31, 2018. It is now read-only.

use tsort for dependency resolution #408

Closed
rmarquis opened this issue Jan 30, 2016 · 12 comments
Closed

use tsort for dependency resolution #408

rmarquis opened this issue Jan 30, 2016 · 12 comments

Comments

@rmarquis
Copy link
Owner

Replace the current dependency resolution code with a shorter, cleaner code using tsort.
See https://ptpb.pw/MPsK/sh from Earnestly:

#!/bin/bash
# repo-cycle-detector - attempt to detect cycles in pacman repos

set +e
repo=$1

usage() {
    if ((!$#)); then
        printf 'usage: repo-cycle-detector repo\n' >&2
        exit 1
    fi
}

constrain_to() {
    repo=$1
    awk -v repo="^$repo" '$1 ~ repo { print $2 }'
}

generate() {
    local repo=$1
    # The format is formally package:dependency0:dependency1:..:dependency(n),
    # however as expac cannot be constrained to search a specific repo I will
    # instead use a sentinel value as a means to contrain the results to that
    # value.  The sentinel in this case will be based on the value of $repo.

    # https://github.com/falconindy/expac/issues/24
    expac -S '%r %n:%E' -l ':' < <(pacman -Sql "$repo") | constrain_to "$repo"
}

parse() {
    # We assume a package name cannot contain this FS.
    awk -F : '{ 
        p = $1

        # If we find that a package contains no dependency, make it a
        # dependency of itself to provide tsort a complete graph.
        if($2 == "")
            $2 = $1

        for(i = 1; i <= NF; ++i)
            printf "%s %s\n", p, $NF
    }'
}

usage "$@"
generate "$repo" | parse | tsort > /dev/null

For testing:

$ tsort <<!
heredoc> a b
heredoc> a b
heredoc> c b
heredoc> d a
heredoc> !
c
d
a
b

Handling split packages isn't that trivial, but is apparently possible.

@rmarquis rmarquis added the todo label Jan 30, 2016
@rmarquis rmarquis modified the milestones: 4.5.x - pacman 5, 5.x - new features, 4.5.x - maintenance Jan 30, 2016
@AladW
Copy link
Contributor

AladW commented Mar 7, 2016

If by handling split packages you mean not including them in the tsort graph (i.e., only include pkgbase), then it depends on what you use to generate the input.

For .SRCINFO, it's fairly easy since all make/depends to build the package must be global, per PKGBUILD(5), and split packages are separated by newlines. Example

For RPC, it's harder since everything is thrown in a bunch and it's possible to query split packages too. See aurutils/aurutils#27

@rmarquis
Copy link
Owner Author

rmarquis commented Mar 7, 2016

Ideally, I'd be able to use tsort for all packages (not only pkgbase) but take this additional parameter into account, since the whole dependency chain is solved through the RPC before any download. I really wouldn't like to parse .SRCINFO after download to actually compute the correct dependency chain again.

Alternatively, implement some push/pop capabilities to simplify the actual code, implement this into cower directly and drop the whole dependency code, or maybe even rewrite the whole stuff in python.

@AladW
Copy link
Contributor

AladW commented Mar 8, 2016

You could parse the downloaded JSON data both for Name (full graph) and PkgBase (download graph) and have your cake and eat it, too. That's how I fixed my issue above, anyway.

edit: This still needs more work it seems, see aurutils/aurutils#35

BTW, I'd probably open a cower issue to have at least recursive printing of dependencies, if you haven't done so already.

@rmarquis
Copy link
Owner Author

rmarquis commented Mar 8, 2016

Well, you'd still need to combine both graph in the end.
Currently, pacaur solves the "full" graph with a simple list append/list remove mechanism, get the corresponding pkgbase ordered list, then does some magic to combine both lists to ensure packages are only built once and in the right order. And I don't know how to do this last step with tsort. In fact, I'm not even sure both computed lists are that independent from each other, and using tsort here might actually break the whole stuff :/

I've been quite lazy recently and haven't took time to look deeper in cower dep solver at the moment, feel free to open a ticket about it. From what I recall, cower solver doesn't compute deps in the correct topological order, making it effectively useless for automated build.The new ponies version might have improved this particular aspect.

@AladW
Copy link
Contributor

AladW commented Mar 16, 2016

What also makes things harder is that pkgbase doesn't have to match any of the given pkgnames. I've added a cheap solution for aurqueue/aurchain by replacing pkgname in the download list with pkgbase. https://github.com/AladW/aurutils/blob/master/aurqueue#L45

However, I don't know if that helps with your use case, as I assume you need the split package order to avoid conflicts when installing split packages?

@rmarquis
Copy link
Owner Author

No, conflicts are actually directly handled by makepkg. What pacaur does is precomputing the conflicts to display them to the user before the main prompt, and later on the conflicts prompts are bypassed automatically with the --ask hidden parameter of makepkg. Conflicts order is thus not needed, as long as the dependency order (pkgname) is correct.

The current implementation requires quite a lot of computation, but on the bright side it actually handles the pkgname/pkgbase quite well, even if both are different. And to be honest, I'm not convinced anymore that using tsort would result in a cleaner implementation... :/

@rmarquis rmarquis modified the milestones: 4.6.x - new features, 5.0.x - later Mar 24, 2016
@rmarquis
Copy link
Owner Author

rmarquis commented Apr 9, 2016

Closing as "Wontfix". I don't see a real possible improvement here without a complete design overhaul, which might very well not be worth it anyway (see #443). Speed wise, there are others I/O bottlenecks (mostly expac calls) and gain would be marginal.

@rmarquis
Copy link
Owner Author

rmarquis commented May 27, 2016

Reopening. I've locally implemented tsorting instead of the brute force method. Currently in a very raw state, needs a lot of code cleanup and optimization.

Split packages support should be checked, and benchmarking should be done to see if this is actually worth having in common cases. It is expected that simple user cases will run slower, but complex dependency chains will be greatly sped up.

Currently, with providers and conflict checks disabled:

  • for plasma-git-meta (106 AUR deps): 35s vs 62s for 4.6.4
  • for yaourt-git (2 AUR deps): 1.8s vs 1.6s for 4.6.4

@rmarquis rmarquis reopened this May 27, 2016
@rmarquis
Copy link
Owner Author

rmarquis commented Jun 1, 2016

A complete solver around tsort might also be useful to help implement #173.

@rmarquis
Copy link
Owner Author

rmarquis commented Oct 1, 2016

Done in 2c2ba9a. Seems okay, but I kinda expect some breakage.

@rmarquis rmarquis closed this as completed Oct 1, 2016
@rmarquis
Copy link
Owner Author

rmarquis commented Oct 1, 2016

Reopening, some package won't install anymore (you can laugh AladW :P).

$ pacaur -S pacaur-git
:: Package pacaur-git not found in repositories, trying AUR...
:: resolving dependencies...
:: looking for inter-conflicts...
 there is nothing to do

The breakage is inconsistent. Some package have no issue while other fail. The code breaks in SortAurDeps() and exits in ConflictCheck() due to the empty deps variable.

@rmarquis rmarquis reopened this Oct 1, 2016
@rmarquis
Copy link
Owner Author

rmarquis commented Oct 1, 2016

Fixed in 992eff7.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants