New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Question] libsolv for conda / number of rules #284
Comments
Thanks for starting this conversation! When you are saying "slow", could you provide more specific numbers? |
Sure -- in the worst case times (for the entire installation process) of 2 minutes have been observed. |
I can't tell you much about the speed of libsolv. However, if I try to upgrade all packages on my system the solver will create 68611 rules and solving takes 74 ms. The actual solving takes 5 ms, the rest is rule setup (i.e. provides lookup and the like).
|
@wolfv I have a feeling that libsolv would work just fine for your case. :) |
Hi guys, thanks for all the responses. It's encouraging to see that there is a community around this project! I will need to look into how to get started. It would be great to swap out the solver of conda, but I need to figure out how well the concepts map to each other ... I've pinged the conda core developers on the gitter channel (https://gitter.im/conda/conda if anyone is interested). Is there a good how-to-get-started doc for libsolv? Should one take a look at the dnf sources? If I am understanding correctly (from looking at the doc/ folder in this repository) then the basic idea would be to create a tool that turns a conda package file specification into a |
The |
@wolfv The libdnf sources are probably a good reference. |
Here's a simple example so that you can see how libsolv works:
|
Also have a look at the libsolv-bindings manpage: https://github.com/openSUSE/libsolv/blob/master/doc/libsolv-bindings.txt |
And also the python implementation of the demo solver: https://github.com/openSUSE/libsolv/blob/master/examples/pysolv Looks like conda uses its own way of matching dependencies, we would need to implement that in libsolv (we currently have rpm, deb, haiku). |
@mlschroe @Conan-Kudo thanks for your thoughts here. I'm one of the conda maintainers, and I want to explore this a bit more. My current concerns are:
A large question for me is how much work we'd have to do to customize conda for libsolv specifically, vs. how much conda can be written to formulate problems for arbitrary solvers, and thus be less rigid in the future. Another conda user recommended that we look at the clingo solver, which uses the CUDF spec for formulating problems to clingo. conda/conda#7808 (comment) How much is that approach possible with libsolv, either via CUDF, or via some similar idea, where conda would translate its current repodata and specs to some libsolv format, for feeding into libsolv behind some more generic interface in conda? |
Actually @mlschroe even created small json parser inside libsolv so you can easily implement support for conda repositories. Btw, hawkey is deprecated and has nothing to do with your problem ;)
That's something what I described in #286.
I don't think you would need to do much, you would need to implement code in libsolv to read repository metadata and populate pool with solvables from it. Then it's mostly like |
@ignatenkobrain awesome, thanks!
Glad to hear - reducing the number of things for me to research is helpful!
Yes, that looks like the crux of the problem. As you've pointed out in that issue, the problem is of arbitrary complexity, so simply making it all part of the name is not really viable. In conda-land, we represent that as a hash of a dictionary of variables that have affected the recipe somehow. It's less readily interpretable, but we have some tools to spit out what the hash represents.
If I understand you correctly, I need to look into the JSON parser in libsolv, or perhaps somehow write something that translates conda metadata into .solv files? What's the best reference on the .solv format? |
You don't need to use solv files at all, they are basically a caching mechanism so that you don't need to "translate" the repository metadata each time you call the package manager. The next step forward would be me implementing conda dependency matching with a new REL_CONDA dependency type. This basically means implementing what's in conda's models/match_spec.py file. |
Hi all, finally I've gotten around to play with the libsolv Python bindings, and successfully created a small script that downloads conda Regarding build numbers and variants: currently this is indeed an issue. I think I'll implement a lookup-preprocessing step that calculates a new version number by appending the build number to the version string. That should overlay an ordering by build number. In the first step, the max build number will be searched, and all other build numbers will be prepended with I am just realizing that this will also require to modify all dependency relations ... so this is definitely not a nice solution. I don't know how hard it is to add this feature to libsolv? For the build variants, the preprocessing step will check if a matching build variant is available, and if yes, remove all non-matching ones. Anyways, here is the current state of the code. Besides this, the experience with libsolv is great! I would be very happy if you leave some comments on how to do this better. Cheers! |
Stanzas like these can be done with |
Hi @Conan-Kudo, thanks for the heads-up. I got this stuff to work, the problem is that conda has a notion of build_number.
So the way it's implemented right now in my version, I think libsolv ends up thinking the requirement is I'll solve this by preprocessing. (Btw I just add multiple conditions manually in the add_deparray, is there a better way to do it? I don't know how to use |
It does not, it needs to be constructed in a way as I mentioned earlier. That said, if you're trying to retrieve a single solvable, you can do something like this to binding it to a single object by treating each condition with the Details about |
@wolfv Also note, libsolv needs to be built with |
@Conan-Kudo great. Got it to work (apparently the pip build ships with COMPLEX_DEPS enabled :) I managed to produce a installable environment file (meaning, if I pass this to conda to create a new environment, the current conda solver does not complain about conflicts). So this is already a pretty cool thing! What can version strings look like with libsolv? I had one version string that looks like Another question: for debugging, is there a way to figure out what the dependency tree looks like? E.g. if package D is required, can I figure out which package that is getting installed required that package? Like a reverse lookup? That would help a lot! |
Change the |
libsolv has an implementation of JSON metadata in libsolvext: fae06e5 You could use that to speed things up? |
wait wait wait, you really need REL_CONDA support to make libsolv's dependency matching to the right thing. |
Yes, REL_CONDA needs to be implemented first. I'll need to cross-check the specification with PEP440 to see how far it matches. |
Conda's version ordering is PEP440 for the most part, but not completely. Here's the rules: https://github.com/conda/conda/blob/master/conda/models/version.py#L47-L155 |
@mlschroe how much work is it to implement this and could I be of any help there? |
For people who are still curious: I've gotten pretty far with this, over here: https://github.com/wolfv/mamba This is using libsolv from C++ which I found quite easy to do, after some adjusting to the new concepts. It solves environments almost like the current conda solver. Obviously there are some corner cases here and there (but I think I am also not setting the repository priority correctly for the default channel etc). In it's current state it can then hand over the packages to install & remove to conda, and conda does the installing. I am using simd-json to parse the conda repository files, and then create all the packages in libsolv. This is pretty much sufficiently fast for my liking :) I've also started building libsolv on conda forge: conda-forge/staged-recipes#7969 I am also doing some hacking around to make the version numbers look like something that (seems) to work fine with libsolv. I guess, the correct way would be to add a CONDA version comparison in Cheers, Wolf |
I think @mlschroe won't be against having few conditions in CMakeLists.txt for windows stuff. |
(You probably already noticed, but conda's version comparison is already implemented. You need to enable CONDA support when configuring libsolv and then set the disttype to DISTTYPE_CONDA.) |
Actually, I just noticed an hour ago :) the first protoype is released under the name "mamba" here: https://anaconda.org/conda-forge/mamba I'll switch over to your version comparison as soon as possible! Thanks a lot! |
@mlschroe just FYI the source code for mamba is here: https://github.com/QuantStack/mamba I saw that you're still working on the conda support in libsolv. Just let me know if there is anything I can do to support you! |
Libsolv now supports a REL_CONDA relation type. The "name" part of the relation is the package name, the "evr" part is of the form "version" or "version build". The build part is currently ignored, but this will be easy to add. There's also currently no way to match the other components, i.e. "build_number", "track_features" etc. |
This is great! For the track_features: I am not sure how this is supposed to work (to be honest) but I was able to replicate the ability to prefer a certain feature by adding all packages with a feature into a new ("fake") repo and give it a high priority if the feature was requested. This seems to do exactly what's necessary. For the build number: In my version I am currently adding it to the version, and normalize the version number, so that python 3.4 with build_number 12 becomes One thing that I haven't been able to solve (and not sure if this will be solvable easily) is wildcards in build strings. E.g. on conda-forge you can find: |
With REL_CONDA it'll be easy to support wildcards (they already work for the version comparison) Do you have an example of a package with dependencies that contain other component matches than 'name', 'version' and 'build'? I.e. a package that uses the '[component=xxx]' syntax in a dependency? |
I don't think I've ever seen that, I am not even sure it's legal. Maybe @msarahan knows more? |
We don't currently support the bracket syntax, but it would be a nice addition. We think of this syntax in terms of conditional dependencies - "install this if some condition is true". Maybe this would be used to add in package sets for optional functionality. |
Hi,
I am investigating wether it would be worth to use the libsolv solver in the conda package manager.
For this I am wondering how many packages / rules you can expect to solve with libsolv in a reasonable timeframe? The current SAT solver of conda has the problem of being slow – but conda also produces a lot of SAT rules to be solved. If I am not mistaken, the last number I've seen was around 60'000...
Sorry if this channel is inappropriate to ask such a question.
Cheers,
Wolf
The text was updated successfully, but these errors were encountered: