Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a pseudo-boolean SAT solver #568

Closed
asmeurer opened this issue Feb 17, 2014 · 4 comments
Closed

Use a pseudo-boolean SAT solver #568

asmeurer opened this issue Feb 17, 2014 · 4 comments
Assignees
Labels
locked [bot] locked due to inactivity priority-high [deprecated] use severity::* type::feature request for a new feature or capability

Comments

@asmeurer
Copy link
Contributor

The current SAT solver falls short if we want to be more intelligent about including more than just the latest version of every package (see #491 and related issues). An attempt was made using the existing framework at #549, but it is too slow. A pseudo-boolean SAT solver would let us specify a function to optimize, which would be translated into SAT conditions.

@asmeurer
Copy link
Contributor Author

Here's an update. I've delved a bit into the wonderful world of SAT solvers. I won't bore you with the details, but basically academics are terrible at writing software.

I have a branch pseudo_tests on my fork with a few test scripts. I also have a conda package for minisatp for Linux 64 bit. It doesn't seem easy to build for other platforms. But you can test it. For instance, I have a test output from my all-solutions branch for the currently broken conda install pandas=0.9 python=2 (from a Python 3 conda). ./sat_test.py will export a file, and minisatp all_solutions_clauses2.opb will solve it. You can check the solution with ./sat_test.py all 2 "<solution>" (the quotes are important). In this case, it gives:

dateutil-2.1-py27_2.tar.bz2
zlib-1.2.7-1.tar.bz2
nose-1.3.0-py27_0.tar.bz2
python-2.7.6-1.tar.bz2
pandas-0.9.1-np17py27_0.tar.bz2
openssl-1.0.1c-0.tar.bz2
sqlite-3.7.13-1.tar.bz2
scipy-0.13.2-np17py27_p1.tar.bz2
tk-8.5.13-1.tar.bz2
readline-6.2-2.tar.bz2
six-1.5.2-py27_0.tar.bz2
numpy-1.7.1-py27_p2.tar.bz2
mkl-rt-11.1-p0.tar.bz2

(it includes mkl, which is a separate issue; it's because internally conda is considering mkl to be newer than the non-mkl version)

minisatp is actually quite slow at solving this, but I think it's because it tries a poor initial value for the optimization function.

The optimization function that I am using works like this. Say 20 is the maximum number of versions of any package. Say that p1, p2, ... are the variables for one package (say Python) in order, so that p1 is the newest version, p2 is the next newest, and so on, and x1, x2, ... are variables for another, and so on. The function to maximize is 20*p1 + 19*p2 + 18*p3 + ... + 20*x1 + 19*x2 + ... (note that minisatp actually requires us to multiply this by -1 and minimize it). If we can get the newest version of each package, that will be the optimal value, because it will add 20. I think that knowing what the optimal value is, we can optimize this quicker, because we will be able to start there and work down if it doesn't work, rather than starting small and working our way up as minisatp does.

Now here is my plan for now:

  • The BDD algorithm seems to be the best, both in terms of speed and ease of implementation. In fact, this is the entire algorithm.
  • If this is all we are going to do with the optimization function, it occurs to me that it would be exactly equivalent to just have conda try all latest versions, then try all at the latest version except for one at the second latest version, then all at the latest version except for two at the second latest version or one at the third latest version, and so on (trying each stage all at once by putting all possibilities into the SAT solver).

This would be much easier to implement, and would be exactly as efficient. The downside is that we are stuck with that exact scheme, or something similar. With pseudo-boolean, we could play with different kinds of constraints or optimization functions in the future (or even now, if this one we are currently thinking of does not work as well as we expect it to). It would also make tweaking the scheme easier, because we would just have to tweak the optimization function, rather than the algorithm I described above.

But the upside is that we don't have to worry about implementing and testing a tricky logic algorithm.

Yesterday when I realized this my plan was to just implement this in conda, but today, when I saw how simple the BDD conversion is, I think I'll just try it.

@asmeurer asmeurer mentioned this issue Feb 20, 2014
4 tasks
@asmeurer
Copy link
Contributor Author

The paper at http://www.mancoosi.org/papers/ase10.pdf describes how to use pseudo-boolean constraints. I think it is better to use what they are doing rather than the ad-hoc thing I came up with.

The pseudo-boolean transformation algorithm is completely implemented in the logic branch. Testing it on my above idea reveals that it is too slow (it takes about 20 seconds just to convert the constraint into SAT conditions), and also, since the implemented BDD algorithm is recursive, it hits the recursion limit if there are more than 1000 or so terms. I haven't read the above paper fully yet, but it seems to be much simpler, and it will likely be much faster.

@asmeurer
Copy link
Contributor Author

asmeurer commented Mar 7, 2014

This is merged. There is still work to do (like using pseudo-boolean clauses to also find the minimal solution, and starting to look at what is already installed), but the most important part, looking at all versions and using constraints to find the optimal solution, is implemented.

@asmeurer asmeurer closed this as completed Mar 7, 2014
@github-actions
Copy link

github-actions bot commented Dec 7, 2021

Hi there, thank you for your contribution to Conda!

This issue has been automatically locked since it has not had recent activity after it was closed.

Please open a new issue if needed.

@github-actions github-actions bot added the locked [bot] locked due to inactivity label Dec 7, 2021
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 7, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
locked [bot] locked due to inactivity priority-high [deprecated] use severity::* type::feature request for a new feature or capability
Projects
None yet
Development

No branches or pull requests

1 participant