Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

It is slow adding variables and clauses to CMS #445

Closed
msoos opened this issue Jan 3, 2018 · 25 comments
Closed

It is slow adding variables and clauses to CMS #445

msoos opened this issue Jan 3, 2018 · 25 comments
Assignees

Comments

@msoos
Copy link
Owner

msoos commented Jan 3, 2018

Please run:

perf record ./myprogram

where your program adds the clauses and then simply exits (i.e. does not run solve()). Then please run

perf report --call-graph --stdio

and paste that ouptut here. It will not reveal anything about your instance. If you could describe what you are doing, that would also help, i.e. how you are adding variables, etc.

By the way, have you tried adding variables like: new_vars(10000) instead of calling new_var() every time? It's a LOT faster :)

@horenmar
Copy link
Contributor

horenmar commented Jan 4, 2018

We do new_var() because it has been enough so far and computing the exact amount of variables needed can be a bit of a pain.

We could easily reserve about half of the variables ahead of time though.

Anyway, as I said I can't back to you until mid January, but I promise I'll do it 😄

Also stay tuned, I have input that takes 4 minutes to insert to MiniSat (320M variables 960M clauses), so I will try that then.

@msoos
Copy link
Owner Author

msoos commented Jan 4, 2018

Thanks, looking forward! Yeah, don't worry about the new_var stuff, I realised it's not that important :)

@msoos
Copy link
Owner Author

msoos commented Jan 27, 2018

Hi, could you please provide the output about the perf record/perf report? Thanks! :)

@horenmar
Copy link
Contributor

Sorry, it is not forgotten, but I found a bug in the code that led to wrong results and I had to rerun all my experiments, so the server has been running non-stop for the last two weeks.

Next week I swear

@msoos
Copy link
Owner Author

msoos commented Jan 29, 2018

Wow, 2 weeks, you are dedicated! Okay, no worries, looking forward!

Mate

@horenmar
Copy link
Contributor

horenmar commented Feb 1, 2018

Well, the recording has been going for ~2 hours and perf data take up 5.5GB.

I think I am going to kill it soon and assume that the trace is roughly representative 😄

@msoos
Copy link
Owner Author

msoos commented Feb 1, 2018

Definitely! That's more than enough! 10 min would have been fine I think :)

@horenmar
Copy link
Contributor

horenmar commented Feb 1, 2018

I let it run for a bit longer, after 6 hours it still wasn't finished so I killed it. It took perf about 30 minutes to process the trace, here is somewhat filtered output:

https://gist.github.com/horenmar/5f5ac9fe18e5b01d1cdbca5d69337eca

Note that the output is for CMSat release 5.0.1, built in RelWithDebInfo configuration.

I also did some investigating based on the results and I think I know what is happening. Our use case switches between adding variables and clauses frequently (basically, we create variables on-demand, when the conversion logic requires them first), so these pieces of code lead to quadratic complexity:

DLL_PUBLIC bool SATSolver::add_clause(const vector< Lit >& lits)
{
    ...

    } else {
        data->solvers[0]->new_vars(data->vars_to_add); // <-- This will be ~10 for most calls
        data->vars_to_add = 0;
        ...
    }
}

void Solver::new_vars(size_t n)
{
    if (n == 0) { // <-- Often not true
        return;
    }

    check_switchoff_limits_newvar(n);
    Searcher::new_vars(n); // <-- This is the problem
    varReplacer->new_vars(n);

    if (conf.perform_occur_based_simp) {
        occsimplifier->new_vars(n);
    }

    if (compHandler) {
        compHandler->new_vars(n);
    }
    datasync->new_vars(n);
}


void Searcher::new_vars(size_t n)
{
    PropEngine::new_vars(n);

    activ_glue.resize(activ_glue.size() + n, 0); // <-- This here leads to O(n**2) complexity of adding variables
    // Most implementations of std::vector don't have exponential
    // size increase logic as part of vector::resize, because it is assumed
    // that the programmer gives proper value.

    for(int i = n-1; i >= 0; i--) {
        insertVarOrder((int)nVars()-i-1);
    }
}

The code contains some notes as to the problem.

An extremely simple fix is to instead do this

    for (int i = 0; i < n; ++i) { activ_glue.push_back(0); }

A less naive approach is to check whether n is more than say twice the current capacity and call resize and only use vector::push_back otherwise.

@msoos msoos self-assigned this Feb 4, 2018
@msoos msoos added the bug label Feb 4, 2018
@msoos
Copy link
Owner Author

msoos commented Feb 4, 2018

Wow, this is definitely wrong and you are so right!! Ahhh what an idiot I am not having thought of that.

Note that the resize() is still useful because calling push_back() is an issue when one adds 5 million variables in one go (happens...), calling that 5 million times is... expensive. So, I will do capacity check and an intelligent reserve() + resize(). That should work when someone adds 5 million in one go, and also when one resize()-s by small amount (e.g. 1-5) every time.

Thank you so much, you are 100% right.

@horenmar
Copy link
Contributor

horenmar commented Feb 5, 2018

Actually, I think the best solution would be to use vector::insert(const_iterator, size_type, T const&) overload. When used like this activ_glue.insert(activ_glue.end(), n, 0), it should keep the geometrical growth, providing O(1) complexity per inserted item, while also keeping the number of function calls low.

msoos added a commit that referenced this issue Feb 8, 2018
Thanks a LOT to Martin Hořeňovský for finding this, debugging it, and
recommending a great solution.

Related to issue #445
@msoos
Copy link
Owner Author

msoos commented Feb 8, 2018

OK, I think the issue has been fixed. However to close, I need

  • create a test case. I will do this soon, I don't intend to close this before making sure this doesn't happen again
  • If you could kindly get back to me whether it fixes the issue, that'd be awesome :)

@horenmar
Copy link
Contributor

horenmar commented Feb 12, 2018

Using the same instance as before ends in ~3:30, but it is not quite the improvements I would like:

ERROR: memory manager can't handle the load. **PLEASE RECOMPILE WITH -DLARGEMEM=ON** size: 1073741817 needed: 9 newcapacity: 1073741823
ERROR: memory manager can't handle the load. **PLEASE RECOMPILE WITH -DLARGEMEM=ON** size: 1073741817 needed: 9 newcapacity: 1073741823

I am going to try with smaller instance later.


Smaller instance finishes in 38 seconds, which is much more reasonable compared to the ~520 seconds CMSat 5.0.1 needed. It still seems that CMSat does a lot of extra work compared to MiniSat, as MiniSat finishes adding all the variables/clauses from the same instance in ~15 seconds, Glucose in ~17.

It should be noted that both MiniSat and Glucose are built together with our code using full LTO, while CMSat is built using RelWithDebInfo and used as static library.

@msoos
Copy link
Owner Author

msoos commented Feb 13, 2018

Wait... how many clauses are there? And how many variables? What I am suspecting is that the translation of your problem can be significantly improved -- which is a good thing because it could allow MASSIVE speedup if we fix it. I am sorry to say, but if CMS runs out of memory on a standard build, then the first thing to think about is how to improve the problem encoding. I mean, we could most likely easily achieve a 10x speedup, but in such cases, I have seen 100x too. What kind of encoding are you using? Do you really need all those variables? What forces you to have them all? Are there any other ways of encoding the same constraints? I am really curious because every time I have seen this out-of-mem, it has always been the case that the translation of the problem could be significantly improved.

Anyway, you can always just recompile, using "-DLARGEMEM=ON" in the "cmake". but I strongly suggest trying to fix the problem translation. It can most likely be improved by a large factor.

@horenmar
Copy link
Contributor

The final CNF contains 320M variables in 966M clauses.

As to the translation, it is SAT formulation of master-key problem as described here, in 2nd chapter. The specific input is confidential, but the input has ~2k keys and locks, the geometry is fairly long (~15 positions) and contains no profiles. Any ideas for more efficient encoding are welcome.

@msoos
Copy link
Owner Author

msoos commented Feb 17, 2018

Ah, waiiiittt... you didn't put the p cnf VARS CLAUSES" at the top of the CNFs you sent me :) That way, the variable adding is slower :)

@horenmar
Copy link
Contributor

  1. The measurements were done for use as a C++ library, without the DIMACS front-end
  2. Those are for CMS is slow on some instances #443

@msoos
Copy link
Owner Author

msoos commented Feb 17, 2018

Ah, I see :) I'll try to improve that too :) I see it takes about 40s on my machine, really long, you are absolutely right :)

@msoos
Copy link
Owner Author

msoos commented Feb 18, 2018

Well, I reading https://dspace.cvut.cz/bitstream/handle/10467/73952/F3-DP-2018-Horenovsky-Martin-main.pdf?sequence=-1&isAllowed=y now so maybe I'll understand more about these instances. What I can say is that the search space is humongous but in fact the actual solution is not hard to find. So it's a bit of a random thing if the solver finds it or not. It only makes 33 conflicts/s because of the number of clauses and variables. I'll see how I could fix that :)

@msoos
Copy link
Owner Author

msoos commented May 24, 2018

Ah, I think I managed to fix this one :) Can you please check? I found some issues with ".resize()" and stuff that could have been the issue.

@horenmar
Copy link
Contributor

horenmar commented Jun 5, 2018

I downloaded and compiled 5.6.0 and initialization is still significantly slower than e.g. Minisat's, but the factor is reasonable.

CMSat Minisat slowdown
Problem 1 52s 23.5s 2.2x
Problem 2 15s 7s 2.1x
Problem 3 16s 6.8s 2.3x

As far as I am concerned, 93529f6 fixed it enough to close this issue.

@msoos
Copy link
Owner Author

msoos commented Jun 6, 2018

Wow, thanks for running the experiment! I'm glad this solved some of the issue. Factor 2x slowdown is not nice. Let me keep this open for the moment and try to fix it for 5.7 :) Thanks so much again!

@msoos msoos added enhancement and removed bug labels Jun 11, 2018
@msoos
Copy link
Owner Author

msoos commented Jul 4, 2018

I think I might have to live with 2x slow to add than MiniSat. It's not extreme, though not perfect I have to admit.

Thank you so much for looking into this issue! It has really-really helped and actually helped me understand C++ and the .resize() operation a lot better. I'm a better programmer because of this :) Thanks again!

Closing :)

@msoos msoos closed this as completed Jul 4, 2018
@msoos
Copy link
Owner Author

msoos commented Aug 28, 2018

So... With the conda developers we have really put some effort into this, and the results are really good. I suggest you take another go at this :) In case you are interested in the history, it's mostly documented into #513 and a bunch of commits in the past few days. All in all, the startup speed should be significantly faster when adding clauses and variables to a solver both from the C++ and the python interface. Check it out please :) And thanks again for the help you gave here -- it was invaluable to be able to do the further optimisations!

@mbargull
Copy link
Contributor

Oh, I didn't realize you just recently, only a few months back, improved on this already. Removing a O(n²) inefficiency surely sounds like a really satisfying achievement, neat!

320M variables 960M clauses

Sounds like some problems which are "a little bit" bigger than the smaller ones we have in Conda 😆. But I'd also be interested if those latest improvements we made to gain a few 0.x seconds also help for the over end of the spectrum!

@msoos
Copy link
Owner Author

msoos commented Aug 28, 2018

@mbargull Well, that's the thing that makes it hard to write SAT solvers -- the range is huge. Some people write problems that have 500 variables and are difficult (say, 10+ days) to solve and some write problems that have 100M+ variables and can be solved in <100s. Then there are those problems that take <3s each but there are hundreds of them. There is of course no optimal datastructure that spans all of these, so tradeoffs must be made.

Note that CMS actually runs stable on hard problems, I personally had a few problems running for 14+ days without a hitch. This is not the case for many SAT solvers. Note that picosat and generally solvers written by Armin Biere are well-engineered and I would expect them to run 50+ days without a hitch. Most of the winners in the competition though won't survive beyond 10h. The incentives are just not right, and there is also a lack of ownership. Winning is one thing. Maintaining a solver for 8 years is another thing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants