Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trying out the box cone API #63

Closed
stephane-caron opened this issue Jun 24, 2022 · 29 comments
Closed

Trying out the box cone API #63

stephane-caron opened this issue Jun 24, 2022 · 29 comments

Comments

@stephane-caron
Copy link

stephane-caron commented Jun 24, 2022

This is not an issue, just my feedback while trying out the box cone API in quadratic programming.

Context: suggestion from @bodono at #36 (comment)

Following this comment, I end up with this code:

# data is already filled with a zero cone 
cone["bl"] = lb
cone["bu"] = ub
k = lb.shape[0]
zero_row = sparse.csc_matrix((1, k))
data["A"] = sparse.vstack((data["A"], zero_row, -sparse.eye(k)), format="csc")
data["b"] = np.hstack((data["b"], 1.0, np.zeros(k)))
cone["bsize"] = k + 1

Empirically it yields the correct results.

Curiosity about the API

The API surprised me in two ways while implementing this:

  • Why do we set "bl" and "bu" in cone? At first, I wrongly assumed they would go to data.
  • Why do we manually set cone["bsize"]? This is redundant this SCS can figure it out from "bl", "bu" if present.

Suggestions for the doc

note that the the variable vector is stacked as [t;s], ie, t comes first then s

This was useful to me but does not appear in Cones. It should probably be mentioned there for future travelers.

Now I'm deciding cone["bsize"]. The doc says:

where bsize is total cone length.

What I understand is that bsize is the length of [t;s], but I wasn't sure at first. For zero ambiguity, the doc could e.g. mention that bsize is k + 1, referring to the formula to the left.

@stephane-caron
Copy link
Author

stephane-caron commented Jun 24, 2022

Performance of quadratic programming with the box cone versus positive cone

Here is a comparison of performance on my machine for:

  1. Using the box cone API
  2. Stacking inequalities in the positive cone

Code in the corresponding PR qpsolvers/qpsolvers#67

Problem Box cone Positive cone
Sparse $(n=500)$ 4.99 ms ± 3.42 µs 6.33 ms ± 7.64 µs
Small dense 1.09 ms ± 9.7 µs 612 µs ± 5.36 µs

On the random dense benchmark the box API does not seem to scale as well though:

image

Here $bl=-0.3$ and $bu=0.3$ for every coordinate. Here is another variation of the problem where $bl=-0.03$ and $bu=0.03$, so that box inequalities are more often constrained in optimal solutions:

image

From these observations only, it seems using the box cone performs better than stacking in the positive cone on sparse, but worse on dense problems.

stephane-caron added a commit to qpsolvers/qpsolvers that referenced this issue Jun 24, 2022
@bodono
Copy link
Owner

bodono commented Jun 27, 2022

Hi @stephane-caron , thanks for the feedback, this is very useful!

A couple of comments:

  • The bl and bu entries go into the cone struct because they relate to the cone, and the code is somewhat decomposed in that the cone and the linear system don't need to know anything about each other, which means it's very easy to add support for new cones or new linear solvers.
  • The bsize entry does not need to be passed into the python API, it is pulled out programmatically. The number you are passing in is just ignored. It needs to be passed into the C API because C arrays are just pointers and you need to track the length of the array separately.
  • Thanks for the feedback on the docs, I will try and make them clearer!

I'm a little surprised that the box cone is slower. Is it easy for me to run these examples? It could be that the cone projection (which is more expensive for the box cone because there's a 1d Newton method happening) is dominating for small random examples.

Overall the box cone should be slightly faster because the data does not be to be replicated as it does when just using the positive cone (don't need to stack A twice). The other major advantage is that often users specify 'no bound on this variable' by passing in inf or -inf (or massive numbers like 10^10) as bounds. This significantly harms the performance of solvers using the positive cone, but is easily handled by the box cone.

@stephane-caron
Copy link
Author

stephane-caron commented Jun 27, 2022

Thanks for your feedback! I've removed the extra bsize.

Is it easy for me to run these examples?

Sure, you should be able to just clone the branch and test locally:

git clone https://github.com/stephane-caron/qpsolvers.git -b feature/scs_box_cone
cd qpsolvers/examples
ln -s ../qpsolvers ./  # testing locally
ipython -i benchmark_dense_problem.py

All benchmark examples, except the model predictive control one, are set to compare "scs" and "scs_box" in this branch, so you can ipython -i on any of them.

stephane-caron added a commit to qpsolvers/qpsolvers that referenced this issue Jun 27, 2022
@bodono
Copy link
Owner

bodono commented Jun 28, 2022

Thanks @stephane-caron , I was able to run and reproduce your results. I think the issue here is that these random problems are quite simple and nicely-conditioned, and the fact that they are dense.

@stephane-caron
Copy link
Author

Okay. Let me know if you know other standard/representative problems (or even better a family of problems parameterized by e.g. size) that we could add to the QP benchmark.

@bodono As you see no problem here I believe we can move forward and make the box cone API the default way to use SCS in qpsolvers. Chime in if you think otherwise.

@bodono
Copy link
Owner

bodono commented Jul 5, 2022

The Maros-Meszaros testset of QPs is what most people use and is a pretty good test set. It formulates QPs as:
Screen Shot 2022-07-05 at 16 39 56

@stephane-caron
Copy link
Author

stephane-caron commented Jul 15, 2022

Thank you! I'll check it out in a separate PR.

I have another question: what is the best way to handle one bound and not the other, e.g. when bu is not set?

  • Setting it to None results in "Failed to parse cone field bu"
  • Not setting it in the dictionary results in "bu different dimension to bl"

For now I went for:

cone["bl"] = lb if lb is not None else np.full((n,), -np.inf)
cone["bu"] = ub if ub is not None else np.full((n,), +np.inf)

If that's proper we can merge https://github.com/stephane-caron/qpsolvers/pull/67/files to push the box cone API to qpsolvers.

stephane-caron added a commit to qpsolvers/qpsolvers that referenced this issue Jul 15, 2022
@bodono
Copy link
Owner

bodono commented Jul 25, 2022

Yes you can do that, or probably easier is to just use the standard linear cone when only one side of the box is specified.

@stephane-caron
Copy link
Author

OK, thank you for your feedback in this issue 😃 The box cone API is now rolled out in qpsolvers v2.1.0.

@stephane-caron
Copy link
Author

The Maros-Meszaros testset of QPs is what most people use and is a pretty good test set. It formulates QPs as:

@bodono FYI I've started working on it in a QP solvers benchmark: https://github.com/stephane-caron/qpsolvers_benchmark/blob/main/results/maros_meszaros.md similar to the ones from OSQP and ProxQP. SCS 3.2.0 is part of the tested solvers and if you'd like you can check preliminary results in that report (or run the benchmark on your machine).

The methodology is also an open work-in-progress, e.g. what's a good metric to compare overall solver primal accuracies? Looking for wisdom 😃

@bodono
Copy link
Owner

bodono commented Nov 1, 2022

How do I run the benchmark? I don't see the run_benchmark.py file.

@bodono
Copy link
Owner

bodono commented Nov 1, 2022

There are a lot of dependencies just to run SCS on this. I have already ran into needing to install IPython, yaml, and now highspy at which point I'm giving up. Do I really need all that? Can I just run one solver without having to have all the solvers installed?

@stephane-caron
Copy link
Author

stephane-caron commented Nov 2, 2022

How do I run the benchmark? I don't see the run_benchmark.py file.

Sorry the README was outdated (the code is under dev, things still change quite fast). You can do:

python benchmark.py run maros_meszaros --solver scs --settings default

Do I really need all that? Can I just run one solver without having to have all the solvers installed?

No we don't. I'll make them optional.

I don't see how you ran into a highspy import with the current main branch. Do you have a traceback?

@bodono
Copy link
Owner

bodono commented Nov 2, 2022

From my run yesterday:

=> python benchmark.py run maros_meszaros --solver SCS
[2022-11-01 17:00:40,684] [warning] Run ``pip install py-cpuinfo`` for more accurate CPU info (utils.py:34)
[2022-11-01 17:00:41,489] [info] Made 0 QP solver calls (test_set.py:140)
[2022-11-01 17:00:41,490] [info] Test set results written to /usr/local/google/home/bodonoghue/git/qpsolvers_benchmark/results/
maros_meszaros.csv (results.py:67)
Traceback (most recent call last):
  File "/usr/local/google/home/bodonoghue/git/qpsolvers_benchmark/benchmark.py", line 114, in <module>
    report.write(os.path.join(results_dir, f"{args.test_set}.md"))
  File "/usr/local/google/home/bodonoghue/git/qpsolvers_benchmark/qpsolvers_benchmark/report.py", line 82, in write
    {self.get_solvers_table()}
  File "/usr/local/google/home/bodonoghue/git/qpsolvers_benchmark/qpsolvers_benchmark/report.py", line 48, in get_solvers_table
    versions = get_solver_versions(self.test_set.solvers)
  File "/usr/local/google/home/bodonoghue/git/qpsolvers_benchmark/qpsolvers_benchmark/utils.py", line 93, in get_solver_version
s
    versions = {
  File "/usr/local/google/home/bodonoghue/git/qpsolvers_benchmark/qpsolvers_benchmark/utils.py", line 94, in <dictcomp>
    solver: metadata.version(package)
  File "/usr/local/google/home/bodonoghue/miniconda3/envs/python39/lib/python3.9/importlib/metadata.py", line 551, in version
    return distribution(distribution_name).version
  File "/usr/local/google/home/bodonoghue/miniconda3/envs/python39/lib/python3.9/importlib/metadata.py", line 524, in distribut
ion
    return Distribution.from_name(distribution_name)
  File "/usr/local/google/home/bodonoghue/miniconda3/envs/python39/lib/python3.9/importlib/metadata.py", line 187, in from_name
    raise PackageNotFoundError(name)
importlib.metadata.PackageNotFoundError: highspy

@stephane-caron
Copy link
Author

Thank you 😃 it's fixed now qpsolvers/qpbenchmark#4

@bodono
Copy link
Owner

bodono commented Nov 3, 2022

I only have SCS and OSQP installed on this machine, so I only ran those two. Even still, I'm seeing some strange stuff about tolerances. It looks like OSQP default settimgs is eps_abs=eps_rel=1e-4, but SCS is asked to solve to eps_abs=eps_rel=1e-7

[2022-11-03 15:38:47,819] [info] Solving CONT-300 by osqp with default settings... (test_set.py:148)
-----------------------------------------------------------------
           OSQP v0.6.2  -  Operator Splitting QP Solver
              (c) Bartolomeo Stellato,  Goran Banjac
        University of Oxford  -  Stanford University 2021
-----------------------------------------------------------------
problem:  variables n = 90597, constraints m = 271492
          nnz(P) + nnz(A) = 653093
settings: linear system solver = qdldl,
          eps_abs = 1.0e-04, eps_rel = 1.0e-04,

[2022-11-03 15:38:29,029] [info] Solving AUG2D by scs with default settings... (test_set.py:148)
------------------------------------------------------------------
               SCS v3.2.1 - Splitting Conic Solver
        (c) Brendan O'Donoghue, Stanford University, 2012
------------------------------------------------------------------

problem:  variables n: 20200, constraints m: 30201
cones:    z: primal zero / dual free vars: 10000
          b: box cone vars: 20201
settings: eps_abs: 1.0e-07, eps_rel: 1.0e-07, eps_infeas: 1.0e-07

Obviously this is going to make SCS take longer and therefore look worse. You have to be very careful about this type of thing when comparing solvers. It's easy to make one solver look better than another by accidentally asking one to solve to 1000x the accuracy of the other.

The other thing to know is that OSQP does not constrain the duality gap which can make it artificially faster, but in reality it's returning false solutions. I discuss this in a paper:

Screen Shot 2022-11-03 at 15 44 00
Screen Shot 2022-11-03 at 15 45 12

Another point is how to restore from the box cone formulation to the original problem when running SCS. This is a minor point, but I wonder if you are dividing by the t scalar variable at termination? I think it's typically better to do that. The box cone is {t * l <= s <= t *u} and we add the constraint that t=1, but if at termination t != 1 then you might need to divide by whatever t is to get a solution that satisfies the conic constraint.

@stephane-caron
Copy link
Author

stephane-caron commented Nov 3, 2022

You have to be very careful about this type of thing when comparing solvers. It's easy to make one solver look better than another by accidentally asking one to solve to 1000x the accuracy of the other.

This is a big issue, I'll address it with high prio, first in qpsolvers/qpsolvers#102 so that all solvers use their default tolerance settings (left to solver developers).

(The override of SCS's default tolerances comes from qpsolvers/qpsolvers#50 (comment); I'll find some other way to address this issue that keeps SCS's defaults.)

The follow-up question for the benchmark is that some solvers could appear faster by having laxer default tolerances. That is why benchmark results also include a shifted geometric mean of the primal residual at the returned solution, so that such solvers would appear both faster and less precise.

Another point is how to restore from the box cone formulation to the original problem when running SCS. This is a minor point, but I wonder if you are dividing by the t scalar variable at termination? I think it's typically better to do that. The box cone is {t * l <= s <= t *u} and we add the constraint that t=1, but if at termination t != 1 then you might need to divide by whatever t is to get a solution that satisfies the conic constraint.

Interesting! Let's take a look at it → qpsolvers/qpsolvers#103

@bodono
Copy link
Owner

bodono commented Nov 3, 2022

Oh, I see, I thought this project set the defaults. You're saying the defaults are coming from the qpsolvers package, which are tuned in some sense for some other outcome.

I guess it's up to you whether the point of this project is to make a fair comparison of these solvers in a head-to-head with equalised settings (which is what I was assuming), or if it's more about tuning the qpsolvers package and the results are more indicative of their performance within that package.

I'm not sure what's going on with that primal residual because some of them seem enormous.

@stephane-caron
Copy link
Author

stephane-caron commented Nov 3, 2022

I guess it's up to you whether the point of this project is to make a fair comparison of these solvers in a head-to-head with equalised settings (which is what I was assuming)

The goal of the benchmark is to compare solvers through at least two lenses:

  • default: whatever settings ship with each solver
  • high_accuracy: all solvers settings are such that eps_abs=1e-8 and eps_rel=0

It is totally open to other variants, e.g. having non-zero eps_rel for solvers that support it. Those lenses appear in the reports as different columns (I just updated this one to show what it looks like, obviously still WIP 😅).

or if it's more about tuning the qpsolvers package and the results are more indicative of their performance within that package.

Not really, qpsolvers should just be a gateway. It has deviated from that ideal, but it will come back on track.

I'm not sure what's going on with that primal residual because some of them seem enormous.

This could be due to how qpsolvers handles SOLVED_INACCURATE results. Currently it issues a warning and returns them like regular solutions (here). We could also decide that inaccurate results are no good and consider them as "no solution found" instead.

@stephane-caron
Copy link
Author

stephane-caron commented Nov 7, 2022

This could be due to how qpsolvers handles SOLVED_INACCURATE results.

Confirmed. The problems that had cost/primal errors through the roof were:

$ python benchmark.py check_results results/maros_meszaros_dense.csv
[2022-11-07 15:03:59,608] [info] Loading existing results from results/maros_meszaros_dense.csv (results.py:59)
[2022-11-07 15:03:59,613] [info] Check out `results` for the full results data (benchmark.py:161)
[2022-11-07 15:03:59,614] [info] Check out `df` for results as a pandas DataFrame (benchmark.py:163)
Python 3.8.10 (default, Jun 22 2022, 20:18:18) 
Type 'copyright', 'credits' or 'license' for more information
IPython 8.0.1 -- An enhanced Interactive Python. Type '?' for help.

In [1]: df[df["solver"] == "scs"][df["found"]][df["cost_error"] > 1]
problem solver settings runtime found cost_error primal_error
DUALC1 scs default 1.48001 True 8.41269e+23 2.79628e+11
DUALC2 scs default 1.41959 True 1.31408e+21 8.04521e+12
HS76 scs default 0.0799307 True 1.28279e+18 1.38347e+09
PRIMAL1 scs default 7.38873 True 1.74335e+29 1.98123e+13
PRIMAL2 scs default 10.5728 True 1.77982e+29 1.25733e+11
PRIMAL3 scs default 18.3006 True 1.34623e+29 0
PRIMALC1 scs default 2.50527 True 5.65706e+24 3.13875e+11
PRIMALC2 scs default 2.23358 True 3.89248e+24 1.30071e+13
PRIMALC5 scs default 2.86476 True 1.43503e+25 3.31257e+12
PRIMALC8 scs default 5.19112 True 3.28301e+24 5.06913e+11
QCAPRI scs default 3.77299 True 1.58404e+29 1.09689e+12
QRECIPE scs default 1.34932 True 2.30308e+27 4.13912e+10
QSCAGR7 scs default 1.07679 True 2.82285e+24 3.76232e+13
QSCTAP1 scs default 4.8086 True 4.66901e+29 9.18115e+12
QSHARE1B scs default 1.80473 True 8.13167e+16 3.95843e+14
ZECEVIC2 scs default 0.0537397 True 2.83422e+16 1.03975e+14

I checked them manually and they were all SOLVED_INACCURATE:

$ python benchmark.py check_problem maros_meszaros PRIMAL3
[2022-11-07 15:02:29,367] [info] Loading existing results from results/maros_meszaros.csv (results.py:59)
[2022-11-07 15:02:29,420] [info] Check out `problem` for the PRIMAL3 problem (benchmark.py:158)
Python 3.8.10 (default, Jun 22 2022, 20:18:18) 
Type 'copyright', 'credits' or 'license' for more information
IPython 8.0.1 -- An enhanced Interactive Python. Type '?' for help.

In [1]: x, runtime = problem.solve(solver="scs")
~/.local/lib/python3.8/site-packages/qpsolvers/solvers/scs_.py:263: UserWarning: SCS returned 2: SOLVED_INACCURATE

Now the benchmark is updated to consider SOLVED_INACCURATE results as "not found". This makes the primal and cost errors for SCS comparable to those of the other solvers:

(I'll re-run the full Maros-Meszaros as well.)

stephane-caron added a commit to qpsolvers/qpsolvers that referenced this issue Nov 8, 2022
@bodono
Copy link
Owner

bodono commented Nov 9, 2022

That makes sense. Are these still with 1e-7 accuracy? That's probably a bit out of the range where SCS is best. By default SCS uses 1e-4 for both eps_abs and eps_rel.

@stephane-caron
Copy link
Author

SCS is now at its 1e-4 defaults for both eps_abs and eps_rel. More generally:

qpsolvers should just be a gateway. It has deviated from that ideal, but it will come back on track.

It is now back on track, i.e. all defaults are left to the solvers.

I've re-run the Maros-Meszaros test set yesterday, here is the resulting report. The success rate for SCS 3.2.2 is around 33% with its default settings (and there is indeed a drop to 21% with the high_accuracy settings).

@bodono
Copy link
Owner

bodono commented Nov 10, 2022

Hmm, I still can't really tell what's happening in that report. Table 2 seems to suggest that basically all the solvers fail all the time, is that right? If so then something is wrong with the test suite, because it simply can't be the case that the solvers have 96%+ failure rates routinely like this:

Screenshot 2022-11-10 at 15 50 09

CVXOPT had a 100% failure rate at the default tolerance? That doesn't sound right to me.

Also I don't understand this table at all:

Screenshot 2022-11-10 at 15 50 55

The right way to validate a solution is to enforce the contract that the solver says it will satisfy, not to come up with a new contract. For instance, SCS will return a solution when it satisfies these conditions:
Screenshot 2022-11-10 at 15 52 05
If you want to verify a solution, then you should test that it is returning something that satisfies the conditions that it promises it will. This may differ from solver to solver so you'll probably have to write something that takes care of that (and ideally try to set the solver settings so that the conditions are similar across solvers for fairness). Anything other than that is unfair because you would be testing the solvers in doing something they have not promised to do. The one exception to the above are the solvers that don't enforce a duality gap, for those I would just test to see if they satisfy a similar duality gap condition as SCS does and return a failure if they do not.

Also, I notice that the default tolerances are very loose (1000?) and not set relatively (e.g, an objective error should be relative to the true objective etc.) so I would guess that many things are slipping through the cracks there. By the way, to test optimality you should always check the KKT conditions, which are necessary and sufficient for QP optimality. I think what you seem to be checking (primal and cost) are not sufficient and not quite right anyway because there are two costs - primal and dual.

I have also run SCS on the Maros-Meszaros test set, and I seemed to get better results. Checking the paper I actually used slightly different tolerances, eps_abs=1e-3 and eps_rel=1e-4, so it's a little different to yours. Attached is a csv file of my results for your comparison.

results.csv

@bodono
Copy link
Owner

bodono commented Nov 10, 2022

Looking at the instances you posted, I see for instance the first entry is DUALC1, which you have:

problem solver settings runtime found cost_error primal_error
DUALC1 scs default 1.48001 True 8.41269e+23 2.79628e+11

but in the results spreadsheet I have:

name solver status run_time iter obj_val obj_opt n m N setup_time solve_time scale scale_updates
DUALC1 SCS-3.0 optimal 0.002122163772583008 150 6154.912037803983 6155.2508 9 224 2025 0.330263 1.190224 0.04653712266416252 1

And I notice that the true objective value is listed as 6155.2508. My run of SCS finishes in 150 iterations and is close to the optimum with 6154.912037803983, but somehow your run has a cost_error of 8.41269e+23, which seems very high!

(FYI: in my csv the run_time is seconds as determined by a separate python timer, ie, it doesn't rely on the solvers own timers, and setup_time solve_time are both in milliseconds and are from the solvers own timer.)

stephane-caron added a commit to qpsolvers/qpbenchmark that referenced this issue Nov 10, 2022
@stephane-caron
Copy link
Author

stephane-caron commented Nov 10, 2022

Hmm, I still can't really tell what's happening in that report. Table 2 seems to suggest that basically all the solvers fail all the time, is that right? If so then something is wrong with the test suite, because it simply can't be the case that the solvers have 96%+ failure rates routinely like this

Oh my, sorry for this 🙈 It is indeed the negation of the second clause, i.e. the table lists percentage of instances where (1) the solver said it found the solution and (2) the solution passed tolerance checks. → Updated report

CVXOPT had a 100% failure rate at the default tolerance? That doesn't sound right to me.

So, for instance, CVXOPT is always right when it says it finds a solution with its default settings. But it only finds solutions 16% of the time.

@stephane-caron
Copy link
Author

The right way to validate a solution is to enforce the contract that the solver says it will satisfy, not to come up with a new contract. For instance, SCS will return a solution when it satisfies these conditions:

  1. The primal tolerance is not a new contract: it is exactly the primal residual check you mention.
  2. The dual and duality gap contracts that you mention are not currently implemented by the benchmark, rather they are listed in the Limitations section of the README.

I'm totally in favor of implementing 2 down the line. The reason why it won't be implemented in this first release of the benchmark stems from a quick estimate of (1) the workload to wire dual multipliers for all backend solvers 😅 and (2) the time I can spend on this in the upcoming weeks.

@stephane-caron
Copy link
Author

stephane-caron commented Nov 10, 2022

Also, I notice that the default tolerances are very loose (1000?) and not set relatively (e.g, an objective error should be relative to the true objective etc.) so I would guess that many things are slipping through the cracks there.

Agreed. Those are just initial loose values to check that everything works. I will identify tighter values for the cost and primal tolerances before the initial release of the benchmark. → https://github.com/stephane-caron/qpsolvers_benchmark/issues/15

@stephane-caron
Copy link
Author

an objective error should be relative to the true objective

Yes 👍 For the Maros-Meszaros test set all true objectives are $&gt;0$ (except perhaps HS51?), so we can use the relative variation instead:

$$ rel\ cost\ error := \frac{|f(x_{solver}) - f(x^*)|}{|f(x^*)|} $$

This isn't entirely satisfactory though, because on some problems in different test sets $f(x^*)$ could be zero. 🤔

I've added this point to the design discussion here: qpsolvers/qpbenchmark#18.

@stephane-caron
Copy link
Author

Thanks a lot @bodono for your feedback 😃

I will check your results and the performance on DUALC1 in qpsolvers/qpbenchmark#19.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants