-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: modularize presolve in linprog #11691 #12510
Conversation
Added corresponding tests Revert tox.ini
Needed to pass standard PR checks
Add _presolve subfunctions to detect infeasible constraints, remove variables which arre fixed by bounds, remove variable which are fixed by row singletons. Add tests for these functions. Row singleton tests are not complete.
Add two tests
Included asv.conf.json and reverted .gitignore to original.
The code in _presolve() is split into parts, allowing the process to loop until no furter reduction is obtained. In the redundancy-removal, the initial zero-row test is removed; _presolve() does that now as a final step.
Removed variables, equations, inequalities & redundancy removal result.
Including redundancy-removal result
Preparing final PR
Resolved conflict in test__linprog_clean_inputs.py and asv.conf.json
Corrected errors in tests mainly. These were written to an earlier version of the presolve routines.
This code was rather slow.
Same reason and same change as the previous commit.
…nfeasibility checks These calls are still timing bottlenecks.
…esolve_infeasibile_..._constraints() The benchmarks improve, but having dense intermediate matrices of the same shape as sparse A_eq and/or A_ub defies logic.
Actually a redo. Preferred parameter formats and comments explaining the tests.
Also suppressed efficiency warning in feasibility checks with sparse matrices
Resolved conflicts, small updates to _linprog.py and _linprog_util.py
The _remove_zero()-function is not necessary now that the _presolve() guarantees that there are no zero-rows im the problem matrices. I modified this function to do a light check, but the warning makes the tests fail.
I forgot I modified |
Getting into details, two test failures left, one that puzzles me:
|
Resolves SparseEfficiencyWarning
The TravisCI LINT test flags all the cases in test__presolve().py where I tried to neatly align matrix elements -- too bad that has to go. The presolve step is not consistently slower. In several cases it is also faster. I haven't dug into it deeply, though. Form the names of the tests which perform better, the faster cases may have to do with detecting infeasibility. The slower cases may have to do with looping. Checking for infeasible equality and inequality constraints is relatively heavy, and it happens each loop. I still have a couple of days before leaving for a 3 week's holiday. What would be most important to check first? |
Aligning matrix elements does not fit PEP8 rules.
There was talk of relaxing that rule. Let me double check on that. Update: see gh-12367. I suppose you can wait on this. |
|
||
# Remove variables which are fixed by singletons in A_eq | ||
if not _presolve_complete(p_stat): | ||
lp, rev, p_stat = _presolve_remove_equation_row_singletons(lp, p_stat, tol) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we put all of these functions in a list and loop this? Something like:
for fun in fun_list:
if not _presolve_complete(p_stat):
lp, rev, p_stat = fun(lp, p_stat, tol)
if rev is not None:
revstack.append(rev)
if complete: | ||
x = np.zeros(lp.c.shape) | ||
else: | ||
A, b, c, _, x0 = _get_Abc(lp, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you eliminated the need for c0
entirely? If the option 'disp': True
will the value of the original objective function at each iteration print correctly? Will the correct value of the objective function be sent to any callback functions? If so, _get_Abc
doesn't need to return it, and _linprog_simplex
etc... don't need to accept it as an argument; we can remove the idea from the code.
return (_LPProblem(c, A_ub, b_ub, A_eq, b_eq, bounds, x0), | ||
c0, x, revstack, complete, status, message) | ||
presolve_effect[4] = pps[4] - A_eq.shape[0] | ||
return (lp._replace(A_eq=A_eq, b_eq=b_eq), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A_ub
, b_ub
don't need to be here?
I'll go through most of your comments and questions later. There are some files I did not intend to change: I'm attaching the slow benchmark results: |
Re unintended changes: You can delete the file scipy/optimize/_shgo_lib/sobol_vec.gz if it is not tracked by Git. The error about it was probably caused by changes to the other files. If you check The benchmark problems don't test specific things. They are just a large collection of real-world problems of varying difficulties from Netlib. |
OK, I'll check that out. I have been looking at the most extreme cases of the benchmark. The two which report failure, do not fail when I run these benchmarks separately, in Spyder. (I copy the classes, adjust paths to the .npz files , instantiate them and call the timing method.) The third benchmark, Regarding the last couple of cases, e.g. I found some limited hints about the background of these benchmark data sets: via Google Scholar, David M Gay, "Electronic mail distribution of linear programming test problems". Still it seems a bit strange to me to benchmark code without knowing if these benchmark problems are not biased in some way. But perhaps the large number averages out a lot. |
This is something I've found with ASV, too. Sometimes it reports failure when problems take too long to solve, and maybe that's what's going on here. How long did they take in Spyder?
OK. Yes, I see that most of these large "regressions" are in
is looping 8 times, but it's slowing down presolve by a factor of 24, which suggests that individual iterations are taking much longer.
so I think the longer presolve is OK. It would just help to know why it is happening.
The
are good because the error was reduced by a factor of 100. Those last few lines:
suggest that
I think they are intentionally biased toward being numerically challenging, but I'm not sure. Other thoughts: There are a lot of big improvements in I'm not concerned by any of the regressions in the I'm looking through some of the
is totally fine because it's accompanied by:
It's OK to take longer now because it actually solves the problem. Same with So overall these results look OK. I'd just like to better understand what is going on with the large number (45-50) of minor regressions, e.g.:
Is the time difference accounted for by an increase in presolve time? |
@@ -783,8 +793,8 @@ def test_unbounded_no_nontrivial_constraints_1(self): | |||
res = linprog(c, A_ub, b_ub, A_eq, b_eq, bounds, | |||
method=self.method, options=self.options) | |||
_assert_unbounded(res) | |||
assert_equal(res.x[-1], np.inf) | |||
assert_equal(res.message[:36], "The problem is (trivially) unbounded") | |||
# assert_equal(res.x[-1], np.inf) # unclear if this should be required |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These were written because the problem should be solved in presolve.
status = 2 | ||
message = ("The problem is (trivially) infeasible since one " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like these messages are gone?
These two directories are empty. Will see if this is all.
Directory is empty. Will see if that is sufficient.
Permissions of the previously uploaded file were wrong (644 rw-r--r-- instead of 755 rwxr-xr-x)
…olve As far as I can see only sobol_vec.gz
Thanks for the explanation, I'm beginning to get a feel for how to look at these benchmarks. I thought all comparisons were times, but some are errors - I didn't now that. Going through a couple of topics before I'm taking a 3 week's break :-) I modified my local branch to get rid of the unintentional changes which were present in the PR. I also conformed to PEP8. I succeeded for all but one: not for the sobol_vec.gz file permissions. Uploading changes 755 to 644, and I don't see a way to change that. I see the CircleCI-checks now pass. The failing benchmarks take 22s (WOODW) and 49s (bgindy), but these numbers may be rather meaningless since the master branch benchmarks differ also: 16s (WOODW) and 47s (bgindy). An option to limit the number of loops would be OK to me. Perhaps there may be other possibilities, for example restricting presolve to limited functionality. We'll see once we get a clearer view on where extra presolve efforts become a burden. I'll do these checks when I'm back:
Thanks for reviewing! |
Thanks @janvle. Have a great break. |
Reference issue
Third (last) step to implement suggestion #11570
What does this implement/fix?
This PR implements
_presolve()
in a modular way.Additional information
Andersen & Andersen suggest a large number of presolve possibilities, which were not all implemented in the code (yet). I have not implemented any additional steps, but by breaking up _presolve() into parts, I was able to implement the processing loop A&A suggest.
The attached benchmark results show some large differences (from 10 times slower to 10 times faster). The slower benchmarks have in common that they relate to sparse matrices. The source is the modifying of intermediate sparse matrices (which results in a SparseEfficiencyWarning, see code comments). I don't know if this is sufficiently important to address, and I also don't know how to go about fixing this.
presolve benchmark 01.txt
Because of the new loop,
_presolve()
is able to reduce problems more thoroughly than before. I have not yet added tests to cover this extra functionality.'_presolve()' returns an additional
presolve_effect
variable. Nothing is done with this data yet, but perhaps it is helpful for monitor its functioning.