-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: optimize.linprog; linprog duals #10
base: main
Are you sure you want to change the base?
Conversation
@mdhaber Pinging you to make you aware of this. I'd like to have your eyes on it when you free up in a month or so |
Looking forward to it! |
@mckib2 it's tough to see what's really new here after merging upstream master. Maybe if you update your master branch it will look cleaner? |
If I remember theory correctly, my suggestion for testing would be to set up a primal and dual problem pair that has at least one of each constraint type active and check the following:
It would be really nice to write a tutorial that shows all this stuff, actually. Does that make sense or am I confusing the theory? |
Re: updating master, if that doesn't work, try changing the branch you're comparing this branch against to something else then back to master again. |
Yeah, I can see that the update worked when I open a new PR with the same branch, but I'm having trouble getting this PR to update. How do I change the branch for this PR? EDIT: I see the link now, it worked! |
Yeah that comes in handy as a maintainer. It doesn't always work but I'm guessing that's when the author has screwed up and the commit numbers have actually changed. |
Yes, this is much along the same lines that I was thinking. I was going to try to set up a simple test today but I got distracted on other projects. I am a little fuzzy on how to how to test langrangian values corresponding to the bounds, but I'm sure googling around will help that. Depending on how much time you want to spend on this, feel free to take a crack at a test or two. It might be a couple days before I get something going -- no pressure either way |
As much fun as that would be, I have a feeling I would get sucked in. Can't do that right now : (
I don't know that it's any different? Oops, I see I had a typo: I wrote
but meant
So just add epsilon to each constraint and solve the problem again. The difference in the objective function should be epsilon times the corresponding lambda. |
I couldn't resist. To be honest, this is really all we need for a test. I'm already convinced that it's working fine. All the stuff about the dual problem is unnecessary. It would allow us to get better accuracy, but otherwise it's redundant - the code is already confirmed to be correct; at that point we'd just be illustrating the theory. It would be nice to refactor to have less copy-paste, but this is really all that's needed for testing: import numpy as np
from scipy.optimize import linprog
# this random problem generator is already in the test suite as `very_random_gen`
np.random.seed(0)
m_eq, m_ub, n = 10, 20, 50
c = np.random.rand(n)-0.5
A_ub = np.random.rand(m_ub, n)-0.5
b_ub = np.random.rand(m_ub)-0.5
A_eq = np.random.rand(m_eq, n)-0.5
b_eq = np.random.rand(m_eq)-0.5
lb = -np.random.rand(n)
ub = np.random.rand(n)
lb[lb < -np.random.rand()] = -np.inf
ub[ub > np.random.rand()] = np.inf
bounds = np.vstack((lb, ub)).T
res = linprog(c, A_ub, b_ub, A_eq, b_eq, bounds, method='highs')
f0 = res.fun
dfdbub = np.zeros_like(b_ub) # naming convention: partial derivative of fun w.r.t. b_ub
dfdbeq = np.zeros_like(b_eq)
dfdlb = np.zeros_like(lb)
dfdub = np.zeros_like(ub)
eps = 1e-6
for i in range(len(b_ub)):
b_ub2 = b_ub.copy()
b_ub2[i] = b_ub2[i]*(1 + eps) # take relative epsilon steps. Probably doesn't matter since scale is all near unity
res2 = linprog(c, A_ub, b_ub2, A_eq, b_eq, bounds, method='highs')
dfdbub[i] = (res2.fun - f0)/(b_ub2[i]*eps)
np.allclose(dfdbub, res['lambda']['ineqlin'])
for i in range(len(b_eq)):
b_eq2 = b_eq.copy()
b_eq2[i] = b_eq2[i]*(1 + eps)
res2 = linprog(c, A_ub, b_ub, A_eq, b_eq2, bounds, method='highs')
dfdbeq[i] = (res2.fun - f0)/(b_eq2[i]*eps)
np.allclose(dfdbeq, res['lambda']['eqlin'])
for i in range(len(lb)):
lb2 = lb.copy()
lb2[i] = lb2[i]*(1 + eps)
bounds2 = np.vstack((lb2, ub)).T
res2 = linprog(c, A_ub, b_ub, A_eq, b_eq, bounds2, method='highs')
dfdlb[i] = (res2.fun - f0)/(lb2[i]*eps)
np.allclose(dfdlb, res['lambda']['lower'])
for i in range(len(lb)):
ub2 = ub.copy()
ub2[i] = ub2[i]*(1 + eps)
bounds2 = np.vstack((lb, ub2)).T
res2 = linprog(c, A_ub, b_ub, A_eq, b_eq, bounds2, method='highs')
dfdub[i] = (res2.fun - f0)/(ub2[i]*eps)
np.allclose(dfdub, res['lambda']['upper']) Two things about API:
|
Nice! I'll try to do some refactoring and put this into a test case in
I'm actually surprised I did this -- I almost always rename
Anything wrong with just a plain |
Looks like you're doing a finite difference derivative estimate? IIRC, there are some built into scipy we could call instead like approx_fprime. Not sure if there's a way to make scipy do complex step for |
Sure. I didn't think of it because we don't want/need a refined higher order estimate that might take multiple evaluations. It just has to be close enough to compare with the information provided by HiGHS. We're not testing the accuracy of HiGHS, just that the output of HiGHS is getting to the right place / that we're interpreting it correctly. |
Do they support indexing by key? I think we want both so it works like the rest of the Regarding the name, let's draft with "sensitivity" and ask Julian if that sounds good to him. These partial derivatives are referred to as dual variables, shadow prices, and sensitivity; of those I think sensitivity is the most universal. |
I needed to do central difference and choose a good step size to get a reasonable accuracy (expected from finite differences). You're right and I didn't even think it through -- HiGHS is not built to handle complex input so complex step will not work here.
I made the inner dict an
Sounds good to me, I've made the change to The test is in there after playing around with it for a little bit, please take a look when you have time to make sure I'm not doing something silly or hard to follow. The only thing left might be updating the docs? |
scipy/optimize/tests/test_linprog.py
Outdated
lo = _kwargs[prop].copy() | ||
hi[ii] += h | ||
lo[ii] -= h | ||
hi_kwargs = {k: (v if k != prop else hi) for k, v in _kwargs.items()} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mdhaber You may or may not like what I've done here. There's probably a better way to do this -- this is just the first thing I thought of
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I had something like this in mind, but it would be a lot simpler if you could do just forward step. Does the code I provided not run without error for you? That is just forward step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went straight for replacing the random program generator with very_random_gen(), but with everything else the same the tolerances were too large to pass and I couldn't find a step size that worked for everything. I'll check again tomorrow, I agree forward step would make this easier to read
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I think my original code didn't pass. I wasn't using np.testing.assert_allclose
; it was just np.allclose
: ) Well, let me think about it. It might be best just to change the tolerances.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might have been an implementation issue -- I just did a quick forward step and it's working so I removed the central difference in favor of that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?! I tested it myself today, and it didn't look to me like it would pass with simple forward step! How are we flip-flopping?
Did you see my #12 though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not see that PR -- I apologize! I'm surprised I didn't get an email about it, I'll have to see if there's a setting I need to toggle...
I'm happy to use whichever implementation you think is more maintainable -- I'm probably not going to mess around with it anymore other than integration if needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I think what's going on here is that I'm just getting lucky. I'm using naive implementations of numerical differentiation without regard for scaling of the variables whereas you were trying to take that into account. For this problem, it appears the naive method wins out, but I like the approach better in #12 , so I think going to pull that in
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also found that absolute steps were better, but still wasn't getting within default tolerance for me. Whatever!
scipy/optimize/_linprog.py
Outdated
@@ -606,6 +606,7 @@ def linprog(c, A_ub=None, b_ub=None, A_eq=None, b_eq=None, | |||
_check_result(sol['x'], sol['fun'], sol['status'], sol['slack'], | |||
sol['con'], lp.bounds, tol, sol['message'])) | |||
sol['success'] = sol['status'] == 0 | |||
sol['sensitivity'] = OptimizeResult(sol['sensitivity']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Originally I said that it should be a subclass of OptimizeResult
because I thought OptimizeResult
itself had some required fields, but it looks like it doesn't. The documentation states that OptimizeResult will have a bunch of fields that linprog
already doesn't return, so I think we just need to adjust the documentation at some point to reflect the fact that OptimizeResult
isn't guaranteed to have anything in particular.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops -- forgot to subclass! Not sure if subclassing will do anything useful here if the docs for OptimizeResult
should be changed anyway
scipy/optimize/_linprog_highs.py
Outdated
'nit': res.get('simplex_nit', 0) or res.get('ipm_nit', 0), | ||
'crossover_nit': res.get('crossover_nit'), | ||
} | ||
sol = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
has anything changed other than the addition of sensitivity
and reformatting?
Update: hid whitespace changes. No.
scipy/optimize/_linprog_highs.py
Outdated
'slack': slack, | ||
# TODO: Add/test dual info like: | ||
# 'lambda': res.get('lambda'), | ||
# 's': res.get('s'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What was s
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s is the same as in _highs_wrapper.pyx. It looks like once upon a time I had planned on passing it straight through, but this can now be removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More specifically, if lambda
are the Lagrange multipliers associated with the constraints Ax = b
, the s
are the Lagrange multipliers associated with x >= 0
. They get transformed into the sensitivity
fields in _highs_wrapper.pyx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. That information ends up in lower
/upper
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I was incorrect before. s is returned by _highs_wrapper.pyx and _linprog_highs.py translates that into the lower/upper fields. So yes, that information makes it into lower/upper
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that I'm looking at this again, it might be something to ask Julian if I'm doing correctly. Each s value has a corresponding basis status value that tells whether it's lower or upper bound. There are other values, but I just set to 0 if it's not one of those two
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. Yeah good thing to check. Also ask what he thinks of the documentation and name sensitivity
, please.
I don't think references are needed in the documentation for this unless we think there is a really helpful general reference for LP sensitivity analysis Possible additional test: Names: What a mess. There doesn't seem to be much agreement. I'm not sure what the best way is to clarify the distinction between e.g. the lower bounds (the constraints themselves) and the lower limit of a "range" (for which the dual variables are valid / basis is the same). This search brought up more questions:
|
Wow! Thanks for looking around at all those, this is really good information.
I believe this is the same as complementary slackness, writing a test now for it.
What a mess, indeed. I'm personally fine with nesting -- it gives a little more semantic meaning to what those values are. I think we just need to decide on something that is intelligible and can be easily understood when you read the docs. I would also like to keep track of these mappings and put them in the docs somewhere, I think it's useful. MOSEK seems to me to be the most "principled" of the bunch, but for some reason I still like the way you've proposed with |
@mdhaber I think the questions you pose might be good to run by Julian in an email. As for including ranging information in this PR, it looks we'll have to update HiGHS in order to get it. SciPy is currently using HiGHS from the end of last year before the ranging information was in their master. Do you want to include the upgrade in this PR or only include sensitivities and include ranging information in a follow-up PR? EDIT: looking into how difficult this update is. If it's not bad, then I'll go for it |
Updates:
I'm not entirely sure what I'm looking at with the ranging information, so I need to spend some time figuring out what that is and how we need to transform the information to make it useful for us. I did end up not nesting the sensitivity information just because it was cumbersome to keep typing out the nested layer, but that can be easily changed if needed |
if highs.getRanging(highsRanging) == HighsStatusOK: | ||
ranging = { | ||
'col_cost_up' : { | ||
'val': [highsRanging.col_cost_up.value_[ii] for ii in range(highsRanging.col_cost_up.value_.size())], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to re-evaluate this -- I think there's a better to copy the contents of std::vector
into a numpy array or do it in a way that doesn't involve a copy
@@ -653,6 +654,7 @@ def _highs_wrapper( | |||
|
|||
# We might need an info object if we can look up the solution and a place to put solution | |||
cdef HighsInfo info = highs.getHighsInfo() # it should always be safe to get the info object | |||
cdef HighsRanging highsRanging |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of these are stack allocated and copied into; would be nice to create a constant ref (doesn't work in Cython if I recall correctly). Maybe some ptr juggling would be good for these HighsRanging
, HighsSolution
, and HighsBasis
structs
scipy/optimize/tests/test_linprog.py
Outdated
c, A_ub, b_ub, A_eq, b_eq, bounds = very_random_gen(seed=0) | ||
res = linprog(c, A_ub=A_ub, b_ub=b_ub, A_eq=A_eq, b_eq=b_eq, | ||
bounds=bounds, method=self.method, options=self.options) | ||
print(res) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A dummy test to stop and look at the HighsRanging
results
When testing ranging data the way that you are, make sure that the LP isn't primal or dual degenerate. If it's degenerate, then the dual information may not be valid for a positive change in a particular cost or bound value (call this value data_). To see whether this is the case, look at the value_ entry in a particular HighsRangingRecord. Your epsilon must not be bigger than |value_-data_|. |
Thanks @jajhall. Tests passed, and we hadn't thought to look for ranging data yet, so it didn't make it in the test. We can add it. (I know that degeneracy is more common than one might expect - even for randomly generated problems?) Speaking of degeneracy, suppose we have a 2D problem. Three inequality constraints intersect at a point, and that point is the optimal solution. A given finite change in the RHS of any one of those constraints would cause different magnitude changes in the objective depending on whether the change is positive or negative. In such a case, I assume:
If that makes sense, does it sound like my understanding is correct? I ask because I saw in Mosek something about (I don't need it; I'm just asking so we can design the API such that changes can be made in a backwards-compatible way.) Also, we seem to be heading toward the name |
Thoughts and status:
|
Yes |
It certainly isn't cheap. It requires the whole of the standard simplex tableau to be computed - one column at a time. One HiGHS option that I'm planning to introduce is the calculation of ranging information for a specific set of rows or a specific set of columns - in case users are only interested in certain variables or constraints and want to avoid the big overhead. Note that ranging information is only available when the model status is optimal. |
…ging data; overhauled ranging testing
Updates:
In general, the code needs to be cleaned up, @mdhaber any suggestions about code organization are welcome. Lots of code exists in @jajhall If you get a chance, can you look at @mdhaber 's earlier comment? I'd be interested to know how HiGHS handles this case as well |
Ranging information cannot be given for both bounds of a boxed variable/constraint. If the variable/constraint is nonbasic, then the bound ranging information relates to the bound at which the variable/constraint is active.
Basic/nonbasic status for "bound" ranging does matter.
For cost ranging the basic/nonbasic status isn't quite so important. Cost ranging relates to the minimum and maximum values that the cost can take (with the current basis remaining optimal). |
I think that your understanding is correct, but it's not expressed too clearly. In an example of this type, there are three possible bases at the optimal solution. This is because there are three constraints, so three basic variables, two of which are the original variables x1 and x2. The three slacks are all zero, so any one of them could complete the basis. [Or, equivalently, any two of the zero variables could be chosen to be nonbasic so that the values of the two original variables is given by the solution of the 2x2 system. The value of the basic slack is naturally zero since the three lines intersect at a point.] In your specific example, - with original variables x1 and x2 - here are observations on the ranging for the two optimal bases:
Subtly different...
So, yes, in the nonbasic cases (where ranging does not allow bounds to be changed by positive amount in one direction), "a finite step in the RHS in at least one direction (positive or negative) will go beyond the range provided by HiGHS". In the direction where ranging allows a positive change, the "the dual values provided by HiGHS will agree with changes in [that] direction"
This looks as if they have identified the shadow price as the solution leaves a degenerate vertex. To me this means performing simplex basis changes - which sounds expensive - but I'll think more about it.
Yes, makes sense. |
Thanks @jajhall! |
Reference issue
What does this implement/fix?
marginals
dictionary with the following fields containing dual (Langrangian) values:lower
: corresponds to lower boundsupper
: corresponds to upper boundseqlin
: corresponds toA_ub
ineqlin
: corresponds toA_eq
Additional information