Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: optimize.linprog; linprog duals #10

Open
wants to merge 19 commits into
base: main
Choose a base branch
from
Open

ENH: optimize.linprog; linprog duals #10

wants to merge 19 commits into from

Conversation

mckib2
Copy link
Owner

@mckib2 mckib2 commented Jan 3, 2021

Reference issue

What does this implement/fix?

  • only for HiGHS linprog methods
  • returns a MATLAB style marginals dictionary with the following fields containing dual (Langrangian) values:
    • lower: corresponds to lower bounds
    • upper: corresponds to upper bounds
    • eqlin: corresponds to A_ub
    • ineqlin: corresponds to A_eq
  • Additionally returns ranging information (return structure still WIP)

Additional information

@mckib2 mckib2 changed the title Linprog duals ENH: optimize.linprog; Linprog duals Jan 3, 2021
@mckib2 mckib2 changed the title ENH: optimize.linprog; Linprog duals ENH: optimize.linprog; linprog duals Jan 3, 2021
@mckib2
Copy link
Owner Author

mckib2 commented Jan 3, 2021

@mdhaber Pinging you to make you aware of this. I'd like to have your eyes on it when you free up in a month or so

@mdhaber
Copy link
Collaborator

mdhaber commented Jan 4, 2021

Looking forward to it!

@mdhaber
Copy link
Collaborator

mdhaber commented Feb 15, 2021

@mckib2 it's tough to see what's really new here after merging upstream master. Maybe if you update your master branch it will look cleaner?

@mckib2 mckib2 marked this pull request as draft February 15, 2021 06:35
@mckib2 mckib2 marked this pull request as ready for review February 15, 2021 06:35
@mdhaber
Copy link
Collaborator

mdhaber commented Feb 15, 2021

If I remember theory correctly, my suggestion for testing would be to set up a primal and dual problem pair that has at least one of each constraint type active and check the following:

  • optimal and dual problem objectives match
  • lambda values returned in the result structure of the primal agree with the sensitivities of the primal objective function with respect to small changes in primal problem constraints
  • lambda values returned in the result structure of the dual agree with the sensitivities of the dual objective function with respect to small changes in dual problem constraints
  • lambda values returned in the result structure of the primal agree with the dual problem decision variables
  • lambda values returned in the result structure of the dual agree with the primal problem decision variables

It would be really nice to write a tutorial that shows all this stuff, actually.

Does that make sense or am I confusing the theory?

@mdhaber
Copy link
Collaborator

mdhaber commented Feb 15, 2021

Re: updating master, if that doesn't work, try changing the branch you're comparing this branch against to something else then back to master again.

@mckib2
Copy link
Owner Author

mckib2 commented Feb 15, 2021

Yeah, I can see that the update worked when I open a new PR with the same branch, but I'm having trouble getting this PR to update. How do I change the branch for this PR?

EDIT: I see the link now, it worked!

@mckib2 mckib2 changed the base branch from master to biased-urn February 15, 2021 06:40
@mckib2 mckib2 changed the base branch from biased-urn to master February 15, 2021 06:40
@mdhaber
Copy link
Collaborator

mdhaber commented Feb 15, 2021

Yeah that comes in handy as a maintainer. It doesn't always work but I'm guessing that's when the author has screwed up and the commit numbers have actually changed.

@mckib2
Copy link
Owner Author

mckib2 commented Feb 15, 2021

If I remember theory correctly, my suggestion for testing would be to set up a primal and dual problem pair that has at least one of each constraint type active and check the following:

  • optimal and dual problem objectives match
  • lambda values returned in the result structure of the primal agree with the sensitivities of the primal objective function with respect to small changes in primal problem decision variables
  • lambda values returned in the result structure of the dual agree with the sensitivities of the dual objective function with respect to small changes in dual problem decision variables
  • lambda values returned in the result structure of the primal agree with the dual problem decision variables
  • lambda values returned in the result structure of the dual agree with the primal problem decision variables

It would be really nice to write a tutorial that shows all this stuff, actually.

Does that make sense or am I confusing the theory?

Yes, this is much along the same lines that I was thinking. I was going to try to set up a simple test today but I got distracted on other projects. I am a little fuzzy on how to how to test langrangian values corresponding to the bounds, but I'm sure googling around will help that.

Depending on how much time you want to spend on this, feel free to take a crack at a test or two. It might be a couple days before I get something going -- no pressure either way

@mdhaber
Copy link
Collaborator

mdhaber commented Feb 15, 2021

As much fun as that would be, I have a feeling I would get sucked in. Can't do that right now : (

I am a little fuzzy on how to how to test langrangian values corresponding to the bounds, but I'm sure googling around will help that.

I don't know that it's any different? Oops, I see I had a typo: I wrote

sensitivities of the primal objective function with respect to small changes in primal problem decision variables

but meant

sensitivities of the primal objective function with respect to small changes in primal problem constraints

So just add epsilon to each constraint and solve the problem again. The difference in the objective function should be epsilon times the corresponding lambda.

@mdhaber
Copy link
Collaborator

mdhaber commented Feb 15, 2021

I couldn't resist. To be honest, this is really all we need for a test. I'm already convinced that it's working fine. All the stuff about the dual problem is unnecessary. It would allow us to get better accuracy, but otherwise it's redundant - the code is already confirmed to be correct; at that point we'd just be illustrating the theory.

It would be nice to refactor to have less copy-paste, but this is really all that's needed for testing:

import numpy as np
from scipy.optimize import linprog

# this random problem generator is already in the test suite as `very_random_gen`
np.random.seed(0)
m_eq, m_ub, n = 10, 20, 50
c = np.random.rand(n)-0.5
A_ub = np.random.rand(m_ub, n)-0.5
b_ub = np.random.rand(m_ub)-0.5
A_eq = np.random.rand(m_eq, n)-0.5
b_eq = np.random.rand(m_eq)-0.5
lb = -np.random.rand(n)
ub = np.random.rand(n)
lb[lb < -np.random.rand()] = -np.inf
ub[ub > np.random.rand()] = np.inf
bounds = np.vstack((lb, ub)).T
res = linprog(c, A_ub, b_ub, A_eq, b_eq, bounds, method='highs')
f0 = res.fun

dfdbub = np.zeros_like(b_ub)  # naming convention: partial derivative of fun w.r.t. b_ub
dfdbeq = np.zeros_like(b_eq)
dfdlb = np.zeros_like(lb)
dfdub = np.zeros_like(ub)

eps = 1e-6

for i in range(len(b_ub)):
    b_ub2 = b_ub.copy()
    b_ub2[i] = b_ub2[i]*(1 + eps)  # take relative epsilon steps. Probably doesn't matter since scale is all near unity
    res2 = linprog(c, A_ub, b_ub2, A_eq, b_eq, bounds, method='highs')
    dfdbub[i] = (res2.fun - f0)/(b_ub2[i]*eps)

np.allclose(dfdbub, res['lambda']['ineqlin'])

for i in range(len(b_eq)):
    b_eq2 = b_eq.copy()
    b_eq2[i] = b_eq2[i]*(1 + eps)
    res2 = linprog(c, A_ub, b_ub, A_eq, b_eq2, bounds, method='highs')
    dfdbeq[i] = (res2.fun - f0)/(b_eq2[i]*eps)

np.allclose(dfdbeq, res['lambda']['eqlin'])

for i in range(len(lb)):
    lb2 = lb.copy()
    lb2[i] = lb2[i]*(1 + eps)
    bounds2 = np.vstack((lb2, ub)).T
    res2 = linprog(c, A_ub, b_ub, A_eq, b_eq, bounds2, method='highs')
    dfdlb[i] = (res2.fun - f0)/(lb2[i]*eps)

np.allclose(dfdlb, res['lambda']['lower'])

for i in range(len(lb)):
    ub2 = ub.copy()
    ub2[i] = ub2[i]*(1 + eps)
    bounds2 = np.vstack((lb, ub2)).T
    res2 = linprog(c, A_ub, b_ub, A_eq, b_eq, bounds2, method='highs')
    dfdub[i] = (res2.fun - f0)/(ub2[i]*eps)

np.allclose(dfdub, res['lambda']['upper'])

Two things about API:

  • The name lambda is problematic because of the conflict with built-in lambda. As a result, res.lambda doesn't work and we have to use res['lambda']. Maybe sensitivity?
  • Along the same lines, I'm not sure if a plain dict is what we want. It would more consistent if it were like OptimizeResult in that allows the fields to be accessed as attributes or elements in a dict. Maybe it could be a subclass of OptimizeResult - subclass so it doesn't have all the other fields.

@mckib2
Copy link
Owner Author

mckib2 commented Feb 15, 2021

It would be nice to refactor to have less copy-paste, but this is really all that's needed for testing:

Nice! I'll try to do some refactoring and put this into a test case in test_linprog.py.

The name lambda is problematic because of the conflict with built-in lambda. As a result, res.lambda doesn't work and we have to use res['lambda']. Maybe sensitivity?

I'm actually surprised I did this -- I almost always rename lambda to lamda to avoid this clash. If sensitivity makes sense to the people who use this, then that sounds great to me.

Maybe it could be a subclass of OptimizeResult - subclass so it doesn't have all the other fields.

Anything wrong with just a plain NamedTuple?

@mckib2
Copy link
Owner Author

mckib2 commented Feb 15, 2021

dfdlb[i] = (res2.fun - f0)/(lb2[i]*eps)

Looks like you're doing a finite difference derivative estimate? IIRC, there are some built into scipy we could call instead like approx_fprime. Not sure if there's a way to make scipy do complex step for approx_fprime, but that would be my goto

@mdhaber
Copy link
Collaborator

mdhaber commented Feb 15, 2021

IIRC, there are some built into scipy

Sure. I didn't think of it because we don't want/need a refined higher order estimate that might take multiple evaluations. It just has to be close enough to compare with the information provided by HiGHS. We're not testing the accuracy of HiGHS, just that the output of HiGHS is getting to the right place / that we're interpreting it correctly.
I don't know if HiGHS supports complex values or if the function is analytic. Accuracy isn't important though.
I would say whatever makes the code look cleanest is best.

@mdhaber
Copy link
Collaborator

mdhaber commented Feb 15, 2021

Anything wrong with just a plain NamedTuple?

Do they support indexing by key? I think we want both so it works like the rest of the OptimizeResult it's a part of.

Regarding the name, let's draft with "sensitivity" and ask Julian if that sounds good to him. These partial derivatives are referred to as dual variables, shadow prices, and sensitivity; of those I think sensitivity is the most universal.

@mckib2
Copy link
Owner Author

mckib2 commented Feb 16, 2021

Sure. I didn't think of it because we don't want/need a refined higher order estimate that might take multiple evaluations.

I needed to do central difference and choose a good step size to get a reasonable accuracy (expected from finite differences). You're right and I didn't even think it through -- HiGHS is not built to handle complex input so complex step will not work here.

Do they support indexing by key? I think we want both so it works like the rest of the OptimizeResult it's a part of.

I made the inner dict an OptimizeResult for consistency, I like that.

Regarding the name, let's draft with "sensitivity" and ask Julian if that sounds good to him.

Sounds good to me, I've made the change to sensitivity.

The test is in there after playing around with it for a little bit, please take a look when you have time to make sure I'm not doing something silly or hard to follow. The only thing left might be updating the docs?

lo = _kwargs[prop].copy()
hi[ii] += h
lo[ii] -= h
hi_kwargs = {k: (v if k != prop else hi) for k, v in _kwargs.items()}
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mdhaber You may or may not like what I've done here. There's probably a better way to do this -- this is just the first thing I thought of

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I had something like this in mind, but it would be a lot simpler if you could do just forward step. Does the code I provided not run without error for you? That is just forward step.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went straight for replacing the random program generator with very_random_gen(), but with everything else the same the tolerances were too large to pass and I couldn't find a step size that worked for everything. I'll check again tomorrow, I agree forward step would make this easier to read

Copy link
Collaborator

@mdhaber mdhaber Feb 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I think my original code didn't pass. I wasn't using np.testing.assert_allclose; it was just np.allclose : ) Well, let me think about it. It might be best just to change the tolerances.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might have been an implementation issue -- I just did a quick forward step and it's working so I removed the central difference in favor of that

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?! I tested it myself today, and it didn't look to me like it would pass with simple forward step! How are we flip-flopping?

Did you see my #12 though?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not see that PR -- I apologize! I'm surprised I didn't get an email about it, I'll have to see if there's a setting I need to toggle...

I'm happy to use whichever implementation you think is more maintainable -- I'm probably not going to mess around with it anymore other than integration if needed

Copy link
Owner Author

@mckib2 mckib2 Feb 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think what's going on here is that I'm just getting lucky. I'm using naive implementations of numerical differentiation without regard for scaling of the variables whereas you were trying to take that into account. For this problem, it appears the naive method wins out, but I like the approach better in #12 , so I think going to pull that in

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also found that absolute steps were better, but still wasn't getting within default tolerance for me. Whatever!

@@ -606,6 +606,7 @@ def linprog(c, A_ub=None, b_ub=None, A_eq=None, b_eq=None,
_check_result(sol['x'], sol['fun'], sol['status'], sol['slack'],
sol['con'], lp.bounds, tol, sol['message']))
sol['success'] = sol['status'] == 0
sol['sensitivity'] = OptimizeResult(sol['sensitivity'])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally I said that it should be a subclass of OptimizeResult because I thought OptimizeResult itself had some required fields, but it looks like it doesn't. The documentation states that OptimizeResult will have a bunch of fields that linprog already doesn't return, so I think we just need to adjust the documentation at some point to reflect the fact that OptimizeResult isn't guaranteed to have anything in particular.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops -- forgot to subclass! Not sure if subclassing will do anything useful here if the docs for OptimizeResult should be changed anyway

'nit': res.get('simplex_nit', 0) or res.get('ipm_nit', 0),
'crossover_nit': res.get('crossover_nit'),
}
sol = {
Copy link
Collaborator

@mdhaber mdhaber Feb 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has anything changed other than the addition of sensitivity and reformatting?
Update: hid whitespace changes. No.

'slack': slack,
# TODO: Add/test dual info like:
# 'lambda': res.get('lambda'),
# 's': res.get('s'),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was s?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s is the same as in _highs_wrapper.pyx. It looks like once upon a time I had planned on passing it straight through, but this can now be removed

Copy link
Owner Author

@mckib2 mckib2 Feb 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More specifically, if lambda are the Lagrange multipliers associated with the constraints Ax = b, the s are the Lagrange multipliers associated with x >= 0. They get transformed into the sensitivity fields in _highs_wrapper.pyx

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. That information ends up in lower/upper?

Copy link
Owner Author

@mckib2 mckib2 Feb 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I was incorrect before. s is returned by _highs_wrapper.pyx and _linprog_highs.py translates that into the lower/upper fields. So yes, that information makes it into lower/upper

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I'm looking at this again, it might be something to ask Julian if I'm doing correctly. Each s value has a corresponding basis status value that tells whether it's lower or upper bound. There are other values, but I just set to 0 if it's not one of those two

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Yeah good thing to check. Also ask what he thinks of the documentation and name sensitivity, please.

@mdhaber
Copy link
Collaborator

mdhaber commented Feb 19, 2021

first draft of docstring additions (might need a reference?)

I don't think references are needed in the documentation for this unless we think there is a really helpful general reference for LP sensitivity analysis

Possible additional test:
There's a neat identity in the matlab documentation about lambda that we could add as a test.

Names:
Looked into other software's conventions. Some relevant links and names of variables or functions:
Matlab - just lambda
Maple page 157 - confusing
CPLEX - organizes sensitivity into obj and rhs with reduced cost/shadow prices and ranges in each
gurobi - pi for shadow prices, RC for reduced costs, SARHSUp/SARHSLow/SAObjUp/SAObjLow for ranges.
MOSEK - leftprice, rightprice, leftrange, rightrange
LINDO - variable names unclear, this talks about a report
SAS - variable names unclear to me, this seems to be about report reports and interactive features
XPRESS - XPRSrhssa, XPRSobjsa, dual
GLPK - "Marginal", "Activity", "Lower Bound", "Upper Bound"

What a mess. There doesn't seem to be much agreement.
"marginal" might be a good word for the duals themselves. I think it is more descriptive than "dual" or "price". Maybe within the sensitivity object we have marginals and ranges? Or is that too much nesting of OptimizeResults?

I'm not sure what the best way is to clarify the distinction between e.g. the lower bounds (the constraints themselves) and the lower limit of a "range" (for which the dual variables are valid / basis is the same).

This search brought up more questions:

  • Should we explicitly indicate somewhere which variables are in the basis? (In degenerate problems, basic variables can be equal to zero, so we can't just rely on the values of the decision variables.)
  • Does degeneracy have any implications for the accuracy/reliability of the results returned by HiGHS?
  • Do we want to have a separate variable for the "reduced costs" or can we rely on the user to know that the thing called
    "reduced cost" is zero for basic variables and for nonbasic variables equal to the allowable decrease on the objective coefficients?
  • Some solvers (e.g. MOSEK) seem to have have different left/right dual values (that are always equal in the examples given). Would HiGHS ever report separate values to the left and right?

@mckib2
Copy link
Owner Author

mckib2 commented Feb 21, 2021

Wow! Thanks for looking around at all those, this is really good information.

There's a neat identity in the matlab documentation about lambda that we could add as a test.

I believe this is the same as complementary slackness, writing a test now for it.

What a mess. There doesn't seem to be much agreement.
"marginal" might be a good word for the duals themselves. I think it is more descriptive than "dual" or "price". Maybe within the sensitivity object we have marginals and ranges? Or is that too much nesting of OptimizeResults?

What a mess, indeed. I'm personally fine with nesting -- it gives a little more semantic meaning to what those values are. I think we just need to decide on something that is intelligible and can be easily understood when you read the docs. I would also like to keep track of these mappings and put them in the docs somewhere, I think it's useful. MOSEK seems to me to be the most "principled" of the bunch, but for some reason I still like the way you've proposed with sensitivity -> [marginals, ranges] (even though marginals introduces yet another name for the duals...)

@mckib2
Copy link
Owner Author

mckib2 commented Feb 21, 2021

@mdhaber I think the questions you pose might be good to run by Julian in an email.

As for including ranging information in this PR, it looks we'll have to update HiGHS in order to get it. SciPy is currently using HiGHS from the end of last year before the ranging information was in their master. Do you want to include the upgrade in this PR or only include sensitivities and include ranging information in a follow-up PR? EDIT: looking into how difficult this update is. If it's not bad, then I'll go for it

@mckib2
Copy link
Owner Author

mckib2 commented Feb 21, 2021

Updates:

I'm not entirely sure what I'm looking at with the ranging information, so I need to spend some time figuring out what that is and how we need to transform the information to make it useful for us. I did end up not nesting the sensitivity information just because it was cumbersome to keep typing out the nested layer, but that can be easily changed if needed

if highs.getRanging(highsRanging) == HighsStatusOK:
ranging = {
'col_cost_up' : {
'val': [highsRanging.col_cost_up.value_[ii] for ii in range(highsRanging.col_cost_up.value_.size())],
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to re-evaluate this -- I think there's a better to copy the contents of std::vector into a numpy array or do it in a way that doesn't involve a copy

@@ -653,6 +654,7 @@ def _highs_wrapper(

# We might need an info object if we can look up the solution and a place to put solution
cdef HighsInfo info = highs.getHighsInfo() # it should always be safe to get the info object
cdef HighsRanging highsRanging
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these are stack allocated and copied into; would be nice to create a constant ref (doesn't work in Cython if I recall correctly). Maybe some ptr juggling would be good for these HighsRanging, HighsSolution, and HighsBasis structs

c, A_ub, b_ub, A_eq, b_eq, bounds = very_random_gen(seed=0)
res = linprog(c, A_ub=A_ub, b_ub=b_ub, A_eq=A_eq, b_eq=b_eq,
bounds=bounds, method=self.method, options=self.options)
print(res)
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A dummy test to stop and look at the HighsRanging results

@jajhall
Copy link

jajhall commented Feb 22, 2021

When testing ranging data the way that you are, make sure that the LP isn't primal or dual degenerate. If it's degenerate, then the dual information may not be valid for a positive change in a particular cost or bound value (call this value data_). To see whether this is the case, look at the value_ entry in a particular HighsRangingRecord. Your epsilon must not be bigger than |value_-data_|.

@mdhaber
Copy link
Collaborator

mdhaber commented Feb 22, 2021

Thanks @jajhall. Tests passed, and we hadn't thought to look for ranging data yet, so it didn't make it in the test. We can add it. (I know that degeneracy is more common than one might expect - even for randomly generated problems?)

Speaking of degeneracy, suppose we have a 2D problem. Three inequality constraints intersect at a point, and that point is the optimal solution.
image

A given finite change in the RHS of any one of those constraints would cause different magnitude changes in the objective depending on whether the change is positive or negative. In such a case, I assume:

  • the dual values provided by HiGHS will agree with changes in one of these two directions (depending on which constraints happens to be in the basis) and
  • a finite step in the RHS in at least one directions (positive or negative) will go beyond the range provided by HiGHS.

If that makes sense, does it sound like my understanding is correct?

I ask because I saw in Mosek something about leftprice and rightprice, which made me wonder if there could be a different shadow price depending on the direction of the change in a constraint RHS. In their example here, the left and right prices (sigma_1 and sigma_2, I think) are all equal, but I imagine they could be different in a case like the one above. If this sort of information is not available now, would there be any plan to add this sort of information to HiGHS in the future?

(I don't need it; I'm just asking so we can design the API such that changes can be made in a backwards-compatible way.)

Also, we seem to be heading toward the name marginals for the shadow prices. Is that preferable to sensitivity to avoid confusion with the ranging data?

@mckib2
Copy link
Owner Author

mckib2 commented Mar 7, 2021

Thoughts and status:

  • I've reorganized the ranging dictionary into costs, bounds, and constraints
    • each field has its own up/down subfields
    • costs corresponds to SciPy coefficients c and HiGHS col_cost_[up|down]
    • bounds corresponds to SciPy bounds and HiGHS col_bnd_[up|down]
    • constraints corresponds to SciPy concat(b_ub, b_eq) and HiGHS row_bnd_[up|down]
    • I think yet another reorg is in order -- maybe to mirror the structure of marginals (upper/lower/ineqlin/eqlin) -- definitely to split constraints into the b_ub and b_eq components
  • To interpret the ranging information, column/row basis statuses are required and have been added to the ranging dictionary. To include this in the release of course would require exposing the HIGHS_BASIS_STATUS_* constants to end users. I'll see if I can mess around with it a little more to avoid having to expose the basis status information
  • Collecting and building ranging information is not a no-op so I created another _lingprog_highs option named ranging : bool to instruct the wrapper whether or not to return this information. marginals are currently always returned because they are already cheap to provide
  • I'm working on ranging tests -- I haven't run into any degeneracy problems yet with using random problems
  • As far as usability, the nested dictionaries can be cumbersome. Would a nested OptimizeResult-like object be more appealing? That would allow for "dot" accessors instead of dictionary keying

@mdhaber
Copy link
Collaborator

mdhaber commented Mar 7, 2021

Would a nested OptimizeResult-like object be more appealing?

Yes

@mckib2 mckib2 changed the base branch from master to biased-urn March 7, 2021 07:11
@mckib2 mckib2 changed the base branch from biased-urn to master March 7, 2021 07:11
@jajhall
Copy link

jajhall commented Mar 7, 2021

Collecting and building ranging information is not a no-op so I created another _lingprog_highs option named ranging : bool to instruct the wrapper whether or not to return this information. marginals are currently always returned because they are already cheap to provide

It certainly isn't cheap. It requires the whole of the standard simplex tableau to be computed - one column at a time.

One HiGHS option that I'm planning to introduce is the calculation of ranging information for a specific set of rows or a specific set of columns - in case users are only interested in certain variables or constraints and want to avoid the big overhead.

Note that ranging information is only available when the model status is optimal.

@mckib2
Copy link
Owner Author

mckib2 commented Mar 13, 2021

Updates:

  • Created RangingInfo and RangingFields classes that inherit from OptimizeResult
    • A RangingFields object exists for ineqlin, eqlin, costs, and bounds
      • For bounds, lower and upper seem to not be able to be stripped apart from each other -- in the case of bounds[j][0] == bounds[j][1] (equality) both bounds change simultaneously
    • have val, fun, basis_in, basis_out (and fac) fields
  • column/row basis status are no longer necessary to interpret and use results
    • an additional fac sign is required for ranging.up.ineqlin in the case of basic row statuses to instruct the user how to convert upper bounds to equivalent lower bounds
  • ranging tests now complete (?)

In general, the code needs to be cleaned up, @mdhaber any suggestions about code organization are welcome. Lots of code exists in _linprog_highs.py that might be able to be moved to _highs_wrapper.pyx. Also please look at RangingInfo docstring to see how we can document these structures and make them easy to understand.

@jajhall If you get a chance, can you look at @mdhaber 's earlier comment? I'd be interested to know how HiGHS handles this case as well

@jajhall
Copy link

jajhall commented Mar 13, 2021

For bounds, lower and upper seem to not be able to be stripped apart from each other -- in the case of bounds[j][0] == bounds[j][1] (equality) both bounds change simultaneously

Ranging information cannot be given for both bounds of a boxed variable/constraint. If the variable/constraint is nonbasic, then the bound ranging information relates to the bound at which the variable/constraint is active.

column/row basis status are no longer necessary to interpret and use results

Basic/nonbasic status for "bound" ranging does matter.

  • For a basic variable, the "bound" ranging information relates to the minimum and maximum values that the variable can take either side of its current value (with the current basis remaining optimal). Bounds only come into play as a limit on the value that a basic variable/constraint can reach.

  • For a nonbasic variable/constraint, the ranging information relates to the minimum and maximum values that can be taken by the bound at which the variable/constraint is active (with the current basis remaining optimal).

For cost ranging the basic/nonbasic status isn't quite so important. Cost ranging relates to the minimum and maximum values that the cost can take (with the current basis remaining optimal).

@jajhall
Copy link

jajhall commented Mar 13, 2021

Speaking of degeneracy, suppose we have a 2D problem. Three inequality constraints intersect at a point, and that point is the optimal solution.
A given finite change in the RHS of any one of those constraints would cause different magnitude changes in the objective depending on whether the change is positive or negative. In such a case, I assume:

  • the dual values provided by HiGHS will agree with changes in one of these two directions (depending on which constraints happens to be in the basis) and
  • a finite step in the RHS in at least one directions (positive or negative) will go beyond the range provided by HiGHS.

If that makes sense, does it sound like my understanding is correct?

I think that your understanding is correct, but it's not expressed too clearly.

image

In an example of this type, there are three possible bases at the optimal solution. This is because there are three constraints, so three basic variables, two of which are the original variables x1 and x2. The three slacks are all zero, so any one of them could complete the basis. [Or, equivalently, any two of the zero variables could be chosen to be nonbasic so that the values of the two original variables is given by the solution of the 2x2 system. The value of the basic slack is naturally zero since the three lines intersect at a point.]

In your specific example, - with original variables x1 and x2 - here are observations on the ranging for the two optimal bases:

  • Purple-Green nonbasic: If the purple constraint is p^Tz<=P, then P cannot be increased at all, otherwise the slack on yellow goes negative. P can be reduced until x2 is zeroed. If the green constraint is g^Tz<=G, then (as for purple) G cannot be increased at all, otherwise the slack on yellow goes negative. G can be reduced until x1 is zeroed. If the yellow constraint is y^Tx<=Y, then its (basic) slack can be increased until the first of x1 and x2 is zeroed. Its slack cannot be reduced at all since it is zero.

Subtly different...

  • Yellow-Green nonbasic: If the yellow constraint is y^Tx<=Y, then Y cannot be increased at all, otherwise the slack on purple goes negative. Y can be reduced until x2 is zeroed. However, If the green constraint is g^Tz<=G, then G can be increased until x2 is zeroed, but it cannot be decreased at all, otherwise the slack on purple goes negative. If the purple constraint is p^Tx<=P, then its (basic) slack can be increased until x2 is zeroed. Its slack cannot be reduced at all since it is zero.

So, yes, in the nonbasic cases (where ranging does not allow bounds to be changed by positive amount in one direction), "a finite step in the RHS in at least one direction (positive or negative) will go beyond the range provided by HiGHS". In the direction where ranging allows a positive change, the "the dual values provided by HiGHS will agree with changes in [that] direction"

I ask because I saw in Mosek something about leftprice and rightprice, which made me wonder if there could be a different shadow price depending on the direction of the change in a constraint RHS. In their example here, the left and right prices (sigma_1 and sigma_2, I think) are all equal, but I imagine they could be different in a case like the one above. If this sort of information is not available now, would there be any plan to add this sort of information to HiGHS in the future?

(I don't need it; I'm just asking so we can design the API such that changes can be made in a backwards-compatible way.)

This looks as if they have identified the shadow price as the solution leaves a degenerate vertex. To me this means performing simplex basis changes - which sounds expensive - but I'll think more about it.

Also, we seem to be heading toward the name marginals for the shadow prices. Is that preferable to sensitivity to avoid confusion with the ranging data?

Yes, makes sense.

@mdhaber
Copy link
Collaborator

mdhaber commented Mar 17, 2021

Thanks @jajhall!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants