Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize._linprog() sparse matrix input doc update/convert sparse parameters to dense for simplex methods #14488

Closed
wants to merge 8 commits into from

Conversation

dcb2124
Copy link

@dcb2124 dcb2124 commented Jul 27, 2021

Reference issue

#14471

What does this implement/fix?

ENH: optimize._linprog() will now try to convert sparse matrix to dense for A_ub and A_eq parameters if method='simplex' or 'revised simplex', since they do not accept sparse matrices.

DOC: Added language to documentation for A_ub and A_eq parameters to reflect that they accept sparse matrix but will try to convert to dense if method='simplex' or 'revised simplex'.

for A_ub and A_eq parameters if method='simplex' or 'revised simplex',
since they do not accept sparse matrices.

DOC: Added language to documentation for A_ub and A_eq parameters to
reflect that they accept sparse matrix but will try to convert to dense
if method='simplex' or 'revised simplex'.
@dcb2124 dcb2124 requested a review from andyfaff as a code owner July 27, 2021 22:13

if 'simplex' in meth:
A_eq = A_eq.todense()
A_ub = A_ub.todense()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will only work for sparse input, right? I suggest following _LPProblem or _parse_linprog - I can't remember which - to find out where the input validation is. Try to find other input validation on A_eq and A_ub and put this there, and only convert todense if issparse.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify, the issue here is that, the way I've written it, if someone choose a simplex method, but submits a dense matrix/array, then this will raise an exception. So, it should check first that it needs to convert, and then only do so if issparse. We want the user to be able to submit either sparse or dense without making _linprog() angry.

_parse_linprog() calls check_sparse_inputs(), which at one point calls sparse.issparse(A_eq) to see if the inputs are sparse matrices. I think it would make sense to just call sparse.isspsarse directly because check_sparse_inputs() will throw an exception if the the sparsity constraint and the method are not compatible.

I'll push changes shortly to reflecting the above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify...

Exactly.

I think it would make sense to just call...

I'm not sure about that. It would be unusual for the code to convert the matrices to dense and then check whether they are sparse, right? They can't be sparse if our code has already changed them to dense.

Instead, we'd want to change that code in _check_sparse_inputs if it is no longer what we want to do.

Let's hold off on changes while we wait back for @mckib2's opinion, though. I had forgotten that this check was in there, and I didn't realize that the code currently raises an exception for sparse input. Producing an error and directing users to use one of the HiGHS methods in this case might be better thing than converting to dense.

scipy/optimize/_linprog.py Outdated Show resolved Hide resolved
@@ -207,13 +207,17 @@ def linprog(c, A_ub=None, b_ub=None, A_eq=None, b_eq=None,
The coefficients of the linear objective function to be minimized.
A_ub : 2-D array, optional
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A_ub : 2-D array, optional
A_ub : 2-D array or sparse matrix, optional

Below (for A_eq), too.
Also, all the method-specific documentation is in _linprog_doc.py. Please edit this everywhere in there, too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have correctly added this in all the method-specific documentation, but if it's true for every method, we would want to have it here in the top-level linprog documentation, too.

Copy link
Author

@dcb2124 dcb2124 Jul 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My fault again. I misread what you were saying in this suggestion, and again just went through too fast and wasn't thinking

#14488 (comment)

Will correct if further commits necessary.

scipy/optimize/_linprog.py Outdated Show resolved Hide resolved
Copy link
Contributor

@mdhaber mdhaber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dcb2124! A few suggestions.

A_ub parameters of _linprog accept sparse input, convert sparse to
dense.

Corrected previous commit to check that input is sparse before trying to
convert to dense.
@dcb2124 dcb2124 requested a review from mdhaber July 28, 2021 00:51
Comment on lines 56 to 57
coefficients of a linear inequality constraint on ``x``. Converts
input to dense array.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
coefficients of a linear inequality constraint on ``x``. Converts
input to dense array.
coefficients of a linear inequality constraint on ``x``.

This part was only needed in the method-specific documentation for the simplex and revised simplex methods. For the other inputs, the plan was to leave them sparse, right? But again, let's wait for @mckib2's input on this now that we see that our plan conflicts a bit with a previous decision.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's my bad. I just went through an added it to all of them without thinking, I meant to just keep it to highs and interior point. But I will hold off on correcting that as you suggest.

Copy link
Contributor

@mdhaber mdhaber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very close. Let's hold off on corrections while we check on the opinions of @mckib2 @Kai-Striega as to whether we should retain the current behavior - raise an error, e.g.

ValueError: Method 'revised simplex' does not support sparse constraint matrices. Please consider using one of {'highs-ipm', 'highs-ds', 'highs'}.

or if we should convert sparse input to to dense for simplex and revised simplex.

@mckib2 @Kai-Striega The reason I was suggesting that we convert to dense is because in the top level documentation I wanted to be able to write:

A_eq : 2-D array or sparse matrix, optional

without the caveat that actually, some methods support sparse input and some don't. If you think we should keep the current error and not convert to dense, how would you suggest we document the accepted type(s) in the top-level linprog documentation?

@dcb2124
Copy link
Author

dcb2124 commented Jul 28, 2021

I just looked at the tests apparently they are failingbecause /optimize/tests/test_linprog.py calls test_sparse_constraints() which fails if there is it's a simplex method with sparse inputs that does not raise ValueError. Does this affect whether the change should be made about trying to convert to dense?

if self.method in {'simplex', 'revised simplex'}:
# simplex and revised simplex should raise error
with assert_raises(ValueError, match=f"Method '{self.method}' "
"does not support sparse constraint matrices."):
linprog(c=c, A_eq=A_eq, b_eq=b_eq, bounds=bounds,
method=self.method, options=self.options)

Comment on lines 594 to 596
A_eq = A_eq.todense()
if sps.issparse(A_ub):
A_ub = A_ub.todense()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't realize that todense returns a np.matrix rather than np.ndarray (gh-14494). If we go this route, let's make these toarray.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I thought that was a little weird, I haven't seen np.matrix used much

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very timely. I'm preparing a PR that removes all unreasonable uses of .todense from the codebase.

@mckib2
Copy link
Contributor

mckib2 commented Jul 29, 2021

Very close. Let's hold off on corrections while we check on the opinions of @mckib2 @Kai-Striega as to whether we should retain the current behavior - raise an error, e.g.

ValueError: Method 'revised simplex' does not support sparse constraint matrices. Please consider using one of {'highs-ipm', 'highs-ds', 'highs'}.

or if we should convert sparse input to to dense for simplex and revised simplex.

@mckib2 @Kai-Striega The reason I was suggesting that we convert to dense is because in the top level documentation I wanted to be able to write:

A_eq : 2-D array or sparse matrix, optional

without the caveat that actually, some methods support sparse input and some don't. If you think we should keep the current error and not convert to dense, how would you suggest we document the accepted type(s) in the top-level linprog documentation?

I suppose this is a breaking change for anyone relying on linprog to raise a ValueError when sparse matrices are passed using method='[simplex|revised simplex]'. I am supportive of the motion to make the interface more consistent but I still wonder how many people/projects this would impact -- it shouldn't be very many, but if you are relying on this behavior I suspect a chance of it not being an easy upgrade. In most cases simply switching to a HiGHS method should be sufficient. Is there precedent for this kind of change? If not, I say we announce it on the mailing list and then proceed if no one says anything

@perimosocordiae
Copy link
Member

I'd recommend raising a SparseEfficiencyWarning if either input is converted from sparse->dense. This would allow us to point users in the direction of more efficient methods, but can also be ignored by users if they know what they're doing.

@mdhaber
Copy link
Contributor

mdhaber commented Jul 29, 2021

Great suggestions.

@dcb2124 please proceed with the original plan by implementing the corrections we've discussed:

  • instead of raising an error in test_sparse_constraints, convert toarray if the array is sparse and one of the old simplex methods is being used
  • in top-level documentation, list "2D array or sparse matrix" for the allowed types, potentially adding to the explanation that a sparse matrix will be converted to an array if one of the old simplex methods is selected.
  • In the method-specific documentation of each method, list "2D array or sparse matrix" for the allowed types. In the method-specific documentation of the old simplex methods, mention that a sparse matrix will be converted to array.

As @perimosocordiae suggests, please raise a scipy.sparse.SparseEfficiencyWarning whenever the conversion is performed.

@perimosocordiae, do you know if toarray does any sort of memory check before proceeding?

@mckib2 Yes, we should mention this on the mailing list and mention it in the release notes. But I don't think the error was documented originally, and we are making the function work where it didn't before, so I'm guessing we are pretty safe to make this change.

@dcb2124
Copy link
Author

dcb2124 commented Jul 29, 2021

I think I'm gonna word the conversion part of the doc like this, "If method = 'simplex' or 'revised simplex', and input is sparse, converts to dense array." Just so it is perfectly clear.

@dcb2124
Copy link
Author

dcb2124 commented Jul 29, 2021

Also, the SparseEfficiencyWarning - should it direct to particular method, or should it just warn them that other methods may be more efficient?

@mdhaber
Copy link
Contributor

mdhaber commented Jul 29, 2021

I think I'm gonna word the conversion part of the doc like this

Sounds good.

should it direct to particular method, or should it just warn them that other methods may be more efficient?

I think the error suggests using the HiGHS methods. Please do that.

…eters

ENH: _linprog converts sparse matrix inputs to dense when method='simplex' or
'revised simplex' and gives SparseEfficiencyWarning.

DOC: Updates top level docs for _linprog to clarify that sparse
matrices are accepted for A_eq and A_ub parameters, and tries to convert
to dense array if a simplex method is used. In method-specific docs:
simplex methods note sparse is converted to dense; non-simplex note
sparse is accepted.

TST: Modified test_linprog::test_sparse_constraints to check that
SparseEfficiecnyWarning is given when sparse input is used with simplex
method; no longer checks for ValueError to be raised under these
conditions.
Comment on lines +599 to +601
conversion_message = '''Converting sparse input {0} to dense array
because a simplex method is used. May run out of memory if {0} is very
large; consider HiGHS method instead.'''.replace('\n', ' ')
Copy link
Contributor

@mdhaber mdhaber Jul 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typically for multi-line string we surround in (), e.g.:

x = ("this is"
     "one string.")

I'm not sure whether PEP8 cares, but maybe check with flake8. Our PEP8 checks on CI are not very strict.

A_eq = A_eq.toarray()
if sps.issparse(A_ub):
warn(conversion_message.format('A_ub'), SparseEfficiencyWarning)
A_ub = A_eq.toarray()
Copy link
Contributor

@mdhaber mdhaber Jul 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A_ub = A_eq.toarray()
A_ub = A_ub.toarray()

We need a unit tests to check for this sort of thing. Add a unit test for revised simplex and simplex that makes sure they correctly solve a problem given sparse input.

@@ -588,6 +594,19 @@ def linprog(c, A_ub=None, b_ub=None, A_eq=None, b_eq=None,
warning_message = "x0 is used only when method is 'revised simplex'. "
warn(warning_message, OptimizeWarning)

# convert sparse to dense array if a simplex method is used and warn
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic looks correct now, but again, please do not put it here. You found that there is a function called _check_sparse_inputs, and it's purpose is very closely aligned with what you are doing:

Check the provided ``A_ub`` and ``A_eq`` matrices conform to the specified
optional sparsity variables.

It already has logic very similar to what you're adding here:

    if meth in dense_methods and sparse_constraint:
        raise ValueError(f"Method '{meth}' does not support sparse "
                         "constraint matrices. Please consider using one of "
                         f"{preferred_methods}.")

except that it raises an error instead of performing the conversion.

If we were to leave the existing code in _check_sparse_inputs, it would be unreachable because you would have already made A_eq and A_ub dense. Also, because _check_sparse_inputs exists, it makes sense to put related code inside that function rather than directly in linprog`.

So please remove this code from linprog and replace the part of _check_sparse_inputs that raises an error with your code that performs conversions and raises the warning.

# simplex and revised simplex should raise error
with assert_raises(ValueError, match=f"Method '{self.method}' "
"does not support sparse constraint matrices."):
with assert_warns(scipy.sparse.SparseEfficiencyWarning):
linprog(c=c, A_eq=A_eq, b_eq=b_eq, bounds=bounds,
Copy link
Contributor

@mdhaber mdhaber Jul 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To address the comment above about unit testing, you can just check that linprog correctly solves this problem.

Since the purpose of the test is changing from checking that an error is raised to checking for correctness, you might find that another example problem - e.g. one to which you already know the answer - would be more appropriate.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm not very familiar with writing tests...would this be the way to do it? Thanks for your patience

    if self.method in {'simplex', 'revised simplex'}:
        #keep assert_warns check or no? 
        with assert_warns(scipy.sparse.SparseEfficiencyWarning):
            linprog(c=c, A_eq=A_eq, b_eq=b_eq, bounds=bounds,
                    method=self.method, options=self.options)

        #this is the correct way to check the problem is solved correctly? 
        res = linprog(c=c, A_eq=A_eq, b_eq=b_eq, bounds=bounds,
                      method=self.method, options=options)
        assert res.success

    else:
        # other methods do not warn
        options = {**self.options}
        if self.method in {'interior-point'}:
            options['sparse'] = True

        res = linprog(c=c, A_eq=A_eq, b_eq=b_eq, bounds=bounds,
                      method=self.method, options=options)
        assert res.success

Copy link
Contributor

@mdhaber mdhaber Aug 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#keep assert_warns check or no?

Yes, you want to check for the warning. You can condense things a little:

    if self.method in {'simplex', 'revised simplex'}:
        # Old simplex methods convert sparse to dense
        with assert_warns(scipy.sparse.SparseEfficiencyWarning):
            res = linprog(c=c, A_eq=A_eq, b_eq=b_eq, bounds=bounds,
                          method=self.method, options=self.options)
    else:
        options = {**self.options}
        if self.method in {'interior-point'}:
            options['sparse'] = True

        res = linprog(c=c, A_eq=A_eq, b_eq=b_eq, bounds=bounds,
                      method=self.method, options=options)

        assert res.success

#this is the correct way to check the problem is solved correctly?

I suggested that you add a check for a correct solution because there is a typo above A_ub = A_eq.toarray() that this test should catch (by failing). Does this test fail currently and pass when you fix the typo? If not (e.g. it passes with the typo) then it certainly needs to be strengthened. Even if it does (fail now and pass when you fix the typo), it probably should be strengthened because res.success does not ensure that that linprog solved the problem that you intended it to solve.

I'd suggest that the test check whether the solution res.x is the correct solution to the original problem. You can do that by changing the problem to one for which you already know the answer (e.g. it is published somewhere or you have some other an external reference) or you solve the problem with one of the solvers that you are not changing the behavior of (e.g. one of the HiGHS solvers. Since you are not changing their behavior, you can use them as a reference to test the behavior of the solvers you are changing - the old simplex solvers.)

@dcb2124
Copy link
Author

dcb2124 commented Aug 10, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants