Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Symmetricity of SDP #31

Closed
karanveerm opened this issue Jan 1, 2015 · 21 comments
Closed

Symmetricity of SDP #31

karanveerm opened this issue Jan 1, 2015 · 21 comments

Comments

@karanveerm
Copy link

Unless we(SCS.jl) are doing something wrong, it seems possible that SCS can return semi-definite matrices that are not symmetric.

using Convex, SCS
set_default_solver(SCSSolver())

y = Variable((2, 2), :Semidefinite)
p = minimize(y[1, 1], y[1, 2] == 1)
solve!(p)
y.value
   0           1.0    
 -1.0         0.58682

Is there a reason why it is possible for SDP matrices to not be symmetric?
The README seems to say
positive semidefinite cone { X | min(eig(X)) >= 0, X = X^T }
in which case the matrix should be symmetric.

Here is the data in raw format:

A = [
 0.0 0.0 1.0 0.0;
 -1.0 0.0 0.0 0.0;
 0.0 -1.0 0.0 0.0;
 0.0 0.0 -1.0 0.0;
 0.0 0.0 0.0 -1.0]
c = [1.0,0.0,0.0,0.0]
b = [1.0,0.0,0.0,0.0,0.0]
f = 1
s = [2]
ssize = 1
@mlubin
Copy link
Contributor

mlubin commented Jan 1, 2015

Do the constraints need to be symmetric? e.g.
0.5*y[1, 2] + 0.5*y[2, 1] == 1 instead of y[1, 2] == 1

@SteveDiamond
Copy link
Member

I've always added a symmetry constraint, and SCS has worked fine. But I was looking through the SCS code, and something struck me as a little odd.

The code for projecting a matrix X onto the PSD cone actually projects (X + X')/2 onto the PSD cone. So it's not assuming that X is symmetric. But if X is not constrained to be symmetric, then the PSD cone is not self-dual. On R^{nxn}, the dual of symmetric PSD matrices is matrices Y such that Y+Y' is PSD. So I'm not sure what's going on.

Maybe it would be better if PSD cones had dimension n*(n+1)/2, i.e. only the lower triangular entries.

@bodono what do you think? I probably have misunderstood something.

@SteveDiamond
Copy link
Member

What CVXOPT does is assume that in

minimize        c'*x 
subject to      A*x + s = b 
                s in K

the rows of A corresponding to the upper triangular parts of PSD blocks in s are assumed to be the same as the rows corresponding to the lower triangular parts.

@bodono
Copy link
Member

bodono commented Jan 2, 2015

@SteveDiamond is right, basically the cone of symmetric positive semi-definite cones is not self-dual in R^{n x n}, so the Moreau decomposition of any point into a member of the PSD cone and the dual of the PSD cone will end up with a non-symmetric part. SCS projects the dual variables onto the cone, so it ends up with a primal variable that might not be symmetric. I noticed this very early on when I was implementing the PSD cone. The right way to fix it is to make the PSD cone explicitly a subset of R^{n x (n+1)/2}. The reason I don't do this is because I wanted to use the same input format as sedumi, which unfortunately uses n x n. There are two ways we could think about fixing this:

  1. Make scs.jl (and others) do as cvxpy does, and make the data matrix "symmetric" with respect to s.
  2. Allow two ways to specify the PSD constraints in SCS, one n x n, and one n x (n+1)/2. cvxpy, scs.jl, cvx, and others can use the second way for all PSD cones, but we're still compatible with Sedumi input formats. This would likely take the form of a new field in the cone struct.

Thoughts?

@mlubin
Copy link
Contributor

mlubin commented Jan 2, 2015

FWIW, Mosek, the only commercial SDP solver, enforces symmetry by only accepting the lower triangle of matrix coefficients: http://docs.mosek.com/7.1/pythonapi/Semidefinite_optimization.html. Input format and algorithmic implementation are two distinct issues. My vote would be for doing it the "right" way algorithmically so that it's impossible to create a nonsymmetric solution. You can always transform the user's input if they want to provide the problem in n x n format.

@madeleineudell
Copy link
Collaborator

first, a mathematical aside: @bodono, I was somewhat surprised/confused at first by your assertion that "the cone of symmetric positive semi-definite cones is *not *self-dual in R^{n x n}", so I'd like to give a very concrete example to help posterity understand what's going on. I'm going to call symmetric positive semi-definite SPSD in what follows, for clarity. The SPSD cone is self-dual in S^n, but not in R^{n x n}. Self-duality in S^n means that "Among all symmetric X and Y, Tr(X,Y) >= 0 for any Y PSD <=> X is PSD". Non-self-duality in R^{n x n} means "There is some asymmetric X for which Tr(X,Y) >= 0 for
any Y SPSD." In fact, @SteveDiamond mentioned above that any X for which (X+X')/2 is PSD will work. Let's see that explicitly: how about X = [10 1; -1, 10]. Then (writing <X,Y> for the trace inner product) <X,Y> = <[10 0; 0, 10], Y> + <[0 1; -1, 0], Y> = 10 Tr(Y) + 0 > 0 for any Y SPSD.

I agree with Miles that it's important to make it impossible to create a problem for which SCS returns an asymmetric solution. Is there a good way to detect asymmetric problems and to symmetrize them inside SCS? The cvxpy solution right now is to add constraints to enforce symmetry of PSD
variables, which is different from modifying the matrix A so that the algorithm will automatically enforce symmetry without adding extra constraints. But either of these, I think, should be done inside SCS
(perhaps with an optional flag to turn off the behavior) rather than by its calling libraries.

On Fri, Jan 2, 2015 at 7:50 AM, Miles Lubin notifications@github.com
wrote:

FWIW, Mosek, the only commercial SDP solver, enforces symmetry by only
accepting the lower triangle of matrix coefficients:
http://docs.mosek.com/7.1/pythonapi/Semidefinite_optimization.html. Input
format and algorithmic implementation are two distinct issues. My vote
would be for doing it the "right" way algorithmically so that it's
impossible to create a nonsymmetric solution. You can always transform the
user's input if they want to provide the problem in n x n format.


Reply to this email directly or view it on GitHub
#31 (comment).

Madeleine Udell
PhD Candidate in Computational and Mathematical Engineering
Stanford University
www.stanford.edu/~udell

@madeleineudell
Copy link
Collaborator

Here's a more concrete proposal. Suppose we have the (standard conic form)
problem

minimize c'*x
subject to Ax+b in K

and we know that x[i]==x[j] at any solution. Well then we can symmetrize A
and c with respect to the indices i and j without changing the set of
solutions. We just make the ith column of the A and the jth column of A
both equal to the average of the ith and jth column, and similarly for c.
Explicitly, let

A[:,i] <- (A[:,i] + A[:,j])/2
A[:,j] <- (A[:,i] + A[:,j])/2
c[i] <- (c[i]+c[j])/2
c[j] <- (c[i]+c[j])/2

and the set of solutions to the problem is preserved. But now there is
nothing breaking the symmetry between i and j in the problem. As far as I
understand the algorithm, that will imply that x[i]=x[j] at the solution
returned by SCS so long as the algorithm is initialized symmetrically.
(Others with more intuition, please correct me if that's wrong. I'm also
not sure if the algorithm would be stable wrt perturbations away from
symmetry.)

This kind of preprocessing could be imposed by any modeling layer that
knows that part of a variable is supposed to be in the SDP cone, either in
Convex.jl or in MathProgBase by exploiting our new very expressive conic
form.

But this columnwise procedure is a little unnatural, because the problem
statement really imposes symmetry on slack variables (corresponding to rows
of A), rather than variables (columns). I suppose you could do the same
thing for the rows of A, but I'm having trouble thinking of a row-wise
scheme that guarantees a symmetric solution x. (Because even if the rows
corresponding to SDP constraints on x are symmetric, we might have an
asymmetry between entries of x in the objective.)

On Fri, Jan 2, 2015 at 12:52 PM, Madeleine Udell madeleine.udell@gmail.com
wrote:

first, a mathematical aside: @bodono, I was somewhat surprised/confused at
first by your assertion that "the cone of symmetric positive
semi-definite cones is *not *self-dual in R^{n x n}", so I'd like to give
a very concrete example to help posterity understand what's going on. I'm
going to call symmetric positive semi-definite SPSD in what follows, for
clarity. The SPSD cone is self-dual in S^n, but not in R^{n x n}.
Self-duality in S^n means that "Among all symmetric X and Y, Tr(X,Y) >= 0
for any Y PSD <=> X is PSD". Non-self-duality in R^{n x n} means "There
is some asymmetric X for which Tr(X,Y) >= 0 for any Y SPSD." In fact,
@SteveDiamond mentioned above that any X for which (X+X')/2 is PSD will
work. Let's see that explicitly: how about X = [10 1; -1, 10]. Then
(writing <X,Y> for the trace inner product) <X,Y> = <[10 0; 0, 10], Y> + <[0
1; -1, 0], Y> = 10 Tr(Y) + 0 > 0 for any Y SPSD.

I agree with Miles that it's important to make it impossible to create a
problem for which SCS returns an asymmetric solution. Is there a good way
to detect asymmetric problems and to symmetrize them inside SCS? The
cvxpy solution right now is to add constraints to enforce symmetry of PSD
variables, which is different from modifying the matrix A so that the
algorithm will automatically enforce symmetry without adding extra
constraints. But either of these, I think, should be done inside SCS
(perhaps with an optional flag to turn off the behavior) rather than by its
calling libraries.

On Fri, Jan 2, 2015 at 7:50 AM, Miles Lubin notifications@github.com
wrote:

FWIW, Mosek, the only commercial SDP solver, enforces symmetry by only
accepting the lower triangle of matrix coefficients:
http://docs.mosek.com/7.1/pythonapi/Semidefinite_optimization.html.
Input format and algorithmic implementation are two distinct issues. My
vote would be for doing it the "right" way algorithmically so that it's
impossible to create a nonsymmetric solution. You can always transform the
user's input if they want to provide the problem in n x n format.


Reply to this email directly or view it on GitHub
#31 (comment).

Madeleine Udell
PhD Candidate in Computational and Mathematical Engineering
Stanford University
www.stanford.edu/~udell

Madeleine Udell
PhD Candidate in Computational and Mathematical Engineering
Stanford University
www.stanford.edu/~udell

@bodono
Copy link
Member

bodono commented Jan 4, 2015

I don't want to have any sort of data cleaning or preparation inside of SCS (besides the normalization, which can actually be done 'matrix-free' anyway). The reason being that the code treats the cones and the linear systems entirely separately, so that you can add any new cones you like, or swap in a new linear system solver very easily. In fact you can run SCS without having access to the data directly, just an oracle that supplies Ax and A'x for input x. So the options are to do nothing and just assume that the input data matrix A respects the symmetry (or enforces a symmetry constraint as cvxpy does), or change the input format to only supply the lower triangular part of the primal variable. I prefer the second option, but it is a breaking API change so is pretty drastic.

@mlubin
Copy link
Contributor

mlubin commented Jan 4, 2015

@bodono, data cleaning and preprocessing user input is quite a well accepted practice. User models typically have many variables that can easily be eliminated, constraints that are redundant, or coefficients that can be improved by logical implication. These tricks are especially useful for relaxations of integer programming-type problems, but still for LPs a speedup between 40% and 400% would not be surprising see here and here. I'd argue that these techniques are underutilized by conic solvers.
I understand the desire to keep messiness like this out of SCS. The question really is where it should go. We'll be able to manage the symmetrization preprocessing somewhere in the Julia stack, but this means that it won't be available for other users like cvxpy. Would you guess that there's a measurable performance impact by adding n^2/2 extra (very sparse) equality constraints for each PSD cone?

@madeleineudell
Copy link
Collaborator

@bodono, what problem exactly is SCS solving? looking at KV's example from the top of this thread,

minimize Y[1,1]
subject to -Y + S = 0
S >= 0,

SCS gives the answer Y = [0 1; -1 .586]. it's clear that there is no symmetric S such that -Y + S = 0, so it seems like SCS isn't solving the problem the documentation claims to solve, ie that S is in the SDP cone with SDP cone defined as { X | min(eig(X)) >= 0, X = X^T }.

Is it solving the problem with the asymmetric definition of SDP cone, ie SDP cone defined as { X | min(eig(X+X')) >= 0 }?

@bodono
Copy link
Member

bodono commented Jan 6, 2015

For what we're calling the primal problem it essentially uses the asymmetric version of the PSD cone, for the dual problem it's the symmetric definition. This doesn't really make sense from a user standpoint, so it should definitely change.

@madeleineudell
Copy link
Collaborator

It might be fine, so long as the documentation makes it clear that that's
what it's doing.

Actually, that would allow you to solve problems involving only the
asymmetric version of the PSD cone, or only the symmetric version, by
passing either the primal or dual of the user's problem to SCS as it is
implemented now. A user wishing to solve

minimize c'*x
subject to b-Ax in K

where K might include some symmetric PSD cones (and no asymmetric ones)
just needs to pass to SCS the problem

minimize -b'_y
subject to A'_y + c = 0
y in K^*

(so now K^* has some asymmetric PSD cones in it, and no symmetric ones) and
accept the dual solution to this problem as the primal solution to the
original problem.

Have I understood this correctly?

On Tue, Jan 6, 2015 at 6:59 AM, bodono notifications@github.com wrote:

For what we're calling the primal problem it essentially uses the
asymmetric version of the PSD cone, for the dual problem it's the symmetric
definition. This doesn't really make sense from a user standpoint, so it
should definitely change.


Reply to this email directly or view it on GitHub
#31 (comment).

Madeleine Udell
PhD Candidate in Computational and Mathematical Engineering
Stanford University
www.stanford.edu/~udell

@bodono
Copy link
Member

bodono commented Jan 7, 2015

Technically the PSD cone is R^{n x n} is a different cone to Sn, and I don't think it's really useful to use the R^{n x n} one. How difficult would it be for Convex.jl or cvxpy to use a vector of length n x (n+1)/2 instead of n^2 for PSD cone variables?

@SteveDiamond
Copy link
Member

It would be very easy for cvxpy to use a vector of length n x (n+1)/2
instead of n^2 for PSD cone variables.

On Wed, Jan 7, 2015 at 3:35 AM, bodono notifications@github.com wrote:

Technically the PSD cone is R^{n x n} is a different cone to Sn, and I
don't think it's really useful to use the R^{n x n} one. How difficult
would it be for Convex.jl or cvxpy to use a vector of length n x (n+1)/2
instead of n^2 for PSD cone variables?


Reply to this email directly or view it on GitHub
#31 (comment).

@madeleineudell
Copy link
Collaborator

Yes, I think that would work fine for us too.

Just to clarify: as far as I understand, this is only a change in the size
of the returned solution, not of the call. c, A, b and cones stay the same,
yes? In that case, I'm not sure why SCS couldn't just perform the
symmetrization (from a vector of length n(n+1)/2 to one of length n^2)
internally, to prevent a breaking API change.

On Wed, Jan 7, 2015 at 3:28 PM, Steven Diamond notifications@github.com
wrote:

It would be very easy for cvxpy to use a vector of length n x (n+1)/2
instead of n^2 for PSD cone variables.

On Wed, Jan 7, 2015 at 3:35 AM, bodono notifications@github.com wrote:

Technically the PSD cone is R^{n x n} is a different cone to Sn, and I
don't think it's really useful to use the R^{n x n} one. How difficult
would it be for Convex.jl or cvxpy to use a vector of length n x (n+1)/2
instead of n^2 for PSD cone variables?


Reply to this email directly or view it on GitHub
#31 (comment).


Reply to this email directly or view it on GitHub
#31 (comment).

Madeleine Udell
PhD Candidate in Computational and Mathematical Engineering
Stanford University
www.stanford.edu/~udell

@bodono
Copy link
Member

bodono commented Jan 8, 2015

A and b would have to change since the number of rows in A is equal to the sum of the sizes of all the cones, and we're making the PSD cones go down to size n(n+1)/2 instead of n^2. Any constraints on the upper triangular part of the semidefinite variable would have to be converted to constraints on the lower triangular part, so the structure would change too. Also cvxpy (and possibly cvx and Convex.jl?) would be able to drop the explicit S = S^T set of equality constraints for PSD variables, making the problems smaller again. The semidefinite variables solutions returned by SCS would just be the lower triangular part (stacked columnwise), so to get back the full matrix cvxpy, Convex.jl, cvx etc would have to fill out the upper triangular part.

@madeleineudell
Copy link
Collaborator

this raises an interesting point. right now we'd allow users to say

minimize y
subject to [x y; z w] in SemidefiniteCone

(where x, y, z, and w are variables). if we remove the upper triangular
part of A and b prior to giving the problem to SCS, then the problem is
unbounded.

the easiest way to deal with this is still to add extra equality
constraints (here, y==z), making the total number of rows of A and b to
represent semidefinite variables n^2, where n(n+1)/2 of them come from the
reduced size SDP constraint, and n(n-1)/2 come from the new equality
constraints. this is super easy to implement: take the rows l and u
corresponding to the lower and upper triangular versions of a transposed
element in the SDP constraint. replace u by u-l, and set the cone for that
row to the zero cone. it's also kind of nice that this means the sizes of
everything are preserved.

how does this extra constraint affect the dual variables? how would we
recover the dual to the constraint [x y; z w] in SemidefiniteCone
interpreted as a symmetric SDP constraint, rather than (as implemented) as
a lower triangular SDP constraint + equality constraint?

we could also do some kind of variable/constraint preprocessing to reduce
the number of introduced constraints by logical implications. i'm not sure
that would be worth the extra complexity.

---------- Forwarded message ----------
From: bodono notifications@github.com
Date: Thu, Jan 8, 2015 at 2:11 AM
Subject: Re: [scs] Symmetricity of SDP (#31)
To: cvxgrp/scs scs@noreply.github.com
Cc: Madeleine Udell madeleine.udell@gmail.com

A and b would have to change since the number of rows in A is equal to the
sum of the sizes of all the cones, and we're making the PSD cones go down
to size n(n+1)/2 instead of n^2. Any constraints on the upper triangular
part of the semidefinite variable would have to be converted to constraints
on the lower triangular part, so the structure would change too. Also cvxpy
(and possibly cvx and Convex.jl?) would be able to drop the explicit S = S^T
set of equality constraints for PSD variables, making the problems smaller
again. The semidefinite variables solutions returned by SCS would just be
the lower triangular part (stacked columnwise), so to get back the full
matrix cvxpy, Convex.jl, cvx etc would have to fill out the upper
triangular part.


Reply to this email directly or view it on GitHub
#31 (comment).

Madeleine Udell
PhD Candidate in Computational and Mathematical Engineering
Stanford University
www.stanford.edu/~udell

@bodono
Copy link
Member

bodono commented Jan 9, 2015

That problem is unbounded in either case, right? If you add z>=0 to the problem your point stands.

In that case you would have to, as you say, introduce the equality constraint z==y, and pass three variables (x,z,w) into the S_2 cone constraint. To get the dual variable to the original problem I think you would just do an affine combination of the dual variables from the equality constraint and the symmetric semidefinite dual variable. Interestingly this would end up with a non-symmetric dual variable if I'm doing it right. This seems like the correct approach from the parser, because the user will (generally) expect that [x y; z w] in SemidefiniteCone will enforce the constraint that z==y, and you guys will take care of that in the background.

@madeleineudell
Copy link
Collaborator

Yes, it looks like the equality constraint adds an skew symmetric part to
the dual variable: the new lagrangian has terms mu(y-z) + <lambda, [x y; z
w]> = <lambda + [0 mu; -mu 0], [x y; z w]>, so you'd probably want to
return lambda + [0 mu; -mu 0] as the optimal dual variable.

That's very strange. I'm worried we'd be violating our contract with users
by handing them an asymmetric optimal dual variable for a semidefinite
constraint. But in fact, it is the SDP constraint that is doing the work
to force y==z, and that forcing is asymmetric. So maybe it's fine.

On Fri, Jan 9, 2015 at 8:07 AM, bodono notifications@github.com wrote:

That problem is unbounded in either case, right? If you add z>=0 to the
problem your point stands.

In that case you would have to, as you say, introduce the equality
constraint z==y, and pass three variables (x,z,w) into the S_2 cone
constraint. To get the dual variable to the original problem I think
you would just do an affine combination of the dual variables from the
equality constraint and the symmetric semidefinite dual variable.
Interestingly this would end up with a non-symmetric dual variable if I'm
doing it right. This seems like the correct approach from the parser,
because the user will (generally) expect that [x y; z w] in
SemidefiniteCone will enforce the constraint that z==y, and you guys will
take care of that in the background.


Reply to this email directly or view it on GitHub
#31 (comment).

Madeleine Udell
PhD Candidate in Computational and Mathematical Engineering
Stanford University
www.stanford.edu/~udell

@karanveerm
Copy link
Author

FYI, both sedumi and sdpt3 also return skew symmetric duals.

@bodono
Copy link
Member

bodono commented Jan 11, 2015

I've pushed a new branch, symmetric_sdp, that has the interface change to specify semidefinite variables of size n * (n+1) / 2 (and no longer supporting n^2 sizes). It appears to work...
The matlab and python interfaces are also updated (actually very little needed to change there). If you want to test it out with your parsers @SteveDiamond @madeleineudell @karanveerm and give feedback on how it is, that would be very useful. If it all looks good I would like to put this into the v1.1.0 release of SCS, which I hope to push to master sometime in mid February.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants