Changes to user interface in CVXPY 1.0 #199

Open
SteveDiamond opened this Issue Jun 22, 2015 · 18 comments

Projects

None yet

8 participants

@SteveDiamond
Member

I'm working on cvxpy 1.0. There will be some changes to the user interface. These fall into three categories:

  1. Making cvxpy syntax more like NumPy.
    • The expr.value and constraint.dual_value will return NumPy 2D arrays instead of NumPy matrices.
    • The expr.size field will be replaced by expr.shape, and expr.size will return expr.shape[0]*expr.shape[1], like in NumPy.
    • The variable argument functions, like vstack, will take lists instead of multiple arguments, like in NumPy. People who are new to Python don't know about unpacking lists with *, so lists are better.
  2. Adding sets and treating variables and parameters that belong to sets in a unified way.
    • I haven't figured out how this will work yet, but it may involve changing how variables and parameters that belong to sets are created. For example, the syntax for creating positive parameters may change from Parameter("positive") to something that extends to nonnegativity constraints and positive variables.
  3. Better error messages and exceptions.
    • The error messages could be a lot better, and it might be nice to make more use of custom exception types.
    • I might add a flag you can set so DCP errors are triggered when you form an expression, rather than when you try to solve the problem. Perhaps this should even be the default.

The first category of changes will be really annoying, but I think they're necessary. The other two should be fairly easy to adjust to.

Post here if you have any other ideas for changes to the user interface or thoughts on the proposed changes.

To be clear, there will be much more exciting new features, but they won't require rewriting any code that uses cvxpy.

@SteveDiamond SteveDiamond added this to the 1.0 milestone Jun 22, 2015
@tachim
tachim commented Jul 20, 2015

#1 is a great idea, I was doing that with a helper method anyway everywhere I used cvxpy.Problem

@SteveDiamond
Member

I've procrastinated way too long on the cvxpy 1.0 update. I'll aim to finish it this month.

Here are two proposals for how sets could work. I'm not sure which is better. @echu and @ajfriend I'd appreciate your thoughts on this. I'm also open to alternatives.

Proposal 1

Sets will be a separate type of class. Examples of sets include Pos for nonnegative reals, PosSemidef for positive semidefinite matrices, and Bool for booleans. There will be an Intersect class/function that gives the intersection of a list of sets.

When you specify a set's dimensions, this creates a variable that is an element of that set. For example, Pos(3,4) creates a 3-by-4 matrix variable constrained to have all positive entries. This means that to constrain an expression to lie in a set, you write

expr == Set(dim1, dim2, ...)

like in CVX. The standard Variable(m,n) constructor is equivalent to Real(m,n).

You cannot constrain an expression to lie in a set without the additional equality constraint. The downside of this is that you're always creating an extra variable. The benefit is that it clarifies the issue of dual values. The current contract is that all constraints have a dual value assigned when you solve the problem. If you can create constraints involving non-conic sets, for example x in Polyhedron, the constraint won't have a dual value. We'll have to distinguish between constraints with duals and constraints without duals.

The other function of sets is to specify the domains of variables and parameters. Creating a positive parameter, for example, would look like this

p = Parameter(m, n, domain=Pos)

Notice that we don't specify the dimensions of the set, since they can be inferred. To say a parameter belongs to multiple sets, you give the intersection of those sets as the domain. The parameter constructor will use the domain to set the parameter's DCP properties.

For symmetry variables will also have a domain argument, though specifying a set as the domain is equivalent to constructing the variable from that set.

Proposal 2

An alternative approach is for Set(dim1, dim2, ...) to create a set object rather than a variable. You would get a variable constrained to lie in a set by writing Set(dim1, ...).elem() and create the constraint that an expression lies in a set by writing Set(dim1,...).contains(expr). You could also do arithmetic with sets, e.g., set3 = Set1(n) + Set2(n). There would be a method for getting parameters from a set, or parameters could have a domain argument like in proposal 1.

This approach makes more sense mathematically, since sets always have dimensions. It also makes constraints more accessible to the user, since they can create them directly. In proposal 1 there would be a distinction between constraints the user can create and internal constraints that cvxpy can create.

A downside is that only some constraints will have dual values, which might be confusing. Another downside is that set arithmetic might be confusing because people will try to combine expressions and sets. In proposal 1 you can do arithmetic with variables derived from sets, which gives most of the benefits of set arithmetic without the expressions vs. sets issue.

@ajfriend
Contributor

This is really interesting! I hadn't thought about the difficulty in promising dual variables when you allow for general sets.

What's the goal you have in mind for adding sets? Is it adding features (like set arithmetic) or is it that it would clean up a lot of internal code and simplify function definitions? Allowing for general sets and set arithmetic seems tricky, and I don't think I have anything helpful to say.

I do see a benefit to unifying things so that you only have a single type of set and a single type of linear expression, which would both be used internally by CVXPY and by end-users. A third proposal for sets might be something like

  • define a few atomic set objects, given by the standard convex cones
  • don't allow for manipulation or definitions of new sets
  • the only way that sets interact with convex programs is by containing a linear expression, which would form a constraint: lin_exp.in(S) or S.contains(lin_exp)?
  • make linear expressions and sets available to users to define new functions via their epigraph (or to express constraints in convex programs)
  • refactor the current library of CVXPY functions using the new linear expressions and functions

The benefits would be that

  • hopefully function definitions in CVXPY would shorten a lot
  • defining new functions would be easier for users and for developers, and easier to debug
  • if users do want more complicated sets, they could be implemented as helper functions which return a list of constraints involving a given variable
  • because there is only a single class of linear expression and a single class of set, and both are accessible to the user, it would be easy to expose problem canonicalization to the user (if they're curious about how the problem is transformed) because the canonicalized problem would be another convex problem built from exactly the same elements they were originally working with
  • overall, it could be helpful to remove a lot of the DCP 'magic' that goes on behind the scenes

This proposal wouldn't accomplish as much as your other two, but maybe it would be a good starting point, and some of that other functionality could be added later.

Of course, this is just based on how I was thinking of the problem, so it might not mesh with what you were thinking or trying to accomplish, but hopefully it's helpful.

@SteveDiamond
Member

The main goal in adding sets is to have a unified approach to parameters that belong to special sets (nonnegative, positive semidefinite, etc.) and variables that belong to special sets. This will change the user interface, which is why it's part of the cvxpy 1.0 release.

A secondary goal is to sort out the huge mess that is constraints. Right now there's a distinction between the constraints the user can create (equality, inequality, and positive semidefinite cone constraints) and the constraints cvxpy can create internally. Even worse, there's a third level of constraints that's a half baked LinOp version of equality and inequality constraints.

By adding sets in the style of proposal 2, I hope to reduce the three types of constraints to a single type, which the user can create.

I like your proposal too. I at some point want to get rid of LinOps and define the atoms more elegantly, maybe as partial optimization problems like you suggest. We'll need some way to export a simplified expression tree data structure (like the current LinOp data type) to pass to other libraries, but there's no reason to canonicalize into a special type of expression.

After thinking about it more, I don't think the dual variable issue is a big deal. If someone is sophisticated enough to know what a dual variable is, they can handle the idea that only cone constraints have them.

What do you think of the syntax for proposal 2? Here's how you create a variable constrained to be positive semidefinite:

X = PosSemidef(n).elem() # vs. X = Semidef(n) now.

and here's how you create a parameter with the same constraint:

P = PosSemidef(n).param() # not possible now.

I like expr.in(S) for creating constraints, but I'm worried that people will write expr in S and expect it to work. What do you think?

@argriffing

expr.in(S) -- I'm pretty sure this would be a python syntax error because of how special in is.

@SteveDiamond
Member

Oops, you're right. I just tested it out. Is there anyway to make expr in S return something other than a boolean? I spent a while trying to get something like this to work, but it seemed that in returns a boolean no matter what.

@ajfriend
Contributor

Yeah, the in operator is a bummer in that it must return a boolean value, but I've never understood why it's different from the other python operators in that way.

The .elem() and .param() methods don't seem as intuitive to me as X = Semidef(n) or

X = Variable(n,n)
X.in(S)

but that's probably just a personal bias. But since X.in isn't an option, would a synonym for in work?
Some random possibilities off the top of my head:

  • expr._in_(S) (this would go against Python convention for public method names, but maybe that's ok)
  • expr._in(S)
  • expr.in_(S)
  • expr.within(S)
  • expr.inside(S)
  • expr.containedby(S)
  • expr.member(S)
  • expr.of(S)
@echu
Member
echu commented Sep 16, 2015

Steven, both proposals sound awesome. From what I can tell, I think you
should implement proposal 2. If folks want syntactic sugar, I think you can
build proposal 1 on top of proposal 2.

Also, might I suggest "CVXSet" as the type? (Are they always going to be
convex? Or do you have in mind some nonconvex ones, e.g., {0,1}^n?) I'm
worried that "Set" and "set" (the python builtin) are basically one shift
key away from arcane bugs.

On Tue, Sep 15, 2015 at 5:21 PM, Steven Diamond notifications@github.com
wrote:

Oops, you're right. I just tested it out. Is there anyway to make expr in
S return something other than a boolean? I spent a while trying to get
something like this to work, but it seemed that in returns a boolean no
matter what.


Reply to this email directly or view it on GitHub
#199 (comment).

@argriffing

I'm worried that "Set" and "set" (the python builtin) are basically one shift key away from arcane bugs.

Yes, and before set was a python builtin it was spelled with the shift key
https://docs.python.org/2/library/sets.html#sets.Set

@dave31415

expr.isin(S) might be good or
expr.is_in(S) which perhaps is clearer

@dave31415

You could also just do expr in S if you modify the contains method of S.

@SteveDiamond
Member

Expr.isin is pretty good. It's not enough to modify the contains method,
because I want "expr in S" to return a constraint, but it can only return a
Boolean.

On Tuesday, March 29, 2016, David Johnston notifications@github.com wrote:

You could also just do expr in S if you modify the contains method of
S.


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
#199 (comment)

@SteveDiamond
Member

I just taught a course on convex optimization using cvxpy, and it changed my perspective on cvxpy 1.0. The main issue is that in Python 2.7, NumPy ndarrays are too confusing for people coming from MATLAB. The A.dot(B) notation is embarrassing, and doesn't mesh with cvxpy's use of * for matrix multiplication. I'm not comfortable anymore with having expr.value return an ndarray instead of a matrix, at least in Python 2.7.

ndarrays are more tolerable in Python 3.5 due to the matrix multiplication operator. I could change cvxpy in 3.5 so that @ means matrix multiplication, * means elementwise and scalar multiplication, and expr.value returns an ndarray. But what do I do then with the Python 2.7 version? I'd really appreciate suggestions.

@dave31415

Yeah, it's annoying that you can't define binary operators easily for
associative operations in python.

There is the "Infix" class hack that you can find by googling. For example
http://code.activestate.com/recipes/384122-infix-operators/

You could also define a function called "mult" or something that would at
least allow you to do something as follows. You'd probably want to optimize
the order of operations to minimize computation which shouldn't be too
hard. This does feel like reinventing the wheel. Surely this has been done
before?

import numpy as np

def mult(*args):
# non-optimized
for i, a in enumerate(args):
if i ==0:
result = a
else:
result = np.dot(result,a)
return result

def test_mult():
M = np.ones((5,9))
x = np.arange(5)
y = np.arange(9)
assert mult(x,M,y) == 360.0

David Johnston
Principal Data Scientist, ThoughtWorks
dajohnst@thoughtworks.com
c: 773 600-9417
LinkedIn https://www.linkedin.com/profile/view?id=127493291

On Sun, Apr 3, 2016 at 6:18 PM, Steven Diamond notifications@github.com
wrote:

I just taught a course on convex optimization using cvxpy, and it changed
my perspective on cvxpy 1.0. The main issue is that in Python 2.7, NumPy
ndarrays are too confusing for people coming from MATLAB. The A.dot(B)
notation is embarrassing, and doesn't mesh with cvxpy's use of * for
matrix multiplication. I'm not comfortable anymore with having expr.value
return an ndarray instead of a matrix, at least in Python 2.7.

ndarrays are more tolerable in Python 3.5 due to the matrix multiplication
operator. I could change cvxpy in 3.5 so that @ means matrix
multiplication, * means elementwise and scalar multiplication, and
expr.value returns an ndarray. But what do I do then with the Python 2.7
version? I'd really appreciate suggestions.


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#199 (comment)

@dave31415

I guess numpy already has this (multi_dot). See below. There is also the
matrix class in numpy.

import numpy as np
from numpy.linalg import multi_dot

def Mult(*args):
return multi_dot(args)

def test_mult():
M = np.ones((5,9))
x = np.arange(5)
y = np.arange(9)
assert Mult(x,M,y) == 360.0

David Johnston
Principal Data Scientist, ThoughtWorks
dajohnst@thoughtworks.com
c: 773 600-9417
LinkedIn https://www.linkedin.com/profile/view?id=127493291

On Sun, Apr 3, 2016 at 8:03 PM, David Johnston dajohnst@thoughtworks.com
wrote:

Yeah, it's annoying that you can't define binary operators easily for
associative operations in python.

There is the "Infix" class hack that you can find by googling. For example
http://code.activestate.com/recipes/384122-infix-operators/

You could also define a function called "mult" or something that would at
least allow you to do something as follows. You'd probably want to optimize
the order of operations to minimize computation which shouldn't be too
hard. This does feel like reinventing the wheel. Surely this has been done
before?

import numpy as np

def mult(*args):
# non-optimized
for i, a in enumerate(args):
if i ==0:
result = a
else:
result = np.dot(result,a)
return result

def test_mult():
M = np.ones((5,9))
x = np.arange(5)
y = np.arange(9)
assert mult(x,M,y) == 360.0

David Johnston
Principal Data Scientist, ThoughtWorks
dajohnst@thoughtworks.com
c: 773 600-9417
LinkedIn https://www.linkedin.com/profile/view?id=127493291

On Sun, Apr 3, 2016 at 6:18 PM, Steven Diamond notifications@github.com
wrote:

I just taught a course on convex optimization using cvxpy, and it changed
my perspective on cvxpy 1.0. The main issue is that in Python 2.7, NumPy
ndarrays are too confusing for people coming from MATLAB. The A.dot(B)
notation is embarrassing, and doesn't mesh with cvxpy's use of * for
matrix multiplication. I'm not comfortable anymore with having expr.value
return an ndarray instead of a matrix, at least in Python 2.7.

ndarrays are more tolerable in Python 3.5 due to the matrix
multiplication operator. I could change cvxpy in 3.5 so that @ means
matrix multiplication, * means elementwise and scalar multiplication,
and expr.value returns an ndarray. But what do I do then with the Python
2.7 version? I'd really appreciate suggestions.


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#199 (comment)

@SteveDiamond
Member

These are great suggestions, thanks! My main goal is to have code that looks like math, and my secondary goal is to make cvxpy more compatible with numpy. The question is whether these goals are compatible.

My current approach is to use the numpy matrix class in cvxpy, but apparently numpy people are strongly against using the matrix class. If I change cvxpy to return numpy ndarrays, then the difference between cvxpy syntax, where matrix multiplication is *, and numpy syntax gets very confusing for beginners. I'm willing to switch cvxpy to the Python 3.5 matrix multiplication syntax, but the syntax isn't compatible with Python 2.7.

So the question is if I switch to Python 3.5 syntax, what do I do with the Python 2.7 version? Your suggestions are an interesting alternative to the 3.5 syntax that I'll need to think about.

@mwytock
Contributor
mwytock commented Apr 5, 2016

Hi Steven,

My view is just to follow what NumPy does as closely as possible (e.g.
multiplication, axis parameter, even broadcasting, ideally). Yes, there are
many cons, (A.dot(x) is ugly!) but I think its the most pragmatic approach
and has the benefit of defining clearing semantics for every operator. The
Matlab vs. NumPy syntax issues are well known and there are lots of
resources, e.g.

http://mathesaurus.sourceforge.net/matlab-numpy.html

If CVXPY does something different, than the risk is now there is a 3rd
syntax that is somewhere between NumPy and Matlab with a new set of
semantics.

I also think that the popularity of the numerical python platform has/will
quickly eclipse Matlab, at least in many circles, so that will quickly
define the standard "expected behavior" for many people in years to come.

Cheers,
Matt

On Mon, Apr 4, 2016 at 1:12 PM, Steven Diamond notifications@github.com
wrote:

These are great suggestions, thanks! My main goal is to have code that
looks like math, and my secondary goal is to make cvxpy more compatible
with numpy. The question is whether these goals are compatible.

My current approach is to use the numpy matrix class in cvxpy, but
apparently numpy people are strongly against using the matrix class. If I
change cvxpy to return numpy ndarrays, then the difference between cvxpy
syntax, where matrix multiplication is *, and numpy syntax gets very
confusing for beginners. I'm willing to switch cvxpy to the Python 3.5
matrix multiplication syntax, but the syntax isn't compatible with Python
2.7.

So the question is if I switch to Python 3.5 syntax, what do I do with the
Python 2.7 version? Your suggestions are an interesting alternative to the
3.5 syntax that I'll need to think about.


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#199 (comment)

@bodono
Member
bodono commented Apr 19, 2016

I second the opinion to stick as closely to numpy as possible.

As an addition, you could also support the @ operator, if that's even possible:

http://legacy.python.org/dev/peps/pep-0465/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment