In [1]:
# To disable `warnings.warn(ECOS_DEPRECATION_MSG, FutureWarning)`
import warnings
warnings.filterwarnings('ignore')

## 2. Disciplined Convex Programming

__Disciplined convex programming (DCP)__ is a system for constructing mathematical expressions with known curvature from a given library of base functions. CVXPY uses DCP to ensure that the specified optimization problems are convex.

### Expressions

__Expressions__ in CVXPY are formed from variables, parameters, numerical constants such as Python floats and Numpy matrices, the standard arithmetic operators `+`, `-`, `*`, `/`, `@`, and a library of functions.

In [2]:
import cvxpy as cp

# Create variables and parameters.
x, y = cp.Variable(), cp.Variable()
a, b = cp.Parameter(), cp.Parameter()

# Examples of CVXPY expressions.
3.69 + b / 3

Expression(CONSTANT, UNKNOWN, ())

In [3]:
x - 4 * a

Expression(AFFINE, UNKNOWN, ())

In [4]:
cp.sqrt(x) - cp.minimum(y, x - a)

Expression(UNKNOWN, UNKNOWN, ())

In [5]:
cp.maximum(2.66 - cp.sqrt(y), cp.square(x + 2 * y))

Expression(CONVEX, NONNEGATIVE, ())

Expressions can be scalars, vectors, or matrices.

- The dimensions of an expression are stored as `expr.shape`.
- The total number of entries is given by `expr.size`.
- The number of dimensions is given by `expr.ndim`.

CVXPY will raise an exception if an expression is used in a way that doesn't make sense given its dimensions, for example adding matrices of different size.

The semantics for how shapes behave under arithmetic operations are the same as for NumPy ndarrays (except some broadcasting is banned).

In [6]:
import numpy as np

X = cp.Variable((5, 4))
A = np.ones((3, 5))

# Use expr.shape to get the dimensions.
print(f"""
dimensions of X: {X.shape}
size of X: {X.size}
number of dimensions: {X.ndim}
dimensions of sum(X): {cp.sum(X).shape}
dimensions of A @ X: {(A @ X).shape}
""")

# ValueError raised for invalid dimensions.
try:
    A + X
except ValueError as e:
    print(e)


dimensions of X: (5, 4)
size of X: 20
number of dimensions: 2
dimensions of sum(X): ()
dimensions of A @ X: (3, 4)

Cannot broadcast dimensions  (3, 5) (5, 4)


CVXPY uses __DCP analysis__ to determine the __sign__ and __curvature__ of each expression.

### Sign

Each (sub)expression is flagged as __positive__ (__non-negative__), __negative__ (__non-positive__), __zero__, or __unknown__.


The signs of larger expressions are determined from the signs of their subexpressions. For example, the sign of the expression `expr1 * expr2` is

- Zero if either expression has sign zero.
- Positive if `expr1` and `expr2` have the same (known) sign.
- Negative if `expr1` and `expr2` have opposite (known) signs.
- Unknown if either expression has unknown sign.

The sign given to an expression is always correct.

- But DCP sign analysis may flag an expression as unknown sign when the sign could be figured out through more complex analysis.
- For instance, `x * x` is positive but has unknown sign by the rules above.

CVXPY determines the <u>sign of constants</u> by looking at their value.

- For scalar constants, this is straightforward.
- Vector and matrix constants with all positive (negative) entries are marked as positive (negative).
- Vector and matrix constants with both positive and negative entries are marked as unknown sign.

The __sign__ of an expression is stored as `expr.sign`.

In [7]:
x = cp.Variable()
a = cp.Parameter(nonpos=True)
c = np.array([1, -1]) # has no attribute 'sign'

print(f"""
sign of x: {x.sign}
sign of a: {a.sign}
sign of x * x: {(x * x).sign}
sign of square(x): {cp.square(x).sign}
sign of c * a: {(c * a).sign}
""")


sign of x: UNKNOWN
sign of a: NONPOSITIVE
sign of x * x: UNKNOWN
sign of square(x): NONNEGATIVE
sign of c * a: UNKNOWN



### Curvature

Each (sub)expression is flagged as one of the following __curvatures__ (with respect to its variables) using the curvature rules given below.

| __Curvature__ | __Meaning__ |
| ---           | ---         |
| constant      | $f(x)$ independent of $x$ |
| affine        | $f(\theta x + (1-\theta)y) = \theta f(x) + (1-\theta)f(y)$ for all $x,y,\theta\in [0,1]$ |
| convex        | $f(\theta x + (1-\theta)y) \leq \theta f(x) + (1-\theta)f(y)$ for all $x,y,\theta\in [0,1]$ |
| concave       | $f(\theta x + (1-\theta)y) \geq \theta f(x) + (1-\theta)f(y)$ for all $x,y,\theta\in [0,1]$ |
| unknown       | DCP analysis cannot determine the curvature |

As with sign analysis, the conclusion is always correct, but the simple analysis can flag expressions as unknown even when they are convex or concave. 

Note that <u>any constant expression is also affine</u>, and <u>any affine expression is convex and concave</u>.

### Curvature rules

DCP analysis is based on applying a general composition theorem from convex analysis to each (sub)expression.

1. $f(\text{expr}_1,\text{expr}_2,\dotsc,\text{expr}_n)$ is __convex__ if $f$ is a convex function and for each $\text{expr}_i$ one of the following conditions holds:
    - $f$ is increasing in argument $i$ and $\text{expr}_i$ is convex.
    - $f$ is decreasing in argument $i$ and $\text{expr}_i$ is concave.
    - $\text{expr}_i$ is affine or constant.

2. $f(\text{expr}_1,\text{expr}_2,\dotsc,\text{expr}_n)$ is __concave__ if $f$ is a concave function and for each $\text{expr}_i$ one of the following conditions holds:
    - $f$ is increasing in argument $i$ and $\text{expr}_i$ is concave.
    - $f$ is decreasing in argument $i$ and $\text{expr}_i$ is convex.
    - $\text{expr}_i$ is affine or constant.

3. $f(\text{expr}_1,\text{expr}_2,\dotsc,\text{expr}_n)$ is __affine__ if $f$ is an affine function and each $\text{expr}_i$ is affine.

4. If none of the three rules apply, the expression $f(\text{expr}_1,\text{expr}_2,\dotsc,\text{expr}_n)$ is marked as having __unknown__ curvature.

<u>Whether a function is increasing or decreasing in an argument may depend on the sign of the argument</u>. For instance, `cp.square` is increasing for positive arguments and decreasing for negative arguments.

The curvature of an expression is stored as `expr.curvature`:

In [8]:
x = cp.Variable()
a = cp.Parameter(nonneg=True)

print(f"""
curvature of x {x.curvature}
curvature of a: {a.curvature}
curvature of square(x): {cp.square(x).curvature}
curvature of sqrt(x): {cp.sqrt(x).curvature}
""")


curvature of x AFFINE
curvature of a: CONSTANT
curvature of square(x): CONVEX
curvature of sqrt(x): CONCAVE



### Infix operators

The infix operators `+`, `-`, `*`, `/` and matrix multiplication `@` are treated exactly like functions.

- The infix operators `+` and `-` are __affine__, so the rules above are used to flag the curvature.
- For example, `expr1 + expr2` is flagged as convex if `expr1` and `expr2` are convex.
- `expr1 * expr2`, `expr1 / expr2`, and `expr1 @ expr2` can only be DCP when one of the expressions is constant. The curvature rules above apply.
- For example, `expr1 / expr2` is convex when `expr1` is concave and `expr2` is negative and constant.

#### Example 1

DCP analysis breaks expressions down into subexpressions.

The tree visualization below shows how this works for the expression `2 * square(x) + 3`.

![Example 1](https://www.cvxpy.org/_images/example1.png)

#### Example 2

We'll walk through the application of the DCP rules to the expression `sqrt(1 + square(x))`.

![Example 2](https://www.cvxpy.org/_images/example2.png)

The variable `x` has affine curvature and unknown sign. The `square` function is convex and non-monotone for arguments of unknown sign. It can take the affine expression `x` as an argument; the result `square(x)` is convex.

The arithmetic operator `+` is affine and increasing, so the composition `1 + square(x)` is convex by the curvature rule for convex functions. The function `sqrt` is concave and increasing, which means it can only take a concave argument. Since `1 + square(x)` is convex, `sqrt(1 + square(x))` violates the DCP rules and cannot be verified as convex.

In fact, `sqrt(1 + square(x))` is a convex function of `x`, but the DCP rules are not able to verify convexity. If the expression is written as `norm(hstack(1, x), 2)`, the $L_2$ norm of the vector `[1, x]`, which has the same value as `sqrt(1 + square(x))`, then it will be certified as convex using the DCP rules.

In [9]:
print(f"""
sqrt(1 + square(x)) curvature: {cp.sqrt(1 + cp.square(x)).curvature}
norm(hstack([1, x]), 2) curvature: {cp.norm(cp.hstack([1, x]), 2).curvature}
""")


sqrt(1 + square(x)) curvature: QUASICONVEX
norm(hstack([1, x]), 2) curvature: CONVEX



### DCP problems

If a problem follows the DCP rules, it is guaranteed to be convex and solvable by CVXPY.

The DCP rules require that the problem __objective__ have one of two forms:

- Minimize (convex)
- Maximize (concave)

The only valid __constraints__ under the DCP rules are

- affine `==` affine
- convex `<=` concave
- concave `>=` convex

You can check that a problem, constraint, or objective satisfies the DCP rules by calling `object.is_dcp()`.

In [10]:
x = cp.Variable()
y = cp.Variable()

# DCP problems.
prob1 = cp.Problem(
    cp.Minimize(cp.square(x - y)),
    [x + y >= 0]
)
prob2 = cp.Problem(
    cp.Maximize(cp.sqrt(x - y)),
    [2*x - 3 == y, cp.square(x) <= 2]
)

# Non-DCP problems.

# A non-DCP objective.
objective = cp.Maximize(cp.square(x))
prob3 = cp.Problem(objective)

# A non-DCP constraint.
prob4 = cp.Problem(
    cp.Minimize(cp.square(x)),
    [cp.sqrt(x) <= 2]
)

print(f"""
prob1 is DCP: {prob1.is_dcp()}
prob2 is DCP: {prob2.is_dcp()}
prob3 is DCP: {prob3.is_dcp()}
prob4 is DCP: {prob4.is_dcp()}

Maximize(square(x)) is DCP: {objective.is_dcp()}
sqrt(x) <= 2 is DCP: {(cp.sqrt(x) <= 2).is_dcp()}
""")


prob1 is DCP: True
prob2 is DCP: True
prob3 is DCP: False
prob4 is DCP: False

Maximize(square(x)) is DCP: False
sqrt(x) <= 2 is DCP: False



CVXPY will raise an exception if you call `prob.solve()` on a non-DCP problem.

In [11]:
# A non-DCP problem.
prob = cp.Problem(cp.Minimize(cp.sqrt(x)))

try:
    prob.solve()
except Exception as e:
    print(e)

Problem does not follow DCP rules. Specifically:
The objective is not DCP, even though each sub-expression is.
You are trying to minimize a function that is concave.
However, the problem does follow DQCP rules. Consider calling solve() with `qcp=True`.


## Examples

### Least-squares

$$
\text{minimize} \qquad \|Ax-b\|_2^2.
$$

In [12]:
import numpy as np
import cvxpy as cp

# Generate data.
m = 20
n = 15

rng = np.random.default_rng(1)
A = rng.standard_normal((m, n))
b = rng.standard_normal(m)

# Define and solve the CVXPY problem.
x = cp.Variable(n)
cost = cp.sum_squares(A @ x - b)
prob = cp.Problem(cp.Minimize(cost))
prob.solve()

# Print result.
print(f"""\
The optimal value is {prob.value:f}
The optimal x is {x.value}
The norm of the residual is {cp.norm(A @ x - b, p=2).value:f}
""")

The optimal value is 3.011406
The optimal x is [ 0.78861924 -0.51550419  0.39471835 -0.13482039  0.10178291 -0.127348
  1.40952725  0.29569351 -0.0456772  -0.17465868  0.30421628  0.57538791
 -0.01717862  0.0345873   0.45944972]
The norm of the residual is 1.735340



### Linear program

A __linear program (LP)__ is an optimization problem with a linear objective and affine inequality constraints.
$$
\begin{align*}
\text{minimize} &\qquad c^Tx \\
\text{subject to} &\qquad Ax\preceq b,
\end{align*}
$$
where $A\in\mathbb{R}^{m\times n}$, $b\in\mathbb{R}^m$, and $c\in\mathbb{R}^n$ are problem data and $x\in\mathbb{R}^n$ is the optimization variable.

When we solve a LP, in addition to a primal solution $x^*$, we obtain a dual solution $\lambda_i^*\in\mathbb{R}^m$ corresponding to the inequality constraints.

In [13]:
import numpy as np
import cvxpy as cp

# Generate a random non-trivial linear program.
m = 15
n = 10

rng = np.random.default_rng(1)

s0 = rng.standard_normal(m)
# numpy.maximum gives elementwise maximum of array elements
s0_m = np.maximum(-s0, 0) # s0_m >= 0
s0_p = np.maximum(s0, 0)  # s0_p >= 0

x0 = rng.standard_normal(n)
A = rng.standard_normal((m, n))
b = A @ x0 + s0_p
c = -A.T @ s0_m

# Define and solve the CVXPY problem.
# A x <= b means A (x - x0) <= s0_p
# c^T x    means (-s0_m)^T (A x) 
x = cp.Variable(n)
prob = cp.Problem(
    cp.Minimize(c.T @ x),
    [A @ x <= b]
)
prob.solve()

# Print result.
print(f"""\
The optimal value is {prob.value:f}
A primal solution is {x.value}
A dual solution is {prob.constraints[0].dual_value}
""")

The optimal value is 3.437815
A primal solution is [ 0.65503724  0.12598733 -0.24064036 -0.80122806 -0.43705585 -0.01746409
 -0.32684225  1.47349566  0.94751969 -2.57172013]
A dual solution is [1.08119082e-10 3.28896991e-11 7.32567433e-11 1.30315723e+00
 1.75042457e-11 1.81799107e-10 5.36953235e-01 2.00378352e-11
 2.65461315e-11 7.96635278e-11 1.31820850e-10 5.38049480e-11
 7.36454087e-01 1.62909948e-01 4.82119313e-01]



### Quadratic program

A __quadratic program (QP)__ is an optimization problem with a quadratic objective and affine equality and inequality constraints.
$$
\begin{align*}
\text{minimize} &\qquad \frac{1}{2}x^T Px + q^Tx \\
\text{subject to} &\qquad Gx\preceq h \\
&\qquad Ax = b
\end{align*}
$$
where $P\in\mathbb{R}^{n\times n}_{\succeq 0}$, $q\in\mathbb{R}^n$, $G\in\mathbb{R}^{m\times n}$, $h\in\mathbb{R}^m$, $A\in\mathbb{R}^{p\times n}$, and $b\in\mathbb{R}^p$ are problem data and $x\in\mathbb{R}^n$ is the optimization variable​.

When we solve a QP, in addition to a primal solution $x^*$, we obtain a dual solution $\lambda_i^*\in\mathbb{R}^m$ corresponding to the inequality constraints.

In [14]:
import numpy as np
import cvxpy as cp

# Generate a random non-trivial quadratic program.
m = 15
n = 10
p = 5

rng = np.random.default_rng(3)

# Use Cholesky decomposition to make a positive semidefinite matrix
P = rng.standard_normal((n, n))
P = P.T @ P
q = rng.standard_normal(n)

G = rng.standard_normal((m, n))
h = G @ rng.standard_normal(n)

A = rng.standard_normal((p, n))
b = rng.standard_normal(p)

# Define and solve the CVXPY problem.
x = cp.Variable(n)
prob = cp.Problem(
    cp.Minimize((1/2) * cp.quad_form(x, P) + q.T @ x),
    [G @ x <= h, A @ x == b]
)
prob.solve()

# Print result.
print(f"""\
The optimal value is {prob.value:f}
A primal solution is {x.value}
A dual solution is {prob.constraints[0].dual_value}
""")

The optimal value is 33.439013
A primal solution is [-1.55613147 -1.34830305 -0.94253022  0.95936727 -0.37890983  2.3094063
  1.1557562   0.03372598  0.84450632  0.40878405]
A dual solution is [  0.           6.702611     0.          20.57720965   0.
  15.82002902   0.           0.           0.           0.
 131.79080557   0.           0.           0.          13.73678921]



### Second-order cone program

A __second-order cone program (SOCP)__ is an optimization problem of the form
$$
\begin{align*}
\text{minimize} &\qquad f^Tx \\
\text{subject to} &\qquad \|A_ix+b_i\|_2 \leq c_i^Tx+d_i, \quad 1\leq i\leq m \\
&\qquad Fx = g,
\end{align*}
$$
where $f\in\mathbb{R}^n$, $A_i\in\mathbb{R}^{n_i\times n}$, $b_i\in\mathbb{R}^{n_i}$, $c_i\in\mathbb{R}^n$, $d_i\in\mathbb{R}$, $F\in\mathbb{R}^{p\times n}$, and $g\in\mathbb{R}^p$ are problem data and $x\in\mathbb{R}^n$ is the optimization variable​.

When we solve a SOCP, in addition to a primal solution $x^*$, we obtain a dual solution $(\eta_i^*,\lambda_i^*)\in\mathbb{R}\times\mathbb{R}^{n_i}$ corresponding to each second-order cone constraint.

In [15]:
import numpy as np
import cvxpy as cp

# Generate a random feasible SOCP.
m = 3
n = 10
p = 5
n_i = 5

rng = np.random.default_rng(1)

f = rng.standard_normal(n)
x0 = rng.standard_normal(n)
A = []
b = []
c = []
d = []
for i in range(m):
    A.append(rng.standard_normal((n_i, n)))
    b.append(rng.standard_normal(n_i))
    c.append(rng.standard_normal(n))
    d.append(np.linalg.norm(A[i] @ x0 + b, 2) - c[i].T @ x0)
F = rng.standard_normal((p, n))
g = F @ x0

# Define and solve the CVXPY problem.
x = cp.Variable(n)
# We use cp.SOC(t, x) to create the SOC constraint ||x||_2 <= t.
soc_constraints = [
    cp.SOC(ci.T @ x + di, Ai @ x + bi) for Ai, bi, ci, di in zip(A, b, c, d)
]
prob = cp.Problem(
    cp.Minimize(f.T @ x),
    soc_constraints + [F @ x == g]
)
prob.solve()

# Print result.
print(f"""\
The optimal value is {prob.value:f}
A primal solution is {x.value}
""")
for i in range(m):
    print(f"SOC constraint {i} dual variable solution for (t, x) is {soc_constraints[i].dual_value}")

The optimal value is -1.583298
A primal solution is [-0.47194841  0.25981289 -0.67753554  0.68374499 -0.82365898  1.44204024
 -0.22784841 -0.29562695 -0.73023417 -0.34361817]

SOC constraint 0 dual variable solution for (t, x) is [array([0.40436056]), array([[-0.06346225],
       [-0.14688047],
       [ 0.2177872 ],
       [ 0.29901013],
       [-0.03267733]])]
SOC constraint 1 dual variable solution for (t, x) is [array([0.30254382]), array([[ 0.03090247],
       [ 0.22513544],
       [ 0.05651073],
       [-0.01654781],
       [ 0.19085214]])]
SOC constraint 2 dual variable solution for (t, x) is [array([0.13069577]), array([[-0.08922774],
       [-0.03769338],
       [-0.01941461],
       [-0.02506666],
       [ 0.08181527]])]


### Semidefinite program

A __semidefinite program (SDP)__ is an optimization problem of the form
$$
\begin{align*}
\text{minimize} &\qquad \operatorname{tr}(CX) \\
\text{subject to} &\qquad \operatorname{tr}(A_iX) = b_i, \quad 1\leq i\leq p \\
&\qquad X \succeq 0,
\end{align*}
$$
where $C,A_i\in\mathbb{R}^{n\times n}_\text{sym}$ and $b_i\in\mathbb{R}$ are problem data and $X\in\mathbb{R}^{n\times n}_\text{sym}$ is the optimization variable​, where
$$
\operatorname{tr}(CX) \equiv \sum_{i,j} C_{ij}X_{ij}
$$
is the form of a general real-valued linear function on $\mathbb{R}^{n\times n}_\text{sym}$.

In [16]:
import numpy as np
import cvxpy as cp

# Generate a random SDP.
n = 3
p = 3

rng = np.random.default_rng(1)

C = rng.standard_normal((n, n))
A = []
b = []
for i in range(p):
    A.append(rng.standard_normal((n, n)))
    b.append(rng.standard_normal())

# Define and solve the CVXPY problem.
# Create a symmetric matrix variable.
X = cp.Variable((n, n), symmetric=True)
# The operator >> denotes matrix inequality.
constraints = [
    cp.trace(A[i] @ X) == b[i] for i in range(p)
] + [X >> 0]
prob = cp.Problem(
    cp.Minimize(cp.trace(C @ X)),
    constraints
)
prob.solve()

# Print result.
print(f"""\
The optimal value is {prob.value:f}
A solution is {x.value}
""")

The optimal value is 0.972388
A solution is [-0.47194841  0.25981289 -0.67753554  0.68374499 -0.82365898  1.44204024
 -0.22784841 -0.29562695 -0.73023417 -0.34361817]



### Mixed-integer quadratic program

A __mixed-integer quadratic program (MIQP)__ is an optimization problem of the form
$$
\begin{align*}
\text{minimize} &\qquad x^TQx + q^Tx + r \\
\text{subject to} &\qquad x\in\mathcal{C} \\
&\qquad x\in\mathbb{Z}^n,
\end{align*}
$$
where $Q\in\mathbb{R}^{n\times n}_{\succeq 0}$, $q\in\mathbb{R}^n$, and $r\in\mathbb{R}$ are problem data, $x\in\mathbb{Z}^n$ is the optimization variable​, and $\mathcal{C}$ is some convex set.

CVXPY’s preferred open-source mixed-integer nonlinear solver is [SCIP](http://scip.zib.de).

    conda install -c conda-forge pyscipopt

In [17]:
import numpy as np
import cvxpy as cp

# Generate a random feasible SOCP.
m = 40
n = 25

# Generate a random problem
rng = np.random.default_rng(1)

A = rng.uniform(size=(m, n))
b = rng.standard_normal(m)

# Construct a CVXPY problem
x = cp.Variable(n, integer=True)
objective = cp.Minimize(cp.sum_squares(A @ x - b))
prob = cp.Problem(objective)
prob.solve(solver=cp.SCIP)

# Print result.
print(f"""\
Status: {prob.status}
The optimal value is {prob.value:f}
A solution is {x.value}
""")

Status: optimal
The optimal value is 27.993554
A solution is [ 0.00000000e+00  1.00000000e+00 -1.00000000e+00  0.00000000e+00
  0.00000000e+00  1.00000000e+00 -1.00000000e+00  1.00000000e+00
  0.00000000e+00  1.00000000e+00 -1.00000000e+00  0.00000000e+00
  1.00000000e+00  0.00000000e+00  1.00000000e+00 -1.00000000e+00
 -1.61728989e-07 -1.00000000e+00  0.00000000e+00  0.00000000e+00
 -1.00000000e+00  0.00000000e+00  0.00000000e+00  1.00000000e+00
 -1.00000000e+00]

