# Convex Optimization in Julia

## Madeleine Udell | JuliaCon 2015

## Convex.jl team

* [Convex.jl](https://github.com/cvxgrp/Convex.jl): Madeleine Udell, Karanveer Mohan, David Zeng, Jenny Hong

## Collaborators/Inspiration:

* [cvx](http://www.cvxr.com): Michael Grant, Stephen Boyd
* [cvxpy](https://github.com/cvxgrp/cvxpy): Steven Diamond, Eric Chu, Stephen Boyd

# Convex optimization

## Convexity

* A **convex combination** of the points $x$ and $y$ is any point of the form 
$$
\theta x + (1-\theta)y
$$
for $\theta \in [0,1]$.

* A set $S$ is **convex** if for all $\theta \in [0,1]$, when $x \in S$ and $y \in S$, 
then $\theta x + (1-\theta)y \in S$

* A function $f$ is **convex** if for all $\theta \in [0,1]$,
$$
f(\theta x + (1-\theta)y ) \leq \theta f(x) + (1-\theta) f(y).
$$

    equivalently, 

    * $f$ has nonnegative (upward) curvature
    * the graph of $f$ never lies above its chords
    * $f'' \geq 0$ (if $f$ is differentiable)
    * sublevel sets $\{x : f(x) \leq \alpha\}$ are convex

<!--![chords](chord.png)-->

A function $f$ is convex if convex combinations of points only get better (lower). 
        
(We like to **minimize** functions.)

In [None]:
using Gadfly
f(x) = x*x
chord(x; a=-3, b=4) = if x<=b && x>=a return (f(b)-f(a))/(b-a)*(x-a) + f(a) else return NaN end
p = plot([f,chord],-5,5)

## Convex optimization (nonlinear form)

$$
\begin{array}{ll} 
\mbox{minimize}  & f_0(x) \\
\mbox{subject to} & f_i(x) \leq 0, \quad i=1, \ldots, m_1\\
& h_i(x) = 0, \quad i=1, \ldots, m_2\\
\end{array}
$$

* variable $x\in \mathbf{R}^n$
* $f_i$ are all convex
* $h_i$ are all affine

In other words, a problem is convex if convex combinations of feasible points 

1. are still feasible
2. have better objective values

## Convex optimization (conic form)

$$
\begin{array}{ll} 
\mbox{minimize}  & c^T x \\
    \mbox{subject to} & Ax = b\\
    & x \in \mathcal K\\
\end{array}
$$

where $\mathcal K$ is a **convex cone**:
* $ x \in \mathcal K$ iff $rx \in \mathcal K$ for any $r>0$

examples:
* positive orthant $\mathcal K = \{x: x_i >=0,~i=1,\ldots,n\}$
* second order cone $\mathcal K = \{(x,t): \|x\|_2 \leq t\}$
    * *aka* ice cream cone
* positive semidefinite (PSD) cone $\mathcal K = \{X: X = X^T,~ v^T X v \geq 0,~ \forall v \in \mathbf{R}^n\}$

## Why convex optimization?

* beautiful, nearly complete theory
    * duality, optimality conditions, ...
* effective algorithms, methods (in theory and practice)
    * get **global solution** (and optimality certificate)
    * polynomial complexity
* conceptual unification of many methods
* useful even for nonconvex problems
    * subroutine in larger algorithm (k-means, EM)
    * bounds/heuristics for hard problems
* lots of applications
    * machine learning, statistics, control, finance, signal and image processing, vision, networking, ...

# Convex.jl

* write problems in nonlinear form
* solve problems by calling (fast) conic form solvers

In [None]:
# Make the Convex.jl module available
using Convex, SCS
set_default_solver(SCSSolver(verbose=0))

# Generate random problem data
m = 4;  n = 5
A = randn(m, n); b = randn(m, 1)

# Create a (column vector) variable of size n.
x = Variable(n)

# The problem is to minimize ||Ax - b||^2 subject to x >= 0
# This can be done by: minimize(objective, constraints)
problem = minimize(sumsquares(A * x - b), 
                   x >= 0)

# Solve the problem by calling solve!
solve!(problem)

# Check the status of the problem
println("problem status is ", problem.status) # :Optimal, :Infeasible, :Unbounded etc.

# Get the optimum value
println("optimal value is ", problem.optval)

# Quick convex prototyping

## Variables

In [None]:
# Scalar variable
x = Variable()

In [None]:
# (Column) vector variable
y = Variable(4)

In [None]:
# Matrix variable
z = Variable(4, 2)

# Expressions

Convex.jl allows you to use a wide variety of [functions](http://convexjl.readthedocs.org/en/latest/operations.html) on variables and on expressions to form new expressions.

### Definition
* $f$ is **concave** $\iff$ $-f$ is convex
* $f$ is **affine** $\iff$ $f$ is both convex and concave

Recall a function $f$ is convex if convex combinations of points only get better (lower). (Because we like to *minimize* functions.)

In [None]:
x + 2x

In [None]:
x + y[1]

In [None]:
x+y

In [None]:
maximum(y)

In [None]:
maximum(abs(y))

In [None]:
minimum(y)

In [None]:
minimum(abs(y))

# Constraints

A constraint is convex if convex combinations of feasible points are also feasible. Equivalently, feasible sets are convex sets.

In other words, convex constraints are of the form

* `convexExpr <= 0`
* `concaveExpr >= 0`
* `affineExpr == 0`

In [None]:
x <= 0

In [None]:
x^2 <= 0

In [None]:
x^2 <= sum(y)

In [None]:
x^2 >= 1

# Problems

In [None]:
x = Variable()
y = Variable(4)
objective = 2*x + 1 - sqrt(sum(y))
constraint = x >= maximum(y)
p = minimize(objective, constraint)

In [None]:
# solve the problem
solve!(p)
p.status

In [None]:
x.value

In [None]:
# can evaluate expressions directly
evaluate(objective)

## Problem variants

* minimization problems 
    ````
    minimize(objective, constraints)
    ````
* maximization problems 
    ````
    maximize(objective, constraints)
    ````
* constraint satisfaction problems 
    ````
    satisfy(constraints)
    ````

# Examples: least squares and friends

# Convex problems

Convex.jl only solves convex optimization problems.

### How can you tell if a problem is convex?

Let's write our optimization problem as
$$
\begin{array}{ll} 
\mbox{minimize}  & f(x) \\
\mbox{subject to} & x \in \mathcal C.
\end{array}
$$

Recall a problem is convex if convex combinations of feasible points only get better:
that is, if $x \in \mathcal C$, $y \in \mathcal C \implies$ 
* $(x+y)/2 \in \mathcal C$
* $f((x+y)/2) \leq f(x) + f(y)$

To verify convexity, Convex.jl simply checks that 

* objective is
    * `minimize(convexExpr)`
    * `maximize(concaveExpr)`
* constraints are
    * `convexExpr <= 0`
    * `concaveExpr >= 0`
    * `affineExpr == 0`

### How can you tell if an expression is convex?

Use **disciplined convex programming:** Infer convexity of expressions by induction.

* variables have known vexity (affine) and sign
* library of atoms with known vexity and sign (as function of their arguments)

More information at [dcp.stanford.edu](dcp.stanford.edu)

## Convexity inference
1. $f \circ g(x)$ is convex in $x$ if
    * $f$ is convex increasing and $g$ is convex 
    * $f$ is convex decreasing and $g$ is concave 

1. $f \circ g(x)$ is concave in $x$ if
    * $f$ is concave increasing and $g$ is concave 
    * $f$ is concave decreasing and $g$ is convex

For smooth functions, derivation via chain rule:
$$
(f \circ g)''(x) = f''(g(x))(g(x))^2 + f'(g(x))g''(x)
$$

example: 
* `+` is convex and increasing in its arguments
* so `convexExpr + convexExpr` is convex

## Library of atoms

See [the docs](http://convexjl.readthedocs.org/en/latest/operations.html)

## DCP examples

In [None]:
# affine
x = Variable(4)
y = Variable (2)

In [None]:
2*maximum(x) + 4*sum(y) - sqrt(y[1] + x[1]) - 7 * minimum(x[2:4])

In [None]:
# not dcp compliant
sqrt(x) + x^2

In [None]:
# $f$ is convex increasing and $g$ is convex
square(pos(x))

In [None]:
# $f$ is convex decreasing and $g$ is concave 
invpos(sqrt(x))

In [None]:
# $f$ is concave increasing and $g$ is concave 
sqrt(sqrt(x))

# How does it work?

* construct abstract syntax tree
* parse to canonical form
* pass to solver

## Construct abstract syntax tree

(done automatically during parsing)

* for objective
* for each constraint

example:

    pos(sum(x)) + 1 = (:+, [(:pos, [(:sum, [x])]), 1])

## Parse to canonical form: introduce new variables

one atom per constraint
$$
\begin{array}{ll}
v_1 &=& sum(x) \\
v_2 &=& pos(v_1) \\
v_3 &=& v_2 + 1 \\
\end{array}
$$

## Parse to canonical form: relax convex constraints

problem is convex $\implies$ relaxation is tight at solution

$$
\begin{array}{ll}
v_1 &=& sum(x) \\
v_2 &\geq& pos(v_1) \\
v_3 &=& v_2 + 1 \\
\end{array}
$$

## Parse to canonical form: canonicalize

every atom has a canonical conic form

* $v_1 = sum(x)$ becomes $v_1 = 1^T x$
* $v_2 \geq pos(v_1)$ becomes 
    * $v_2 \in \mathcal K_+$
    * $v_2 - v_1 = v_4$
    * $v_4 \in \mathcal K_+$

$$
\begin{array}{ll}
v_1 &=& 1^T x \\
v_2 - v_1 &=& v_4 \\
v_2 &\in& \mathcal K_+ \\
v_4 &\in& \mathcal K_+ \\
v_3 &=& v_2 + 1 \\
\end{array}
$$

In [None]:
## In code
conic_problem(minimize(sumsquares(A*x-b), x>=0))

## Conic form

problem is now in standard conic form
$$
\begin{array}{ll} 
\mbox{minimize}  & c^T x \\
    \mbox{subject to} & Ax = b\\
    & x \in \mathcal K\\
\end{array}
$$

* $\mathcal K = $ positive orthant $\implies$ linear program
* $\mathcal K = $ second order cone $\implies$ SOCP
* $\mathcal K = $ semidefinite cone $\implies$ SDP
* $\mathcal K = $ exponential cone $\implies$ exponential cone program

## Pass to solver

call a `MathProgBase` solver suited for your problem class

* see the [list of Convex.jl operations](http://convexjl.readthedocs.org/en/latest/operations.html) to find which cones you're using
* see the [list of solvers](http://www.juliaopt.org/) for an up-to-date list of solvers and which cones they support

to solve problem using a different solver, just import the solver package and pass the solver to the `solve!` method: eg

    using Mosek
    solve!(p, MosekSolver())