## Differentiable Optimization Problems
- JuMP GSOC 2020

A convex conic optimization problem in its primal (P) and dual (D) forms:

$$
\begin{split}
\begin{array} {llcc}
\textbf{Primal Problem} & & \textbf{Dual Problem} & \\
\mbox{minimize} & c^T x  \quad \quad & \mbox{minimize} & b^T y  \\
\mbox{subject to} & A x + s = b  \quad \quad & \mbox{subject to} & A^T y + c = 0 \\
& s \in \mathcal{K} &  & y \in \mathcal{K}^*
\end{array}
\end{split}
$$

where
- $x \in R^n$ is the primal variable, $y \in R^m$ is the dual variable, and $s \in R^m$ is the primal slack
variable
- $\mathcal{K} \subseteq R^m$ is a closed convex cone and $\mathcal{K}^* \subseteq R^m$ is the corresponding dual cone
variable
- $A \in R^{m \times n}$, $b \in R^m$, $c \in R^n$ are problem data

**Goal** is to differentiate program variables $x$, $s$, $y$  w.r.t pertubations in problem data i.e. $dA$, $db$, $dc$

Reference article: [_Differentiating Through a Cone Program_](https://arxiv.org/abs/1904.09043) - Akshay Agrawal, Shane Barratt, Stephen Boyd, Enzo Busseti, Walaa M. Moursi, 2019

### Progress so far
- Created DiffOpt wrapper around MOI optimizer
- Able to differentiate a convexconic program with SDP, SOC constraints
- Able to differentiate a convex quadratic program

### Example program

$$ 
\begin{split}
\begin{array} {llcc}
\mbox{minimize}  &
\left\langle
\left[
\begin{array} {ccc}
        2 & 1 & 0  \\
        1 & 2 & 1  \\
        0 & 1 & 2
   \end{array}
   \right],
        X \right\rangle
        + x_0   &  &   \\
          \mbox{subject to}  &
          \left\langle
          \left[
          \begin{array} {ccc}
          1 & 0 & 0  \\
          0 & 1 & 0  \\
          0 & 0 & 1
          \end{array}
          \right],
          X \right\rangle
          + x_0 & =  & 1,   \\
            &
            \left\langle
            \left[
            \begin{array}{ccc}
            1 & 1 & 1  \\
            1 & 1 & 1  \\
            1 & 1 & 1
            \end{array}
            \right],
            X \right\rangle + x_1 + x_2
            & = & 1/2,  \\
            & (x_0, x_1, x_2) \in \mathbb{Q}^3 \text{ or } x_0 \geq \sqrt{{x_1}^2 + {x_2}^2} \\
            & X \succeq 0, X \in \mathbb{S}^3_{+}
\end{array}
\end{split}
$$
where
$$
\mathbb{S}^n_{+} =
\left\lbrace
X \in \mathbb{S}^n: z^T X z \geq 0, \quad \forall z \in \mathbb{R}^n
\right\rbrace,
$$

> Refered from Mosek examples: https://docs.mosek.com/9.2/toolbox/tutorial-sdo-shared.html#example-sdo1

### Equivalent JuMP program

In [6]:
using SCS
using DiffOpt
using MathOptInterface

const MOI = MathOptInterface;

In [11]:
model = diff_optimizer(SCS.Optimizer)
MOI.set(model, MathOptInterface.Silent(), true)

δ = √(1 + (3*√2+2)*√(-116*√2+166) / 14) / 2
ε = √((1 - 2*(√2-1)*δ^2) / (2-√2))
y2 = 1 - ε*δ
y1 = 1 - √2*y2
obj = y1 + y2/2
k = -2*δ/ε
x2 = ((3-2obj)*(2+k^2)-4) / (4*(2+k^2)-4*√2)
α = √(3-2obj-4x2)/2
β = k*α

X = MOI.add_variables(model, 6)
x = MOI.add_variables(model, 3)

vov = MOI.VectorOfVariables(X)

cX = MOI.add_constraint(
    model, 
    MOI.VectorAffineFunction{Float64}(vov), MOI.PositiveSemidefiniteConeTriangle(3)
)

cx = MOI.add_constraint(
    model, 
    MOI.VectorAffineFunction{Float64}(MOI.VectorOfVariables(x)), MOI.SecondOrderCone(3)
)

c1 = MOI.add_constraint(
    model, 
    MOI.VectorAffineFunction(
        MOI.VectorAffineTerm.(1:1, MOI.ScalarAffineTerm.([1., 1., 1., 1.], [X[1], X[3], X[end], x[1]])), 
        [-1.0]
    ), 
    MOI.Zeros(1)
)

c2 = MOI.add_constraint(
    model, 
    MOI.VectorAffineFunction(
        MOI.VectorAffineTerm.(1:1, MOI.ScalarAffineTerm.([1., 2, 1, 2, 2, 1, 1, 1], [X; x[2]; x[3]])), 
        [-0.5]
    ), 
    MOI.Zeros(1)
)

objXidx = [1:3; 5:6]
objXcoefs = 2*ones(5)
MOI.set(model, MOI.ObjectiveFunction{MOI.ScalarAffineFunction{Float64}}(),
MOI.ScalarAffineFunction(MOI.ScalarAffineTerm.([objXcoefs; 1.0], [X[objXidx]; x[1]]), 0.0))
MOI.set(model, MOI.ObjectiveSense(), MOI.MIN_SENSE)

sol = MOI.optimize!(model)

# fetch solution
x = sol.primal
s = sol.slack
y = sol.dual

println("x -> ", round.(x; digits=3))
println("s -> ", round.(s; digits=3))
println("y -> ", round.(y; digits=3))

# perturbations in the parameters
dA = ones(11, 9)
db = ones(11)
dc = ones(9)

# differentiate and get the gradients
dx, dy, ds = backward_conic!(model, dA, db, dc)

println("dx -> ", round.(dx; digits=3))
println("ds -> ", round.(ds; digits=3))
println("dy -> ", round.(dy; digits=3))

# find equivalent diffcp program here 
#                https://github.com/AKS1996/jump-gsoc-2020/blob/master/diffcp_sdp_3_py.ipynb

x -> [0.217, -0.26, 0.311, 0.217, -0.26, 0.217, 0.254, 0.18, 0.18]
s -> [-0.0, 0.0, 0.254, 0.18, 0.18, 0.217, -0.368, 0.307, 0.311, -0.368, 0.217]
y -> [0.545, 0.322, 0.455, -0.322, -0.322, 1.133, 0.959, -0.455, 1.133, 0.959, 1.133]
dx -> [1.617, -0.557, -0.747, 1.6, -0.557, 1.617, -2.43, -1.701, -1.701]
ds -> [0.0, -0.0, -2.487, -1.758, -1.758, 1.56, -0.845, 2.206, -0.804, -0.845, 1.56]
dy -> [2.059, 9.71, 4.481, -3.169, -3.169, -5.228, -9.106, -9.106, -5.228, -9.106, -5.228]


## What's next?

+ Add matrix builder code to `MatrixOptInterface.jl`
+ Add projections in `MathOptDistances.jl`
+ Support sparse `A` matrix in computations - Time benchmarking
+ Documentation