## Differentiable Optimization Problems
- JuMP GSOC 2020

A convex conic optimization problem in its primal (P) and dual (D) forms:

$$
\begin{split}
\begin{array} {llcc}
\textbf{Primal Problem} & & \textbf{Dual Problem} & \\
\mbox{minimize} & c^T x  \quad \quad & \mbox{minimize} & b^T y  \\
\mbox{subject to} & A x + s = b  \quad \quad & \mbox{subject to} & A^T y + c = 0 \\
& s \in \mathbb{K} &  & y \in \mathbb{K}^*
\end{array}
\end{split}
$$

where
- $x \in R^n$ is the primal variable, $y \in R^m$ is the dual variable, and $s \in R^m$ is the primal slack
variable
- $\mathbb{K} \subseteq R^m$ is a closed convex cone and $\mathbb{K}^* \subseteq R^m$ is the corresponding dual cone
variable
- $A \in R^{m \times n}$, $b \in R^m$, $c \in R^n$ are problem data

**Goal** is to differentiate program variables $x$, $s$, $y$  w.r.t pertubations in problem data i.e. $dA$, $db$, $dc$

Reference article: [_Differentiating Through a Cone Program_](https://arxiv.org/abs/1904.09043) - Akshay Agrawal, Shane Barratt, Stephen Boyd, Enzo Busseti, Walaa M. Moursi, 2019

### Progress so far
- Created DiffOpt wrapper around MOI optimizer
- Able to differentiate a convexconic program with SDP, SOC constraints
- Able to differentiate a convex quadratic program

### Example program

$$ 
\begin{split}
\begin{array} {llcc}
\mbox{minimize}  &
\left\langle
\left[
\begin{array} {ccc}
        2 & 1 & 0  \\
        1 & 2 & 1  \\
        0 & 1 & 2
   \end{array}
   \right],
        X \right\rangle
        + x_0   &  &   \\
          \mbox{subject to}  &
          \left\langle
          \left[
          \begin{array} {ccc}
          1 & 0 & 0  \\
          0 & 1 & 0  \\
          0 & 0 & 1
          \end{array}
          \right],
          X \right\rangle
          + x_0 & =  & 1,   \\
            &
            \left\langle
            \left[
            \begin{array}{ccc}
            1 & 1 & 1  \\
            1 & 1 & 1  \\
            1 & 1 & 1
            \end{array}
            \right],
            X \right\rangle + x_1 + x_2
            & = & 1/2,  \\
            & (x_0, x_1, x_2) \in \mathbb{Q}^3 \text{ or } x_0 \geq \sqrt{{x_1}^2 + {x_2}^2} \\
            & X \succeq 0, X \in \mathbb{S}^3_{+}
\end{array}
\end{split}
$$
where
$$
\mathbb{S}^n_{+} =
\left\lbrace
X \in \mathbb{S}^n: z^T X z \geq 0, \quad \forall z \in \mathbb{R}^n
\right\rbrace,
$$

> Refered from Mosek examples: https://docs.mosek.com/9.2/toolbox/tutorial-sdo-shared.html#example-sdo1

### Equivalent JuMP program

In [6]:
using SCS
using DiffOpt
using MathOptInterface

const MOI = MathOptInterface;

In [8]:
model = diff_optimizer(SCS.Optimizer)

δ = √(1 + (3*√2+2)*√(-116*√2+166) / 14) / 2
ε = √((1 - 2*(√2-1)*δ^2) / (2-√2))
y2 = 1 - ε*δ
y1 = 1 - √2*y2
obj = y1 + y2/2
k = -2*δ/ε
x2 = ((3-2obj)*(2+k^2)-4) / (4*(2+k^2)-4*√2)
α = √(3-2obj-4x2)/2
β = k*α

X = MOI.add_variables(model, 6)
x = MOI.add_variables(model, 3)

vov = MOI.VectorOfVariables(X)

cX = MOI.add_constraint(
    model, 
    MOI.VectorAffineFunction{Float64}(vov), MOI.PositiveSemidefiniteConeTriangle(3)
)

cx = MOI.add_constraint(
    model, 
    MOI.VectorAffineFunction{Float64}(MOI.VectorOfVariables(x)), MOI.SecondOrderCone(3)
)

c1 = MOI.add_constraint(
    model, 
    MOI.VectorAffineFunction(
        MOI.VectorAffineTerm.(1:1, MOI.ScalarAffineTerm.([1., 1., 1., 1.], [X[1], X[3], X[end], x[1]])), 
        [-1.0]
    ), 
    MOI.Zeros(1)
)

c2 = MOI.add_constraint(
    model, 
    MOI.VectorAffineFunction(
        MOI.VectorAffineTerm.(1:1, MOI.ScalarAffineTerm.([1., 2, 1, 2, 2, 1, 1, 1], [X; x[2]; x[3]])), 
        [-0.5]
    ), 
    MOI.Zeros(1)
)

objXidx = [1:3; 5:6]
objXcoefs = 2*ones(5)
MOI.set(model, MOI.ObjectiveFunction{MOI.ScalarAffineFunction{Float64}}(),
MOI.ScalarAffineFunction(MOI.ScalarAffineTerm.([objXcoefs; 1.0], [X[objXidx]; x[1]]), 0.0))
MOI.set(model, MOI.ObjectiveSense(), MOI.MIN_SENSE)

sol = MOI.optimize!(model)

x = sol.primal
s = sol.slack
y = sol.dual

println("x -> ", x)
println("s -> ", s)
println("y -> ", y)

dA = ones(11, 9)
db = ones(11)
dc = ones(9)

dx, dy, ds = backward_conic!(model, dA, db, dc)

println("dx -> ", dx)
println("ds -> ", ds)
println("dy -> ", dy)

# find equivalent diffcp program here 
#                https://github.com/AKS1996/jump-gsoc-2020/blob/master/diffcp_sdp_3_py.ipynb

x -> [0.217251238770051, -0.2599704147186366, 0.31108967252616243, 0.2172512383948794, -0.25997041471863536, 0.21725123877005162, 0.25440785083602735, 0.17989351643122645, 0.17989351643122645]
s -> [-5.8304524274385224e-18, 1.672022895212547e-17, 0.25440785071939953, 0.17989351643078233, 0.17989351643078233, 0.21725123843399946, -0.36765368641915924, 0.30723964791089897, 0.31108967225522005, -0.3676536864191592, 0.2172512384339987]
y -> [0.5447582105062065, 0.3219045549528856, 0.45524178943652027, -0.3219045563900619, -0.3219045563900619, 1.133337233849821, 0.9589717750123417, -0.4552417892860504, 1.1333372326447106, 0.958971775012335, 1.1333372338498173]
dx -> [1.6170286480279092, -0.5569023296651843, -0.7471794115241701, 1.6003051546584053, -0.5569023296652973, 1.6170286480279237, -2.429780390766397, -1.7013907607653123, -1.7013907607653111]
ds -> [1.468909818880985e-17, -4.212453285122734e-17, -2.486877848961781, -1.7584881909834902, -1.7584881909834902, 1.55993119439835, -0.8446762

## What's next?

+ Add matrix builder code to `MatrixOptInterface.jl`
+ Add projections in `MathOptDistances.jl`
+ Support sparse `A` matrix in computations - Time benchmarking
+ Documentation