# Problem 1. [60 points] Making Sandcastle

Grad student Alice is at Cancun to spend her end-of-the-year vacation. Relaxing at the beautiful white sand beach, she contemplates making a sandcastle while minimizing her effort to construct it. 

Alice approximates the sandy beach as part of a 2D plane. She decides to move the sand mass from $m$ known source locations $x_{i}^{\text{source}}\in\mathbb{R}^{2}$, $i=1,...,m$, to $n$ known destination locations $x_{j}^{\text{sandcastle}}\in\mathbb{R}^{2}$ , $j=1,...,n$. 

Sand particles in the beach have nonuniform mass. In particular, the $i$th source location has a known mass $\alpha_i>0$, $i=1,...,m$. Alice's sandcastle design requires the $j$th destination location to have a known mass $\beta_j>0$, $j=1,...,n$. Conservation of mass requires $\sum_{i}\alpha_i = \sum_{j}\beta_{j}$. Without loss of generality, Alice normalizes the mass, i.e., sets $\sum_{i}\alpha_i = \sum_{j}\beta_{j}=1$. In other words, known $\alpha_i$ denotes the fraction of total mass at the $i$th source location. Similar interpretation holds for $\beta_{j}$.

Alice models the cost $C_{ij}$ of moving **unit amout of sand** from the source location $x_{i}^{\text{source}}$ to the destination location $x_{j}^{\text{sandcastle}}$ as squared Euclidean distance, i.e., $C_{ij} = \|x_{i}^{\text{source}} - x_{j}^{\text{sandcastle}}\|_{2}^{2}$. This defines a matrix $C\equiv [C_{ij}]\in\mathbb{R}^{m\times n}$. 

## (a) [5 + 10 + (5 + 5) + 10 = 35 points] Formulation

(i) **Explain why** Alice's model for $C_{ij}$ is reasonable. 

(ii) If Alice decides to move $M_{ij}$ amount of mass from $x_{i}^{\text{source}}$ to $x_{j}^{\text{sandcastle}}$, then she incurs a cost $C_{ij}M_{ij}$ for that particular route. Taking the matrix $M\equiv [M_{ij}]\in\mathbb{R}^{m\times n}$ as the decision variable, **clearly write down the optimization problem** Alice needs to solve for making her sandcastle. The input parameters for the problem should be the matrix $C\in\mathbb{R}^{m\times n}$, the vector $\alpha\in\mathbb{R}^{m}_{++}$, and the vector $\beta\in\mathbb{R}^{n}_{++}$. Assume that the problem data already guarantees $\langle\boldsymbol{1}_{m},\alpha\rangle = \langle\boldsymbol{1}_{n},\beta\rangle = 1$ where $\boldsymbol{1}_{m},\boldsymbol{1}_{n}$ denote the vector of all ones of size $m\times 1$ and $n\times 1$, respectively.

(iii) **Mathematically explain why** this is a convex optimization problem. **Mathematically argue what type** of convex optimization problem is this.

Side remark: Unlike the exercises in HW5 Problem 2, this problem has no analytical solution in terms of the problem data.

(iv) **Carefully argue the size of the optimization problem**, i.e., how many variables are to be solved for and how many constraints are there.

## Solution for part 1(a):

(i) Alice's model for $C_{ij}$ is reasonable becuase $C_{ij}$ is a monotone function of the Euclidean distance $\|x_{i}^{\text{source}} - x_{j}^{\text{sandcastle}}\|_{2}$. In other words, the cost for moving per unit amount of mass increases (resp. decreases) if the distance between the source and destination are large (resp. small).

(ii) Because the total cost is $\sum_{i=1}^{m}\sum_{j=1}^{n}C_{ij}M_{ij} = \langle C, M\rangle$ (the Frobenius inner product), Alice's optimization problem becomes
\begin{align*}
&\underset{M\in\mathbb{R}^{m\times n}}{\min} \quad \langle C, M\rangle\\
&\text{subject to} \quad M \geq 0 \quad\text{(elementwise)},\\
&\qquad\qquad\quad M\boldsymbol{1}_{n}=\alpha,\\
&\qquad\qquad\quad M^{\top}\boldsymbol{1}_{m}=\beta.
\end{align*}
The first constraint says that along any source-to-sandcastle route, she needs to transport either zero or positive mass. The second and third constraint respectively specify the given distribution of mass at the source and at the sandcastle. 

(iii) The objective function is linear (hence convex) in matrix variable $M$, i.e., linear in the entries of $M$. The set $\mathbb{R}^{m\times n}$ is affine, and hence a convex set. All equality and inequality constraints are also linear (thus convex) in $M$. Therefore, **this is an LP, which is indeed a convex optimization problem**.

(iv) Since the optimization is over $M\in\mathbb{R}^{m\times n}$, we have $mn$ unknown real decision variables to solve for. The elementwise inequalities give $mn$ closed halfspace constraints. In addition, there are $m+n$ equality, i.e., hyperplane constraints. **So this is an LP in $mn$ variables with $mn + m + n$ constraints**. 

Side remark: The constraint polyhedron parametereized by vectors $\alpha,\beta$, is sometimes referred to as the transportation poyhedron $\mathcal{P}_{\text{transport}}(\alpha,\beta)$. Clealry, the minimizer is one of the vertices of this polyhedron.




## (b) [25 points] Numerical solution

Fix $m=150$, $n=225$. Write a code in MATLAB/Python/Julia to load the input from CANVAS Files section: HW problems and solutions: alpha.txt, beta.txt, x_source.txt, x_sandcastle.txt, and use cvx/cvxpy/Convex.jl in the same code to solve the optimization problem in part (a). Report the **numerically computed optimal value (minimized cost) and submit your code**. **Also report the computational time needed to solve the problem by cvx/cvxpy/Convex.jl** (only the computational time for solving, not for setting up the problem data).

Side remark: It is recommended (but not required) that in your code, you also check if your optimal solution obtained from using cvx/cvxpy/Convex.jl matches with linprog in MATLAB, or with scipy.optimize.linprog in Python.

## Solution for part 1(b):

To proceed for numerical computation, we need to rewrite the LP in Part 1(a)(ii) in **standard vector LP form** (see e.g., Lec. 11, p. 4). For this purpose, let
$$c := {\rm{vec}}(C)\in\mathbb{R}^{mn}, \qquad p := {\rm{vec}}(M)\in\mathbb{R}^{mn}, \qquad A:=\begin{pmatrix}
\boldsymbol{1}_{n}^{\top}\otimes I_{m}\\
I_{n} \otimes \boldsymbol{1}_{m}^{\top}
\end{pmatrix}\in\mathbb{R}^{(m+n)\times mn}, \qquad b:=\begin{pmatrix}
\alpha\\
\beta
\end{pmatrix}\in\mathbb{R}^{m+n},$$
where ${\rm{vec}}$ denotes the vectorization, and $\otimes$ denotes the Kronecker product. We then rewrite the LP in Part 1(a)(ii) as
\begin{align*}
&\underset{p\in\mathbb{R}^{mn}}{\min} \quad \langle c, p\rangle\\
&\text{subject to} \quad p \geq 0 \quad\text{(elementwise)},\\
&\qquad\qquad\quad Ap = b.
\end{align*}
Please see our sample MATLAB solution file ${\texttt{AM229HW6P1bSol.m}}$ posted in CANVAS File section inside the folder "HW Problems and Solutions". Using that code, we find the minimum values
$$\text{minimum}_{\text{cvx}} = 0.7293, \quad \text{minimum}_{\text{linprog}} = 0.7293,$$
and the corresponding computational times
$$t_{\text{cvx}} = 1.4973\;\text{s}, \quad t_{\text{linprog}} = 0.2725\;\text{s}.$$
As expected, the specialized linprog solver solves the LP much faster than the general purpose primal-dual solver in cvx.

# Problem 2. [15 + 15 + 10 = 40 points] Lagrange Dual Problem

Consider the primal convex optimization problem
\begin{align}
p^{*} = &\underset{x\in\mathbb{R}^{n}}{\min}\quad\frac{1}{2}x^{\top}P_{0}x + \langle q_0, x\rangle + r_0\\
&\text{subject to} \quad \frac{1}{2}x^{\top}P_{i}x + \langle q_i, x\rangle + r_i \leq 0, \quad\forall\,i=1,...,m,
\end{align}
where $P_0\in\mathbb{S}^{n}_{++}$, $P_{i}\in\mathbb{S}^{n}_{+}$ for all $i=1,...,m$, $q_i\in\mathbb{R}^{n}$ for all $i=0,1,...,m$, and $r_i\in\mathbb{R}$ for all $i=0,1,...,m$.

(a) Denote the Lagrange multiplier associated with the primal inequality constraints as $\lambda\in\mathbb{R}^{m}_{\geq 0}$. Let
\begin{align}
P(\lambda) := P_{0} + \sum_{i=1}^{m}\lambda_i P_i \succ 0,\quad q(\lambda) &:= q_0 + \sum_{i=1}^{m}\lambda_i q_i,\quad r(\lambda) := r_0 + \sum_{i=1}^{m}\lambda_i r_i.
\end{align}
**Prove that** the Lagrange dual problem associated with the primal problem is 
$$d^{*} = \underset{\lambda\in\mathbb{R}^{m}_{\geq 0}}{\min}\:\frac{1}{2}\left(q(\lambda)\right)^{\top}\left(P(\lambda)\right)^{-1}q(\lambda) - r(\lambda).$$

(b) Rewrite the Lagrange dual problem derived in part (a) in one of the standard forms: LP, QP, QCQP, SOCP, SDP. **Show all your calculations**.

(c) Specialize Slater's condition for this primal problem to **state a sufficient condition for strong duality** ($p^{*}=d^{*}$).

## Solution for part 2(a):
By definition, the Lagrangian is
$$L(x,\lambda) = \frac{1}{2}x^{\top}P(\lambda)x + \langle q(\lambda),x\rangle + r(\lambda),$$
which yields the Lagrange dual function
$$g(\lambda) := \underset{x\in\mathbb{R}^{n}}{\inf}L(x,\lambda) = -\frac{1}{2}\left(q(\lambda)\right)^{\top}\left(P(\lambda)\right)^{-1}q(\lambda) + r(\lambda),$$
obtained by solving $\nabla_{x}L = \boldsymbol{0}$ for the minimizer $x$, and then substituting this minimizer back to compute the minimum value of $L$. Therefore, the Lagrange dual problem is
$$d^{*}=\underset{\lambda\in\mathbb{R}^{m}_{\geq 0}}{\sup}g(\lambda) =  - \underset{\lambda\in\mathbb{R}^{m}_{\geq 0}}{\inf}\left(-g(\lambda)\right) = -\underset{\lambda\in\mathbb{R}^{m}_{\geq 0}}{\inf}\:\bigg\{\frac{1}{2}\left(q(\lambda)\right)^{\top}\left(P(\lambda)\right)^{-1}q(\lambda) - r(\lambda)\bigg\}.$$
**Side remark:** Notice that the extra minus sign in front is because of the relation between minimum value and maximum value of the corresponding problems that we explained in Lec. 15, p. 12 (in blue color), see also Lec. 16 video beginning few minutes. In other words, the minus sign is only relevant for relating the primal and dual values $p^{*}$ and $d^{*}$. If we are interested in answering what is the dual optimization problem, or finding the dual optimizer $\lambda^{*}$, then we need to solve the problem $\underset{\lambda\in\mathbb{R}^{m}_{\geq 0}}{\inf}\:\bigg\{\frac{1}{2}\left(q(\lambda)\right)^{\top}\left(P(\lambda)\right)^{-1}q(\lambda) - r(\lambda)\bigg\}$.

## Solution for part 2(b):
We first rewrite the dual convex optimization problem in epigraph form (Lec. 12, p. 9):
\begin{align*}
&\underset{(\lambda,t)\in\mathbb{R}^{m}_{\geq 0} \times \mathbb{R}}{\min}\quad t\\
&\text{subject to}\quad \frac{1}{2}\left(q(\lambda)\right)^{\top}\left(P(\lambda)\right)^{-1}q(\lambda) - r(\lambda) \leq t.
\end{align*}
Next, using the Schur complement lemma (Lec. 9, p. 2-3), we rewrite the above quadratic constraint as
$$\underbrace{F_{1} := \begin{pmatrix} P(\lambda) & q(\lambda)\\
\left(q(\lambda)\right)^{\top} & t+r(\lambda)
\end{pmatrix}}_{\in\mathbb{R}^{(n+1)\times(n+1)}} \succeq \boldsymbol{0}.$$
Therefore, letting $z:=(\lambda,t)^{\top}$, $c:=\begin{pmatrix}\boldsymbol{0}_{n\times 1}\\
1\end{pmatrix}$, $A := \left(I_m \quad 0\right)\in\mathbb{R}^{m\times (m+1)}$, $F_2(z):={\rm{diag}}\left((Az)_1, ..., (Az)_m\right)\in\mathbb{R}^{m\times m}$, we can express the dual convex optimization problem in SDP standard form (Lec. 12, p. 1)
\begin{align*}
&\underset{z\in\mathbb{R}^{m+1}}{\min}\quad \langle c,z\rangle\\
&\text{subject to}\quad \underbrace{F(z) := {\rm{diag}}\left(F_1(z), F_2(z)\right)}_{\in\mathbb{R}^{(m+n+1)\times(m+n+1)}}\succeq \boldsymbol{0}.
\end{align*}
As usual, the LMI constraint $F(z)\succeq\boldsymbol{0}$ in the SDP above represents a spectrahedron (see Lec. 12, p. 1-2, also Lec. 7 last three pages).

## Solution for part 2(c):
Following Lec. 14, p. 13, the Slater's condition for this primal problem specializes to the following: **if $\exists\,x\in\mathbb{R}^{n}$ such that $\frac{1}{2}x^{\top}P_{i}x + \langle q_i, x\rangle + r_i < 0$ for all $i=1,...,m$, then $d^{*}=p^{*}$ (strong duality) holds.**