# 18.065 Problem Set 4

Due Friday 4/7 at 1pm.

## Problem 1

A "convex set" $S$ is one such that for any two points $x,y$ in $S$, the straight line connecting them is also in $S$.  That is, $\alpha x + (1-\alpha)y \in S$ for all $\alpha \in [0,1]$.

**(a)** If $f_i(x)$ is a [convex function](https://en.wikipedia.org/wiki/Convex_function) (as defined in e.g. lecture 16), explain why the constraint $f_i(x) \le 0$ defines a convex set of feasible points.

**(b)** Explain why the intersection of two convex sets is a convex set.  (Hence, the feasible set for many convex constraints is a convex set.)

## Problem 2

Suppose we are solving the $\ell^1$-regularized least-square problem:
$$
\min_{x\in \mathbb{R}^n} \left( \Vert b - Ax \Vert_2^2 + \lambda \Vert x \Vert_1 \right)
$$
where $\lambda > 0$ is some regularization parameter and $ \Vert x \Vert_1 = \sum_i |x_i|$ is the $\ell^1$ norm.

Similar to the examples in 16, show that this can be converted into a an equivalent [quadratic programming (QP)](https://en.wikipedia.org/wiki/Quadratic_programming) problem — a convex quadratic objective with affine constraints — by introducing one or more dummy variables (an "epigraph" formulation).   This replaces the non-differentiable $\ell^1$ problem with an equivalent *differentiable* problem with *differentiable constraints*.

## Problem 3

Let
$$
A(p) = A_0 + \underbrace{\begin{pmatrix} p_1 & & & \\ & p_2 & & \\ & & \ddots & \\ & & & p_m \end{pmatrix}}_{\text{diagm(p)}}
$$
where $A_0$ is some $m \times m$ matrix and $p \in \mathbb{R}^m$ are $m$ parameters.

Now, suppose we compute
$$
f(p) = g(\underbrace{A(p)^{-1} b}_{x(p)})
$$
where $g(x) = x^T G x$, $b \in \mathbb{R}^m$ is some vector, and $G = G^T$ is some symmetric $m \times m$ matrix.

**(a)** What is $\nabla f$?  Explain how you can compute *all m* components of $\nabla f$ by solving only *two* $m \times m$ systems followed by $\Theta(m)$ additional work in *total*.

**(b)** Implement your solution from (a) by filling in the function `∇f(p)` below, and check that it correctly predicts $df = f(p + dp) - f(p)$ (*approximately*) for a random small $dp$.

In [4]:
using LinearAlgebra

# some A₀, b, G for m=5
m = 5
A₀ = randn(m,m)
b = randn(m)
G = randn(m,m); G = G' + G

# our functions
g(x) = x' * G * x
f(p) = g((A₀ + Diagonal(p)) \ b)

p = randn(m) # some random parameters
f(p) # make sure it gives a number out

-25.41770831865196

In [None]:
# your solution to (b):

function ∇f(p)
    ????
end

dp = randn(m) * 1e-8 # a random small dp
df = f(p + dp) - f(p)

# check: ∇f approximately predicts df
# ???

## Problem 4

In class, we considered steepest descent for $f(x) = \kappa x_1^2 + x_2^2$ in $\mathbb{R}^2$, and argued that for an arbitrary $x = [x_1,x_2]$ starting point and $\kappa \gg 1$, the steepest-descent step $x \leftarrow x - sz$ is *approximately* $sz \approx [x_1, x_2/\kappa]$.

**(a)** If $sz = [x_1, x_2/\kappa]$ exactly, then on the next step we would have the new $x \leftarrow x - sx = [0, (1-\frac{1}{\kappa} x_2)]$.   However, explain why a more careful calculation shows that the new $x - sx \approx [O(1/\kappa), (1-\frac{1}{\kappa} x_2)]$, i.e. the first component is proportional to $1/\kappa$ to leading order in $1/\kappa$.

**(b)** If you start with an $x = [\#/\kappa, x_2]$, i.e. where the first component is proportional to $1/\kappa$ and $\#$ is some number of the same order of magnitude as $x_2$, show that after one steepest-descent step (for $\kappa \gg 1$) the $x_1$ component is *still* roughly order $1/\kappa$ but of the opposite sign, and $x_2$ again subtracts a term roughly proportional to $x_2/\kappa$.

**(c)** Implement this steepest-descent process numerically for $\kappa = 100$ and a starting point $x_1 = 0.01234567$ and $x_2 = 0.8910$.  Plot $100x_1$ and $x_2$ for 100 iterations of steepest descent.  A more careful analysis would show a convergence proportional to $\left( \frac{\kappa - 1}{\kappa + 1} \right)^k$, where $k$ is the iteration number, following equation (4) in the Strang book — include this function for comparison on your plot.

## Problems 5, 6, … coming soon