## Constrained least squares

Let's start off by considering the classical least-squares regression problem.

\begin{align}
\min_x \quad&||Ax-b||_2\\
\end{align}

In this notebook, we'll see how to formulate the least squares problem using the [Convex.jl](https://github.com/JuliaOpt/Convex.jl) package and then add constraints on top of it. In constrast to previous classes, our focus will be on developing these methods from the primitives of convex optimization instead of using an off-the-shelf package.

In [2]:
using Distributions # for sampling from normal distribution
using Convex        # for solving least-squares problem
using ECOS          # open-source convex solver

In [3]:
# Fix random seed
srand(10);

We first fix $A$ and $x$ and then generate $b = Ax + \epsilon$ where $\epsilon$ is Gaussian random noise.

In [4]:
# rand() generates coefficients uniformly between 0 and 1
A = rand(1000,50); 

In [5]:
x = rand(50);

In [6]:
b = A*x + rand(Normal(0,100),1000);

Now we invoke the Convex.jl syntax to declare a set of optimization varibles $\hat x$,

In [7]:
x̂ = Variable(50);

and declare a problem of minimizing $||Ax-b||_2$:

In [8]:
problem = minimize(norm(A*x̂-b,2)); 

Then solve it using ECOS:

In [9]:
solve!(problem,ECOSSolver(verbose=0))

And ask for the solution:

In [10]:
evaluate(x̂)

50x1 Array{Float64,2}:
  -0.711662
  12.3106  
  -8.18699 
 -16.5759  
  13.612   
   9.79773 
 -14.737   
   3.98307 
   0.15439 
   6.40962 
  -1.45227 
  -3.26792 
 -22.939   
   ⋮       
  -2.08691 
  -1.08507 
  19.656   
   0.745009
   1.84697 
  -6.47361 
   3.48504 
  39.3582  
 -21.1685  
 -16.3304  
   0.126439
  -7.60035 

Let's turn this into a standalone function:

In [11]:
"""
    leastsquares(A,b)

Given A,b, returns the x which minimizes ||A*x-b||₂
"""
function leastsquares(A,b)
    numrow,numcol = size(A)
    x̂ = Variable(numcol)
    problem = minimize(norm(A*x̂-b,2))
    solve!(problem,ECOSSolver(verbose=0))
    return evaluate(x̂)
end

leastsquares (generic function with 1 method)

The documentation above the function now appears when you search for help.

In [12]:
?leastsquares

search: 

```
leastsquares(A,b)
```

Given A,b, returns the x which minimizes ||A*x-b||₂


In [13]:
x_ls = leastsquares(A,b);

leastsquares



We generated coefficients from the interval $[0,1)$. How many of the coefficients are below zero?

In [14]:
sum(x_ls .<= 0)

25

The [nonnegative least squares](https://en.wikipedia.org/wiki/Non-negative_least_squares) problem is
\begin{align}
\min_{x\ge 0} \quad&||Ax-b||_2\\
\end{align}

Will the knowledge of the fact that the coefficients are nonnegative give us a better solution? Let's check!

In [15]:
"""
    nonnegativeleastsquares(A,b)

Given A,b, returns the x which minimizes ||A*x-b||₂ subject to x ≥ 0
"""
function nonnegativeleastsquares(A,b)
    numrow,numcol = size(A)
    x̂ = Variable(numcol)
    problem = minimize(norm(A*x̂-b,2),
                       x̂ >= 0) # All we changed was this!!
    solve!(problem,ECOSSolver(verbose=0))
    return evaluate(x̂)
end

nonnegativeleastsquares (generic function with 1 method)

In [16]:
x_nnls = nonnegativeleastsquares(A,b);

In [17]:
sum(x_nnls .< 0) # all components are nonnegative

0

How can we compare the two solutions? An easy way is the distance from the true solution:

In [18]:
norm(x-x_ls,2)

89.86991822320907

In [19]:
norm(x-x_nnls,2)

17.88255684742728

A better way to compare the two solutions is to check their predictive power. How well do they predict $b$ if we resample the noise?

In [20]:
# out of sample test
b2 = A*x + rand(Normal(0,10),1000);

In [21]:
norm(A*x_ls-b2,2)

832.7949480457804

In [22]:
norm(A*x_nnls-b2,2)

359.01072183471734

**Sanity check**: what would happen if we just used $\hat x = 0$?

In [23]:
norm(b2,2)

524.7639978985346

What if we just guessed ``b`` from the original observations?

In [24]:
norm(b-b2,2)

3256.2477400241655

>**\[Exercise\]**: Interval-constrained least squares.

> Recall we generated the coefficients of x from $[0,1)$. Does it help the performance of the regression if we constrain $0 \le \hat x \le 1$?