# Nonnegative least square
$\newcommand{\n}[1]{\left\|#1 \right\|}$ 
$\renewcommand{\a}{\alpha}             $ 
$\renewcommand{\b}{\beta}              $ 
$\renewcommand{\c}{\gamma}             $ 
$\renewcommand{\d}{\delta}             $ 
$\newcommand{\D}{\Delta}               $ 
$\newcommand{\la}{\lambda}             $ 
$\renewcommand{\t}{\tau}               $ 
$\newcommand{\s}{\sigma}               $ 
$\newcommand{\e}{\varepsilon}          $ 
$\renewcommand{\th}{\theta}            $ 
$\newcommand{\x}{\bar x}               $ 
$\newcommand{\R}{\mathbb R}            $ 
$\newcommand{\N}{\mathbb N}            $ 
$\newcommand{\Z}{\mathbb Z}            $ 
$\newcommand{\E}{\mathcal E}           $ 
$\newcommand{\lr}[1]{\left\langle #1\right\rangle}$
$\newcommand{\nf}[1]{\nabla f(#1)}     $
$\newcommand{\hx}{\hat x}               $
$\newcommand{\hy}{\hat y}               $
$\DeclareMathOperator{\prox}{prox}      $
$\DeclareMathOperator{\argmin}{argmin}  $
$\DeclareMathOperator{\dom}{dom}        $
$\DeclareMathOperator{\id}{Id}          $
$\DeclareMathOperator{\conv}{conv}      $

We are interested in the nonnegative least square problem
\begin{align*}
\min_x &  ||Ax-b||^2\\
    \text{subj. to } & x \geq 0.
\end{align*}
Alternatively, we can write it as 
$$\min_x \frac 1 2 ||Ax-b||^2 + \delta_{x\geq 0}=: f(Ax)+g(x)$$
or in a primal-dual form
$$\min_x \max_y g(x)+(Ax,y)-f^*(y),$$
where $f(x) = \frac 1 2 ||x-b||^2$, $f^*(y) = \frac 1 2 ||y||^2 + (b,y) = \frac 1 2 ||y+b||^2 -\frac{1}{2}||b||^2$, $g(x) = \delta_{\R^n_+}(x)$.

In order to apply proximal gradient method of FISTA we have to define $h(x) = f(Ax) = \frac 1 2 ||Ax-b||^2$. Then $\nabla h(x) = A^*(Ax-b)$.

In [9]:
import scipy.sparse as sr
import scipy.sparse.linalg as la
import matplotlib.pyplot as plt
from scipy import  io
from opt_operators import *
from algorithms import *
from pd_algorithms import *
%reload_ext autoreload
%autoreload 2

Choose any instance of the problems below. The data is taken from http://math.nist.gov/MatrixMarket/

In [3]:
A = io.mmread("real_data/well1033.mtx")
#A = io.mmread("real_data/well1850.mtx")
#A = io.mmread("real_data/illc1033.mtx")
#A = io.mmread("real_data/illc1850.mtx")

m,n = A.shape

gen = 100
np.random.seed(gen)
b = np.random.normal(0,1,m)

Define all proximal operators, energy $J$, and gradient $\nabla h$ for proximal gradient-like methods

In [4]:
def prox_g(x, rho):
    return np.fmax(x, 0)

def prox_f_conj(y, rho):
    return (y - rho*b)/(1+rho)

def J(x,y):
    t = A.dot(x)-b
    return 0.5*t.dot(t)

# We need the following in order to apply accelerated primal-dual with strongly convex primal term.
def J1(y,x):
    return J(x,y)

#### for FISTA
def dh(x):
    return A.T.dot(A.dot(x)-b)

def F(x):
    return J(x,1)

Compute matrix norm (expensive in general) and Frobenius norm (cheap)

In [5]:
# find the largest eigenvalue:
max_eig = np.real((la.eigs(A.dot(A.T), k=1)[0]))[0]
# find matrix norm
L = np.sqrt(max_eig)
1./L

# find Frobenius norm (this is very cheap)
L_F = LA.norm(A.todense())

Define number of iterations, starting points, and initial step sizes

In [6]:
N = 500

x0 = np.zeros(n)
y0 = -b

# for PDA
tau = 1./L
sigma = 1./L

alpha = 1./L**2 # for FISTA
tau0 = np.sqrt(min(m,n))/L_F  # for PDAL

In [7]:
ans1 = pd(J, prox_g, prox_f_conj, A, x0, y0, sigma, tau, numb_iter=N)
ans2 = pd_linesearch_dual_is_square_norm(J, prox_g, -b, A, x0, y0, tau0, 1., numb_iter=N)
ans3 = pd_accelerated_primal(J1, prox_f_conj, prox_g, -A.T, y0, x0, tau, sigma, 0.5,   numb_iter=N)
ans4 = pd_linesearch_acceler_dual(J, prox_g, prox_f_conj, A, x0, y0, tau0, 1, 0.5,  numb_iter=N)
ans5 = fista(F, dh, prox_g, x0, alpha, numb_iter=N)

----- Primal-dual method -----
Time execution: 0.0845639705658
----- Primal-dual method with  linesearch. f^*(y)=0.5*||y-b||^2-----
Time execution: 0.101423978806
----- Accelerated primal-dual method (g(x) is strongly convex)-----
Time execution: 0.0756549835205
----- Accelerated primal-dual method with linesearch (f^* is strongly convex) -----
Time execution: 0.186081886292
---- FISTA----
Time execution: 0.0701711177826


In [8]:
t = min(ans1[0]+ans2[0]+ans3[0]+ans4[0]+ans5[0])
plt.plot(ans1[0]-t, 'b',)
plt.plot(ans2[0]-t, 'r',)
plt.plot(ans3[0]-t, 'k',)
plt.plot(ans4[0]-t, 'g',)
plt.plot(ans5[0]-t, 'm',)

plt.yscale('log')
plt.show()

To make nicer plots:

In [91]:
import matplotlib as mpl
mpl.rc('lines', linewidth=2)
mpl.rcParams.update(
    {'font.size': 13, 'font.family': 'STIXGeneral', 'mathtext.fontset': 'stix'})
mpl.rcParams['xtick.major.pad'] = 2
mpl.rcParams['ytick.major.pad'] = 2

t = min(ans1[0]+ans2[0]+ans3[0]+ans4[0]+ans5[0])
plt.plot(ans1[0]-t, 'b',label = 'PDA')
plt.plot(ans2[0]-t, 'r', label = 'PDAL')
plt.plot(ans3[0]-t, 'k',label = 'APDA')
plt.plot(ans4[0]-t, 'g', label = 'APDAL')
plt.plot(ans5[0]-t, 'm',label = 'FISTA')
plt.yscale('log')
#plt.xscale('log')
plt.xlabel(u' iterations ')
plt.ylabel('$\phi(x)-\phi^*$')

plt.legend()
plt.savefig('figures/nonneg-1.pdf')
plt.show()