In [1]:
%matplotlib notebook
from numpy import *
from matplotlib.pyplot import *
from numpy.linalg import qr 
from scipy.linalg import hilbert

In [2]:
def display_mat(msg,A):
    print(msg)
    display(A)
    print("")


<hr style="border-width:4px; border-color:coral"></hr>

# Homework 10 : Conditioning and stability of linear least squares

<hr style="border-width:4px; border-color:coral"></hr>

The least squares problem

\begin{equation}
A\mathbf x = \mathbf b
\end{equation}

where $A \in \mathcal R^{m \times n}$, $m \ge n$ has four associated "conditioning" problems, described in the table in Theorem 18.1 of TB (page 131).   These are 

1.  Sensitivity of $\mathbf y = A\mathbf x$ to right hand side vector $\mathbf b$, 

2.  Sensitivity of the solution $\mathbf x$ to right hand side vector $\mathbf b$, 

3.  Sensitivity of $\mathbf y = A\mathbf x$ to the coefficient matrix $A$, and

4.  Sensitivity of the solution $\mathbf x$ to the coefficient matrix $A$.



## Problem 1

<hr style="border-width:4px; border-color:coral"></hr>


**Sensitivity of $\mathbf y$ to a perturbation in $\mathbf b$.**

In TB Lecture 12, the relative condition number is defined as 

\begin{equation}
\kappa = \sup_{\delta x}\left(\frac{\Vert \delta f \Vert}{\Vert f(x) \Vert}\bigg/ \frac{\Vert \delta x \Vert}{\Vert x \Vert}\right)
\end{equation}

#### Problem 1(a)

 Arguing directly from this definition, establish the condition number of $\mathbf y$ with respect to perburbations in $\mathbf b$ given by TB Lecture 18 

\begin{equation}
\kappa = \frac{1}{\cos \theta}
\end{equation}

**Hint:** The input "$x$" in this problem is $\mathbf b$ and the output (or model) "$f$" is $\mathbf y$.  Show geometrically that the supremum is attained with $P\delta b = \delta b$.  

#### Problem 1(b)
For $\theta = \pi/2$, the condition number is $\infty$.  Illustrate what this means by considering the least squares problem

\begin{equation}
\begin{bmatrix} 2 \\ 1 \end{bmatrix}
\begin{bmatrix} x \end{bmatrix} = 
\begin{bmatrix} -1 \\ 2 \end{bmatrix}
\end{equation}

Use the results in TB 11.11 and 11.12 (page 82) to determine the projection operator $P$ for this problem.  Then compute $\mathbf y = P\mathbf b$ and show that $P\mathbf b = 0$.  Find a perturbation $\delta \mathbf b$ so that $P\delta \mathbf b = \delta \mathbf b = \delta \mathbf y \ne 0$. Explain what a condition number $\kappa=\infty$ might mean here.  Illustrate your argument graphically. 

#### Problem 1(c)

Now consider the problem

\begin{equation}
\begin{bmatrix} 2 \\ 1 \end{bmatrix}
\begin{bmatrix} x \end{bmatrix} = 
\begin{bmatrix} 2 \\ 1 \end{bmatrix}
\end{equation}

For this problem, show that $\kappa = 1$.  What is qualitatively different about this problem than the problem in which $\kappa = \infty$?  



## Problem 2

<hr style="border-width:4px; border-color:coral"></hr>

Problem 18.1 in TB (page 136) 

##### 2(a)

In [3]:

A=array([[1,1],[1,1.0001],[1,1.0001]])
b=array([[2],[0.0001],[4.0001]])
A_inv = linalg.pinv(A)
P=A@A_inv

display_mat(" A = ",A)
display_mat("Pseudo inverse of matrix A = ",A_inv)
display_mat("P = ",P)

 A = 


array([[1.    , 1.    ],
       [1.    , 1.0001],
       [1.    , 1.0001]])


Pseudo inverse of matrix A = 


array([[10000.99999998, -4999.99999999, -4999.99999999],
       [-9999.99999998,  4999.99999999,  4999.99999999]])


P = 


array([[1.00000000e+00, 9.09494702e-13, 9.09494702e-13],
       [0.00000000e+00, 5.00000000e-01, 5.00000000e-01],
       [0.00000000e+00, 5.00000000e-01, 5.00000000e-01]])




##### 2(b)

In [4]:
#x1 = linalg.lstsq(A,b)
#display_mat(" x = ",x1[0])
x = A_inv@b
y = P@b

display_mat(" x = ",x)
display_mat(" y = ",y)

 x = 


array([[1.],
       [1.]])


 y = 


array([[2.    ],
       [2.0001],
       [2.0001]])




##### 2(c)

In [5]:
k =linalg.cond(A)

K = linalg.norm(A)*linalg.norm(A_inv)
theta = arccos(linalg.norm(y)/linalg.norm(b))
eta = (linalg.norm(A)*linalg.norm(x))/linalg.norm(y)

display_mat(" K(A) = ",k)
display_mat(" theta = ",theta)
display_mat(" eta = ",eta)

 K(A) = 


42429.235416083044


 theta = 


0.684702873261185


 eta = 


1.000000000833278




##### 2(d)

In [6]:
##d
K_by = 1/cos(theta)
K_bx = k/(eta*cos(theta))
K_Ay = k/(cos(theta))
K_Ax = k+ (((k**2) *tan(theta))/eta)
display_mat(" K_by = ",K_by)
display_mat(" K_bx = ",K_bx)
display_mat(" K_Ay = ",K_Ay)
display_mat(" K_Ax = ",K_Ax)

 K_by = 


1.290977236078942


 K_bx = 


54775.1770207547


 K_Ay = 


54775.17706639765


 K_Ax = 


1469883252.449082




##### 2(e)

## Problem 3

<hr style="border-width:4px; border-color:coral"></hr>

Show that if $(\lambda,\mathbf v)$ is an eigenvalue/eigenvector pair for matrix $A$, then $((\lambda-\mu)^{-1}, \mathbf v)$ is an eigenvalue/eigenvector pair for the matrix $(A - \mu I)^{-1}$. 

Why is this observation useful when using the power iteration to find an eigenvalue close to $\mu$?   

##### Solution
If $\lambda$ is an eigenvalue  of matrix $A$ with a corresponding eigenvector $\mathbf v$, then we have that;
\begin{equation}
A \mathbf v = \lambda \mathbf v
\end{equation}
Subtracting $\mu \mathbf v$ from equation() above we obtain;
\begin{align}
A \mathbf v - \mu \mathbf v = \lambda \mathbf v - \mu \mathbf v\\
(A-\mu I)\mathbf v = (\lambda-\mu)\mathbf v\\
%\Rightarrow \frac{1}{} = \frac{}{}
\end{align}

## Problem 4

<hr style="border-width:4px; border-color:coral"></hr>

Exercise 29.1 (Lecture 29, TB page 223).  This is a five part problem that asks you to code an eigenvalue solver for a real, symmetric matrix using the shifted $QR$ algorithm.   Do your code in Python, using the Numpy `qr` algorithm where needed.  

The basic steps are : 

1.  Reduce your matrix $A$ to tridiagonal form.  You may use the hessenberg code we wrote in class. 

2.  Implement the unshifted $QR$ code (also done in class).  Use the Numpy routine `qr`.   Your iteration should stop when the off diagonal elements are smaller (in absolute value) than  $\tau \approx 10^{-12}$.  

3.  Find all eigenvalues of a matrix $A$ using the "deflation" idea described in Algorithm 28.2. 

4. Introduce the Wilkinson shift, described in Lecture 29.   

#### Notes

* Your code should work for a real, symmetrix matrix

* Your code does not have to be efficient in the sense of optimizing the cost of matrix/vector multiplies and so on.  

* Apply your algorithm to the Hilbert matrix `scipy.linalg.hilbert`.  The entries of the $m \times m$ Hilbert matrix are given by 

\begin{equation}
H_{ij} = \frac{1}{i + j - 1}, \qquad i,j = 1,2,\dots m
\end{equation}


In [7]:
def display_mat(msg,A):
    print(msg)
    fstr = {'float' : "{:>10.6f}".format}
    with printoptions(formatter=fstr):
        display(A)
    print("")

#### 1

In [8]:
def hessenberg(A): 
    m,n = A.shape 
    assert m == n, "A must be square" 
     
    H = A.copy() 
    Qt = eye(m) 
    for j in range(m-1): 
        x = H[j+1:,j:j+1]         
        I = eye(m-j-1) 
        s = 1 if x[0] > 0 else -1     # sign function, with sign(0)  = 1 
        v = s*linalg.norm(x,2)*I[:,0:1] + x 
 
        vn = linalg.norm(v,2) 
        v = v/vn 
        F = I - 2*(v@v.T) 
        H[j+1:,j:] = F@H[j+1:,j:] 
        H[0:,j+1:] = H[0:,j+1:]@F   # Apply F to the right side of H.  
         
    return H 

In [9]:
# A = array(mat('1,7,3; 7,4,5; 3,5,0'),dtype='float') 
# # A = array(mat('1,2,-1,5; 3,7,4,3; 5,6,4,-1; 4,6,2,2'),dtype='float')
A = array(mat('3,1,0,1;1,3,1,0;0,1,3,1;1,0,1,3'),dtype='float')
# display_mat("A = ",A) 
# display_mat("A.T = ",A.T)
# H = hessenberg(A) 
# display_mat("H = ",H) 

#### 2

In [10]:
## unshifted  𝑄𝑅  code

# def unshifted_𝑄𝑅(A,kmax = 1000 ):
#             m,n = A.shape 
#             Ak = A.copy() 
#             e = empty(kmax) 
#             Qbar = eye(m) 
#             for k in range(kmax): 
#                 Q,R = qr(Ak)    # Decompostion of A^k 
#                 Ak = R@Q 
#                 Qbar = Qbar@Q 
#                 e[k] = abs(Ak[1,0])     # Diagonal entry just below fir
#                 #print("{:5d} {:12.4e} ".format(k,e[k])) 
#                 if e[k] < 1e-12: 
#                         break 
#             return Ak,Qbar
# lam,v = unshifted_𝑄𝑅(H)
# lam

In [11]:
def unshifted_𝑄𝑅(A,kmax = 100000 ):
            m,n = A.shape 
            Ak = A.copy() 
            e = empty(kmax) 
            Qbar = eye(m) 
            for k in range(kmax): 
                Q,R = qr(Ak)    # Decompostion of A^k 
                Ak = R@Q 
                Qbar = Qbar@Q 
                e[k] = abs(Ak[1,0])     # Diagonal entry just below fir
                #print("{:5d} {:12.4e} ".format(k,e[k])) 
                if e[k] < 1e-12:
                        break 
            return Ak,Qbar
# lam,v = unshifted_𝑄𝑅(H)
# lam


##### 3

In [12]:
# ## shifted  𝑄𝑅  code
# def shifted_𝑄𝑅(A,kmax = 1000 ):
#             m,n = A.shape 
#             Ak = A.copy() 
#             mu = Ak[-1,-1]    # Raleigh shift 
#             I = eye(m)  
#             e = empty(kmax) 
#             Qbar = eye(m) 
#             for k in range(kmax): 
#                 Q,R = qr(Ak - mu*I)    # Decompostion of A^k 
#                 Ak = R@Q + mu*I 
#                 mu = Ak[-1,-1] 
#                 Qbar = Qbar@Q 
#                 e[k] = abs(Ak[1,0])     # Diagonal entry just below fir
#                 #print("{:5d} {:12.4e} ".format(k,e[k])) 
#                 if e[k] < 1e-12: 
#                         break 
#             return Ak,Qbar
# # D,Q = shifted_𝑄𝑅(H)
# # D

In [13]:
## shifted  𝑄𝑅  code


def shifted_𝑄𝑅(A,kmax = 10000 ):
            m1,n = A.shape 
            Ak = A.copy() 
            mu = Ak[-1,-1]    # Raleigh shift 
            #I = eye(m)  
            e = empty(kmax) 
            lam = zeros((m1,1))
            #Qbar = eye(m) 
            for k in range(kmax):
                m,n = Ak.shape
                
                Q,R = qr(Ak - mu*eye(m) )   # Decompostion of A^k 
                Ak = R@Q + mu*eye(m) 
                mu = Ak[-1,-1] 
                #Qbar = Qbar@Q 
                #e[k] = abs(Ak[1,0])     # Diagonal entry just below fir
                
                if size(Ak,1)==1:
                    lam[m] = Ak[-1,-1] 
                elif linalg.norm(Ak[-1,-2]) < 1e-12:
                    print("{:5d} {:12.4e} ".format(k,Ak[-1,-2]))
                    display_mat("A (after QR iteration) : ",Ak)
                    for i in range(0,m):
                        lam[i]= Ak[-1,-1]
                        Ak = Ak[0:m-1,0:m-1]
                    
            return lam


In [14]:
# def shifted_𝑄𝑅(A,kmax = 10000 ):
#             m1,n = A.shape 
#             Ak = A.copy() 
#             mu = Ak[-1,-1]    # Raleigh shift 
#             e = empty(kmax) 
#             lam = zeros((m1,1)) 
#             for k in range(kmax):
#                 m,n = Ak.shape
#                 if m==1:
#                     lam[m] = Ak[-1,-1] 
#                 else:
#                     Q,R = qr(Ak - mu*eye(m) )   # Decompostion of A^k 
#                     Ak = R@Q + mu*eye(m) 
#                     mu = Ak[-1,-1] 
#                     if Ak[-1,-2] < 1e-16:
#                             print("{:5d} {:12.4e} ".format(k,Ak[-1,-2]))
#                             display_mat("A (after QR iteration) : ",Ak)
#                             #for i in range(0,m):
#                             lam[size(Ak,1)-1]= Ak[-1,-1]
#                             Ak = Ak[0:m-1,0:m-1]
                    
#             return lam


In [15]:

H1 = hilbert(2)
display_mat("Hilbert matrix : ", H1)
H = hessenberg(H1)
D,Q = unshifted_𝑄𝑅(H)
D

Hilbert matrix : 


array([[  1.000000,   0.500000],
       [  0.500000,   0.333333]])




array([[ 1.26759188e+00, -9.04515298e-14],
       [-9.05632337e-14,  6.57414541e-02]])

In [16]:
D1 = shifted_𝑄𝑅(H)
D1

    3  -1.2263e-22 
A (after QR iteration) : 


array([[  1.267592,   0.000000],
       [ -0.000000,   0.065741]])




array([[0.06574145],
       [1.26759188]])

In [17]:
D,Q = unshifted_𝑄𝑅(H)
diag(D)


array([1.26759188, 0.06574145])

In [18]:
linalg.norm(H1-H1.T)

0.0

In [19]:
# def WilkinsonShift( H,m=5 ):
#     # Calculate Wilkinson's shift for symmetric matrices: 
#     sigma = (H[m-2,m-2]-H[m-2,m-1])/2
# #     return H[m-2,m-1] - sign(sigma)*H[m-1,m-1]^2/(abs(sigma) + sqrt(sigma^2+H[m-1,m-1]^2))
# m=5
# H[m-1,m-1]
# mu = WilkinsonShift( H[m-2,m-2], H[m-1,m-1], H[m-2,m-1] )

In [20]:
size(H1,1)

2