# DSCI 6001 5.4 Lab QR Factorization III

### Householder Factorization


## Rotation Strategies for QR Factorization


The Gram-Schmidt algorithm is not terribly efficient for heavy repeated applications , however, and so **rotation strategies** are what are typically applied. These are either **Householder reflections** or **Givens rotations**. These strategies are typically employed in the "Francis" algorithm which is the current state of the art for square matrix decomposition.  


A simple tweak is the usual way that the QR decomposition is presented in practice:

Consider:

$A = QR$

now reverse the order of multiplication:

$RQ = Q^{-1}AQ$

It so happens, due to reasons not covered in this course, that multiple applications of the similarity transformation result in a final product $RQ$ that becomes upper triangular such that the eigenvalues can be read from the diagonal:

$A = Q_{k}Q_{k-1}Q_{k-2}...Q_{1}R$

The typical way this above formulation is done is by using **rotations**; either **Householder** or **Givens rotations.** We will only cover Householder rotations today due to time constraints required in developing an intuition for Givens rotations. You are encouraged to explore this on your own.

## Householder reflections

Householder reflections are a way of obtaining $Q$ implicitly without ever calculating the GS basis directly. They are efficient to calculate using common matrix operations and can be improved using modern techniques of broadcasting. They do not require explicit calculations of matrix products and require less in the way of storage and operations than any of the GS algorithms.

### Householder reflections: Concepts

Householder transformations are generalizations of reflections across the plane. These are matrices of the form:

$$ H_{u} = I - 2uu^{T}$$

Where $I$ is the identity matrix, and $u$ is an $N$ dimensional unit vector.

H is symmetric: $(H_{u})^{T} = (I - 2uu^{T})^{T} = I^{T}-2(uu^{T})^{T} = I - 2uu^{T} = H_{u}$

H is also orthogonal: $H_{u}^{T}H_{u} = (I - 2uu^{T})^{T}(I - 2uu^{T}) = I - 4uu^{T} + 4uu^{T}uu^{T} = I$

When you apply $H_{u}$ to a target vector $\bf{y}$: 

$H_{u}{\bf{y}} = {\bf{y}}-2uu^{T}{\bf{y}}$

This corresponds to reflecting ${\bf{y}}$ about the line through the origin perpendicular to ${\bf{u}}$ as shown in the below figure:

![householder](./imgs/householder.png)

Making use of the standard basis axes as a reference point we can choose

$$ u_{1} = \dfrac{{\bf{y}} \pm \|{\bf{y}}\|e_{1}}{\|{\bf{y}} \pm \|{\bf{y}}\|e_{1}}e_{1}$$

This produces a reflection along the $e_{1}$ axis. Then we must construct the Householder matrix:

$$H_{u_{1}} = I - 2u_{1}u_{1}^{T}$$

This matrix is applied in a rotation of $A$:

$$ X_{1} = H_{u_{1}}A $$

The basic idea is that we use the householder reflection to project ${\bf{y}}$ onto the axis orthogonal to ${\bf{u}}$. This projection is propagated throughout the matrix (of column vectors), effectively creating a change of coordinates (a sort of translation). This sets elements below the first diagonal to $0$, and provides a coordinate change to the remaining $n-1$ vectors. Then we take the next vector in the matrix and calculate its reflection against the previously corrected vector, setting the below elements to $0$, propagating to the $n-2$ vectors and so on.

The householder algorithm proceeds for a $m \times n$ matrix as follows:

set $Q = I_{n}$

$for\ i\ in\ num\ columns:$

$\ \ \ \ \ u_{i} = \dfrac{{\bf{y}} \pm \|{\bf{y}}\|e_{i}}{\|{\bf{y}} \pm \|{\bf{y}}\|e_{i}}e_{i}$

$\ \ \ \ \ H_{i} =  I - 2u_{i}u_{i}^{T}$

$\ \ \ \ \ Q = QH_{i}$

Finally you end up with a series of $Q = Q_{n-1}Q_{n-2}...Q_{1}$, giving a final estimate for the real $Q$

### Example:

Let's get the $QR$ decomposition of the two-vector matrix 

$X = \begin{bmatrix}1. & 1.26\\1. & 1.82\\1.& 2.22 \end{bmatrix}$

We first choose a $u$ to take the first column of the matrix to the x-axis:

$u_{1} = \begin{bmatrix}1.\\1.\\1.\end{bmatrix}-\sqrt{3}\begin{bmatrix}1.\\0\\0\end{bmatrix}$

Then we need to normalize it:

$u_{1} = \dfrac{u_{1}}{\|u_{1}\|} =\dfrac{1}{1.5925} \begin{bmatrix}-0.7321.\\1.\\1.\end{bmatrix}$

Now create $H_{u_{1}}$:

$H_{u_{1}} = I - 2u_{1}u_{1}^{T} = \begin{bmatrix}1. & 0 & 0\\0 & 1. & 0\\0 & 0 & 1.\end{bmatrix} - 2\begin{bmatrix} 0.21132487 & -0.28867513 & -0.28867513\\-0.28867513 & 0.39433757 & 0.39433757\\ -0.28867513 & 0.39433757 & 0.39433757\end{bmatrix}$

$H_{u_{1}} = \begin{bmatrix} 0.57735027 & 0.57735027 & 0.57735027 \\ 0.57735027 & 0.21132487 & -0.78867513\\ 0.57735027 & -0.78867513 & 0.21132487\end{bmatrix}$

And we'll start off the creation of Q by allowing the first $Q_{i}$ to be $H_{u_{1}}$:

$Q = IH_{u_{1}}$


The current state of the factorization can be taken with the matrix product of $H_{u_{1}}$ and $X$:

$R = H_{u_{1}}X = \begin{bmatrix} 1.7321 & 3.0600 \\ 0 & -0.6388 \\ 0 & -0.2388\end{bmatrix} $

Now we need to tear down the second column. Note that we don't want to lose the work we've done in the first row, so the next $u$ will not be considering the first row:

$u_{2} = \begin{bmatrix}0\\-0.6388\\-0.2388\end{bmatrix}-0.6820\begin{bmatrix}0\\1.\\0\end{bmatrix}$

$u_{2} = \dfrac{u_{2}}{\|u_{2}\|} = \dfrac{1}{1.3422}\begin{bmatrix}0\\-1.3208\\-0.2388\end{bmatrix}$

$H_{u_{2}} = I - 2u_{2}u_{2}^{T} = \begin{bmatrix} 1 & 0 & 0 \\ 0 & -2.48902528 & -0.63081408\\ 0 & -0.63081408 & 0.88594912\end{bmatrix}$

$R = H_{u_{2}}H_{u_{1}}X = \begin{bmatrix} 1.7321 & 3.0600 \\ 0 & 0.6820 \\ 0 & 0\end{bmatrix} $

$Q = H_{u_{1}}H_{u_{2}} = \begin{bmatrix} 0.57735027 & -0.74295879 & 0.33864273 \\
  0.57735027 & 0.07820619 & -0.81274255 \\
  0.57735027 & 0.6647526 & 0.47409982 \end{bmatrix} $

## TASK:

Below is given an example snippet of the above operations as a major hint. Below that is the code stub you can use to fill out, or try a method of your own. 

In [40]:
import numpy as np
import numpy.linalg as LA

eps = 1.0E-10
Q = np.eye(3)

X = np.array([[1., 1.26],[1, 1.82],[1., 2.22]])
R = np.copy(X)

# we are making some shortcuts here, so as not to give everything away

u1 = X[:,0]-LA.norm(X[:,0])*np.array([1.,0.,0.])
u1 = u1/LA.norm(u1)

H1 = np.identity(3)-2.*np.outer(u1, u1)

print(np.outer(u1,u1))

print("h1")
print(H1)
print(' ')

Q = np.dot(Q, H1)
R = np.dot(H1, R)

#continuing on, with more hints

x = H1.T.dot(X)[1:,1] # now you ignore the top i rows (because they've already been solved)
e = np.zeros_like(H1.T.dot(X)[1:,1]) # you want do reflection only on the elements that haven't been solved for yet
e[0] = np.copysign(np.linalg.norm(x), -X[1, 1]) # this is a useful step 
                                            # just to make sure that you've got the right signed norm

u = x+e
v = u / np.linalg.norm(u)

H_i = np.identity(3)
H_i[1:, 1:] -= 2.0 * np.outer(v, v)
R = np.dot(H_i, R)

Q = np.dot(Q, H_i)

# Here is a clean way to zero out low values
low_values_indices = R < eps  # Where values are low
R[low_values_indices] = 0  # All low values set to 0

print(Q)
print(' ')
print('and the factorization emerges')
print(R)

[[ 0.21132487 -0.28867513 -0.28867513]
 [-0.28867513  0.39433757  0.39433757]
 [-0.28867513  0.39433757  0.39433757]]
h1
[[ 0.57735027  0.57735027  0.57735027]
 [ 0.57735027  0.21132487 -0.78867513]
 [ 0.57735027 -0.78867513  0.21132487]]
 
[[ 0.57735027 -0.74295879  0.33864273]
 [ 0.57735027  0.07820619 -0.81274255]
 [ 0.57735027  0.6647526   0.47409982]]
 
and the factorization emerges
[[ 1.73205081  3.05995643]
 [ 0.          0.68195797]
 [ 0.          0.        ]]


In [56]:
from math import copysign
import numpy as np

def householder_reflection(A):
    """Perform QR decomposition of matrix A using Householder reflection."""
    (rows, cols) = np.shape(A)

    # * Initialize orthogonal matrix Q and upper triangular matrix R.
    # * I would set Q = I (how big should it be?) But you could do zeros
    # * I would set R=A (you'll need to make a copy of it)
    eps = 1.0E-10
    Q = np.eye(rows)
    R = np.copy(A)

    
    # * iterate over each column subvector
    for j in range(cols):
        # * pick out the subvector we're looking at
        if j == 0:
            u = A[j:,j]
            
            # 1st column. This initial vector length is the total number of rows, rest is rows - 1.
            if len(u) == rows:
        
                u = u - np.linalg.norm(u) * Q[j]
                u = u/np.linalg.norm(u)
                
                H = np.identity(rows) - 2 * np.outer(u, u)
                R = np.dot(H, R)
                Q = np.dot(Q, H)
        
        # * get the correct sign and components of the subvector and set this to the first nonzero component
        x = H.T.dot(A)[j:,j]
        e = np.zeros_like(H.T.dot(A)[j:,j])
        e[0] = copysign(np.linalg.norm(x), -A[j, j])
        
        
        # * build u from the subvector and the norm 
        #* (there are several ways of doing this - look at the math)
        u = x + e[0] * e
        
        
        # * norm this u
        v = u/np.linalg.norm(u)

        
        # * build Householder reflection
        H_i = np.identity(rows)
        H_i[j:, j:] -= 2 * np.outer(v, v)
        

        # * Apply this householder reflection to R (left to right)
        R = np.dot(H_i, R)
        # * Apply this householder reflection to Q (right to left)
        Q = np.dot(Q, H_i)
        
        low_values_indices = R < eps
        R[low_values_indices] = 0

    return (Q, R)

In [57]:
import numpy as np
A = np.array([[-2, 0, 1 ],[1, -2, 1],[1, -1, 0]], dtype=float)
(Q,R) = householder_reflection(A)
#R = Q.T.dot(A)
print(Q)
print(Q.dot(R))

[[ 0.81649658 -0.00589164 -0.57732021]
 [-0.40824829 -0.7129616  -0.57010446]
 [-0.40824829  0.70117833 -0.58453596]]
[[ 0.          1.          0.33333333]
 [ 0.         -0.5        -0.16666667]
 [ 0.         -0.5        -0.16666667]]


In [33]:
import numpy as np
A = np.array([[-2, 0, 1 ],[1, -2, 1],[1, -1, 0]], dtype=float)
(Q,R) = householder_reflection(A)
#R = Q.T.dot(A)
print(Q)
print(Q.dot(R))

i  0 x [-2.  1.  1.]
e:  [ 2.44948974  0.          0.        ]
u: [ 0.44948974  1.          1.        ]
Q[ 0 ]
[[ 0.81649658 -0.40824829 -0.40824829]
 [-0.40824829  0.09175171 -0.90824829]
 [-0.40824829 -0.90824829  0.09175171]]
i  1 x [ 0.72474487  1.72474487]
e:  [ 1.87082869  0.        ]
u: [ 2.59557356  1.72474487]
Q[ 1 ]
[[ 1.          0.          0.        ]
 [ 0.         -0.38739243 -0.92191491]
 [ 0.         -0.92191491  0.38739243]]
[[ 0.81649658  0.53452248  0.21821789]
 [-0.40824829  0.80178373 -0.43643578]
 [-0.40824829  0.26726124  0.87287156]]
[[ -2.00000000e+00  -1.20335748e-15   1.00000000e+00]
 [  1.00000000e+00  -2.00000000e+00   1.00000000e+00]
 [  1.00000000e+00  -1.00000000e+00   1.02695630e-15]]
