# QR Factorization

---

__QR factorization__ of a $m\times n$ matrix $A$, where $m\geq n$, is

$$
A=QR, 
$$

where $Q$ is an __ortonormal matrix__ of size $m\times m$, or

$$
Q^TQ=Q Q^T=I,
$$

and $R$ is a $m\times n$ upper triangular matrix.

We call an orthonormal matrix also __orthogonal matrix__.

For example,

\begin{align*}
\begin{bmatrix} a_{11} & a_{12} & a_{13} \\
a_{21} & a_{22} & a_{23} \\
a_{31} & a_{32} & a_{33} \\
a_{41} & a_{42} & a_{43} \\
a_{51} & a_{52} & a_{53}
\end{bmatrix}=
\begin{bmatrix}
q_{11} & q_{12} & q_{13} & q_{14} & q_{15} \\
q_{21} & q_{22} & q_{23} & q_{24} & q_{25} \\
q_{31} & q_{32} & q_{33} & q_{34} & q_{35} \\
q_{41} & q_{42} & q_{43} & q_{44} & q_{45} \\
q_{51} & q_{52} & q_{53} & q_{54} & q_{55}
\end{bmatrix}
\begin{bmatrix}
r_{11} & r_{12} & r_{13} \\
0 & r_{22} & r_{23} \\
0 & 0 & r_{33} \\
0 & 0 & 0 \\
0 & 0 & 0 
\end{bmatrix}. \tag{1}
\end{align*}

The relation (1) also defines an  __economical QR factorization__

\begin{align*}
\begin{bmatrix} a_{11} & a_{12} & a_{13} \\
a_{21} & a_{22} & a_{23} \\
a_{31} & a_{32} & a_{33} \\
a_{41} & a_{42} & a_{43} \\
a_{51} & a_{52} & a_{53}
\end{bmatrix}=
\begin{bmatrix}
q_{11} & q_{12} & q_{13} \\
q_{21} & q_{22} & q_{23} \\
q_{31} & q_{32} & q_{33} \\
q_{41} & q_{42} & q_{43} \\
q_{51} & q_{52} & q_{53}
\end{bmatrix}
\begin{bmatrix}
r_{11} & r_{12} & r_{13} \\
0 & r_{22} & r_{23} \\
0 & 0 & r_{33}
\end{bmatrix}. \tag{2}
\end{align*}


Equating columns starting from the first one, gives:

\begin{align*}
t&=a_{:1}\\
r_{11}&=\|t\|_2 \\
q_{:1}&=t\frac{1}{r_{11}}\\
r_{12}&= q_{:1}^Ta_{:2} \\
t&=a_{:2}-q_{:1}r_{12} \\
r_{22}&=\|t\|_2 \\
q_{:2}&=t\frac{1}{r_{22}} \\
r_{13}&=q_{:1}^Ta_{:3} \\
r_{23}&=q_{:2}^Ta_{:3} \\
t&=a_{:3}-q_{:1}r_{13}-q_{:2}r_{23}\\
r_{33}&=\|t\|_2 \\
q_{:3}&=t\frac{1}{r_{33}}.
\end{align*}

Induction yields __Gram-Schmidt orthogonalization procedure__.

In [1]:
using LinearAlgebra
function GramSchmidtQR(A::Array)
    m,n=size(A)
    R=zeros(n,n)
    Q=Array{Float64}(undef,m,n)
    R[1,1]=norm(A[:,1])
    Q[:,1]=A[:,1]/R[1,1]
    for k=2:n
        for i=1:k-1
            R[i,k]=Q[:,i]⋅A[:,k]
        end
        t=A[:,k]-sum([R[i,k]*Q[:,i] for i=1:k-1])
        R[k,k]=norm(t)
        Q[:,k]=t/R[k,k]
    end
    Q,R
end 

GramSchmidtQR (generic function with 1 method)

In [2]:
import Random
Random.seed!(123)
A=rand(8,5)

8×5 Array{Float64,2}:
 0.768448   0.26864   0.275819  0.20923   0.356221
 0.940515   0.108871  0.446568  0.918165  0.900925
 0.673959   0.163666  0.582318  0.614255  0.529253
 0.395453   0.473017  0.255981  0.802665  0.031831
 0.313244   0.865412  0.70586   0.555668  0.900681
 0.662555   0.617492  0.291978  0.940782  0.940299
 0.586022   0.285698  0.281066  0.48      0.621379
 0.0521332  0.463847  0.792931  0.790201  0.348173

In [3]:
Q,R=GramSchmidtQR(A)

([0.4459793428925313 -0.11247250760342886 … -0.5812166712836059 -0.41412595632067195; 0.5458410188985631 -0.35479752607940923 … 0.2965179575333956 0.30357335777811606; … ; 0.3401061260963262 -0.005962161430000556 … -0.07610095803321874 0.14145810224715058; 0.030256209503033652 0.43235726102632527 … 0.25137006145700375 -0.1283593711674988], [1.7230566559710467 0.8577805301619602 … 1.6688916514356933 1.6121226948829994; 0.0 1.012805983168232 … 0.760567613361574 0.6039879528877767; … ; 0.0 0.0 … 0.6864929839600884 -0.0027145072616664018; 0.0 0.0 … 0.0 0.6528889504004459])

In [4]:
Q

8×5 Array{Float64,2}:
 0.445979   -0.112473    -0.144567   -0.581217   -0.414126
 0.545841   -0.354798     0.210356    0.296518    0.303573
 0.391141   -0.169675     0.45213    -0.0982655  -0.123261
 0.229507    0.272659    -0.24854     0.435716   -0.699857
 0.181796    0.700501     0.0463297  -0.432191    0.268038
 0.384523    0.284018    -0.440047    0.344953    0.350741
 0.340106   -0.00596216  -0.0882082  -0.076101    0.141458
 0.0302562   0.432357     0.681975    0.25137    -0.128359

In [5]:
Q'*Q

5×5 Array{Float64,2}:
 1.0           1.75748e-16   1.92453e-16   3.41335e-16   1.11022e-16
 1.75748e-16   1.0          -1.1918e-16   -2.4266e-16   -2.22045e-16
 1.92453e-16  -1.1918e-16    1.0          -2.02216e-16  -2.77556e-16
 3.41335e-16  -2.4266e-16   -2.02216e-16   1.0          -6.10623e-16
 1.11022e-16  -2.22045e-16  -2.77556e-16  -6.10623e-16   1.0

In [6]:
R

5×5 Array{Float64,2}:
 1.72306  0.857781  1.01346   1.66889    1.61212
 0.0      1.01281   0.700064  0.760568   0.603988
 0.0      0.0       0.67391   0.349435   0.179984
 0.0      0.0       0.0       0.686493  -0.00271451
 0.0      0.0       0.0       0.0        0.652889

In [7]:
# Residual
A-Q*R

8×5 Array{Float64,2}:
 0.0   0.0           0.0           0.0           0.0
 0.0   2.77556e-17   5.55112e-17  -1.11022e-16   0.0
 0.0   0.0           0.0           0.0           0.0
 0.0   5.55112e-17   0.0           0.0          -2.77556e-17
 0.0   0.0          -1.11022e-16  -1.11022e-16   0.0
 0.0   0.0           0.0          -1.11022e-16   0.0
 0.0   0.0           0.0           0.0           0.0
 0.0  -5.55112e-17   0.0           0.0           0.0

Algorithm `GramSchmidtQR()` is numerically unstable, so it is better to use __modified Gram-Schmidt algorithm__ or __Householder reflectors__ or __Givens rotations__ (see [Matrix Computations, Section 5][GVL13]).

[GVL13]: https://books.google.hr/books?id=X5YfsuCWpxMC&printsec=frontcover&hl=hr#v=onepage&q&f=false "G. Golub and C. F Van Loan, 'Matrix Computations', 4th Edition, John Hopkins, Baltimore, 2013"

## Householder reflectors

__QR factorization of vector__ $x$ is

$$
H \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_m 
\end{bmatrix}  =r,
$$

where

$$
H=I - \frac{2}{v^Tv}v v^T, \qquad  
v=\begin{bmatrix}
x_1\pm \|x\|_2 \\ x_2 \\ x_3 \\ \vdots \\ x_m
\end{bmatrix}.
$$ 

__Householder reflector__ $H$ is __symmatric__ and __orthogonal__ matrix (__prove it!__). Depending on the choice of sign in the definition of the vector $v$, we have

$$
r=\begin{bmatrix} \mp \|x\| \\ 0 \\ \vdots \\ 0
\end{bmatrix}
$$

For the sake of numerical stability, the standard choice is

$$
v_1=x_1+\mathop{\mathrm{sign}} (x_1) \|x\|_2.
$$

Matrix $H$ is __not explicitly computed__, but the product $Hx$ is computed using formula

$$
Hx=x-\frac{2(v^Tx)}{v^Tv}v=x-\frac{2 (v\cdot x)}{v\cdot v}v,
$$

which requires $O(6m)$ operations.

In [8]:
function HouseholderVector(x::Array)
    # Computes v
    v=copy(x)
    v[1]=x[1]+sign(x[1])*norm(x)
    v
end

HouseholderVector (generic function with 1 method)

In [9]:
x=rand(8)
v=HouseholderVector(x)
β=(2/(v⋅v))*(v⋅x)
x-β*v

8-element Array{Float64,1}:
 -1.2568320365216425
  0.0
  0.0
  0.0
  0.0
  0.0
  0.0
  0.0

In [10]:
x

8-element Array{Float64,1}:
 0.57061318926682
 0.20399662276026764
 0.37498027020240765
 0.7597546535455133
 0.19178019953776593
 0.23454367560056988
 0.09766980331672781
 0.6270929965804597

In [11]:
norm(x)

1.2568320365216425

QR factorization of a matrix is computed by recursively applying QR factorization of a vector to its columns:

In [30]:
function HouseholderQR(A₁::Array)
    # Computes Q and R
    A=copy(A₁)
    m,n=size(A)
    Q=Matrix{Float64}(I,m,m) # eye
    for k=1:n
        v=HouseholderVector(A[k:m,k])
        β=(2/(v⋅v))*v
        A[k:m,k:n]=A[k:m,k:n]-β*(v'*A[k:m,k:n])
        Q[k:m,:]=Q[k:m,:]-β*(v'*Q[k:m,:])
    end
    R=triu(A)
    Q',R
end
    

HouseholderQR (generic function with 1 method)

In [13]:
A

8×5 Array{Float64,2}:
 0.768448   0.26864   0.275819  0.20923   0.356221
 0.940515   0.108871  0.446568  0.918165  0.900925
 0.673959   0.163666  0.582318  0.614255  0.529253
 0.395453   0.473017  0.255981  0.802665  0.031831
 0.313244   0.865412  0.70586   0.555668  0.900681
 0.662555   0.617492  0.291978  0.940782  0.940299
 0.586022   0.285698  0.281066  0.48      0.621379
 0.0521332  0.463847  0.792931  0.790201  0.348173

In [14]:
Q,R=HouseholderQR(A)

([-0.4459793428925316 -0.1124725076034288 … -0.1846088570751743 0.4453989018303586; -0.5458410188985632 -0.3547975260794096 … -0.28453074300651393 0.16172625875188423; … ; -0.3401061260963262 -0.005962161430000654 … 0.9116119148373945 0.0807265826627886; -0.030256209503033656 0.4323572610263252 … 0.08087671084130417 0.4727799552410986], [-1.7230566559710465 -0.8577805301619605 … -1.6688916514356935 -1.6121226948829999; 0.0 1.0128059831682323 … 0.7605676133615736 0.6039879528877765; … ; 0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0])

In [15]:
Q'*A

8×5 Array{Float64,2}:
 -1.72306      -0.857781     -1.01346      -1.66889      -1.61212
 -1.16367e-16   1.01281       0.700064      0.760568      0.603988
  5.11766e-17   3.83323e-17  -0.67391      -0.349435     -0.179984
  8.45621e-17   8.59941e-17   2.78295e-17  -0.686493      0.00271451
  1.58233e-16   1.63517e-16   1.11777e-16   1.2178e-16   -0.652889
  1.5773e-16    2.4082e-16    1.82188e-16   1.02916e-16   2.22045e-16
  7.12178e-17   9.25087e-17   4.25446e-17   5.53655e-17   0.0
 -4.60282e-17  -5.58089e-17  -4.70489e-17  -2.54873e-16  -2.77556e-17

In [16]:
R

8×5 Array{Float64,2}:
 -1.72306  -0.857781  -1.01346   -1.66889   -1.61212
  0.0       1.01281    0.700064   0.760568   0.603988
  0.0       0.0       -0.67391   -0.349435  -0.179984
  0.0       0.0        0.0       -0.686493   0.00271451
  0.0       0.0        0.0        0.0       -0.652889
  0.0       0.0        0.0        0.0        0.0
  0.0       0.0        0.0        0.0        0.0
  0.0       0.0        0.0        0.0        0.0

The function `HouseholderQR()` is  illustrative. Professional programs have following properties:

* computing with block matrices (usual size of a block is 32 or 64),
* the Householder vector is scaled as $\hat v=v/v_1$. Thus,  $\hat v_1=1$, while the rest of the elements of $\hat v$'s are stored in the strict lower triangle of $A$,
* if the matrix $Q$ is required, the accumulation is done backwards using stored elements $\hat v$ (this reduces operation count),
* there is an option of returning economical factorization,
* there is an option of using __pivoting__ - in each step, the columnn of the current submatrix with largest norm is brought to the pivoting position, so
$$
AP=QR.
$$
In this case,
$$
|R_{kk}|\geq |R_{k+1,k+1}|,
$$
which can be used to determine __numerical rank__ of the matrix $A$.

In [17]:
# ?qr  # See the description

In [18]:
# Return the QR object
F=qr(A)

LinearAlgebra.QRCompactWY{Float64,Array{Float64,2}}
Q factor:
8×8 LinearAlgebra.QRCompactWYQ{Float64,Array{Float64,2}}:
 -0.445979   -0.112473     0.144567   …   0.160558  -0.184609    0.445399
 -0.545841   -0.354798    -0.210356      -0.494706  -0.284531    0.161726
 -0.391141   -0.169675    -0.45213        0.385539   0.0148903  -0.66339
 -0.229507    0.272659     0.24854       -0.295448   0.0184688  -0.209603
 -0.181796    0.700501    -0.0463297     -0.362887  -0.160778   -0.240692
 -0.384523    0.284018     0.440047   …   0.558011  -0.144818    0.0589447
 -0.340106   -0.00596216   0.0882082     -0.114703   0.911612    0.0807266
 -0.0302562   0.432357    -0.681975       0.193228   0.0808767   0.47278
R factor:
5×5 Array{Float64,2}:
 -1.72306  -0.857781  -1.01346   -1.66889   -1.61212
  0.0       1.01281    0.700064   0.760568   0.603988
  0.0       0.0       -0.67391   -0.349435  -0.179984
  0.0       0.0        0.0       -0.686493   0.00271451
  0.0       0.0        0.0        0.0  

In [19]:
F.Q'*A

8×5 Array{Float64,2}:
 -1.72306      -0.857781     -1.01346      -1.66889      -1.61212
  0.0           1.01281       0.700064      0.760568      0.603988
 -2.22045e-16  -1.38778e-16  -0.67391      -0.349435     -0.179984
  0.0           2.77556e-16  -2.22045e-16  -0.686493      0.00271451
 -1.11022e-16   0.0           4.44089e-16  -2.22045e-16  -0.652889
  0.0           1.11022e-16   1.66533e-16   0.0           2.22045e-16
  1.11022e-16  -5.55112e-17   5.55112e-17   0.0           1.11022e-16
 -1.66533e-16   1.11022e-16   0.0           0.0           0.0

In [20]:
F.Q*F.R

8×5 Array{Float64,2}:
 0.768448   0.26864   0.275819  0.20923   0.356221
 0.940515   0.108871  0.446568  0.918165  0.900925
 0.673959   0.163666  0.582318  0.614255  0.529253
 0.395453   0.473017  0.255981  0.802665  0.031831
 0.313244   0.865412  0.70586   0.555668  0.900681
 0.662555   0.617492  0.291978  0.940782  0.940299
 0.586022   0.285698  0.281066  0.48      0.621379
 0.0521332  0.463847  0.792931  0.790201  0.348173

In [21]:
F=qr(A,Val(true))

QRPivoted{Float64,Array{Float64,2}}
Q factor:
8×8 LinearAlgebra.QRPackedQ{Float64,Array{Float64,2}}:
 -0.105181  -0.657376   -0.16259    …  -0.418105   -0.280647   -0.0682982
 -0.461568  -0.291448    0.0230722     -0.0183557  -0.262798    0.531458
 -0.30879   -0.242706   -0.109412       0.57546     0.1056     -0.496631
 -0.403505   0.200334   -0.681506       0.246128    0.0857566   0.252251
 -0.279338   0.0965824   0.643895       0.323668   -0.0771281   0.323665
 -0.472938   0.0225186   0.243739   …  -0.117048   -0.204861   -0.529297
 -0.241299  -0.252973    0.145506      -0.246011    0.885934    0.0728644
 -0.397239   0.556817   -0.0378098     -0.504117   -0.0295477  -0.111318
R factor:
5×5 Array{Float64,2}:
 -1.98923  -1.44558   -1.61412   -1.10689   -1.2363
  0.0      -0.937667  -0.473979   0.130204   0.0436452
  0.0       0.0        0.76965    0.350337   0.263875
  0.0       0.0        0.0       -0.629825  -0.177484
  0.0       0.0        0.0        0.0       -0.582983
permutation:

In [22]:
# Pivoting vector
F.p

5-element Array{Int64,1}:
 4
 1
 5
 2
 3

In [23]:
# Permutation matrix
F.P

5×5 Array{Float64,2}:
 0.0  1.0  0.0  0.0  0.0
 0.0  0.0  0.0  1.0  0.0
 0.0  0.0  0.0  0.0  1.0
 1.0  0.0  0.0  0.0  0.0
 0.0  0.0  1.0  0.0  0.0

In [24]:
# Residual using permutation matrix
F.Q*F.R-A*F.P

8×5 Array{Float64,2}:
 0.0          8.88178e-16  4.44089e-16  4.44089e-16  4.44089e-16
 0.0          4.44089e-16  2.22045e-16  1.11022e-16  1.11022e-16
 0.0          3.33067e-16  2.22045e-16  1.66533e-16  1.11022e-16
 0.0          2.77556e-16  2.22045e-16  2.77556e-16  5.55112e-17
 0.0          3.33067e-16  1.11022e-16  1.11022e-16  0.0
 1.11022e-16  3.33067e-16  2.22045e-16  1.11022e-16  2.22045e-16
 5.55112e-17  2.22045e-16  1.11022e-16  1.11022e-16  1.11022e-16
 1.11022e-16  2.22045e-16  1.11022e-16  1.66533e-16  2.22045e-16

In [25]:
# Residual using pivot vektor
F.Q*F.R-A[:,F.p]

8×5 Array{Float64,2}:
 0.0          8.88178e-16  4.44089e-16  4.44089e-16  4.44089e-16
 0.0          4.44089e-16  2.22045e-16  1.11022e-16  1.11022e-16
 0.0          3.33067e-16  2.22045e-16  1.66533e-16  1.11022e-16
 0.0          2.77556e-16  2.22045e-16  2.77556e-16  5.55112e-17
 0.0          3.33067e-16  1.11022e-16  1.11022e-16  0.0
 1.11022e-16  3.33067e-16  2.22045e-16  1.11022e-16  2.22045e-16
 5.55112e-17  2.22045e-16  1.11022e-16  1.11022e-16  1.11022e-16
 1.11022e-16  2.22045e-16  1.11022e-16  1.66533e-16  2.22045e-16

## Speed

The number of floating point operations needed to compute the QR factorization of a $n\times n$ matrix is $O\big(\frac{4}{3}n^3\big)$ to compute $R$ and $O\big(\frac{4}{3}n^3\big)$ to compute $Q$. 


In [26]:
n=512
A=rand(n,n);

In [27]:
@time qr(A);

  0.014455 seconds (7 allocations: 2.282 MiB)


In [28]:
@time qr(A,Val(true));

  0.056790 seconds (7 allocations: 2.141 MiB)


In [29]:
@time HouseholderQR(A);

  2.080790 seconds (10.56 k allocations: 3.354 GiB, 14.89% gc time)


## Accuracy

Matrices $\hat Q$ and $\hat R$ computed with the Householder method satisfy 

\begin{align*}
\hat Q^T\hat Q& =I+E, \qquad \|E \|_2\approx \varepsilon,\\ 
\| A-\hat Q\hat R\|_2& \approx \varepsilon\|A\|_2.
\end{align*}

Also, there exists an orthogonal matrix $Q$ for which

$$\| A- Q\hat R\|_2\approx \varepsilon\|A\|_2.
$$