---
# Section 3.2: Orthogonal Matrices
---

## Inner-product notation

We will use the following notation for the **inner-product** between vectors $x, y \in \mathbb{R}^n$:

$$
\langle x, y \rangle = \sum_{i=1}^n x_i y_i = x^T y = \|x\|_2 \|y\|_2 \cos\theta,
$$

where $0 \leq \theta \leq \pi$ is the **angle** between $x$ and $y$.

**Note:** $\|x\|_2 = \sqrt{\langle x, x \rangle}$.

---

## Orthogonal matrix definition

$Q \in \mathbb{R}^{n \times n}$ is **orthogonal** if the columns of $Q$ are:

1. **unit-length**: 
$$
\|q_i\|_2 = 1, \qquad \forall i,
$$
2. **mutually orthogonal**: 
$$
\langle q_i, q_j \rangle = 0, \qquad i \neq j
$$

This is equivalent to saying that

$$Q^T Q = I$$

which is equivalent to

$$Q^{-1} = Q^T.$$

The rows of $Q$ are also unit-length and mutually orthogonal since $QQ^T = I$.

---

## Exercise

1. Prove that the product of orthogonal matrices is orthogonal.

2. Prove that the transpose of an othogonal matrix is orthogonal.

### Part 1

Suppose that $Q_i \in \mathbb{R}^{n \times n}$ is orthogonal for $i = 1,\ldots,k$. Let

$$
Q = Q_1 Q_2 \cdots Q_k.
$$

Now we want to show that $Q$ is orthogonal. To show this, we compute

\begin{align}
Q^T Q 
& = (Q_1 Q_2 \cdots Q_k)^T (Q_1 Q_2 \cdots Q_k) \\
& = (Q_k^T \cdots Q_2^T Q_1^T) (Q_1 Q_2 \cdots Q_k) \\
& = Q_k^T \cdots Q_2^T (Q_1^T Q_1) Q_2 \cdots Q_k \\
& = Q_k^T \cdots Q_2^T I Q_2 \cdots Q_k \\
& = Q_k^T \cdots Q_3^T (Q_2^T Q_2) Q_3 \cdots Q_k \\
& = Q_k^T \cdots Q_3^T Q_3 \cdots Q_k \\
& \quad \vdots \\
& = Q_k^T Q_k \\
& = I.
\end{align}

Therefore, $Q^T Q = I$, so $Q$ is orthogonal.

### Part 2

Suppose that $Q \in \mathbb{R}^{n \times n}$ is orthogonal. To show that $Q^T$ is orthogonal, we need to show that $(Q^T)^T (Q^T) = I$. So, we compute

$$
(Q^T)^T (Q^T) = Q Q^T = Q Q^{-1} = I.
$$

Therefore, $Q^T$ is also an orthogonal matrix.

---

> ## Theorem:
>
> If $Q \in \mathbb{R}^{n \times n}$ is orthogonal, then:
>
> 1. $\langle Qx, Qy \rangle = \langle x, y \rangle$
>
> 2. $\|Qx\|_2 = \|x\|_2$
>
> This theorem states that any **orthogonal transformation**, $x \mapsto Qx$, preserve angles and lengths.

---

## Exercise:

Prove the theorem.

### Part 1

Since $Q$ is an orthogonal matrix, we have

\begin{align}
\langle Qx, Qy \rangle 
&= (Qx)^T (Qy) \\
&= x^T Q^T Q y \\
&= x^T I y \\
&= x^T y \\
&= \langle x, y \rangle.
\end{align}

### Part 2

Using part 1, we have

\begin{align}
\| Q x \|_2 
&= \sqrt{ \langle Q x, Q x \rangle } \\
&= \sqrt{ \langle x, x \rangle } \\
&= \| x \|_2. \\
\end{align}

---

## Rotation matrices

A $2 \times 2$ rotation matrix has the form

$$
Q = \begin{bmatrix}
\cos\theta & -\sin\theta \\
\sin\theta & \cos\theta
\end{bmatrix}.
$$

We can use rotation matrices to introduce zeros into vectors.


---

## Exercise

1. Prove that $2 \times 2$ rotation matrices are orthogonal.

2. Find a rotation matrix $Q$ such that
$$
Q^T \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = \begin{bmatrix} y_1 \\ 0 \end{bmatrix}
$$
where $y_1 \geq 0$.

### Part 1

Let $Q$ be a $2 \times 2$ rotation matrix, as above. Then

$$
\begin{align}
Q^T Q
&= \begin{bmatrix}
\cos\theta & -\sin\theta \\
\sin\theta & \cos\theta
\end{bmatrix}^T
\begin{bmatrix}
\cos\theta & -\sin\theta \\
\sin\theta & \cos\theta
\end{bmatrix} \\
&= \begin{bmatrix}
\cos\theta & \sin\theta \\
-\sin\theta & \cos\theta
\end{bmatrix}
\begin{bmatrix}
\cos\theta & -\sin\theta \\
\sin\theta & \cos\theta
\end{bmatrix} \\
&= \begin{bmatrix}
\cos^2\theta + \sin^2\theta & -\cos\theta\sin\theta + \sin\theta\cos\theta \\
-\sin\theta\cos\theta + \cos\theta\sin\theta & \sin^2\theta + \cos^2\theta
\end{bmatrix} \\
&= \begin{bmatrix}
1 & 0 \\
0 & 1
\end{bmatrix} \\
&= I.
\end{align}
$$

Therefore, $Q$ is an orthogonal matrix.


### Part 2

Let

$$
Q = 
\begin{bmatrix}
c & -s \\ s & c
\end{bmatrix},
$$

where $c^2 + s^2 = 1$. Then

$$
Q^T \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = \begin{bmatrix} y_1 \\ 0 \end{bmatrix}
$$

implies that

$$
\begin{bmatrix}
c & -s \\ s & c
\end{bmatrix}^T
\begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = \begin{bmatrix} y_1 \\ 0 \end{bmatrix}.
$$

Thus, 

$$
\begin{bmatrix}
c x_1 + s x_2 \\ -s x_1 + c x_2
\end{bmatrix} = \begin{bmatrix} y_1 \\ 0 \end{bmatrix}.
$$

Also, since $Q^T$ is orthogonal,

$$
\left\| \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} \right \|_2 = \left\| \begin{bmatrix} y_1 \\ 0 \end{bmatrix} \right \|_2,
$$

which implies that $y_1 = \sqrt{x_1^2 + x_2^2}$ since $y_1 \ge 0$.

If $x = 0$, then $y_1 = \sqrt{x_1^2 + x_2^2} = 0$, and we would let $c = 1$ and $s = 0$.

Now, we assume that $x \ne 0$. Thus, $y_1 > 0$.

Multiply the first equation by $s$ and the second equation by $c$. Then,

\begin{align}
c s x_1 + s^2 x_2 &= s y_1 \\
-c s x_1 + c^2 x_2 &= 0. \\
\end{align}

Summing these equations gives us

$$
(s^2 + c^2) x_2 = s y_1.
$$

Since we want $s^2 + c^2 = 1$, we have $x_2 = s y_1$, so

$$
s = \frac{x_2}{y_1}.
$$

In a similar way, we find that

$$
c = \frac{x_1}{y_1}.
$$

---

Another way to find $s$ and $c$ is to rewrite

$$
\begin{bmatrix}
c x_1 + s x_2 \\ -s x_1 + c x_2
\end{bmatrix} = \begin{bmatrix} y_1 \\ 0 \end{bmatrix}
$$

as

$$
\begin{bmatrix}
x_1 & x_2 \\ x_2 & -x_1
\end{bmatrix} \begin{bmatrix} c \\ s \end{bmatrix} = \begin{bmatrix} y_1 \\ 0 \end{bmatrix}.
$$

Then, just solve the above system for $c$ and $s$ by multiplying both sides by the inverse of the coefficient matrix.

Since

$$
\begin{bmatrix}
a & b \\ c & d
\end{bmatrix}^{-1} = 
\frac{1}{ad - bc}
\begin{bmatrix}
d & -b \\ -c & a
\end{bmatrix}, 
$$

we have that

$$
\begin{bmatrix}
x_1 & x_2 \\ x_2 & -x_1
\end{bmatrix}^{-1} = 
\frac{1}{-x_1^2 - x_2^2}
\begin{bmatrix}
-x_1 & -x_2 \\ -x_2 & x_1
\end{bmatrix} = 
\frac{1}{x_1^2 + x_2^2}
\begin{bmatrix}
x_1 & x_2 \\ x_2 & -x_1
\end{bmatrix}.
$$


Therefore,

$$
\begin{bmatrix}
c \\ s
\end{bmatrix} = 
\frac{1}{x_1^2 + x_2^2}
\begin{bmatrix}
x_1 & x_2 \\ x_2 & -x_1
\end{bmatrix}
\begin{bmatrix}
y_1 \\ 0
\end{bmatrix} =
\frac{1}{y_1^2}
\begin{bmatrix}
x_1 y_1 \\ x_2 y_1
\end{bmatrix} =
\begin{bmatrix}
x_1/y_1 \\ x_2/y_1
\end{bmatrix}.
$$

In [None]:
using LinearAlgebra

In [None]:
x = randn(2)

In [None]:
y1 = norm(x)

In [None]:
c, s = x[1]/y1, x[2]/y1

In [None]:
c, s = x/norm(x)

In [None]:
c^2 + s^2

In [None]:
Q = [c -s; s c]

In [None]:
Q'x

---

## $QR$-decomposition of a $2 \times 2$ matrix $A$

Suppose 

$$
A = \begin{bmatrix}
a_{11} & a_{12} \\
a_{21} & a_{22}
\end{bmatrix}.
$$

Let $Q$ be the rotation matrix that introduces a zero in the first column of $A$:

$$
Q^T \begin{bmatrix} a_{11} \\ a_{21} \end{bmatrix} = \begin{bmatrix} r_{11} \\ 0 \end{bmatrix}.
$$

Let

$$
\begin{bmatrix} r_{12} \\ r_{22} \end{bmatrix} = Q^T \begin{bmatrix} a_{21} \\ a_{22} \end{bmatrix}.
$$

Let

$$
R = \begin{bmatrix}
r_{11} & r_{12} \\
0 & r_{22}
\end{bmatrix}.
$$

Then $Q^T A = R$, and since $Q$ is orthogonal,

$$
A = QR.
$$

---

## Exercise

Compute the $QR$-decomposition of

$$
A = \begin{bmatrix}
1 & 2 \\
1 & 3
\end{bmatrix}
$$

and check your answer using the `qr` function in Julia.

In [None]:
A = [1 1; 2 3.0]

In [None]:
function formQ(x)
    c, s = x/norm(x)
    Q = [c -s; s c]
end

In [None]:
Q = formQ(A[:,1])

In [None]:
R = Q'A

In [None]:
A - Q*R

In [None]:
F = qr(A)

In [None]:
A - F.Q*F.R

---

## Givens rotations

A **Givens rotation** matrix is

$$
Q = 
\begin{bmatrix}
1 \\
&\ddots\\
&&1\\
&&&c&&&&-s\\
&&&&1\\
&&&&&\ddots\\
&&&&&&1\\
&&&s&&&&c\\
&&&&&&&&1\\
&&&&&&&&&\ddots\\
&&&&&&&&&&1\\
\end{bmatrix},
$$

where $c = \cos\theta$ and $s = \sin\theta$. This matrix rotates the $(x_i,x_j)$ plane by an angle of $\theta$.

These matrices can be used to introduce zeros in general $n \times n$ matrices.

---

## Exercise

Use Givens rotations to compute the $QR$-decomposition of

$$
A = \begin{bmatrix}
1 & 2 & 0 \\
0 & 1 & 3 \\
1 & 3 & 0
\end{bmatrix}.
$$

In [None]:
A = [ 1 2 0; 0 1 3; 1 3 0.0 ]

In [None]:
x = A[[1,3],1]

In [None]:
c, s = x/norm(x)

In [None]:
Q1 = [
    c 0 -s
    0 1 0
    s 0 c
]

In [None]:
A1 = Q1'A

In [None]:
x = A1[[2,3],2]

In [None]:
c, s = x/norm(x)

In [None]:
Q2 = [
    1 0 0
    0 c -s
    0 s c
]

In [None]:
A2 = Q2'A1

In [None]:
R = A2

In [None]:
Q = Q1*Q2

In [None]:
A - Q*R

In [None]:
qr(A)

In [None]:
UpperTriangular(R)

---

## Solving $Ax = b$ using $QR$

If $Ax = b$ and $A = QR$, then

$$
Q(Rx) = b.
$$

If we let $c = Rx$, then we have $Qc = b$.

Thus, we have the following algorithm for solving $Ax = b$:

1. Let $c = Q^Tb$.
2. Solve $Rx = c$ using backward substitution.

---

## Exercise

Use the $QR$-decomposition of $A$ to solve $Ax = b$.

$$
A = \begin{bmatrix}
1 & 2 \\
1 & 3
\end{bmatrix},
\qquad
b = \begin{bmatrix} 1 \\ 2 \end{bmatrix}.
$$

In [None]:
A = [1 2; 1 3.0]
b = [1, 2.0]

Q, R = qr(A)

In [None]:
c = Q'b

In [None]:
x = R\c

In [None]:
A*x - b

---

## Reflection matrices

Another way to create zeros in a matrix is by the [Householder reflection transformation](https://en.wikipedia.org/wiki/Householder_transformation):

$$
Q = I - 2uu^T, \qquad \|u\|_2 = 1.
$$

Let $L$ be the set of vectors $v$ that are orthogonal to the unit vector $u$,

$$
L = \left\{ v \in \mathbb{R}^n : u^T v = 0 \right\}.
$$

Then $L$ is a **hyperplane** containing the origin, and $Q$ reflects vectors $x$ across $L$.


---

## Properties of $Q = I - 2uu^T$

1. $Qu = -u$
2. If $v \in L$, then $Qv = v$.
3. $Q = Q^T$
4. $Q^TQ = I$
5. $Q^{-1} = Q$

---

## Exercise

Prove the above properties.

### Part 1

Since $\|u\|_2 = 1$, we have

\begin{align}
Q u
&= (I - 2 u u^T) u \\
&= u - 2 u (u^T u) \\
&= u - 2 u \|u\|_2^2 \\
&= u - 2 u \\
&= -u.
\end{align}

### Part 2

Since $v \in L$, we have that $u^T v = 0$. Thus,

\begin{align}
Q v
&= (I - 2 u u^T) v \\
&= v - 2 u (u^T v) \\
&= v - 2 u (0) \\
&= v.
\end{align}

### Part 3

We have that

\begin{align}
Q^T 
&= (I - 2 u u^T)^T \\
&= I^T - 2 (u u^T)^T \\
&= I - 2 (u^T)^T u^T \\
&= I - 2 u u^T \\
&= Q. \\
\end{align}

Therefore, $Q$ is symmetric.

### Part 4

Since $Q^T = Q$, we have

\begin{align}
Q^T Q
&= Q Q \\
&= (I - 2 u u^T)(I - 2 u u^T) \\
&= I - 2 u u^T - 2 u u^T + 4 u (u^T u) u^T \\
&= I - 2 u u^T - 2 u u^T + 4 u u^T \\ 
&= I. \\
\end{align}

Therefore, $Q$ is orthogonal.

### Part 5

Since $QQ = I$, the matrix $Q$ is its own inverse, so $Q^{-1} = Q$.

---

## Reflecting $x$ to $y$

If $\|u\|_2 \neq 1$, then the **Householder reflector** is

$$
Q = I - \gamma uu^T, \qquad \gamma = \frac{2}{\|u\|_2^2}.
$$

If $x, y \in \mathbb{R}^n$ such that $\|x\|_2 = \|y\|_2$, then the reflector $Q$ using 

$$u = x - y$$ 

satisfies

$$
Qx = y.
$$



---

## Exercise

Test that $Qx = y$ on random vectors $x$ and $y$.

In [None]:
n = 4
x = randn(n)
y = randn(n)
L = 10*rand()
x *= L/norm(x)
y *= L/norm(y)

norm(x) ≈ norm(y)

In [None]:
u = x - y
γ = 2/dot(u,u)

Q = I - γ*(u*u')

In [None]:
Qmap(v) = v - (γ*dot(u,v))*u

In [None]:
[Q*x y]

In [None]:
norm(Q*x - y)

In [None]:
[Qmap(x) y]

In [None]:
norm(Qmap(x) - y)

---

## Creating zeros using reflectors

We want the reflector $Q$ that reflects $x$ to $y$, where

$$
x = 
\begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{bmatrix},
\qquad
y = 
\begin{bmatrix} -\tau \\ 0 \\ \vdots \\ 0 \end{bmatrix},
\qquad
\tau = \mathrm{sign}(x_1)\|x\|_2.
$$

We define $u$ as

$$
u = \frac{x - y}{\tau + x_1} = 
\begin{bmatrix} 1 \\ x_2/(\tau + x_1) \\ \vdots \\ x_n/(\tau + x_1) \end{bmatrix}.
$$

Note that we have divided by $\tau + x_1$ to ensure that $u_1 = 1$.

Since $\tau$ and $x_1$ have the same sign, the calculation $\tau + x_1$ avoids catastrophic cancellation.

Letting 

$$
Q = I - \gamma uu^T, 
\qquad
\gamma = \frac{2}{\|u\|_2^2},
$$

we have

$$
Qx = \begin{bmatrix} -\tau \\ 0 \\ \vdots \\ 0 \end{bmatrix}.
$$



---

## Exercise

Prove that 
$$\gamma = \frac{\tau + x_1}{\tau}.$$

### Proof.

First note that

$$
(\tau + x_1) u = 
\begin{bmatrix} \tau + x_1 \\ x_2 \\ \vdots \\ x_n \end{bmatrix}.
$$


Now, taking the norm squared of both sides, we have

$$
\|(\tau + x_1) u \|_2^2 = (\tau + x_1)^2 + x_2^2 + \cdots + x_n^2.
$$

Thus,

$$
|\tau + x_1|^2 \|u\|_2^2 = \tau^2 + 2 \tau x_1 + x_1^2 + \cdots + x_n^2.
$$

Then we have

$$
(\tau + x_1)^2 \|u\|_2^2 = \tau^2 + 2 \tau x_1 + \|x\|_2^2.
$$

Since $\tau = \mathrm{sign}(x_1) \|x\|_2$, we have that $\tau^2 = \|x\|_2^2$. Therefore,

$$
(\tau + x_1)^2 \|u\|_2^2 = \tau^2 + 2 \tau x_1 + \tau^2.
$$

Thus,

$$
(\tau + x_1)^2 \|u\|_2^2 = 2 \tau( \tau + x_1 ).
$$

Since $\tau + x_1 \ne 0$, we can divide both sides by $\tau + x_1$, and we get

$$
(\tau + x_1) \|u\|_2^2 = 2 \tau.
$$

Rearranging, we have that

$$
\frac{\tau + x_1}{\tau} = \frac{2}{\|u\|_2^2}.
$$

Since $\gamma = 2/\|u\|_2^2$, we have that

$$
\frac{\tau + x_1}{\tau} = \gamma.
$$

Q.E.D.

---

## Exercise

Test above method for generating $Q$ on a random vector $x$.

In [None]:
n = 5
x = randn(5)

In [None]:
τ = sign(x[1])*norm(x)

In [None]:
u = [1; x[2:end]/(τ + x[1])]

In [None]:
γ = (τ + x[1])/τ

In [None]:
2/dot(u,u)

In [None]:
Q = I - (γ*u)*u'

In [None]:
Q*x

In [None]:
Qmap(v) = v - (γ*dot(u,v))*u

In [None]:
Qmap(x)

---

## `house`

We can now write a function to compute the $u$ and $\gamma$ of the Householder reflector $Q = I - \gamma uu^T$:

```julia
u, γ, τ = house(x)
```

In [None]:
function house(x)
    u = copy(x)
    
    τ = norm(x)
    if τ == 0.0
        γ = 0.0
    else
        if x[1] < 0
            τ = -τ    # τ = sign(x[1])*norm(x)
        end
        γ = τ + x[1]  # γ temporarily stores τ + x[1]
        u[1] = 1.0    # u normalized to u[1] = 1
        u[2:end] /= γ # divide u[2:end] by τ + x[1]
        γ /= τ        # γ = (τ + x[1])/τ
    end
    
    return u, γ, τ
end

In [None]:
n = 5
x = randn(n)

u, γ, τ = house(x)

Q = I - γ*(u*u')

In [None]:
issymmetric(Q)

In [None]:
Q*Q

In [None]:
[x Q*x]

In [None]:
Qmap(x)

---

## `housetimes`

The way we computed $Qx$ in the above numerical example was inefficient. Note that

$$
Qx = \left(I - \gamma uu^T\right)x = x - \left[\gamma \left(u^T x\right)\right] u.
$$

In [None]:
housetimes(x::Vector, u, γ) = x - (γ*dot(u, x))*u

In [None]:
u, γ, τ = house(x)

housetimes(x, u, γ)

---

## Exercise

Count the number of flops:
1. To form $Q$ and compute $Qx$.
2. To compute $x - \left[\gamma \left(u^T x\right)\right] u$.

**Solution:**

1. $3n^2 + 2n$ flops
2. $4n + 1$ flops

### Part 1

Recall that $Q = I - \gamma u u^T$.

1. Computing $y = (-\gamma) u$ requires $n$ multiplications.
2. Computing $Q = y u^T$ requires $n^2$ multiplications.
3. Computing $Q = I + Q$ reqires $n$ additions (along the diagonal).
4. Computing $Qx$ requires $2n^2$ operations.

So, in total we have $3n^2 + 2n$ flops for forming the $n \times n$ matrix $Q$ and computing matrix-vector multiplication $Qx$.

### Part 2

To compute $x - \left[\gamma \left(u^T x\right)\right] u$, we do the following.

1. Computing $\delta = u^T x$ requires $n$ multiplications and $n$ additions.
2. Computing $\mu = \gamma \delta$ requires one multiplication.
3. Computing $y = \mu u$ requires $n$ multiplications.
4. Computing $z = x - y$ requires $n$ subtractions.

So, in total we have $4n + 1$ flops for computing $x - \left[\gamma \left(u^T x\right)\right] u$.

---

In the algorithm for computing the $QR$ decomposition of a matrix $A$, we will need to compute $QB$ where $B$ is a matrix.

$$
QB = B - (\gamma u) \left(u^TB\right)
$$

In [None]:
housetimes(B::Matrix, u, γ) = B - (γ*u)*(u'*B)

In [None]:
methods(housetimes)

In [None]:
n = 5
B = rand(n,n)

In [None]:
u, γ, τ = house(B[:,1])

housetimes(B, u, γ)

---

## Exercise

Use Householder reflectors to numerically compute the $QR$-decomposition of

$$
A = \begin{bmatrix}
1 & 2 & 0 \\
0 & 1 & 3 \\
1 & 3 & 0
\end{bmatrix}.
$$

Check your answer using the `qr` function in Julia.

In [None]:
A = [1 2 0; 0 1 3; 1 3 0.0]
R = copy(A)

In [None]:
u, γ, τ = house(R[:,1])

R[1,1] = -τ
R[2:3,1] .= 0
R[:,2:3] = housetimes(R[:,2:3], u, γ)
R

In [None]:
u, γ, τ = house(R[2:3,2])

R[2,2] = -τ
R[3,2] = 0
R[2:3,3] = housetimes(R[2:3,3], u, γ)
R

In [None]:
qr(A)

---

## The $QR$ Decomposition Algorithm

$$A_0 = A$$

$$
A_1 = Q_1A = 
\left[\begin{array}{c|c}
-\tau_1 & a_1^T \\ \hline
0 & \hat{A}_1
\end{array}\right],
\qquad
Q_1 = I_n - \gamma_1 u_1 u_1^T
$$

$$
A_2 = Q_2Q_1A = 
\left[\begin{array}{c|c}
-\tau_1 & a_1^T \\ \hline
0 &
\begin{array}{c|c}
-\tau_2 & a_2^T \\ \hline
0 & \hat{A}_2
\end{array}
\end{array}\right],
\qquad
Q_2 = 
\left[\begin{array}{c|c}
1 & \\\hline
& I_{n-1} - \gamma_2 u_2 u_2^T
\end{array}\right]
$$

$$
A_3 = Q_3Q_2Q_1A = 
\left[\begin{array}{c|c}
-\tau_1 & a_1^T \\ \hline
0 &
\begin{array}{c|c}
-\tau_2 & a_2^T \\ \hline
0 & 
\begin{array}{c|c}
-\tau_3 & a_3^T \\ \hline
0 & \hat{A}_3
\end{array}
\end{array}
\end{array}\right],
\qquad
Q_3 = 
\left[\begin{array}{c|c}
I_2 & \\\hline
& I_{n-2} - \gamma_3 u_3 u_3^T
\end{array}\right]
$$

$$\vdots$$

$$
A_{n-1} = Q_{n-1} \cdots Q_1A = 
\left[\begin{array}{c|c}
-\tau_1 & a_1^T \\ \hline
0 &
\begin{array}{c|c}
-\tau_2 & a_2^T \\ \hline
0 & 
\begin{array}{c|c}
-\tau_3 & a_3^T \\ \hline
0 & 
\begin{array}{c|c}
\ddots & \ddots \\ \hline
0 & 
\begin{array}{c|c}
-\tau_{n-1} & a_{n-1}^T \\ \hline
0& \hat{A}_{n-1}
\end{array}
\end{array}
\end{array}
\end{array}
\end{array}\right] = R
$$

We then let $Q = Q_1Q_2 \cdots Q_{n-1}$ and obtain $A = QR$.

---

## Storing $u_i$'s and $\gamma_i$'s

Each $u_i$ is normalized so that

$$
u_i = \begin{bmatrix} 1\\*\\\vdots\\* \end{bmatrix}.
$$

Thus we do not need to store the first entry since it is always $1$.

The rest of the entries of $u_i$ can be stored where the zeros are created.

To store the $\gamma_i$'s, we create a separate vector

$$
\gamma = \begin{bmatrix} \gamma_1\\\vdots\\\gamma_{n-1} \end{bmatrix}.
$$

---

## `myqr`

In [None]:
struct myQRfactorization
    V::Matrix{Float64}
    γ::Vector{Float64}
end

function myqr(A::Matrix{Float64})
    m, n = size(A)
    
    m == n || error("This QR decomposition algorithm requires a square input matrix.")
        
    V = copy(A)
    γ = zeros(n-1)
    for k = 1:n-1
        u, γ[k], τ = house(V[k:n,k])  # compute the Householder reflector I - γuu'
        V[k,k] = -τ                   # diagonal entries become -τ
        V[k+1:n,k] = u[2:end]         # store u's in the strictly lower-triangular part of V
        V[k:n,k+1:n] -= (γ[k]*u)*(u'*V[k:n,k+1:n]) # housetimes
    end
    
    myQRfactorization(V, γ)
end

In [None]:
n = 5
A = rand(n, n)

myF = myqr(A)

In [None]:
F = qr(A)

In [None]:
UpperTriangular(myF.V)

---

## Flop count of $QR$ Decomposition Algorithm

In each iteration, we need to compute `housetimes`:

1. $\left(I_n - \gamma_1 u_1 u_1^T\right) A_{0}[1:n,\ 2:n]$

2. $\left(I_{n-1} - \gamma_2 u_2 u_2^T\right) A_{1}[2:n,\ 3:n]$

3. $\left(I_{n-2} - \gamma_3 u_3 u_3^T\right) A_{2}[3:n,\ 4:n]$

$\qquad\vdots$

In iteration $k$, we compute

$$
\left(I_{n-k+1} - \gamma_k u_k u_k^T\right) A_{k-1}[k:n,\ k+1:n].
$$

This operation requires approximately $4(n - k + 1)^2$ flops if done efficiently.

We also need to compute $\gamma_k$ and $u_k$ from $A_{k-1}[k:n, k]$, but this is only $O(n)$.

Therefore, the $QR$ Decomposition Algorithm requires

$$
\sum_{k=1}^{n-1} 4(n-k+1)^2 \approx \int_1^{n-1} 4(n-x+1)^2 dx = \frac{4}{3}n^3.
$$

This does not include forming $Q$ though.

---

## Forming $Q$

We can use the $u_i$'s and the $\gamma_i$'s to compute

$$QB = Q_1Q_2\cdots Q_{n-1} B$$

$$Q^T B = Q_{n-1}\cdots Q_2 Q_1 B$$

efficiently without forming $Q$.


We can form $Q$ by computing $QI_n$:

$$
\begin{align}
Q = QI_n &= Q_1 Q_2\cdots Q_{n-1} I_n\\
&= 
\left[\begin{array}{c}
I_{n} - \gamma_1 u_1 u_1^T
\end{array}\right]
\left[\begin{array}{c|c}
I_1 & \\\hline
& I_{n-1} - \gamma_2 u_2 u_2^T
\end{array}\right]
\cdots
\left[\begin{array}{c|c}
I_{n-2} & \\\hline
& I_{2} - \gamma_{n-1} u_{n-1} u_{n-1}^T
\end{array}\right] I_n
\end{align}
$$

Done efficiently (from right to left), this calculation requires an additional $\frac43n^3$ flops.

---

## `Qtimes` and `formQ`

In [None]:
function Qtimes!(F::myQRfactorization, B::Matrix; T=false)
    n = size(F.V, 1)
    cols = T ? (1:n-1) : (n-1:-1:1)
    for k = cols
        γk = F.γ[k]
        uk = [1.0; F.V[k+1:n,k]]
        B[k:n,k:n] -= (γk*uk)*(uk'*B[k:n,k:n])
    end
    B
end
Qtimes(F::myQRfactorization, B::Matrix) = Qtimes!(F, copy(B))

QTtimes!(F::myQRfactorization, B::Matrix) = Qtimes!(F, B, T=true)
QTtimes(F::myQRfactorization, B::Matrix) = QTtimes!(F, copy(B))

formQ(F::myQRfactorization) = Qtimes(F, Matrix{Float64}(I, size(F.V)))

In [None]:
methods(formQ)

In [None]:
n = 5
A = rand(n, n)
F = myqr(A)

Q = formQ(F)

In [None]:
Q'*A

In [None]:
QTtimes(F, A)

---

## Flop count summary for matrix factorizations

Let $A \in \mathbb{R}^{n \times n}$.

`chol(A)`: $\frac13n^3$ flops

`lu(A)`:  $\frac23n^3$ flops

`F = qr(A)` does not form $Q$: $\frac43n^3$ flops

`Q = F.Q*Matrix(I,n,n)` forms $Q$: $\frac83n^3$ flops


---

## Flop count to solve $Ax = b$ by $QR$

**Algorithm:**

1. Compute the $QR$ Decomposition of $A$, but do not form $Q$.
2. $c = Q^Tb$
3. Use backward substitution to solve $Rx = c$

The cost of the $QR$ Decomposition is $\frac43n^3$.

The cost of computing $c = Q^Tb$ efficiently (using the $u_i$'s and the $\gamma_i$'s) is about $2n^2$ flops.

Backward substitution is $n^2$ flops.

Therefore, in total we have 

$$\frac43n^3 + O(n^2)$$ 

flops to solve $Ax = b$ by $QR$.

---

## `x = F\b`

In [None]:
function Qtimes!(F::myQRfactorization, b::Vector; T=false)
    n = size(F.V, 1)
    cols = T ? (1:n-1) : (n-1:-1:1)
    for k = cols
        γk = F.γ[k]
        uk = [1.0; F.V[k+1:n,k]]
        b[k:n] -= (γk*uk)*dot(uk, b[k:n])
    end
    b
end
Qtimes(F::myQRfactorization, b::Vector) = Qtimes!(F, copy(b))

QTtimes!(F::myQRfactorization, b::Vector) = Qtimes!(F, b, T=true)
QTtimes(F::myQRfactorization, b::Vector) = QTtimes!(F, copy(b))

In [None]:
methods(Qtimes)

In [None]:
n = 5
A = rand(n, n)
b = rand(n)

F = myqr(A)

c = QTtimes(F, b)

In [None]:
R = UpperTriangular(F.V)

In [None]:
x = R\c

In [None]:
b - A*x

In [None]:
b - A*(A\b)

---

In [None]:
import Base.\

\(F::myQRfactorization, b::Vector) = UpperTriangular(F.V)\QTtimes(F, b)

In [None]:
methods(\)

In [None]:
x = F\b

In [None]:
b - A*x

---

> ## $QR$ Decomposition Theorem
>
> Let $A \in \mathbb{R}^{n \times n}$. Then the following hold.
>
> 1. There exists $Q, R \in \mathbb{R}^{n \times n}$ such that $Q$ is orthogonal, $R$ is upper-triangular, and $A = QR$.
>
> 2. If $A$ is **nonsingular**, then $\exists$ **unique** $Q, R \in \mathbb{R}^{n \times n}$ such that $Q$ is orthogonal, $R$ is upper-triangular with **positive diagonal entries**, and $A = QR$.

### Proof.

1. Hint: Use Householder reflectors and induction on $n$.
2. Hint: Let $D$ be diagonal with $d_{ii} = \mathrm{sign}(r_{ii})$.

### Part 1

If $n = 1$ then $A$ is a $1 \times 1$ matrix. Thus, $A = [a_{11}]$. Let $Q = [1]$ and $R = [a_{11}]$. Then $Q$ is orthogonal, $R$ is upper-triangular, and $A = QR$.

Now suppose that all $k \times k$ matrices have a $QR$ decomposition, for some positive integer $k$.

Let $n = k+1$ and $A \in \mathbb{R}^{n \times n}$. We partition $A$ as

$$
A = 
\begin{bmatrix}
a_{11} & b^T \\ c & D
\end{bmatrix},
$$

where $D$ is a $k \times k$ matrix.

Let $Q_1$ be a Householder reflector for the first column of $A$. Then

$$
Q_1 A = 
\begin{bmatrix}
\hat{a}_{11} & \hat{b}^T \\ 0 & \hat{D}
\end{bmatrix}.
$$

Note that $\hat{D}$ is a $k \times k$ matrix, so, by our induction hypothesis, $\hat{D}$ has a $QR$ decomposition: $\hat{D} = \hat{Q} \hat{R}$. Thus,

$$
Q_1 A = 
\begin{bmatrix}
\hat{a}_{11} & \hat{b}^T \\ 0 & \hat{Q} \hat{R}
\end{bmatrix}.
$$

Let

$$
Q_2 = 
\begin{bmatrix}
1 & 0 \\ 0 & \hat{Q}^T
\end{bmatrix},
$$

and note that $Q_2$ is orthogonal. Then

$$
Q_2 Q_1 A = 
\begin{bmatrix}
1 & 0 \\ 0 & \hat{Q}^T
\end{bmatrix}
\begin{bmatrix}
\hat{a}_{11} & \hat{b}^T \\ 0 & \hat{Q} \hat{R}
\end{bmatrix} =
\begin{bmatrix}
\hat{a}_{11} & \hat{b}^T \\ 0 & \hat{Q}^T \hat{Q} \hat{R}
\end{bmatrix} =
\begin{bmatrix}
\hat{a}_{11} & \hat{b}^T \\ 0 & \hat{R}
\end{bmatrix}.
$$

Let $R = Q_2 Q_1 A$. Note that $R$ is upper-triangular. Let $Q = Q_1^T Q_2^T$. Note that $Q$ is orthogonal since it is the product of orthogonal matrices. Finally, we have that $A = Q_1^T Q_2^T R = Q R$.

### Part 2

First, we let $A$ be an $n \times n$ nonsingular matrix. Then, by part 1, the matrix $A$ has a $QR$ decomposition:  $A = QR$. Note that the diagonal entries of $R$ are nonzero since, otherwise, $\det(R) = 0$ and that would imply that $\det(A) = \det(Q) \det(R) = 0$, contradicting our assumption that $A$ is nonsingular.

Let $D$ be the diagonal matrix with diagonal entries $d_{ii} = \mathrm{sign}(r_{ii})$. Thus, $D$ has diagonal entries that are $\pm 1$, so $D^2 = I$. That implies that

$$
A = Q R = Q D^2 R = (Q D) (D R).
$$

Let $\hat{Q} = Q D$ and $\hat{R} = D R$. Then $\hat{Q}$ is orthogonal since $Q$ and $D$ are orthogonal. Also, $\hat{R}$ is upper-triangular with positive diagonal entries, and $A = \hat{Q} \hat{R}$.

Now suppose that $A = Q_1 R_1 = Q_2 R_2$, where $R_1$ and $R_2$ have positive diagonal entries. Then,

$$
A^T A = R_1^T Q_1^T Q_1 R_1 = R_1^T R_1
$$

and $A^T A = R_2^T R_2$ are both Cholesky decompositions of the symmetric positive definite matrix $A^T A$. But the Cholesky decomposition is unique, so $R_1 = R_2$. Then,

$$
Q_1 = A R_1^{-1} = A R_2^{-1} = Q_2.
$$

---

## Stability

Multiplication by rotators or reflectors is stable:

$$
\mathrm{fl}(QA) = Q(A + E)
$$

where $\frac{\|E\|_2}{\|A\|_2}$ is tiny.

Also,

\begin{align}
\mathrm{fl}(Q_2 Q_1 A)
&= Q_2( Q_1(A + E_1) + E_2 ) \\
&= Q_2( Q_1(A + E_1) + Q_1 Q_1^T E_2 ) \\
&= Q_2 Q_1( A + E_1 + Q_1^T E_2 ) \\
&= Q_2 Q_1( A + E ) \\
\end{align}

where $E = E_1 + Q_1^T E_2$.

Thus,

$$
\mathrm{fl}(Q_2Q_1A) = Q_2Q_1(A + E),
$$

where $\|E\|_2 = \left\|E_1 + Q_1^T E_2\right\|_2 \leq \|E_1\|_2 + \left\|Q_1^TE_2\right\|_2 = \|E_1\|_2 + \left\|E_2\right\|_2$. Therefore, $\frac{\|E\|_2}{\|A\|_2}$ is tiny.

---