In [1]:
using Pkg
Pkg.activate("../GenLinAlgProblems/")

using Statistics, LinearAlgebra, Random, Latexify
using GenLinAlgProblems

[32m[1m  Activating[22m[39m project at `C:\Users\jeff\NOTEBOOKS\elementary-linear-algebra\GenLinAlgProblems`


In [2]:
approx(A;d=3)=(x->round(x,digits=d)).(A);

<div style="float:center;width:100%;text-align: center;"><strong style="height:60px;color:darkred;font-size:40px;">Solving the Normal Equations</strong></div>

# 0. Generate Some Data

In [3]:
M=40; N=3

Random.seed!(24526)

A = 10*(rand(M,N ) .- 0.5)
b =  4*(rand(M)    .- 0.5)
data_dict = Dict( "b" => b )
for i ∈ 1:N
    data_dict["a_$i"] = A[:,i]
end;

# 1. Gaussian Elimination  applied to $\mathbf{\left( A^t A \mid A^t b \right)}$

Given $A x = b$, we can decompose $b$ into two othogonal vectors $b = b_{//} + b_\perp,$
where $b_{//} \in \mathscr{C}(A)$ and $b_\perp \in \mathscr{N}(A^t)$

$\qquad\begin{align}
& A^t A x &=&\ A^t b \qquad\qquad & \text{normal equations} \\
& b_{//}  &=&\ A x                & \text{parallel component of } b \\
& b_\perp &=&\ b - b_{//}         & \text{orthogonal component of } b \\
\end{align}$

In [4]:
latexify( [ latex("A^t A = "), approx(A'A), latex("A^t b ="), approx(A'b)']')

L"\begin{equation}
\left[
\begin{array}{cccc}
A^t A =  & \left[
\begin{array}{ccc}
278.471 & -31.409 & -46.007 \\
-31.409 & 342.112 & -18.326 \\
-46.007 & -18.326 & 450.713 \\
\end{array}
\right] & A^t b = & \left[
\begin{array}{c}
-12.753 \\
24.782 \\
9.983 \\
\end{array}
\right] \\
\end{array}
\right]
\end{equation}
"

In [5]:
AtA = A'A
Atb = A'b
x   = AtA \ Atb

b_parallel = A*x
b_perp     = b - b_parallel

println( "Solve the normal equation for x ≈ $(round.(x,digits=3))")
println("Check orthogonality: b_parallel^t b_perp = ", round(dot(b_parallel, b_perp), digits=10))

Solve the normal equation for x ≈ [-0.034, 0.07, 0.022]
Check orthogonality: b_parallel^t b_perp = 0.0


#### Orthogonal Projection onto $\mathbf{\mathscr{C}(A)}$

Assuming we have thrown out all free variable columns in $A$,<br> we can solve for $ b_{parallel} = A \left( A^t A \right)^{-1} A^t b$.

The **orthogonal projection matrix** onto $\mathscr{C}(A)$ is given by $
P_{parallel} = A \left( A^t A \right)^{-1} A^t$

**Remark:** Do not compute the inverse to compute $P_{parallel}$:<br>
$\quad$ Since $\left( A^t A \right)^{-1} A^t = X \Leftrightarrow (A^t A) X = A^t$
* First, solve $(A^t A) X = A^t$ for $X$
* Then $P_{parallel} = A X$

In [6]:
X = AtA \ A'
P_parallel = A*X
#P_parallel = A*inv(A'*A)* A'
println( "P_parallel[1:5,1:5] ≈ ")
approx( P_parallel[1:5,1:5], d=4)

P_parallel[1:5,1:5] ≈ 


5×5 Matrix{Float64}:
  0.1007   0.0333   0.0325   0.0021  -0.0168
  0.0333   0.0689  -0.0617  -0.0566   0.0314
  0.0325  -0.0617   0.105    0.0736  -0.0522
  0.0021  -0.0566   0.0736   0.0571  -0.0371
 -0.0168   0.0314  -0.0522  -0.0371   0.0265

In [7]:
# check P_parallel ^2 ≈ P_parallel
@show ( P_parallel^2 ≈ P_parallel);

P_parallel ^ 2 ≈ P_parallel = true


> **Question:**
> * What happens when $A$ has more columns than rows?<br> (Or more generally, when $A$ does not have full column rank?)
> * What is the size of $A^t A$ compared to the size of $A$?
> * What is the condition number of $A^t A$ compared to the condition number of $A$?

# 2. Using the $\mathbf{QR}$ Decomposition

The normal equations become $R x = Q^t b$.

The **projection matrix** is $P_{parallel} = Q Q^t$.

In [8]:
Q,R = qr(A);
Q = Matrix(Q)
println("R ≈")
latexify(approx(R, d=4))

R ≈


L"\begin{equation}
\left[
\begin{array}{ccc}
16.6874 & -1.8822 & -2.757 \\
0.0 & 18.4003 & -1.278 \\
0.0 & 0.0 & 21.0114 \\
\end{array}
\right]
\end{equation}
"

In [9]:
#x_qr = inv(R) * Q' * b   # thus, Q'b = R x_qr
x_qr = R \ Q'b
println( "Solve the normal equation for x ≈ $(round.(x,digits=3))")
println( "Solve the R x = Qᵗ b equation x ≈ $(round.(x_qr,digits=3))")

Solve the normal equation for x ≈ [-0.034, 0.07, 0.022]
Solve the R x = Qᵗ b equation x ≈ [-0.034, 0.07, 0.022]


In [10]:
P_parallel_qr = Q*Q'
@show (P_parallel ≈ P_parallel_qr);

P_parallel ≈ P_parallel_qr = true


# 3. Using the SVD

The solution of the normal equation is $x = A^\dagger b$.<br>
The orhogonal projection matrix is $A A^\dagger = U U^t = U_r U_r^t$.

In [11]:
Adagger = pinv(A)
println( "the pseudoinverse[:,1:5] is given by")
approx(Adagger[:,1:5], d=3)

the pseudoinverse[:,1:5] is given by


3×5 Matrix{Float64}:
 -0.0     0.006  -0.012  -0.008   0.005
  0.013  -0.003   0.012   0.007  -0.006
  0.01    0.011  -0.006  -0.008   0.003

In [12]:
x_svd = Adagger*b
println( "Solve the normal equation for x ≈ $(round.(x,digits=3))")
println( "Solve the R x = Qᵗ b equation x ≈ $(round.(x_svd,digits=3))")

Solve the normal equation for x ≈ [-0.034, 0.07, 0.022]
Solve the R x = Qᵗ b equation x ≈ [-0.034, 0.07, 0.022]


In [13]:
P_parallel_svd = A * Adagger
@show P_parallel ≈ P_parallel_svd;

P_parallel ≈ P_parallel_svd = true


In [14]:
U = svd(A).U
@show P_parallel ≈ U*U';

P_parallel ≈ U * U' = true
