In [1]:
using LinearAlgebra, RowEchelon
using PyCall
sympy = pyimport("sympy")
sympy.init_printing(use_unicode=true)

import LinearAlgebra.⋅
⋅(u,v) = u.dot(v)

macro display_vec(expression)
   quote
       value = $expression
       print($(Meta.quot(expression)), " = "); display(value.T)
       value
   end
end
macro display_val(expression)
   quote
       value = $expression
       s = repr(value)
       print($(Meta.quot(expression)), " = ", chop(s, head=9, tail=0 ))
       value
   end
end
macro display(expression)
   quote
       value = $expression
       print($(Meta.quot(expression)), " = "); display(value)
       value
   end
end
macro displayln(expression)
   quote
       value = $expression
       print($(Meta.quot(expression)), " = "); display(value); println()
       value
   end
end;

<div style="float:center;width:100%;text-align:center;"><strong style="height:100px;color:darkred;font-size:40px;">Computing Projections and Projection Matrices</strong>
</div>

# 1. Dual Basis

Let $A$ be a matrix with columns $a_1, a_2, \dots a_n$. The equation
$b = A x \Leftrightarrow b = x_1 a_1 + x_2 a_2 \dots + x_n a_n$
expresses a vector $b$ as a linear combination of the columns of $A$.
The vector $x$ is the coordinate vector representing $b$ with respect to the $a_i$.

We want this coordinate vector to be unique, and will therefore consider invertible matrices $A$.

The customary notation for what we want to do is to use the names $V$ for $A$, and $U$ for ithe inverse of $A$. 

**Example:**

In [2]:
V = sympy.Matrix( [ 1 -1 -3;
                    2 -1 -7;
                   -1  2  3] )
U = V.inv()
println( "Matrix V and its inverse U\n" )

@display V
@display U
println("\ncheck:")
@display V*U;

Matrix V and its inverse U

V = 

PyObject Matrix([
[ 1, -1, -3],
[ 2, -1, -7],
[-1,  2,  3]])

U = 

PyObject Matrix([
[11, -3, 4],
[ 1,  0, 1],
[ 3, -1, 1]])


check:
V * U = 

PyObject Matrix([
[1, 0, 0],
[0, 1, 0],
[0, 0, 1]])

In [3]:
b = sympy.Matrix([1,3,2])
x = U * b   # i.e., x = V.inverse() b
@display_vec b
println(" is represented by the coordinate vector")
@display_vec x
println(" with respect to the columns of V" )

b = 

PyObject Matrix([[1, 3, 2]])

 is represented by the coordinate vector
x = 

PyObject Matrix([[10, 3, 2]])

 with respect to the columns of V


In [4]:
println( "Let\'s check:  use the coordinate vector x to recover the original vector b" )
v₁,v₂,v₃ = V.col(0), V.col(1), V.col(2)

@display_vec (x[1] * v₁ + x[2] * v₂ + x[3] * v₃) ;

Let's check:  use the coordinate vector x to recover the original vector b
x[1] * v₁ + x[2] * v₂ + x[3] * v₃ = 

PyObject Matrix([[1, 3, 2]])

---
The columns of an invertible matrix $V$ of size $n \times n$ form a basis for $\mathbb{R}^n$.
So do the rows of the inverse matrix $U = V^{-1}$.<br>
This second basis (the **dual basis**) is rather special:<br>
since $U V = I$, we see that the dot product of the rows of $U$ and the columns of $V$ are the entries in $I$.

> Denote the rows of $U$ by $u_i, i=1,2, \dots n$.
We have
$$
u_i \cdot v_j = \left\{ \begin{array}{ll} 1 \quad & \text{ if } i = j, \\ 0 \quad & \text{ otherwise.}\\ \end{array} \right.
$$

The $i^{th}$ vector $u_i$ is orthogonal to the vectors $v_1, v_2, \dots v_{i-1}, v_{i+1}, \dots v_n$.

**Remark** A source of confusion is the convention that vectors are viewed as **column vectors** when converted to a matrix. Viewed as vectors, there is no column or row convention: when printed, vectors usually appear on a single line (a row!):
* $v_i$ denotes the column vector with entries from the $i^{th}$ column of $V$
* $u_i$ denotes the column vector with entries from the $i^{th}$ row of $U$: thus, the row vector
  $u^t_i$ is the $1 \times n$ matrix equal to the $i^{th}$ row of $U$

In [5]:
foo = v₃ ⋅ v₃

PyObject 67

In [6]:
println( "Let\'s check by looking at the vector u_3  (the third row of U): ")
u₁,u₂,u₃ = U.row(0),U.row(1),U.row(2) # vector() converts row and/or column vectors to vectors
@display u₃

@display_val v₁ ⋅ u₃; println()
@display_val v₂ ⋅ u₃; println()
@display_val v₃ ⋅ u₃;

Let's check by looking at the vector u_3  (the third row of U): 
u₃ = 

PyObject Matrix([[3, -1, 1]])

v₁ ⋅ u₃ = 0
v₂ ⋅ u₃ = 0
v₃ ⋅ u₃ = 1

Let's revisit how we computed $x$ in $V x = b \Leftrightarrow x = U b$.
>By considering each row of $U$, we see that
$$
x_i = u_i \cdot b, \quad i = 1,2, \dots n.
$$

In [7]:
print( "Let\'s check: compute "); @display x[1]; println( "by using the dot product")
@display_val u₁ ⋅ b;

Let's check: compute x[1] = 

PyObject 10

by using the dot product
u₁ ⋅ b = 10

# 2. Oblique Projections

We now rewrite the equation $b = V x$ by using $x_i = u_i \cdot b$:

We find
$$
\begin{array}{ll}
b &=   x_1 v_1 + x_2 v_2 \dots + x_n v_n \\
  &=   v_1 u_1 \cdot b + v_2 \cdot u_2  b \dots + v_n u_n \cdot b \\
  &=   v_1 u^t_1 b + v_2 u^t_2 b + \dots v_n u^t_n b, \\
\end{array}
$$
where we have replaced the vectors $u_i$ with **column vectors** $u_i$
and used the fact that $u^t_i b = ( u_i \cdot b )$.

Note: another way to see this it to write
$$
I b = (V U) b = (v_1 u^t_1 + v_2 u^t_2 + \dots v_n u^t_n ) b
$$
by partitioning the product $V U$ into the columns $v_i$ of $V$ and the rows $u^t_i$ of $U$

> The interesting aspect of this rewrite is that we now have matricies
$$
P_i = v_i u^t_i, i=1,2, \dots n
$$
with the property that $P_i b = x_i v_i$, namely the component vector of the decomposition of $b$ onto the columns of $V$ that we started with above!
$$
b  = P_1 b + P_2 b + \dots P_n b, \quad \text{ where } P_i b  = x_i v_i
$$

**Example:**

In [8]:
function projection_matrix( V, U, j )
    V.col(j) * U.row(j)
end

projection_matrix (generic function with 1 method)

In [9]:
P₁ = projection_matrix( V, U, 0 )
P₂ = projection_matrix( V, U, 1 )
P₃ = projection_matrix( V, U, 2 );

In [10]:
println( "The P_i matrices are "); @displayln P₁; @displayln P₂; @displayln P₃

print( "We had decomposed "); @display_vec b; println(" as a linear combination of the columns of V\n" )
print( "In particular, the component x[1] v_1 = "); @display_vec x[1]*v₁
println("\nVerify this by computing the projection of b using P₁: " )
@display_vec  P₁*b;

The P_i matrices are 
P₁ = 

PyObject Matrix([
[ 11, -3,  4],
[ 22, -6,  8],
[-11,  3, -4]])


P₂ = 

PyObject Matrix([
[-1, 0, -1],
[-1, 0, -1],
[ 2, 0,  2]])


P₃ = 

PyObject Matrix([
[ -9,  3, -3],
[-21,  7, -7],
[  9, -3,  3]])


We had decomposed b = 

PyObject Matrix([[1, 3, 2]])

 as a linear combination of the columns of V

In particular, the component x[1] v_1 = x[1] * v₁ = 

PyObject Matrix([[10, 20, -10]])


Verify this by computing the projection of b using P₁: 
P₁ * b = 

PyObject Matrix([[10, 20, -10]])

---
Two more things to notice:
* we can split a vector $b$ into any two (or more) components, say a component
in the hyperplane $span\{ v_1, v_3 \}$ and $span\{ v_2, v_4, \dots v_n\}$:
$$
b = (P_1+P_3)b  + (P_2 + P_4 + \dots P_N) b
$$
The **projection matrices** $P_i$ have the property
$P^2_i = (v_i u^t_i) (v_i u^t_i) = v_i (u^t_i v_i) u^t_i = v_i (u_i \cdot v_i) u^t_i = v_i u^t_i = P_i$.
* The two vectors we get in such a split are not orthogonal in general

**Example:**

In [11]:
println( "Let's split b into a component in the plane span{v_1,v_3} and a component along the line span{v_2}")
b₁₃  = (P₁+P₃)*b
b₂   = P₂ * b

@display_vec b; println(); @display_vec (P₁+P₃)*b; println();  @display_vec P₂*b
println( "\n\nThe angle between these two vectors is $( ( sympy.acos( (b₂ ⋅ b₁₃)/( b₂.norm() * b₁₃.norm() )) *180/π).round(2) ) degrees")

Let's split b into a component in the plane span{v_1,v_3} and a component along the line span{v_2}
b = 

PyObject Matrix([[1, 3, 2]])


(P₁ + P₃) * b = 

PyObject Matrix([[4, 6, -4]])


P₂ * b = 

PyObject Matrix([[-3, -3, 6]])



The angle between these two vectors is PyObject 153.02 degrees


# 3. Orthogonal Projections

If $V$ is an orthogonal matrix, the $U = V^t$ and the **column vectors** $u_i = v_i$.
For this special case, the matrix $V$ and the $v_i$ vectors are usually denoted $Q$ and $q_i$ respectively.
The projection matrices then look like
$$
P_i = q_i q^t_i
$$

**Example:**

In [12]:
a = sympy.Matrix( [3; 6; -6])
@display_vec a
println( "\nThe orthogonal projection matrix onto the line span{a} is given by")

q = a / a.norm()

P = q * q.T
print( "a/norm(a) = "); @display_vec(q)
print( "P = q q^t = ")
P

a = 

PyObject Matrix([[3, 6, -6]])


The orthogonal projection matrix onto the line span{a} is given by
a/norm(a) = q = 

PyObject Matrix([[1/3, 2/3, -2/3]])

P = q q^t = 

PyObject Matrix([
[ 1/9,  2/9, -2/9],
[ 2/9,  4/9, -4/9],
[-2/9, -4/9,  4/9]])

A slightly different way to write this projection matrix is
$$
P = a \frac{1}{a \cdot a} a^t = a ( a^t a )^{-1} a^t
$$
which we recognize from the solution of the normal equation

---
We could handle orthogonal projections onto hyperplanes in exactly the way we have described above:<br>
We need a basis for $\mathbb{R}^n$ that includes basis vectors for the hyperplane, set up the $n \times n$ matrix
from this complete set of basis vectors and compute its inverse...

The normal equation provides a shortcut: we only need the basis vectors for the hyperplane,
say $\{ v_1, v_2, \dots v_k\}$ where $k \lt n$, that we write into a matrix $A = ( v_1 \; v_2 \; \dots v_k )$
as columns. The projection matrix onto this hyperplane is
$$
P = A (A^t A)^{-1} A^t
$$

**Remark:** In addition to satisfying $P^2 = P$, orthogonal projection matrices are symmetric, i.e., $P^t = P.

$$
P^t = \left( A \left( A^t A \right)^{-1} A^t \right)^t = A \left( A^t A \right)^{-1} A^t = P.
$$

**Example:**

In [13]:
println( "Consider the span of the following two column vectors")
a₁ = sympy.Matrix( [ 1;  2; -1])
a₂ = sympy.Matrix( [ 1; -1;  1])
A  = sympy.Matrix( [ 1  2 -1;  1 -1 1] ).T
@display_vec a₁; println()
@display_vec a₂; println()

@display A

println("Orthogonal projection matrix onto the plane span(a₁,a₂):")
A * (A.T * A).inv()*A.T # matrix should be computed as  A *( (A.T * A) \ A.T)

Consider the span of the following two column vectors
a₁ = 

PyObject Matrix([[1, 2, -1]])


a₂ = 

PyObject Matrix([[1, -1, 1]])


A = 

PyObject Matrix([
[ 1,  1],
[ 2, -1],
[-1,  1]])

Orthogonal projection matrix onto the plane span(a₁,a₂):


PyObject Matrix([
[13/14,  1/7, 3/14],
[  1/7,  5/7, -3/7],
[ 3/14, -3/7, 5/14]])

> **Remark**: if the vectors are orthogonal, the normal equation formula simplifies, since $A^t A$ is diagonal:
$$
P = \frac{1}{a_1\cdot a_1} a_1 a^t_1 + \frac{1}{a_2\cdot a_2} a_2 a^t_2 \dots + \frac{1}{a_k\cdot a_k} a_k a^t_k
$$
i.e., setting $q_i = \frac{1}{|| a_i ||} a_i$, we have
$$
P = q_1 q^t_1 + q_2 q^t_2 + ... q_k q^t_k
$$
Orthogonal basis vectors are nice!

**Example:**

In [14]:
a₁ = sympy.Matrix([ 1;  -1; 1]); q₁ = a₁ / a₁.norm()
a₂ = sympy.Matrix([ 2;   1;-1]); q₂ = a₂ / a₂.norm()
A   = a₁.row_join( a₂ )
Q   = q₁.row_join( q₂ )

@display_vec a₁; println()
@display_vec a₂; println()

print("Verify a₁ and a₂ are orthogonal" ); @display_val a₁ ⋅ a₂; println("\n")

println("Q =")
display( Q ) 

println( "\nOrthogonal Projection Matrix onto span{a₁}")
P₁ = q₁ * q₁.T
display(P₁)

print( "Orthogonal Projection Matrix onto span{a₂}")
P₂ = q₂ * q₂.T
display( P₂ )

println( "\nOrthogonal Projection Matrix onto span{a₁,a₂} computed two ways:")
P₁₂ = A*(A.T* A).inv()*A.T
println("A (A' A).inverse A'")
display( P₁₂ )
println("P₁ + P₂")
display( P₁+P₂ )

a₁ = 

PyObject Matrix([[1, -1, 1]])


a₂ = 

PyObject Matrix([[2, 1, -1]])


Verify a₁ and a₂ are orthogonala₁ ⋅ a₂ = 0

Q =


PyObject Matrix([
[ sqrt(3)/3,  sqrt(6)/3],
[-sqrt(3)/3,  sqrt(6)/6],
[ sqrt(3)/3, -sqrt(6)/6]])


Orthogonal Projection Matrix onto span{a₁}


PyObject Matrix([
[ 1/3, -1/3,  1/3],
[-1/3,  1/3, -1/3],
[ 1/3, -1/3,  1/3]])

Orthogonal Projection Matrix onto span{a₂}

PyObject Matrix([
[ 2/3,  1/3, -1/3],
[ 1/3,  1/6, -1/6],
[-1/3, -1/6,  1/6]])


Orthogonal Projection Matrix onto span{a₁,a₂} computed two ways:
A (A' A).inverse A'


PyObject Matrix([
[1,    0,    0],
[0,  1/2, -1/2],
[0, -1/2,  1/2]])

P₁ + P₂


PyObject Matrix([
[1,    0,    0],
[0,  1/2, -1/2],
[0, -1/2,  1/2]])

# 4. Diagonalizable Matrices

Let's apply these notions to diagonalizable matrices:
Given a matrix $A$ of size $n \times n$, we can ask what effect the linear operator $y = A x$ has on $x$.

Suppose $A = V \Lambda V^{-1} = V \Lambda U$, where we again have set $U = V^{-1}$.
Partitionig the V matrix into columns, this multiplies out to
$$
A = \lambda_1 v_1 u^t_1 + \lambda_2 v_2 u^t_2 + \dots \lambda_n v_n u^t_n.
$$
We now recognize the vector products as projection matrices onto the eigenvectors $v_i$:
the effect of applying the matrix $A$ to the vector $x$ is to decompose $x$ along each of the eigenvectors,
and scaling each of these components by the associated eigenvalue.

When $A$ has a complete set of orthonormal eigenvectors $A = Q \Lambda Q^t$, we have the special case
where $v_i = q_i, u_i = q_i$, and the above formula becomes
$$
A = \lambda_1 q_1 q^t_1 + \lambda_2 q_2 q^t_2 + \dots \lambda_n q_n q^t_n,
$$
which is known as the spectral theorem.

# 5. Application Examples
Which of these formulae should we choose?

##  5.1 A = QR

The Gram Schmidt algorithm uses the simplified orthogonal projection formulae onto orthonormal vectors:
$$
P_i v_k = \left(v_k \cdot q_i \right) q_i = \left( v_k \cdot \frac{w_i}{||w_i||} \right) \frac{w_i}{||w_i||} = \frac{v_k \cdot w_i}{w_i \cdot w_i} w_i
$$
to compute the orthogonal projection of $v_k$ onto $span \{ w_1, w_2, \dots w_{k-1} \}$,
adds up each of these projections to compute the orthogonal projection of $v_k$ onto the
$span \{ w_1, w_2, ... w_{k-1} \}$, i.e.,
$$
v_{parallel} = P_1 v_1 + P_2 v_2 \dots P_{k-1} v_{k-1}
$$
and finally finds the orthogonal component
$$
v_{perpendicular} = v - v_{parallel}.
$$

## 5.2 A Projection Problem (Spring 14)

In [15]:
print("Consider the following matrix")
A = [ 1 1 0 2 5;  -1 -1 0 -2 -5;   2 2 1 3 11;  1 1 2 0 9 ]
AI = sympy.Matrix( [ A 1I ] )
A  = sympy.Matrix(A)

Consider the following matrix

PyObject Matrix([
[ 1,  1, 0,  2,  5],
[-1, -1, 0, -2, -5],
[ 2,  2, 1,  3, 11],
[ 1,  1, 2,  0,  9]])

Problem: find projection matrices onto each of the 4 matrix spaces.

Let's find basis vectors for the spaces:

In [16]:
print("Augment A by I and find a row echelon form")
U,pivot_cols = AI.rref()
U

Augment A by I and find a row echelon form

PyObject Matrix([
[1, 1, 0,  2, 0, 0, 13/2,  5, -5/2],
[0, 0, 1, -1, 0, 0,  7/2,  2, -1/2],
[0, 0, 0,  0, 1, 0, -3/2, -1,  1/2],
[0, 0, 0,  0, 0, 1,    1,  0,    0]])

We observe that the pivots in $A$ are in columns 1, 3 and 5:
* A basis for $\mathscr{C}(A)$ is given by columns 1, 3 and 5 of $A$
* A basis for $\mathscr{R}(A)$ is given by the pivot rows in the row echelon form $U$
* A basis for $\mathscr{N}(A^t)$ is given by the non-pivot rows in the augmented matrix from row echelon form $U$
* A basis for $\mathscr{N}(A)$ is given by the homogeneous solution of $A x = 0$

In [17]:
col1 = A.col(0); col2 = A.col(2); col3 = A.col(4)
row1 = U.__getitem__((0,0:4)).T; row2 = U.__getitem__((1,0:4)).T; row3 = U.__getitem__((2,0:4)).T
nt   = U.__getitem__((3,5:8)).T
n1,n2 = A.nullspace()

println( ">>>>> Row Space basis = ")
@display_vec row1;
@display_vec row2;
@display_vec row3;

println("\n>>>>> Nullspace basis =")
@display_vec n1
@display_vec n2

println("\n>>>>> Column Space basis = ");
@display_vec col1
@display_vec col2
@display_vec col3

println( "\n>>>>> Null Space basis for A.transpose = ")
@display_vec nt; println()

>>>>> Row Space basis = 
row1 = 

PyObject Matrix([[1, 1, 0, 2, 0]])

row2 = 

PyObject Matrix([[0, 0, 1, -1, 0]])

row3 = 

PyObject Matrix([[0, 0, 0, 0, 1]])


>>>>> Nullspace basis =
n1 = 

PyObject Matrix([[-1, 1, 0, 0, 0]])

n2 = 

PyObject Matrix([[-2, 0, 1, 1, 0]])


>>>>> Column Space basis = 
col1 = 

PyObject Matrix([[1, -1, 2, 1]])

col2 = 

PyObject Matrix([[0, 0, 1, 2]])

col3 = 

PyObject Matrix([[5, -5, 11, 9]])


>>>>> Null Space basis for A.transpose = 
nt = 

PyObject Matrix([[1, 1, 0, 0]])




Note that these bases are not orthogonal: to produce orthogonal projection matrices onto these spaces,
we will use the equation $P = A ( A^t A )^{-1} A^t$, where $A$ is the matrix consisting of the basis vectors written in as columns
> Orthogonal Projections onto the row space $\mathscr{R}(A) = \mathscr{C}(A^t)$ and the null space $\mathscr{N}(A)$
* Note that these spaces are orthogonal, so these two projection matrices $P_r + P_n = I$: After we compute one of these, say $P_r$, we can obtain the other by using $P_n = I - P_r$.

> Orthogonal Projections onto the column space $\mathscr{C}(A) = \mathscr{R}(A^t)$ and the null space $\mathscr{N}(A^t)$
* Note that these spaces are also orthogonal, so the same observation holds.

> **Use the matrices $A^t A$ that are smaller in size!**

**Remark:**
* The orthogonal projection onto the span of a single vector $v$ is the orthogonal projection onto a line defined by that vector: we previously computed in in 2D and 3D by considering the projection of the unit vectors onto that line.
* A similar remark holds for the orthogonal projection onto a plane defined by the span of two linearly independent vectors $span\{ v_1, v_2 \}$ 

In [18]:
println( "Row space(A) and Null space(A) projection matrices: there are fewer vectors in the nullspace basis")
An = n1
Pₙ = An * (An.T*An).inv()*An.T  # shold be computed as An * ((An.T*An) \ An.T)
Pᵣ = sympy.eye(5)-Pₙ
@display Pₙ
@display Pᵣ;

Row space(A) and Null space(A) projection matrices: there are fewer vectors in the nullspace basis
Pₙ = 

PyObject Matrix([
[ 1/2, -1/2, 0, 0, 0],
[-1/2,  1/2, 0, 0, 0],
[   0,    0, 0, 0, 0],
[   0,    0, 0, 0, 0],
[   0,    0, 0, 0, 0]])

Pᵣ = 

PyObject Matrix([
[1/2, 1/2, 0, 0, 0],
[1/2, 1/2, 0, 0, 0],
[  0,   0, 1, 0, 0],
[  0,   0, 0, 1, 0],
[  0,   0, 0, 0, 1]])

In [19]:
println( "Col space(A) and Null space(A') projection matrices: there are fewer vectors in the nullspace basis of A.")
Atn = nt

Pₜₙ = nt * (nt.T*nt).inv()*nt.T   # should be computed as nt * ( (nt.T*nt) \ nt.T )
Pc = sympy.eye(4)-Pₜₙ

@display Pc
@display Pₜₙ;

Col space(A) and Null space(A') projection matrices: there are fewer vectors in the nullspace basis of A.
Pc = 

PyObject Matrix([
[ 1/2, -1/2, 0, 0],
[-1/2,  1/2, 0, 0],
[   0,    0, 1, 0],
[   0,    0, 0, 1]])

Pₜₙ = 

PyObject Matrix([
[1/2, 1/2, 0, 0],
[1/2, 1/2, 0, 0],
[  0,   0, 0, 0],
[  0,   0, 0, 0]])

The rank of $A$ is the number of pivots: $rank(A) = 3$

**Remark:** Eigenvectors for eigenvalues $\lambda = 0$ are non-zero solutions of $(A-\lambda I)x = 0$, i.e.,
non-zero vectors in the null space of a **square** matrix $A$. This matrix is not square. The null-space vectors
are eignevectors for $A^t A$ however, since $A$ and $A^t A$ have the same nullspace.