# 1. Dual Basis

Let $A$ be a matrix with columns $a_1, a_2, \dots a_n$. The equation
$b = A x \Leftrightarrow b = x_1 a_1 + x_2 a_2 \dots + x_n a_n$
expresses a vector $b$ as a linear combination of the columns of $A$.
The vector $x$ is the coordinate vector representing $b$ with respect to the $a_i$.

We want this coordinate vector to be unique, and will therefore consider invertible matrices $A$.

The customary notation for what we want to do is to use the names $V$ for $A$, and $U$ for ithe inverse of $A$. 

**Example:**

In [2]:
V = matrix(QQ, [[ 1, -1, -3],
                [ 2, -1, -7],
                [-1,  2,  3]])
U = V.inverse()
print( 'Matrix V and its inverse U' ); show( 'V = ', V, ',  U = ', U,  ',    V U = ', V*U)

Matrix V and its inverse U


In [3]:
b = vector(QQ, [1,3,2])
x = U * b   # i.e., x = V.inverse() b
show('b = ', b, '  is represented by the coordinate vector ')
show('x = ', x, '  with respect to the columns of V')

In [6]:
print( 'Let\'s check:  use the coordinate vector x to recover the original vector b' )
v_1,v_2,v_3 = vector(V[:,0]), vector(V[:,1]), vector(V[:,2])

x[0] * v_1 + x[1] * v_2 + x[2] * v_3

Let's check:  use the coordinate vector x to recover the original vector b


(1, 3, 2)

---
The columns of an invertible matrix $V$ of size $n \times n$ form a basis for $\mathbb{R}^n$.
So do the rows of the inverse matrix $U = V^{-1}$.<br>
This second basis (the **dual basis**) is rather special:<br>
since $U V = I$, we see that the dot product of the rows of $U$ and the columns of $V$ are the entries in $I$.

> Denote the rows of $U$ by $u_i, i=1,2, \dots n$.
We have
$$
u_i \cdot v_j = \left\{ \begin{array}{ll} 1 \quad & \text{ if } i = j, \\ 0 \quad & \text{ otherwise.}\\ \end{array} \right.
$$

The $i^{th}$ vector $u_i$ is orthogonal to the vectors $v_1, v_2, \dots v_{i-1}, v_{i+1}, \dots v_n$.

**Remark** A source of confusion is the convention that vectors are viewed as **column vectors** when converted to a matrix. Viewed as vectors, there is no column or row convention: when printed, vectors usually appear on a single line (a row!):
* $v_i$ denotes the column vector with entries from the $i^{th}$ column of $V$
* $u_i$ denotes the column vector with entries from the $i^{th}$ row of $U$: thus, the row vector
  $u^t_i$ is the $1 \times n$ matrix equal to the $i^{th}$ row of $U$

In [8]:
print( 'Let\'s check by looking at the vector u_3  (the third row of U):')
u_1,u_2,u_3 = vector(U[0,:]),vector(U[1,:]),vector(U[2,:]) # vector() converts row and/or column vectors to vectors
print( 'u_3 =', u_3)
print()
print( 'v_1 dot u_3 = ', v_1 * u_3)
print( 'v_2 dot u_3 = ', v_2 * u_3)
print( 'v_3 dot u_3 = ', v_3 * u_3)

Let's check by looking at the vector u_3  (the third row of U):
u_3 = (3, -1, 1)

v_1 dot u_3 =  0
v_2 dot u_3 =  0
v_3 dot u_3 =  1


Let's revisit how we computed $x$ in $V x = b \Leftrightarrow x = U b$.
>By considering each row of $U$, we see that
$$
x_i = u_i \cdot b, \quad i = 1,2, \dots n.
$$

In [9]:
print( 'Let\'s check: compute x_1 = ', x[0], ' by using the dot product u_1 dot b')
show( 'u1 dot b =\t ', u_1 * b)

Let's check: compute x_1 =  10  by using the dot product u_1 dot b


# 2. Oblique Projections

We now rewrite the equation $b = V x$ by using $x_i = u_i \cdot b$:

We find
$$
\begin{array}{ll}
b &=   x_1 v_1 + x_2 v_2 \dots + x_n v_n \\
  &=   v_1 u_1 \cdot b + v_2 \cdot u_2  b \dots + v_n u_n \cdot b \\
  &=   v_1 u^t_1 b + v_2 u^t_2 b + \dots v_n u^t_n b, \\
\end{array}
$$
where we have replaced the vectors $u_i$ with **column vectors** $u_i$
and used the fact that $u^t_i b = ( u_i \cdot b )$.

Note: another way to see this it to write
$$
I b = (V U) b = (v_1 u^t_1 + v_2 u^t_2 + \dots v_n u^t_n ) b
$$
by partitioning the product $V U$ into the columns $v_i$ of $V$ and the rows $u^t_i$ of $U$

> The interesting aspect of this rewrite is that we now have matricies
$$
P_i = v_i u^t_i, i=1,2, \dots n
$$
with the property that $P_i b = x_i v_i$, namely the component vector of the decomposition of $b$ onto the columns of $V$ that we started with above!
$$
b  = P_1 b + P_2 b + \dots P_n b, \quad \text{ where } P_i b  = x_i v_i
$$

**Example:**

In [10]:
P_1 = V[:,0] * U[0,:]
P_2 = V[:,1] * U[1,:]
P_3 = V[:,2] * U[2,:]

print( "The P_i matrices are "); show('P1 = ', P_1, ',  P2 = ', P_2, ',  P3 = ', P_3)
print()
print( 'We had decomposed b = ', b, ' as a linear combination of the columns of V')
print( 'In particular, the component x_1 v_1 is ', x[0],v_1, ' = ', x[0]*v_1)
print()
show( 'P1 b = ', P_1*b)

The P_i matrices are 



We had decomposed b =  (1, 3, 2)  as a linear combination of the columns of V
In particular, the component x_1 v_1 is  10 (1, 2, -1)  =  (10, 20, -10)



---
Two more things to notice:
* we can split a vector $b$ into any two (or more) components, say a component
in the hyperplane $span\{ v_1, v_3 \}$ and $span\{ v_2, v_4, \dots v_n\}$:
$$
b = (P_1+P_3)b  + (P_2 + P_4 + \dots P_N) b
$$
The **projection matrices** $P_i$ have the property
$P^2_i = (v_i u^t_i) (v_i u^t_i) = v_i (u^t_i v_i) u^t_i = v_i (u_i \cdot v_i) u^t_i = v_i u^t_i = P_i$.
* The two vectors we get in such a split are not orthogonal in general

**Example:**

In [11]:
print( 'Let\'s split b into a component in the plane span{v_1,v_3} and a component along the line span{v_2}')
print( 'b = (P_1+P_3) b  + P_2 b')

b13  = (P_1+P_3)*b
b2   = P_2 * b
show( 'b = ', b, ',  (P1+P3) b  =', b13, ',  P2 b = ', b2 )
print()
print( 'The angle between these two vectors is ', \
       round(acos((b2*b13/(norm(b2)*norm(b13))).n())*180/pi.n(),2), 'degrees')

Let's split b into a component in the plane span{v_1,v_3} and a component along the line span{v_2}
b = (P_1+P_3) b  + P_2 b



The angle between these two vectors is  153.02 degrees


# 3. Orthogonal Projections

If $V$ is an orthogonal matrix, the $U = V^t$ and the **column vectors** $u_i = v_i$.
For this special case, the matrix $V$ and the $v_i$ vectors are usually denoted $Q$ and $q_i$ respectively.
The projection matrices then look like
$$
P_i = q_i q^t_i
$$

**Example:**

In [14]:
a = vector(QQ, [3,6,-6])
print( 'Let a = ', a)
print( 'The orthogonal projection matrix onto the line span{a} is given by')

q = matrix( a/norm(a) ).transpose()

P = q*q.transpose()
show( 'q = a/norm(a) = ', q, ',  P = q q^t = ', P )

Let a =  (3, 6, -6)
The orthogonal projection matrix onto the line span{a} is given by


A slightly different way to write this projection matrix is
$$
P = a \frac{1}{a \cdot a} a^t = a ( a^t a )^{-1} a^t
$$
which we recognize from the solution of the normal equation

---
We could handle orthogonal projections onto hyperplanes in exactly the way we have described above:<br>
We need a basis for $\mathbb{R}^n$ that includes basis vectors for the hyperplane, set up the $n \times n$ matrix
from this complete set of basis vectors and compute its inverse...

The normal equation provides a shortcut: we only need the basis vectors for the hyperplane,
say $\{ v_1, v_2, \dots v_k\}$ where $k \lt n$, that we write into a matrix $A = ( v_1 \; v_2 \; \dots v_k )$
as columns. The projection matrix onto this hyperplane is
$$
P = A (A^t A)^{-1} A^t
$$

**Example:**

In [15]:
print( "Consider the span of the following two column vectors")
a_1 = matrix(QQ,3,1, [ 1,  2,-1])
a_2 = matrix(QQ,3,1, [ 1, -1, 1])
A   = matrix(QQ,[[ 1,  2,-1],[1,-1, 1]]).transpose()
show('a1 = ', a_1, ',  a2 = ', a_2, ',  A = ', A)

print('Orthogonal projection matrix onto the plane span(a_1,a_2):')
show( 'P = ', A* (A.transpose()*A).inverse()*A.transpose())

Consider the span of the following two column vectors


Orthogonal projection matrix onto the plane span(a_1,a_2):


>Remark: if the vectors are orthogonal, the normal equation formula simplifies, since $A^t A$ is diagonal:
$$
P = \frac{1}{a_1\cdot a_1} a_1 a^t_1 + \frac{1}{a_2\cdot a_2} a_2 a^t_2 \dots + \frac{1}{a_k\cdot a_k} a_k a^t_k
$$
i.e., setting $q_i = \frac{1}{|| a_i ||} a_i$, we have
$$
P = q_1 q^t_1 + q_2 q^t_2 + ... q_k q^t_k
$$
Orthogonal basis vectors are nice!

**Example:**

In [16]:
a_1 = matrix(QQ,3,1, [ 1,  -1, 1]); q_1 = a_1 / sqrt(3)
a_2 = matrix(QQ,3,1, [ 2,   1,-1]); q_2 = a_2 / sqrt(6)
A   = a_1.augment(a_2)
Q   = q_1.augment(q_2)

show( 'a_1 = ', a_1, '   a_2 = ', a_2,  ',   A = ', A,  ',   Q = ', Q  ) 

print( 'Orthogonal Projection Matrix onto span{a_1}')
show( q_1*q_1.transpose())

print( 'Orthogonal Projection Matrix onto span{a_2}')
show( q_2*q_2.transpose())

print( 'Orthogonal Projection Matrix onto span{a_1,a_2} computed two ways')
show( A*(A.transpose()*A).inverse()*A.transpose(), ' = ', q_1*q_1.transpose()+q_2*q_2.transpose())

Orthogonal Projection Matrix onto span{a_1}


Orthogonal Projection Matrix onto span{a_2}


Orthogonal Projection Matrix onto span{a_1,a_2} computed two ways


# 4. Diagonalizable Matrices

Let's apply these notions to diagonalizable matrices:
Given a matrix $A$ of size $n \times n$, we can ask what effect the linear operator $y = A x$ has on $x$.

Suppose $A = V \Lambda V^{-1} = V \Lambda U$, where we again have set $U = V^{-1}$.
Partitionig the V matrix into columns, this multiplies out to
$$
A = \lambda_1 v_1 u^t_1 + \lambda_2 v_2 u^t_2 + \dots \lambda_n v_n u^t_n.
$$
We now recognize the vector products as projection matrices onto the eigenvectors $v_i$:
the effect of applying the matrix $A$ to the vector $x$ is to decompose $x$ along each of the eigenvectors,
and scaling each of these components by the associated eigenvalue.

When $A$ has a complete set of orthonormal eigenvectors $A = Q \Lambda Q^t$, we have the special case
where $v_i = q_i, u_i = q_i$, and the above formula becomes
$$
A = \lambda_1 q_1 q^t_1 + \lambda_2 q_2 q^t_2 + \dots \lambda_n q_n q^t_n,
$$
which is known as the spectral theorem.

# 5. Application Examples
Which of these formulae should we choose?

##  5.1 A = QR

The Gram Schmidt algorithm uses the simplified orthogonal projection formulae onto orthonormal vectors:
$$
P_i v_k = \left(v_k \cdot q_i \right) q_i = \left( v_k \cdot \frac{w_i}{||w_i||} \right) \frac{w_i}{||w_i||} = \frac{v_k \cdot w_i}{w_i \cdot w_i} w_i
$$
to compute the orthogonal projection of $v_k$ onto $span \{ w_1, w_2, \dots w_{k-1} \}$,
adds up each of these projections to compute the orthogonal projection of $v_k$ onto the
$span \{ w_1, w_2, ... w_{k-1} \}$, i.e.,
$$
v_{parallel} = P_1 v_1 + P_2 v_2 \dots P_{k-1} v_{k-1}
$$
and finally finds the orthogonal component
$$
v_{perpendicular} = v - v_{parallel}.
$$

## 5.2 A Projection Problem (Spring 14)

In [17]:
print('Consider the following matrix')
A = matrix( QQ, 4,5,  [ 1,1,0,2,5, -1,-1,0,-2,-5,   2,2,1,3,11,  1,1,2,0,9 ] ); show('A = ', A)

Consider the following matrix


Problem: find projection matrices onto each of the 4 matrix spaces.

Let's find basis vectors for the spaces:

In [18]:
print('Augment A by I and find a row echelon form')
U = A.augment(identity_matrix(QQ,4)).rref()
show(U)

Augment A by I and find a row echelon form


We observe that the pivots in $A$ are in columns 1,3 and 5:
* A basis for $\mathscr{C}(A)$ is given by these columns
* A basis for $\mathscr{R}(A)$ is given by the pivot rows in the row echelon form $U$
* A basis for $\mathscr{N}(A^t)$ is given by the non-pivot rows in the augmented matrix from row echelon form $U$
* A basis for $\mathscr{N}(A)$ is given by the homogenieous solution of $A x = 0$

In [19]:
col1 = A[:,0]; col2 = A[:,2]; col3 = A[:,4]
row1 = U[0,0:5].transpose();row2 = U[1,0:5].transpose();row3 = U[2,0:5].transpose();
show('Column Space basis = ', col1,col2,col3,',   Row Space basis = ', row1,row2,row3 )
homogeneous_solution = 2 * A.right_kernel().basis_matrix().transpose()

n1 = homogeneous_solution[:,0];n2=homogeneous_solution[:,1]
nt = U[3,5:].transpose();

show('Null Space basis for A.t = ',nt, ',  Null Space basis for A = ', n1,n2)

Note that these bases are not orthogonal: to produce orthogonal projection matrices onto these spaces,
we will use the equation $P = A ( A^t A )^{-1} A^t$, where $A$ is the matrix consisting of the basis vectors written in as columns
> Orthogonal Projections onto the row space $\mathscr{R}(A) = \mathscr{C}(A^t)$ and the null space $\mathscr{N}(A)$
* Note that these spaces are orthogonal, so these two projection matrices $P_r + P_n = I$: After we compute one of these, say $P_r$, we can obtain the other by using $P_n = I - P_r$.

> Orthogonal Projections onto the column space $\mathscr{C}(A) = \mathscr{R}(A^t)$ and the null space $\mathscr{N}(A^t)$
* Note that these spaces are also orthogonal, so the same observation holds.

> **Use the matrices $A^t A$ that are smaller in size!**

**Remark:**
* The orthogonal projection onto the span of a single vector $v$ is the orthogonal projection onto a line defined by that vector: we previously computed in in 2D and 3D by considering the projection of the unit vectors onto that line.
* A similar remark holds for the orthogonal projection onto a plane defined by the span of two linearly independent vectors $span\{ v_1, v_2 \}$ 

In [20]:
print( 'Row space(A) and Null space(A) projection matrices: there are fewer vectors in the nullspace basis')
An = n1
Pn = An * (An.transpose()*An).inverse()*An.transpose()
Pr = identity_matrix(QQ, 5)-Pn
show(An, '->  Pn = ', Pn, ', Pr = I - Pn = ', Pr)

Row space(A) and Null space(A) projection matrices: there are fewer vectors in the nullspace basis


In [21]:
print( 'Col space(A) and Null space(A.t) projection matrices: there are fewer vectors in the nullspace basis of A.')
Atn = n1.augment(n2)
Ptn = Atn * (Atn.transpose()*Atn).inverse()*Atn.transpose()
Pc = identity_matrix(QQ, 5)-Ptn
show(Atn, '->  Ptn = ', Ptn, ', Pc = I - Ptn = ', Pc)

Col space(A) and Null space(A.t) projection matrices: there are fewer vectors in the nullspace basis of A.


The rank of $A$ is the number of pivots: $rank(A) = 3$

**Remark:** Eigenvectors for eigenvalues $\lambda = 0$ are non-zero solutions of $(A-\lambda I)x = 0$, i.e.,
non-zero vectors in the null space of a **square** matrix $A$. This matrix is not square. The null-space vectors
are eignevectors for $A^t A$ however, since $A$ and $A^t A$ have the same nullspace.