In [1]:
# Orthogonality
# Author: Zhang Su (Teaching Assistant)
# Using python3, numpy
# 2 June 2020

# Learning Outcome

By the end of this material, you should be able to:

+ Determine the orthogonality of two vectors,
+ Calculate the project of a vector onto a (unit) vector,
+ Verify an orthogonal and orthonormal sets,
+ Normalize an orthogonal sets to an orthonormal sets,
+ Carry out orthogonal decomposition in a column-wise and matrix operation manners.

Note: 
1. If you occasionally double clicked a textual cell, the display would change to markdown source code. To reverse, simply click anywhere of that markdown cell,  and then click **Run** in the top manu.
2. Sometimes the notebook may not be responding. That is caused by the failure of jupyter kernel. To repair, try clicking **Kernel** in the top manu, then clicking **Reconnect**. 
3. Section Takeaways summarizes useful tips, e.g., holes of Python to avoid, if any.
4. Section Practice reflect the learning outcomes. You are expected to solve them based on your understanding on the lecture notes alone with the coding skills learned from this demo.

# Table of contents <a name="Table_of_Content"></a>
+ 6.2.1 [Orthogonality Definition](#Orthogonality_Definition) 
+ 6.2.2 [Orthogonal Projection](#Orthogonal_Projection) 
+ 6.2.3 [Orthogonal Sets](#Orthogonal_Sets) 
+ 6.2.4 [Orthonormal Sets](#Orthonormal_Sets) 
+ 6.2.5 [Orthogonal Decomposition](#Orthogonal_Decomposition) 
+ [Takeaways](#Takeaways)
+ [Practice](#Practice)

Let's import the libraries.

In [2]:
import numpy as np
from numpy import linalg as la

### Orthogonality Definition<a name="Orthogonality_Definition"></a>
[Return to Table of Content](#Table_of_Content)

**This part of code is for Section 6.2.1.**

To determine the orthogonality, one should simply calculate the dot product of two given vectors. $0$ denote that the two vectors are orthogonal.

The following two vectors are not orthogonal since their dot product does not equal $0$.

In [3]:
u = np.random.rand(3,1)
v = np.random.rand(3,1)

ortho_uv = np.dot(u.T,v)
print('The dot product of u and v =\n',ortho_uv)

The dot product of u and v =
 [[0.43534345]]


### Orthogonal Projection<a name="Orthogonal_Projection"></a>
[Return to Table of Content](#Table_of_Content)

**This part of code is for Section 6.2.2.**

Given a basis vector $a$, the projection of $u$ onto $a$ is formulated as:

$$
proj_au = w_1 = \frac{u\cdot a}{\|a\|^2}a, \tag{1}
$$

or, if $a$ is already a unit vector:

$$
proj_au = w_1 = (u\cdot a)a. \tag{2}
$$

In [4]:
# Initiate u and a, don't miss the dtype!
u = np.array([[2, -1, 3]], dtype=float)
a = np.array([[4, -1, 2]], dtype=float)

# Normalize a
a_unit = a / la.norm(a)

# Eq. 1
w1 = u.dot(a.T) / (la.norm(a) ** 2) * a

# Eq. 2
w1_2 = u.dot(a_unit.T) * a_unit

print("When a is not a unit vector, the projected component is\n", w1)
print("\nWhen a is a unit vector, the projected component is\n", w1_2)

# Verify the equivalence
print("\nAre they equal?", np.allclose(w1, w1_2))

When a is not a unit vector, the projected component is
 [[ 2.85714286 -0.71428571  1.42857143]]

When a is a unit vector, the projected component is
 [[ 2.85714286 -0.71428571  1.42857143]]

Are they equal? True


Once the $w_1$, the vector component of $u$ along $a$ is obtained, we can further obtain the vector component of $u$ orthogonal to $a$ by:

$$
w_2 = u - w_1 \tag{3}
$$

$w_2$ is the *residual* when $a$ is used to approximate $u$. It is orthogonal to $a$ so that the Pythagoras theorem holds as

$$
\|u\|^2 = \|w_1\|^2 + \|w_2\|^2. \tag{4}
$$

In [5]:
# Eq. 3
w2 = u - w1

print("The residual of the projection is: \n", w2)

# Eq. 4 is left for practice.

The residual of the projection is: 
 [[-0.85714286 -0.28571429  1.57142857]]


### Orthogonal Sets<a name="Orthogonal_Sets"></a>
[Return to Table of Content](#Table_of_Content)

**This part of code is for Section 6.2.3.**

An orthogonal set $\{u_1, u_2, \ldots, u_n\}$ is a collection of vectors satisfying $u_i\cdot u_j=0$, $i\neq j$.

We have two ways to verify the orthogonality of a set using Python, i.e., in a column-wise manner or using matrix multiplication.

In [6]:
U = np.array([[3,-1,-1/2],
             [1,2,-2],
             [1,1,7/2]]) # Make sure that there are double square brackets.

print('The initialized U values =\n',U)

The initialized U values =
 [[ 3.  -1.  -0.5]
 [ 1.   2.  -2. ]
 [ 1.   1.   3.5]]


#### Column-wise 

Following the orthogonality definition, we can separately calculate the dot product of two columns.

In [7]:
col1_col2 = U[:, 0].T.dot(U[:, 1])
col1_col3 = U[:, 0].T.dot(U[:, 2])
col2_col3 = U[:, 1].T.dot(U[:, 2])

print("The dot product of u1 and u2 is", col1_col2)
print("The dot product of u1 and u3 is", col1_col3)
print("The dot product of u2 and u3 is", col2_col3)

The dot product of u1 and u2 is 0.0
The dot product of u1 and u3 is 0.0
The dot product of u2 and u3 is 0.0


#### Matrix Multiplication

Or using matrix operation in one go.

We will use `np.dot()` method. Note that we are to check the orthogonality of columns, we should use `np.dot(S.T, S)` instead of `np.dot(S, S.T)`, because the latter is checking the row orthogonality.

In [8]:
orthogonality_check = np.dot(U.T, U)
orthogonality_check_2 = np.dot(U, U.T)

print('Orthogonality check for columns =\n',orthogonality_check)
print('Orthogonality check for rows =\n',orthogonality_check_2)

Orthogonality check for columns =
 [[11.   0.   0. ]
 [ 0.   6.   0. ]
 [ 0.   0.  16.5]]
Orthogonality check for rows =
 [[10.25  2.    0.25]
 [ 2.    9.   -4.  ]
 [ 0.25 -4.   14.25]]


In `orthogonality_check`, we actually computed
$$
U^TU=\begin{pmatrix}
\|u_1\|^2 & u_1^Tu_2 & u_1^Tu_3 \\
u_2^Tu_1 & \|u_2\|^2 & u_2^Tu_3 \\
u_3^Tu_1 & u_3^Tu_2 & \|u_3\|^2
\end{pmatrix}.
$$

we see that outputted `orthogonality_check` is a diagonal matrix. Since the entry in the $i-$th row and $j-$th column represents the orthogonality check for the $i-$th and $j-$th columns of $U$, the result will be a symmetric matrix. And if the non-diagonal entries are zero, then the corresponding columns of $U$ are orthogonal, forming an orthogonal set.

If we consider a vector to be the minimal element, then $U^TU$ carried out computations of $\mathcal{O}(n^2)$ complexity, where $n$ denote the number of columns. (For the big O notation, please see [this link](https://en.wikipedia.org/wiki/Big_O_notation).) Indeed, $U$ has 3 columns, and we actually did 6 dot product and 3 norm computations, totally 9 computations, equaling to the number of entries in $U^TU$ .

`orthogonality_check_2`, on the other hand,  shows that the rows of `S` are not an orthogonal set.

### Orthonormal sets<a name="Orthonormal_Sets"></a>
[Return to Table of Content](#Table_of_Content)

**This part of code is for Section 6.2.4.**

To make `U` orthonormal, we divide each column of `S` by its length, respectively. We can even use only one line of code to achieve this "pythonically" ([List Comprehensions](https://docs.python.org/3/tutorial/datastructures.html#list-comprehensions)).

It is worth noting that the result may contain some extremely small values like `5.55111512e-17`, which can be safely considered as `0`. To verify, we can simply use `np.allclose()` method to see if the resulted matrices approximate an identity matrix of the same size.

In [9]:
# Column-wise normalization
orthonormal_U = []
for column in U.T:
    unit_column = column / la.norm(column)
    orthonormal_U.append(unit_column)
orthonormal_U = np.column_stack(orthonormal_U)

# Column-wise normalization with Python List Comprehensions.
orthonormal_U_2 = np.column_stack([column/la.norm(column) for column in U.T])


print('Orthonormal S =\n',orthonormal_U)
print('Orthonormal S_2 =\n',orthonormal_U_2)
print('\n')
orthonormal_check = np.dot(orthonormal_U.T, orthonormal_U)
orthonormal_check_2 = np.dot(orthonormal_U_2.T, orthonormal_U_2)

print('Orthonormality check for S =\n',orthonormal_check)
print('Orthonormality check for S_2 =\n',orthonormal_check_2)
print('\n')

# Verify if the 3-by-3 UTU approximates a 3-by-3 identity matrix
print('Orthonormality holds?', np.allclose(orthonormal_check, np.eye(3)))
print('Orthonormality holds?', np.allclose(orthonormal_check_2, np.eye(3)))

Orthonormal S =
 [[ 0.90453403 -0.40824829 -0.12309149]
 [ 0.30151134  0.81649658 -0.49236596]
 [ 0.30151134  0.40824829  0.86164044]]
Orthonormal S_2 =
 [[ 0.90453403 -0.40824829 -0.12309149]
 [ 0.30151134  0.81649658 -0.49236596]
 [ 0.30151134  0.40824829  0.86164044]]


Orthonormality check for S =
 [[1.00000000e+00 7.68155897e-19 3.00799601e-17]
 [7.68155897e-19 1.00000000e+00 2.69677331e-17]
 [3.00799601e-17 2.69677331e-17 1.00000000e+00]]
Orthonormality check for S_2 =
 [[1.00000000e+00 7.68155897e-19 3.00799601e-17]
 [7.68155897e-19 1.00000000e+00 2.69677331e-17]
 [3.00799601e-17 2.69677331e-17 1.00000000e+00]]


Orthonormality holds? True
Orthonormality holds? True


### Orthogonal Decomposition<a name="Orthogonal_Decomposition"></a>
[Return to Table of Content](#Table_of_Content)

**This part of code is for Section 6.2.5.**

Given a space $W$ spanned by basis $\{u_1, u_2, \ldots, u_p\}$ and an arbitrary vector $y$, the prthogonal projection of $y$ onto $W$ is formulated as:

$$
\hat{y} = \frac{y\cdot u_1}{u_1\cdot u_1}u_1 + \cdots + \frac{y\cdot u_p}{u_p\cdot u_p}u_p, \tag{5}
$$

if the basis is **not** an orthonormal basis, or

$$
\hat{y} = (u_1\cdot u_1)u_1 + \cdots + (u_p\cdot u_p)u_p, \tag{6}
$$

if the basis **is** an orthonormal basis.

After the projection $\hat{y}$ is obtained, the residual $z$ which is orthogonal to $W$ can be obtained by:

$$
z = y - \hat{y} \tag{7}
$$

So far we can say that $\hat{y}$ is the closest point in $W$ to $y$, following the Best Approximation Theorem:

$$
Error = \|y-\hat{y}\|<\|y-u\|, \tag{8}
$$

with an error of $\|z\|$. Let's see the example below.



![title](img/projection_problem.png)

We use two equivalent ways to solve it. One is in a column-wise manner, and another one uses the projection matrix.

In [10]:
# Initialization
u1, u2, y = np.array([[-7,1,4]]).T, np.array([[-1,1,-2]]).T, np.array([[-9,1,6]]).T

# Pay attention to how to stack the column vectors u1 and u2.
W = np.column_stack((u1, u2))

#### Column-wise

The column-wise method first calculate the projections of $u_1$ and $u_2$ onto $y$, and then calculate $proj_W y$., i.e.:

$$
\begin{align}
p_1 &= \frac{u_1\cdot y}{u_1\cdot u_1} \tag{9}\\
p_2 &= \frac{u_2\cdot y}{u_2\cdot u_2} \tag{10}\\
\hat{y} &= u1\cdot p_1 + u2\cdot p_2  \tag{11}\\
error &= \|y -  \hat{y} \tag{12}\|
\end{align}
$$

where $p_1$ and $p_2$ denote the projection of $y$ onto $u_1$ and $u_2$, respectively.

In [11]:
orthogonality_check_of_W = np.dot(W.T, W)
orthogonality_check_of_u1_u2 = np.dot(u1.T, u2)

print('Orthogonality check for U =\n',orthogonality_check_of_W)
print('Orthogonality check for u1 and u2 =\n',orthogonality_check_of_u1_u2)

# Equation 9 and 10
p1 = np.dot(y.T, u1) / np.dot(u1.T,  u1) # Be careful! The / operator requires equal dimension.
p2 = np.dot(y.T, u2) / np.dot(u2.T,  u2)

# Equation 11
y_hat = u1 * p1 + u2 * p2


error = la.norm(y - y_hat)

print('The estimated y =\n',y_hat)
print('The error  =\n',error)

Orthogonality check for U =
 [[66  0]
 [ 0  6]]
Orthogonality check for u1 and u2 =
 [[0]]
The estimated y =
 [[-9.]
 [ 1.]
 [ 6.]]
The error  =
 1.7763568394002505e-15


#### Projection Matrix

The projection matrix method first normalize $u_1$ and $u_2$ to form a orthonormal set, and then calculate the projection matrix as $UU^T$. I.e.,

$$
\begin{align}
u_1' &= \frac{u_1}{||u_1||} \tag{13}\\
u_2' &= \frac{u_2}{||u_2||} \tag{14}\\
U &= [u_1' u_2'] \tag{15} \\
\hat{y} &= UU^Ty \tag{16} \\
error &= \|y - \hat{y}\| \tag{17}
\end{align}
$$

In [14]:
# Equation 12 and 13, using the List Comprehensions of Python
orthonormal_W = np.column_stack([column/la.norm(column) for column in W.T])

## Equation 12 and 13, alternatively.
# orthonormal_W = []
# for column in W.T:
#     unit_column = column / la.norm(column)
#     orthonormal_W.append(unit_column)
# orthonormal_W = np.column_stack(orthonormal_W)

# Equation 15
projection_matrix = np.dot(orthonormal_W, orthonormal_W.T)
print(projection_matrix)
y_hat = np.dot(projection_matrix, y)
# Equation 16
error = la.norm(y - y_hat)

print('The orthonormal matrix W\' =\n',orthonormal_W)
print('\nCheck the orthonormality.', np.allclose(orthonormal_W.T.dot(orthonormal_W), np.eye(orthonormal_W.shape[1])))

print('\nThe estimated y =\n',y_hat)
print('\nThe error  =\n',error)

[[ 0.90909091 -0.27272727 -0.09090909]
 [-0.27272727  0.18181818 -0.27272727]
 [-0.09090909 -0.27272727  0.90909091]]
The orthonormal matrix W' =
 [[-0.86164044 -0.40824829]
 [ 0.12309149  0.40824829]
 [ 0.49236596 -0.81649658]]

Check the orthonormality. True

The estimated y =
 [[-9.]
 [ 1.]
 [ 6.]]

The error  =
 2.220446049250313e-16


### Takeaways<a name="Takeaways"></a>
[Return to Table of Content](#Table_of_Content)

1. Pay attention to how to stack column vectors to a matrix, i.e., using `np.column_stack((u1, u2, u3))` or `np.c_[u1, u2, u3]`.
2. [List comprehensions](https://docs.python.org/3/tutorial/datastructures.html#list-comprehensions) are sublimely elegent!

### Practice<a name="Practice"></a>
[Return to Table of Content](#Table_of_Content)

**This part can get you ready for the lab!**

Given $A=\begin{pmatrix}
6 & -6 & -6 \\
6 & -6 & 6 \\
6 & 6  & -6 \\
6 & 6  & 6 
\end{pmatrix}$

1. Generate a new matrix $U$ by normalizing each column of $A$ and verify the orthonormality.
2. Given a vector $y=(1,2,3,4)$, carry out the orthogonal decomposition of $y$ onto $Col\,\,U$ in two ways, i.e., following Eq. 9, 10, 11, 12 AND Eq. 13, 14, 15, 16, 17, respectively.
3. What's the error of the projection in 2?
4. Examine the Pythagoras Theorem between $z$, $u_i$, and $y$ following Eq. 4.
5. Find the closest point to $x=(1,0,0,2)$ in $Col\,\,U$.