# ORTHOGONAL MATRICES AND SUBSPACES

The word orthogonal appears everywhere in linear algebra.

It means **perpendicular**.

Its use extends far beyond the angle between two vectors.

Here are important extensions of that key idea:

**1-) Orthogonal Vectors $x$ and $y$:**

The test is $x^T y$ = $x_1 y_1$ + ... + $x_n y_n$ = 0

If $x$ and $y$ have complex components, change to $\bar{x}^T y$ = $\bar{x}_1 y_1$ + ... + $\bar{x}_n y_n$ = 0 

**2-) Orthogonal Basis for a Subspace:**

Every pair of basis vectors has $v_1 ^T v_j$ = 0 

**Orthonormal Basis:** Orthogonal basis of **unit vectors:** every $v_1 ^T v_j$ = 1 (length 1)

From orthogonal to orthonormal, just divide every basis vector $v_i$ by its length $||v_i||$

**3-) Orthogonal Subspaces R and N:**

Every vector in the space $R$ is orthogonal to every vector in $N$.

Notice Again! The row space and nullspace are orthogonal:

$$

\\

Ax=0 \:\:means\:each\:row\:x=0

\\

\begin{bmatrix}

Row\:1\:of\:A \\
...\\
...\\
Row\:m\:of\:A\\

\end{bmatrix}

*

\begin{bmatrix}

 \\
x\\
\\
\\

\end{bmatrix}

=

\begin{bmatrix}

0 \\
...\\
...\\
0\\

\end{bmatrix}
$$

Every row (and every combination of rows) is orthogonal to all $x$ in the nullspace.

**4-)Tall Thin Matrices $Q$ with Orthonormal Columns: $Q^T Q$ = $I$.**

$$

Q^T Q

=

\begin{bmatrix}

--- & q^T _1 & --- \\
&...&\\
&...&\\
--- & q^T _n & --- \\

\end{bmatrix}

*

\begin{bmatrix}

 &  &  \\
q_1&...&q_n\\
&  &  \\
&  & \\

\end{bmatrix}

=

\begin{bmatrix}

 1 & 0 & 0 \\
 0 & 1 & 0 \\
 0 & 0 & 1 \\


\end{bmatrix}

=

I

$$

If this $Q$ multiplies any vector $x$, the length of the vector does not change:

$||Qx||$ = $||x||$ because $(Qx)^T$ * $(Qx)$ = $x^T$ $Q^T Q$ $x$ = $x^Tx$

If m > n the $m$ rows cannot be orthogonal in $R^n$.

Tall thin matrices have $Q Q^T$ =! $I$

**5-)Orthogonal Matrices are square with orthonormal columns: $Q^T$ = $Q^-^1$**

**For square matrices $Q^T Q=I$ leads to $QQ^T$= $I$**

For square matrices $Q$, the left inverse $Q^T$ is also a right inverse of $Q$.

The columns of this orthogonal $n$ by $n$ matrix are an orthonormal basis for $R^n$.

The rows of $Q$ are a (probably different) orthonormal basis for $R^n$.

The name orthogonal matrix should really be orthonormal matrix.

We will now see examples of orthogonal **vectors**, **bases**, **subspaces** and **matrices**.

## ORTHOGONAL VECTORS

The test $x^T y$ = 0 connects to right triangles by $c^z = a^z + b^z$:

**Pythagoras Law for right triangles: $||x-y||^2$ = $||x||^2$ + $||y||^2$** 

The left side is $(x-y)^T$ * $(x-y)$.

This expands to $x^T x$ + $y^T y$ - $x^T y$ - $y^T x$.

When the last two terms are zero, we have equation $||x-y||^2$ = $||x||^2$ + $||y||^2$:

$x$= (1,2,2) and $y$ = (2,1,-2) have $x^T y$=0

The hypotenuse is $x-y$ = (-1,1,4).

Then Pythagoras is 18 = 9 + 9.

Dot products $x^T y$ and $y^T x$ always equal ||x||*||y|| cos$\theta$, where $\theta$ is the angle between x and y.

So in all cases we have the Law of Cosines $c^2 = a^2 + b^2 - 2ab cosθ$ :

**LAW OF COSINES : $||x-y||^2$ = $||x||^2$ + $||y||^2 - 2 ||x||*||y||* cosθ$** 

Orthogonal vectors have cosθ = 0 and that last term disappears.

## ORTHOGONAL BASIS

The standard basis is orthogonal (even orthonormal) in $R^n$:

Standard basis $i,j,k$ in $R^3$:

$$

i 

=

\begin{bmatrix}

 1 \\
 0  \\
 0  \\

\end{bmatrix}

j

=

\begin{bmatrix}

 0 \\
 1  \\
 0  \\

\end{bmatrix}

k

=

\begin{bmatrix}

 0 \\
 0  \\
 1  \\

\end{bmatrix}

$$

Here are three Hadamard matrices $H_2$, $H_4$, $H_8$ containing orthogonal basis of $R^2$, $R^4$, $R^8$.

$$

Hadamard\:Matrices \\

Orthogonal\:columns\:sizes\:2,\:4,\:8.

\\

\begin{bmatrix}

 1 & 1 \\
 1 & -1 \\
 

\end{bmatrix}

\begin{bmatrix}

 1 & 1 &  1 & 1 \\
 1 & -1 &  1 & -1 \\
 1 & 1 &  -1 & -1 \\
 1 & -1 &  -1 & 1 \\

\end{bmatrix}

\begin{bmatrix}

 H_4 & H_4 \\
 H_4 & -H_4 \\
 

\end{bmatrix}
$$

Are those orthogonal matrices?

No.

The columns have lengths $\sqrt{2}$, $\sqrt{4}$, $\sqrt{8}$.

If we divide by those lengths, we have the beginning of an infinite list: orthonormal bases in 2,4,8,16,32,... dimensions.

The **Hadamard Conjecture** proposes that there is a ±1 matrix with orthogonal columns whenever 4 divides $n$.

Wikipedia says that $n$ = 668 is the smallest of those sizes without a known Hadamard matrix.

The construction for $n$= 16,32,... follows the pattern above.

Here is the key fact:

**Every subspace of $R^n$ has an orthogonal basis.**

Think of a plane in three dimensional space $R^3$.

The plane has two independent vectors $a$ and $b$.

For an orthogonal basis, subtract away from b its component in the direction of $a$.

$$

Orthogonal\:Basis\:a\:and\:c: \\

c

= 

b

-

\frac{a^T b}{a^T a}

*

a


$$

The inner product $a^T c$ is $a^T b$ - $a^T b$ ) 0/

This idea of orthogonalizing applies to any number of basis vectors: *a basis becomes an orthogonal basis.**



## ORTHOGONAL SUBSPACES

$$

\\

Ax=0 \:\:means\:each\:row\:x=0

\\

\begin{bmatrix}

Row\:1\:of\:A \\
...\\
...\\
Row\:m\:of\:A\\

\end{bmatrix}

*

\begin{bmatrix}

 \\
x\\
\\
\\

\end{bmatrix}

=

\begin{bmatrix}

0 \\
...\\
...\\
0\\

\end{bmatrix}
$$

Above equation looked at $Ax==$.

Every row of A is multiplying that nullspace vector $x$.

So each row (and all combination of rows) will be orthogonal to $x$ in $N(A)$.

**The row space of A is orthogonal to the nullspace of A.**

$$
Ax

=

\begin{bmatrix}

Row\:1\:of\:A \\
...\\
...\\
Row\:m\:of\:A\\

\end{bmatrix}

*

\begin{bmatrix}

 \\
x\\
\\
\\

\end{bmatrix}

=

\begin{bmatrix}

0 \\
...\\
...\\
0\\

\end{bmatrix}

\\

A^T y

=

\begin{bmatrix}

(Column\:1)\:^T \\
...\\
...\\
(Column\:n)\:^T \\

\end{bmatrix}

*

\begin{bmatrix}

 \\
y\\
\\
\\

\end{bmatrix}

=

\begin{bmatrix}

0 \\
...\\
...\\
0\\

\end{bmatrix}

$$

From $A^T y=0$, the columns of A are all orthogonal to $y$.

Their combinations (the whole column space) will also be orthogonal to $y$.

**The column space of A is orthogonal to the nullspace of $A^T$.**

This produces the Big Picture of Linear Algebra.

Notice the dimensions $r$ and $n-r$ adding to $n$.

The whole space $R^n$ is accounted for.

Every vector $v$ in $R^n$ has a row space component $v_r$ and a nullspace component $v_n$ with $v$ = $v_r$ + $v_n$.

A row space basis (r vectors) together w
th a nullspace basis (n-r vectors) produces a basis for all of $R^n$ (n vectors). 

![fig_1_6](./fig_1_6.png)

Above shows two pairs of orthogonal subspaces.

The dimensions add to n and add to m.

This is the Big Picture -- two subspaces in $R^n$ and two subspaces in $R^m$.

I will mention a big improvement.

It comes from the Singular Value Decomposition.

The SVD is the most important theorem in data science.

It finds orthonormal bases $v_1$,...,$v_r$ for the row space of A and $u_1$,..., $u_r$ for the column space of A.

Gram-Schmidt can do that.

The special bases from the SVD have the extra property that each pair ($v$ and $u$) is connected by A:



------------------------------------

Singular Vectors:

$$

Av_1

=

\sigma{_1}u_1

\:\:\:\:\:
Av_2

=

\sigma{_2}u_2
\:\:\:\:\:
...
\:\:\:\:\:
Av_r

=

\sigma{_r}u_r


$$

In above figure, imagine the $v$'s on the left and the $u$'s on the right.

For the bases from the SVD, multiplying by A takes an orthogonal basis of $v$'s to an orthogonal basis of $u$'s.

## TALL THIN $Q$ with ORTHONORMAL COLUMNS: $Q^T Q = I$