---
title: 4.4 Orthogonal Projections and Orthogonal Subspaces
subject:  Orthogonality
subtitle: Splitting vectors into orthgonal pieces
short_title: 4.4 Orthogonal Projections and Subspaces
authors:
  - name: Nikolai Matni
    affiliations:
      - Dept. of Electrical and Systems Engineering
      - University of Pennsylvania
    email: nmatni@seas.upenn.edu
license: CC-BY-4.0
keywords: Orthogonal Matrix, QR Factorization
math:
  '\vv': '\mathbf{#1}'
  '\bm': '\begin{bmatrix}'
  '\em': '\end{bmatrix}'
  '\R': '\mathbb{R}'
---

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/nikolaimatni/ese-2030/HEAD?labpath=/03_Ch_4_Orthogonality/054-proj_subspace.ipynb)

{doc}`Lecture notes <../lecture_notes/Lecture 08 - Orthogonal Projections and Subspaces, Least Squares Problems and Solutions.pdf>`

## Reading

Material related to this page, as well as additional exercises, can be found in ALA 4.4.

## Learning Objectives

By the end of this page, you should know:
- orthogonal projection of vectors
- orthogonal subspaces
- relationship between dimensions of orthogonal subspaces
- orthogonality of the fundamental matrix subspaces and how they relate to a linear system

## Introduction

We extend the idea of orthogonality between two vectors to _orthogonality between subspaces_. Our starting point is the idea of _an orthogonal projection_ of a vector onto a subspace.

## Orthogonal Projection

Let $V$ be a (real) inner product space, and $W \subset V$ be a finite dimensional subspace of $V$. The results we resent are fairly general, but it may be helpful to think of $W$ as a subspace of $V = \mathbb{R}^m$. 

:::{prf:definition} Orthogonal to Subspace
:label: orth-to-subspace

A vector $\vv z \in V$ is said to be _orthogonal_ to the subspace $W \subset V$ if it is orthogonal to every vector in $W$, that is, $\langle \vv z, \vv w\rangle = 0$ for all $\vv w \in W$. We will write $\vv z \perp W$, pronounced $\vv z$ "perp" $W$, to indicate $\vv z$ is perpendicular (orthogonal) to $W$. 
:::


::::{prf:definition} Orthogonal Projection
:label: orth-proj

A related notion to [](#orth-to-subspace) is the _orthogonal projection_ of a vector $\vv v \in V$ onto a subspace $W$, which is the element $\vv w \in W$ that makes the difference $\vv z  = \vv v - \vv w$ orthogonal to $W$.
:::{figure}../figures/05-orth_proj.jpg
:label:orth_proj_fig
:alt:Orthogonal projection
:width: 400px
:align: center
:::
::::

Note from [](#orth-proj) that this means $\vv v$ can be decomposed as the sum of its orthogonal projection $\vv w \in W$ and the perpendicular vector $\vv z \perp W$ that is orthogonal to $W$, i.e., $\vv v = \vv w + \vv v - \vv w = \vv w + \vv z$.

When we have access to an orthonormal basis for $W \subset V$, constructing the orthogonal projection of $\vv v \in V$ onto $W$ becomes quite simple as given below.

::::{prf:theorem}
:label:thm_orth_basis
Let $\vv u_1, \vv u _2, \ldots, \vv u_n$ be an orthonormal basis for the subspace $W \subset V$. Then, the orthogonal projection of $\vv v \in V$ onto $\vv w \in W$ is given by 

\begin{equation}
\label{orth_basis_proj_eqn}
\vv w = c_1 \vv u_1 + c_2 \vv u_2 + \ldots + c_n \vv u_n \ \textrm{where} \ c_i = \langle \vv v, \vv u_i \rangle, \ i = 1,2,\ldots,n 
\end{equation}


:::{prf:proof} Proof of [](#thm_orth_basis)
:label: proof-thm_orth_basis
:class: dropdown

Since $\vv u_1, \vv u_2, \ldots, \vv u_n$ form a basis for $W$, we must have that $\vv w = c_1 \vv u_1 + c_2 \vv u_2 + \ldots + c_n \vv u_n$ for some $c_1, \ldots, c_n$.

If $\vv w$ is an orthogonal projection of $\vv v$ onto $W$, by [definition](#orth-proj) we must have that $\langle \vv v - \vv w, \vv q \rangle = 0$ for any $\vv q \in W$. So, let's pick $\vv q = \vv u_i$:
\begin{equation}
0 = \langle \vv v - \vv w, \vv u_i \rangle &= \langle \vv v - c_1 \vv u_1 + \ldots + c_n \vv u_n, \vv u_i \rangle \\
&= \langle \vv v , u_i \rangle - c_1 \langle \vv u_1, \vv u_i \rangle - 
 c_2 \langle \vv u_2, \vv u_i \rangle - \ldots - c_i \langle \vv u_i, \vv u_i\rangle - \ldots - c_n \langle \vv u_n, \vv u_i\rangle \\
&= \langle \vv v, \vv u_i\rangle - c_i
\end{equation}
where, the last line follows from $\vv u_1, \vv u_2, \ldots \vv u_n$ being an _orthonormal_ basis for $W$. Repeating for $i=1,2,\ldots,n$, we conclue that $c_i = \langle \vv v, \vv u_i\rangle$ for $i=1,\ldots,n$ are uniquely prescribed by the orthogonality requirement, satisfying uniqueness. 
:::
::::

:::{prf:example} 
:label: ex_orth

Consider the plane $W \subset \mathbb{R}^3$ spanned by orthogonal (but not orthonormal!) vectors 
$$
\vv v_1 = \bm 1 \\ -2 \\ 1\em \ \textrm{and} \ \vv v_2 = \bm 1 \\ 1 \\ 1\em
$$
Let's compute the orthogonal projection of $\vv v = \bm 1 \\ 0 \\ 0\em$ onto $W = \textrm{span}(\vv v_1, \vv v_2)$. First, we normalize $\vv v_1$ and $\vv v_2$:
$$
\vv u_1 = \frac{\vv v_1}{\|\vv v_1\|} = \frac{1}{\sqrt{6}}\bm 1 \\ -2 \\ 1\em, \ \vv u_2 = \frac{\vv v_2}{\|\vv v_2\|} = \frac{1}{\sqrt{3}}\bm 1 \\ 1 \\ 1\em
$$
and then compute the projection $\vv w$ as
$$
\vv w &= \langle \vv v, \vv u_1\rangle \vv u_1 +  \langle \vv v, \vv u_2\rangle \vv u_2 \\
&= \frac{1}{6}\bm 1 \\ -2 \\ 1\em + \frac{1}{3}\bm 1 \\ 1 \\ 1\em \\
&= \bm \frac{1}{2} \\ 0 \\ \frac{1}{2} \em
$$
:::

We will see shortly that orthogonal projections of a vector onto a subspace is exactly what solving a [least-squares](https://en.wikipedia.org/wiki/Least_squares) problem does, and lies at the heart of machine learning and data science.

However, before that, we will explore the idea of orthogonal subspaces, and see that they provide a deep and elegant connection between the four fundamental subspaces of a matrix $A$ and whether a linear system $A \vv x=\vv b$ has a solution.


:::{hint} Think
How can we write $\vv w = UU^{\top}\vv v$ using the vectors $\{\vv u_1, \vv u_2, \ldots, \vv u_n\}$ where ${U = \bm \vv u_1 & \vv u_2 & \ldots & \vv u_n\em}$?

<!-- $$ by Farhad
W &= \bm \vv u_1 & \vv u_2 & \ldots & \vv u_n\em \bm \vv u_1^{\top} \\ \vv u_2^{\top} \\ \vdots \\ \vv u_n^{\top}\em \vv v \\
&= \bm \vv u_1 & \vv u_2 & \ldots & \vv u_n\em
 \bm \vv u_1^{\top}\vv v \\ \vv u_2^{\top}\vv v \\ \vdots \\ \vv u_n^{\top}\vv v\em \\
&= \langle \vv u_1, \vv v\rangle \vv u_1 + \langle \vv u_2, \vv v\rangle \vv u_2 + \ldots + \langle \vv u_n, \vv v\rangle \vv u_n
$$ -->
:::

## Orthogonal Subspaces


:::{prf:definition} Orthogonal Subspaces
:label: orth-subspace

Two subspaces $W, Z \subset V$ are _orthogonal_ if every vector in $W$ is orthogonal to every vector in $Z$, that is, if and only if $\langle \vv w, \vv z\rangle = 0 $ for all $\vv w \in W$ and all $\vv z \in Z$.
:::

:::{important}
One quick way to check [Orthogonal Subspaces](#orth-subspace) is to compare spanning vectors, such as bases, for $W$ and $V$. If $W = $span$\{\vv w_1, \vv w_2, \ldots, \vv w_k\}$ and $Z$=span$\{\vv z_1, \vv z_2, \ldots, \vv z_l\}$, then $W$ and $Z$ are orthogonal if and only if $\langle \vv w_i, \vv z_j \rangle = 0$ for all $i = 1, \ldots, k$ and $j = 1, \ldots, l$.
:::

:::{prf:example} 
:label: ex_line_plane

If $V = \mathbb{R}^3$ and we are using the dot product, then the plane $W \subset \mathbb{R}^3$ defined by $2x - y + 3z = 0$ is orthogonal to the line $Z$ spanned by the normal vector $\vv n = \bm 2 \\ -1 \\ 3\em$. This is easy to check as any $\vv w = \bm x \\ y \\ z\em \in W$ satisfies $\vv n \cdot \vv w = 2x - y + 3z = 0.$
:::

An important geometric notion present in [](#ex_line_plane) is the _Orthogonal Complement_ that is defined below.

:::{prf:definition} Orthogonal Complement
:label: orth-compl
 The orthogonal complement $W^{\perp}$ of a subspace $W \subset V$ is defined as the set of all vectors that are orthogonal to $W$:
\begin{equation}
W^{\perp} = \{\vv v\ in V | \langle \vv v, \vv w\rangle = 0 \ \textrm{for all} \ \vv w \in W\}.
\end{equation}
:::

The orthogonal complement to a line is given in the figure below, which is discussed in [](#ex_line_plane).
:::{figure}../figures/05-orth_line_plane.jpg
:label:orth_line_plane
:alt:Orthogonal subspaces
:width: 400px
:align: center
:::

:::{note} Properties of [Orthogonal complement](#orth-compl)

1. $W^{\perp}$ is also a subspace.
2. $W \cap W^{\perp} = \{\vv 0\}$, i.e., $W$ and $W^{\perp}$ are [transverse](https://en.wikipedia.org/wiki/Transversality_(mathematics)) and only intersect at the origin.

:::

:::{prf:example} 
:label: ex_complement

Consider the plane $W \subset \mathbb{R}^3$ defined by the equation $x + y - 2z = 0$. Then, $W^{\perp} = $span$\{\vv n\} = \{(\epsilon, \epsilon, -2\epsilon) | \epsilon \in \mathbb{R}\}$ is the line spanned by its defining normal $\vv n = (1, 1, -2)$.

If we consider instead the set $Z = $span$\{\vv n\}$, then $Z^{\perp} = W$, i.e., the orthogonal complement to the line $Z$ is the plane $W$. This also highlights that $Z^{\perp} = \left(W^{\perp}\right)^{\perp} = W$, i.e., taking the  orthogonal complement twice brings you back to where you started.
:::

::::{note}
Given a subspace $W \subset V$ and its orthogonal complement $W^{\perp}$, we can uniquely decompose any vector $\vv v \in V$ into $\vv v = \vv w + \vv z$, where $\vv w \in W$ and $\vv z \in W^{\perp}$. The geometric intuition is given below.
:::{figure}../figures/05-orth_decom.jpg
:label:orth_decom
:alt:Orthogonal decomposition
:width: 400px
:align: center
:::
::::

:::{note}
A useful consequence of the above, which we will use later when deriving the least squares problem solution, is that if $\vv v = \vv w + \vv z$ with $\vv w \in W$ and $\vv z \in W^{\perp}$, then $\|\vv v\|^2 = \|\vv w\|^2 + \|\vv z\|^2$: this is an immediate consequence of $\langle \vv w, \vv z\rangle = 0 $ and is essentially Pythagoras' theorem. 
:::

Another direct consequence of [](#orth_decom) is that a subspace and its orthogonal complement have complementary dimensions. 

:::{prf:proposition} 
:label:prop_comp
If $W \subset V$ is a subspace with dim$W = n$ and dim$V = m$, then dim$W^{\perp} = m - n$.
:::

In [](#ex_complement), $W \subset \mathbb{R}^3$ is plane, with dim$W = 2$. Hence, we can conclude that dim$W^{\perp} = 1$, i.e., $W^{\perp}$ is a line, which is indeed what we saw previously.

## TO DO: Examples 

similar to ALA 4.42, 4.43

## Orthogonality of the Fundamental Matrix Subspace

We previously introduced [the four fundamental subspaces](../01_Ch_2_Vector_Spaces_and_Bases/036-adj.ipynb#fund_thm) associated with an $m \times n$ matrix $A$: the column, null, row, and left null spaces. We also saw that the null and row spaces are subspaces with complementary dimensions in $\mathbb{R}^n$, and the left null space and column space are subspaces with complementary dimensions in $\mathbb{R}^m$. Moreover, these pairs are orthogonal complements of each other with respect to the standard dot product.

:::{prf:theorem}
:label:thm_orth_fund
Let $A \in \mathbb{R}^{m \times n}$ be an $m \times n$ matrix. Then,
$$
\textrm{Null}(A) = \textrm{Row}(A)^{\perp} = \textrm{Col}(A^{\top})^{\perp} \subset \mathbb{R}^n \\
\textrm{and} \\
\textrm{LNull}(A) = \textrm{Null}(A^{\top}) = \textrm{Col}(A)^{\perp} \subset \mathbb{R}^m
$$
:::

We will not go through the proof (although it is not hard), but instead focus on a very important practical consequence.

::::{prf:theorem} Fredholm Alternative
:label:thm_lin_sys
A linear system $A \vv x = \vv b$ has a solution if and only if $\vv b$ is orthogonal to LNull$(A)$.

:::{prf:proof} Proof of [](#thm_lin_sys)
:label: proof-thm_lin_sys
:class: dropdown

We know that $A \vv x = \vv b$ if and only if $b \in $Col$(A)$ since $A \vv x$ is a linear combination of the columns of $A$. 

From [](#thm_orth_fund) we know that $\textrm{LNull}(A) = \textrm{Col}(A)^{\perp}$, or equivalently, that $\textrm{Col}(A) = \textrm{LNull}(A)^{\perp} = \textrm{Null}(A^{\top})^{\perp}$. 

So, this means that $\vv b \in $Null$(A^{\top})^{\perp}$, or equivalently, that $\langle \vv y, \vv b\rangle = 0$ for all $\vv y $ such that $A^{\top}\vv y = \vv 0$. Just to get a sense of why this is perfectly reasonable, let's assume that we can find a $\vv y \in$Null$(A)^{\top}$ for which $\langle \vv y, \vv b\rangle \neq 0$.  This implies that we have an inconsistent set of equations, which can be seen as follows. 

Let $\vv x$ be any solution to $A \vv x = \vv b$, and take the inner product of both sides with $\vv y$:
\begin{equation}
\label{proof-thm_lin_sys-eqn}
\langle \vv y, A \vv x\rangle = \langle \vv y, \vv b\rangle.
\end{equation}
But since $\vv y \in $LNull$(A)$, $\langle \vv y, A \vv x\rangle = \vv y^{\top} A \vv x = 0$ for any $\vv x$, meaning we must have $\langle \vv y, \vv b\rangle = 0$. However, we picked a special $\vv y$ such that $\langle \vv y, \vv b\rangle \neq 0$. There is a contradiction! This might have been because of a mistake in our reasoning: either $A \vv x = \vv b$ has no solution, or $\langle \vv y, \vv b \rangle = 0$. 

Another way of thinking about [](#proof-thm_lin_sys-eqn): if $\vv y^{\top} A \vv x = 0$, which means we can add the equations in the entries of $A \vv x$ together, weighted by the elements of $\vv y$, so that they cancel to zero. So, the only way for $A \vv x = \vv b$ to be compatible is if the same weighted combination of the RHS, $\vv y^{\top} \vv b$, also equals 0. 
:::
::::



[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/nikolaimatni/ese-2030/HEAD?labpath=/03_Ch_4_Orthogonality/054-proj_subspace.ipynb)