---
title: Orthogonal and Orthonormal Bases
subject: Inner Products and Norms
subtitle: 
short_title: Orthogonal and Orthonormal Bases
authors:
  - name: Nikolai Matni
    affiliations:
      - Dept. of Electrical and Systems Engineering
      - University of Pennsylvania
    email: nmatni@seas.upenn.edu
license: CC-BY-4.0
keywords: Orthogonal Basis, Orthonormal Basis
math:
  '\vv': '\mathbf{#1}'
  '\bm': '\begin{bmatrix}'
  '\em': '\end{bmatrix}'
  '\R': '\mathbb{R}'
---

## Reading

Material related to this page, as well as additional exercises, can be found in ALA 4.1.

## Learning Objectives

By the end of this page, you should know:
- What is an orthogonal basis?
- What is an orthonormal basis?

# Orthogonal and Orthonormal Bases

Let $V$ be an [inner product space](#inner-product-space-defn) (as usual, we will assume that the scalars over which $V$ is defined are real valued). Remember that $\vv v, \vv w \in V$ are [orthogonal](#orthogonal-defn) if $\langle \vv v, \vv w\rangle = 0$. If $\vv v, \vv w \in \mathbb{R}^n$ and $\langle \vv v, \vv w\rangle = \vv v \cdot \vv w$ is the dot product, this simply means that $\vv v$ and $\vv w$ are perpendicular (meet at a right angle).

Orthogonal vectors are useful, because they point in completely different directions, making them particularly well-suited for defining bases. Orthogonal vectors give rise to the concept of an *orthogonal basis*.

:::{prf:definition} Orthogonal Basis
:label: orthogonal-basis-defn

A [basis](#basis_defn) $\vv{b_1}, ..., \vv{b_n}$ of an $n$-dimensional inner product space $V$ is called *orthogonal* if $\langle \vv{b_i}, \vv{b_j}\rangle = 0$ for all $i\neq j$. In this case, the collection $\vv{b_i}$ are said to be *mutually orthogonal*, i.e., every pair of distinct vectors are [orthogonal](#orthogonal-defn).
:::

If each basis vector in an orthogonal basis is a unit vector (has norm equal to one), then they form a special type of orthogonal basis known as an *orthonormal basis.*

:::{prf:definition} Orthonormal Basis
:label: orthonormal-basis-defn

An [orthogonal basis](#orthogonal-basis-defn) $\vv{b_1}, ..., \vv{b_n}$ of an $n$-dimensional inner product space $V$ is called *orthonormal* if $\|\vv{b_i}\| = 1$ for each $i$. Here, $\|\vv v\| = \sqrt{\langle \vv v, \vv v\rangle}$ is the norm induced by the inner product.
:::

A simply way to construct an orthonormal basis from an orthogonal basis is to *normalize* each of its elements, that is, to replace each basis element $\vv{b_i}$ with its normalized counterpart $\frac{\vv{b_i}}{\|\vv{b_i}\|}$. As an exercise, can you formally verify that $\frac{\vv{b_1}}{\|\vv{b_1}\|}, ..., \frac{\vv{b_n}}{\|\vv{b_n}\|}$ is an orthonormal basis for if $\vv{b_1}, ..., \vv{b_n}$ is an orthogonal one? Can you explain why rescaling each entry does not affect the mutual orthogonality of this set?

```{note}
A very useful property of a collection of mutually orthogonal vectors is that they are automatically linearly independent. In particular, if $\vv{v_1}, ..., \vv{v_k}$ satisfy $\langle \vv{v_i}, \vv{v_j} \rangle = 0$ for all $i \neq j$ (and $\vv{v_i} \neq 0$ for all $i$), then they are [linearly independent](#lin_dep).

To see this, we take an arbitrary linear combination of the $\vv{v_i}$ and set it to $0$:

\begin{align*}
    c_1 \vv{v_1} + c_2\vv{v_2} + ... + c_k\vv{v_k} = \vv 0\label{expr:lincomboequation}
\end{align*}

Let's take the inner product of both sides of this equation with any $\vv{v_i}$:

\begin{align*}
0 = \langle \vv 0, \vv{v_i} \rangle &= \langle c_1\vv{v_1} + c_2\vv{v_2} + ... + c_k\vv{v_k}, \vv{v_i}\rangle\\
&= c_1\langle \vv{v_1}, \vv{v_i} \rangle + ... + c_i\langle \vv{v_i}, \vv{v_i} \rangle + ... + c_k\langle \vv{v_k}, \vv{v_i} \rangle \quad \text{(linearity of $\langle \cdot, \vv{v_i}\rangle$)}\\
&= c_i\langle \vv{v_i}, \vv{v_i}\rangle = c_i \|\vv{v_i}\|^2 \quad\text{(orthogonality)}
\end{align*}

Since $\vv{v_i} \neq 0$, $\|\vv{v_i}\|^2 > 0$, which means $c_i = 0$. We can repeat this game with all $\vv{v_i}$ for $i = 1, ..., k$, to conclude that [](#expr:lincomboequation) holds only if $c_1 = c_2 = ... = c_k = 0$. Hence, the mutually orthogonal collection $\vv{v_1}, ..., \vv{v_k}$ is linearly independent.

```

:::{prf:example} The standard basis for $\mathbb{R}^n$
:label: standard-basis-ex

A familiar example of an orthonormal basis for $\mathbb{R}^n$ equipped with the standard inner product is the collection of standard basis elements:

\begin{align*}
    \vv{e_1} = \bm 1 \\ 0 \\ \vdots \\ 0\em, \quad\vv{e_2} = \bm 0 \\ 1 \\ \vdots \\ 0\em,\quad ...,\quad \vv{e_n} = \bm 0 \\ 0 \\ \vdots \\ 1\em
\end{align*}

This is known as the *standard basis* of $\mathbb{R}^n$.
:::

:::{prf:example} Normalizing an orthogonal basis
:label: normalizing-ex

The vectors

\begin{align*}
    \vv{b_1} = \bm 1 \\2 \\ -1\em, \quad \vv{b_2} = \bm 0\\1\\2\em, \quad \vv{b_3} = \bm 5\\-2\\1\em
\end{align*}

are an orthogonal basis for $\mathbb{R}^3$. One easy way to check this is to confirm that $\vv{v_i} \cdot \vv{b_j} = 0$ for all $i\neq j$ (this is indeed true). Since $\text{dim} (\mathbb{R}^3) = 3$, and $\vv{b_1}, \vv{b_2}, \vv{b_3}$ are linearly independent, they must be a basis.

To turn them from an orthogonal basis into an orthonormal basis, we simply divide every vector by its length to obtain

\begin{align*}
    \vv{v_1} = \frac{\vv{b_1}}{\|\vv{b_1}\|} = \frac{1}{\sqrt 6}\bm 1\\2\\-1\em, \quad \vv{v_2} = \frac{\vv{b_2}}{\|\vv{b_2}\|} = \frac{1}{\sqrt 5}\bm 0\\1\\2\em,
    \quad \vv{v_3} = \frac{\vv{b_3}}{\|\vv{b_3}\|} = \frac{1}{\sqrt{30}}\bm 5\\-2\\1\em
\end{align*}

This example highlights a more general principle, which is again quite useful: if $\vv{v_1}, ..., \vv{v_n}$ are mutually orthogonal, then they form a basis for their [span](#ln_comb) $W = \text{span}\{ {v_1}, ..., \vv{v_n} \} \subseteq V$, which is thus a [subspace](#sub_def) of $\text{dim}(W) = n$. It then follows that if $\text{dim}(V) = n$, then $\vv{v_1}, ..., \vv{v_n}$ are an orthogonal basis for $V$ (this is precisely the observation we used in this example).
:::

# Working in Orthogonal Bases

So why do we care about orthogonal (or even better, orthonormal) bases? Turns out they make a lot of the computations that we've been doign so far MUCH easier.

We'll start with some important properties of computing a vector's coordinates with respect to an orthogonal basis.

:::{prf:theorem} Coordinates and Norm in an Orthonormal Basis
:label:coordinates-norm-orthonormal-thm

Let $\vv{u_1}, ..., \vv{u_n}$ be an orthonormal basis for an inner product space $V$. Then any $\vv v\in V$ is a linear combination

\begin{align*}
    \vv v = c_1 \vv{u_1} + ... + c_n \vv{u_n}
\end{align*}

in which its coordinates are given by

\begin{align*}
    c_i = \langle \vv v, \vv{u_i} \rangle,\quad u = 1, ..., n
\end{align*}

Moreover, its norm is given by the Pythagorean formula,

\begin{align*}
    \| \vv v \|^2 = c_1^2 + ... + c_n^2 = \sum_{i=1}^{n}{\langle \vv v, \vv{u_i}\rangle^2}
\end{align*}

**Proof.** The trick here is to exploit that 

\begin{align*}
    \langle \vv{u_i}, \vv{u_j} \rangle = \begin{cases} 0 \quad\text{if $i \neq j$}\\ 1 \quad\text{if $ i = j$}\end{cases}
\end{align*}

Let's compute

\begin{align*}
    \langle \vv v, \vv{u_i} \rangle &= \langle c_1\vv{u_1} + ... + c_n\vv{u_n}, \vv {u_i}\rangle\\
    &= c_1 \langle \vv{u_1}, \vv{u_i} \rangle + ... + c_i\langle \vv{u_i}, \vv{u_i}\rangle + ... + c_n\langle \vv{u_n}, \vv{u_i} \rangle\quad\text{(linearity of $\langle \cdot, \vv{u_i}\rangle$)}\\
    &= c_i \| \vv{u_i} \|^2\quad\text{(orthogonality)}\\
    &= c_i \quad\text{$(\| \vv{u_i} \| = 1)$}
\end{align*}

So we have $c_i = \langle \vv v, \vv{u_i}\rangle$. Now to compute the norm, we again use a similar trick:

\begin{align*}
    \|\vv v\|^2 = \langle \vv v, \vv v\rangle &= \left\langle \sum_{i=1}^{n}{c_i\vv{u_i}}, \sum_{j=1}^{n}{c_j\vv{u_j}} \right\rangle\\
    &= \sum_{i=1}^{n}{c_i\left\langle \vv{u_i}, \sum_{j=1}^{n}{c_j\vv{u_j}} \right\rangle}\quad\text{(linearity of $\left\langle \cdot, \sum_{j=1}^{n}{c_j\vv{u_j}} \right\rangle$)}\\
    &= \sum_{i=1}^{n}{\sum_{j=1}^{n}{c_ic_j\langle \vv{u_i}, \vv{u_j} \rangle}}\quad\text{(linearity of $\langle \vv{u_i}, \cdot \rangle$)}\\
    &= \sum_{i=1}^{n}{c_i^2\|\vv{u_i}\|^2}\quad\text{(orthogonality)}\\
    &= \sum_{i=1}^{n}{c_i^2}\quad\text{($\|\vv{u_i} = 1\|$)}
\end{align*}
:::

````{exercise}  Rewriting a vector in an orthonormal basis
:label: coordinates-orthonormal-ex

Rewrite $\vv v = \bm 1\\1\\1\em$ in terms of the orthonormal basis

\begin{align*}
     \vv{v_1} = \frac{1}{\sqrt 6}\bm 1\\2\\-1\em, \quad \vv{v_2} = \frac{1}{\sqrt 5}\bm 0\\1\\2\em,
    \quad \vv{v_3} = \frac{1}{\sqrt{30}}\bm 5\\-2\\1\em
\end{align*}

```{solution} coordinates-orthonormal-ex
:class: dropdown

As we [showed earlier](#coordinates-norm-orthonormal-thm), all we need to do is compute dot products!

\begin{align*}
    \vv v \cdot \vv{u_1} = \frac{2}{\sqrt 6}, \quad \vv v\cdot \vv{u_2} = \frac{3}{\sqrt 5}, \quad \vv v \cdot \vv{u_3} = \frac{4}{\sqrt{30}}
\end{align*}

to then write:

\begin{align*}
    \vv v = \frac{2}{\sqrt 6}\vv{u_1} + \frac{3}{\sqrt 5}\vv{u_2} + \frac{4}{\sqrt{30}}\vv{u_3}
\end{align*}

This is much simplier than solving the system of linear equations

\begin{align*}
    \bm \vv{u_1} & \vv{u_2} & \vv{u_3} \em \bm c_1\\c_2\\c_3 \em = \bm 1\\1\\1\em
\end{align*}

for the coordinates  $c_1, c_2, c_3$.
      
```
````

A very small change to the above allows us to extend these ideas to orthogonal, but not orthonormal, bases:

:::{prf:theorem} Coordinates and Norm in an Orthogonal Basis
:label:coordinates-norm-orthogonal-thm

If $\vv{v_1}, ..., \vv{v_n}$ are an orthogonal basis, then $\vv v\in V$ can be written

\begin{align*}
    \vv v = a_1\vv{v_1} + ... + a_n\vv{v_n} \quad\text{with $a_i = \frac{\langle \vv v, \vv{v_i}  \rangle}{\|\vv{v_i}\|^2}$}
\end{align*}

and its norm is given by

\begin{align*}
    \|\vv v\|^2 = a_1^2 \|v_1\|^2 + ... + a_n^2 \|v_n\|^2
\end{align*}

**Proof.** This is derived using our [theorem for orthonormal bases](#coordinates-norm-orthonormal-thm) by rescaling the $\vv{v_i}$ to $\frac{\vv{v_i}}{\|\vv{v_i}\|}$.

:::


:::{prf:example} Change of coordinates to an orthogonal basis of a function space
:label: coordinates-orthogonal-ex

Even though our focus in this class will mostly be on $\mathbb{R}^n$ (or vector spaces that "behave like" $\mathbb{R}^n$), all of these ideas apply to general inner product spaces, including function spaces.

As a simple example, let's consider the space of quadratic polynomials of degree $\leq 2$, $P^{(2)}$ over $[0, 1]$ equipped with the integral inner product $\langle f, g \rangle = \int_{0}^{1}{f(x)g(x) \:dx}$.

The standard monomials ($1, x, x^2$) do NOT form an orthogonal basis:

\begin{align*}
    \langle 1, x\rangle = \frac 1 2, \quad \langle 1, x^2\rangle = \frac 1 3, \quad \langle x, x^2\rangle = \frac 1 4
\end{align*}

One orthogonal basis for $P^{(2)}$ is:

\begin{align*}
    p_1(x) = 1,\quad p_2(x) = x - \frac 1 2, \quad p_3(x) = x^2 - x + \frac 1 6
\end{align*}

For example, 

\begin{align*}
    \langle p_1, p_2 \rangle = \int_{0}^{1}{1\cdot (x - \frac 1 2) \: dx} &= \int_{0}^{1}{x \:dx} - \frac 1 2 \int_{0}^{1}{dx} \\
    &= \left.\frac{x^2}{2}\right\rvert_0^1 - \left.\frac{x}{2}\right\rvert_0^1\\
    &= \frac 1 2 - 0 - \frac 1 2 + 0 = 0
\end{align*}

With a little bit more calculus, you can check that $\langle p_1, p_3\rangle = \langle p_2, p_3\rangle = 0$ and that

\begin{align*}
    \|p_1\| = 1, \quad \|p_2\| = \frac {1}{2\sqrt 3}, \quad \|p_3\| =\frac{1}{6\sqrt 5}
\end{align*}

If we now want to compute the coordinates $c_1, c_2, c_3$ of a quadratic polynomial

\begin{align*}
    p(x) = c_1p_1(x) + c_2p_2(x) + c_3p_3(x)
\end{align*}

we simply compute some inner products:

\begin{align*}
c_1 = \frac{\langle p, p_1 \rangle}{\|p_1\|^2},\quad c_2 = \frac{\langle p, p_2\rangle}{\|p_2\|^2}, \quad \frac{\langle p, p_3\rangle}{\|p_3\|^2}
\end{align*}

So for example, if $p(x) = x^2 + x + 1$, then

\begin{align*}
    c_1 = \frac{\int_{0}^{1}{(x^2 + x + 1)\cdot 1\:dx}}{1} = \frac{11}{6},\\
    \quad c_2 = \frac{\int_{0}^{1}{(x^2 + x + 1)(x - \frac 1 2)\:dx}}{\left(\frac{1}{12}\right)} = 2, \\
    \quad c_3 = \frac{\int_{0}^{1}{ (x^2 + x + 1)(x^2 - x + \frac 1 6) \:dx}}{(\frac{1}{180})} = 1
\end{align*}

so that $p(x) = x^2 + x + 1 = \frac{11}{6} + 2(x - \frac 1 2) + (x^2 - x + \frac 1 6)$.

While this may look very abstract, this is exactly the same mechanism underpinning things like the Discrete Fourier Transform, which is a change of a signal to a (complex) orthonormal basis in function space, where each basis element is a complex sinusoid.

:::