---
title: The Gram-Schmidt Process
subject: Orthogonality
subtitle: 
short_title: The Gram-Schmidt Process
authors:
  - name: Nikolai Matni
    affiliations:
      - Dept. of Electrical and Systems Engineering
      - University of Pennsylvania
    email: nmatni@seas.upenn.edu
license: CC-BY-4.0
keywords: Gram-Schmidt
math:
  '\vv': '\mathbf{#1}'
  '\bm': '\begin{bmatrix}'
  '\em': '\end{bmatrix}'
  '\R': '\mathbb{R}'
---

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/nikolaimatni/ese-2030/HEAD?labpath=/03_Orthogonality/052-gram_schmidt.ipynb)

{doc}`Lecture notes <../lecture_notes/Lecture 07 - Orthogonality, Gram-Schmidt, Orthogonal Matrices, and QR-Factorization.pdf>`

## Reading

Material related to this page, as well as additional exercises, can be found in ALA 4.2.

## Learning Objectives

By the end of this page, you should know:
- What is the Gram-Schmidt process for finding an orthogonal basis of a vector space?

# The Gram-Schmidt Process

Hopefully we've convinced you that orthogonal bases are useful, so now the natural question becomes: how do I compute one? That's where the famed Gram-Schmidt process (GSP) comes into play.

The idea behind GSP is farily straightforward: given an initial basis for a vector space, iteratively modify until it is orthogonal. Let's start with a simple concrete example, and then introduce the general algorithm:

:::{prf:example} Finding an orthogonal basis
:label: computing-orthogonal-basis-ex

Let $W = \text{span}\{\vv{x_1}, \vv{x_2}\}$, where $\vv{x_1} = \bm 3\\6\\0 \em$ and $\vv{x_2} = \bm 1\\2\\2 \em$.

Since $\vv{x_1}$ and $\vv{x_2}$ are linearly independent (why?), they form a basis for the subspace $W \subseteq \mathbb{R}^3$, where $\text{dim}(W) = 2$.

However, $\vv{x_1}$ and $\vv{x_2}$ are not [orthogonal](#orthogonal-defn) because

\begin{align*}
    \langle \vv{x_1}, \vv{x_2}\rangle = 3(1) + 6(2) + 0(2) = 15.
\end{align*}

Let's use $\{\vv{v_1}, \vv{v_2} \}$ for our new basis, and set $\vv{v_1} = \vv{x_1}$. We need to find a vector $\vv{v_2}$ that is orthogonal to $\vv{v_1}$ such that $\text{span}\{\vv{v_1}, \vv{v_2}\} = W$.

Let's look at a picture first:

![Parallel and perpendicular components of $\vv{x_2}$ along $\vv{v_1}$](../figures/04-gram_schmidt.png)

From this picture, we observe that we can write $\vv{x_2} = \vv p  +\vv{v_2}$ where $\vv{p}$ is the component of $\vv{x_2}$ parallel with $\vv{x_1}$ and $\vv{v_2}$ is what's left over, i.e., the part of $\vv{x_2}$ that is orthogonal to $\vv{x_1} = \vv{v_1}$.

If $\vv p$ is parallel with $\vv{v_1}$, then we must have that $\vv{p} = c\vv{v_1}$ for some constant $c$, and therefore $\vv{2} = \vv{x_2} - \vv p = \vv{x_2} - c\vv{v_1}$.

Now from our [previous discussion](#coordinates-norm-orthogonal-thm), we know that $c = \frac{\langle \vv{x_2}, \vv{v_1}\rangle}{\|\vv{v_1}\|^2}$ (why?) but let's see a different way of computing $c$. We want $\langle \vv{v_2}, \vv{v_1} \rangle = 0$, so we must have:

\begin{align*}
    \langle \vv{v_2}, \vv{v_1} \rangle = \langle \vv{x_2} - c\vv{v_1}, \vv{v_1}\rangle = \langle \vv{x_2}, \vv{v_1} \rangle - c\|\vv{v_1}\|^2 = 0
\end{align*}

or equivalently, $c = \frac{\langle \vv{x_2}, \vv{v_1}\rangle}{\|\vv{v_1}\|^2}$. Therefore, $\vv{v_2} = \vv{x_2} - \frac{\langle \vv{x_2}, \vv{v_1} \rangle}{\|\vv{v_1}\|^2} \vv{v_1}$.

By construction, we have that $\langle \vv{v_1}, \vv{v_2}\rangle = 0$, and since $\vv{v_1} = \vv{x_1}$ and $\vv{v_2} = \vv{x_2} - c\vv{x_1}$, both $\vv{v_1}, \vv{v_2} \in W$. So $\vv{v_1}$ and $\vv{v_2}$ are linearly independent and contained in $W$ (a subspace of dimension $2$), so form a basis for $W$. 

After actually plugging in our values of $\vv{x_1}$ and $\vv{x_2}$, you can verify that 

\begin{align*}
    \vv{v_1} = \bm 3\\6\\0 \em, \quad \vv{v_2} = \bm 0\\0\\2 \em
\end{align*}

is an orthogonal basis for $\text{span}\left\{\bm 3\\6\\0 \em, \bm 1\\2\\2 \em\right\}$.

:::

The Gram-Schmidt process simply repeats [this process](#computing-orthogonal-basis-ex) over and over if there are more than two vectors, but the idea remains the same: at each step you subtract off the directions of the current vector that are parallel with previous ones.

:::{prf:definition} The Gram-Schmidt Process
:label: gram-schmidt-alg

Given a basis $\{\vv{x_1}, ..., \vv{x_p}\}$ for a nonzero subspace $W$ of $\mathbb{R}^n$, define

\begin{align*}
    \vv{v_1} &= \vv{x_1}\\
    \vv{v_2} &= \vv{x_2} - \frac{\langle \vv{x_2}, \vv{v_1} \rangle}{\langle \vv{v_1}, \vv{v_1}\rangle} \vv{v_1}\\
    \vv{v_3} &= \vv{x_3} - \frac{\langle \vv{x_3}, \vv{v_1} \rangle}{\langle \vv{v_1}, \vv{v_1}\rangle} \vv{v_1} - \frac{\langle \vv{x_3}, \vv{v_2} \rangle}{\langle \vv{v_2}, \vv{v_2}\rangle} \vv{v_2}\\
    \vdots\\
    \vv{v_p} &= \vv{x_p} - \frac{\langle \vv{x_p}, \vv{v_1} \rangle}{\langle \vv{v_1}, \vv{v_1}\rangle} \vv{v_1} - \frac{\langle \vv{x_p}, \vv{v_2} \rangle}{\langle \vv{v_2}, \vv{v_2}\rangle} \vv{v_2} - ... - \frac{\langle \vv{x_p}, \vv{v_{p-1}} \rangle}{\langle \vv{v_{p-1}}, \vv{v_{p-1}}\rangle} \vv{v_{p-1}}
\end{align*}

Then $\{ \vv{v_1}, ..., \vv{v_p} \}$ is an orthogonal basis for $W$. In addition

\begin{align*}
    \text{span} \left\{ \vv{v_1}, ..., \vv{v_k} \right\} = \text{span} \left\{ \vv{x_1}, ..., \vv{x_k} \right\} \quad\text{for $1 \leq k \leq p$}.
\end{align*}

:::

````{exercise}  Finding an orthonormal basis for a subspace of $\mathbb{R}^4$
:label: gram_schmidt-ex1

Find an orthonormal basis, with respect to the standard [dot product](#dot-product-defn), for the subspace $W\subseteq \mathbb{R}^n$ consisting all vectors that are [orthogonal](#orthogonal-defn) to the vector $\vv a = (1, 2, -1, -3)$. 

:::{hint} Click me for a hint!
:class: dropdown
First, try to find any basis for $W$! You can do this by solving the homogenous system of equations

\begin{align*}
    \bm 1&2&-1&-3 \em \bm x_1\\x_2\\x_3\\x_4 \em = 0
\end{align*}
:::

```{solution} gram_schmidt-ex1
:class: dropdown

The first task is th find a basis for $W$. A vector $\vv x = (x_1, x_2, x_3, x_4)$ is orthogonal to $\vv a$ if and only if

\begin{align*}
    \vv x \cdot \vv a = x_1 + 2x_2 - x_3 - 3x_4 = 0
\end{align*}

Solving this in the usual way (i.e., writing $x_1$ in terms of $x_2$, $x_3$, $x_4$), we observe that the free variables are $x_2, x_3, x_4$, so that a (non-orthogonal) basis for the subspace is

\begin{align*}
    \vv{w_1} = \bm -2\\1\\0\\0\em, \quad \vv{w_2} = \bm 1\\0\\1\\0\em, \quad\vv{w_3} = \bm 3\\0\\0\\1\em
\end{align*}

because 

\begin{align*}
    \bm x_1 \\x_2\\x_3\\x_4\em = x_2 \bm -2\\1\\0\\0\em + x_3\bm 1\\0\\1\\0\em + x_4\bm 3\\0\\0\\1 \em
\end{align*}

Now we apply Gram-Schmidt to obtain an orthogonal basis: first we set $\vv{v_1} = \vv{w_1}$. To get $\vv{v_2}$, we compute:

\begin{align*}
    \vv{v_2}  = \vv{w_2} - \frac{\langle \vv{w_2}, \vv{v_1}\rangle}{\langle \vv{v_1}, \vv{v_1}\rangle} \vv{v_1} = \bm 1\\0\\1\\0 \em - \frac{-2}{5}\bm -2\\1\\0\\0 \em = \bm 1/5 \\ 2/5 \\ 1 \\ 0\em
\end{align*}

Finally, we compute $\vv{v_3}$:

\begin{align*}
    \vv{v_3} &= \vv{w_3} - \frac{\langle \vv{w_3} ,\vv{v_1}\rangle}{\langle \vv{v_1}, \vv{v_1}\rangle}\vv{v_1} - \frac{\langle \vv{w_3}, \vv{v_2}\rangle}{\langle \vv{v_2}, \vv{v_2}\rangle}\vv{v_2} \\
    &= \bm 3\\0\\0\\1\em - \frac{-6}{5}\bm -2\\1\\0\\0 \em - \frac{3/5}{6/5} \bm 1/5\\2/5\\1\\0 \em\\
    &= \bm 1/2\\1\\-1/2\\1 \em 
\end{align*}

To get our hands on an orthonormal basis, we simply normalize the $\vv{v_i}$ by dividing them by their norms. An orthonormal basis is given by $\vv{u_1}, \vv{u_2}, \vv{u_3}$, where

\begin{align*}
    \vv{u_1} = \frac{\vv{v_1}}{\|\vv{v_1}\|} = \boxed{\frac{1}{\sqrt 5} \bm -2\\1\\0\\0\em}\\
    \vv{u_2} = \frac{\vv{v_2}}{\|\vv{v_2}\|} = \boxed{\frac{1}{\sqrt{6/5}} \bm 1/5 \\ 2/5 \\ 1 \\ 0\em }\\
    \vv{u_3} = \frac{\vv{v_3}}{\|\vv{v_3}\|} = \boxed{\frac{1}{\sqrt{5/2}} \bm 1/2\\1\\-1/2\\1\em }
\end{align*}

```
````

:::{important}
The orthogonal basis you obtain from the GSP does depend on the order of the vectors in the original basis; different orderings will produce different bases, but they will all span the same space as the original basis.
:::

:::{important}
We know that every vector space has a basis. The GSP tells us something very important: given any basis for a finite dimensional [inner product space](#inner-product-space-defn) space, we can always "orthogonalize" it. That is, every finite dimensional inner product space has an orthonormal basis!
:::


In [3]:
# The Gram Schmidt process

import numpy as np

x1 = np.array([-2, 1, 0, 0])
x2 = np.array([1, 0, 1, 0])
x3 = np.array([3, 0, 0, 1])

x = [x1, x2, x3] # basis that is not orthogonal
v = [x1] # initialize the basis that is orthogonal
u = [x1/np.linalg.norm(x1)] # initialize the basis that is orthogonal

for i in range(1, len(x)):
    v_i = x[i]
    for j in range(i):
        v_i = v_i - (np.dot(x[i], v[j])/np.dot(v[j], v[j]))*v[j]
    v.append(v_i)
    u.append(v_i/np.linalg.norm(v_i))

print("Orthogonal basis (not orthonormal): \n", v)
print("Orthonormal basis: \n", u)

Orthogonal basis (not orthonormal): 
 [array([-2,  1,  0,  0]), array([0.2, 0.4, 1. , 0. ]), array([ 0.5,  1. , -0.5,  1. ])]
Orthonormal basis: 
 [array([-0.89442719,  0.4472136 ,  0.        ,  0.        ]), array([0.18257419, 0.36514837, 0.91287093, 0.        ]), array([ 0.31622777,  0.63245553, -0.31622777,  0.63245553])]


[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/nikolaimatni/ese-2030/HEAD?labpath=/03_Orthogonality/052-gram_schmidt.ipynb)
