# Orthogonal and orthonormal bases

You already know that any two vectors that do not lie on the same line form a basis in the plane. However, if you ask a random person to draw two coordinate axes on a checkered piece of paper, they will most likely draw two **perpendicular** lines.

This intuition reflects an important idea: the concept of **orthogonality** complements the notion of choosing a basis in a vector space in a very natural and powerful way.

The combination of **orthogonality** and **bases** is extremely fruitful and appears throughout mathematics and its applications. It plays a fundamental role in areas such as:
- dimensionality reduction in machine learning,
- Fourier analysis,
- signal processing,
- and many frontiers of modern science and technology.

The scope of applications of orthogonal bases is so vast that providing a complete overview would be difficult. Therefore, we will focus on understanding the **foundations** of these ideas and where they come from.

## Example: Constructing an Orthogonal and Orthonormal Basis

Consider a two-dimensional Euclidean space
$$
(V, \langle \cdot , \cdot \rangle).
$$
Choose arbitrary vectors $\vec v_1$ and $\vec v_2$ forming a basis of $V$.

Define a new vector
$$
\vec v_2' = \vec v_2 - \frac{\langle \vec v_1, \vec v_2 \rangle}{\langle \vec v_1, \vec v_1 \rangle}\,\vec v_1.
$$

This vector is of particular interest because it is **orthogonal** to $\vec v_1$.

### Orthogonality Check

Indeed,
$$
\begin{aligned}
\langle \vec v_1, \vec v_2' \rangle
&= \left\langle \vec v_1,\; \vec v_2 - \frac{\langle \vec v_1, \vec v_2 \rangle}{\langle \vec v_1, \vec v_1 \rangle}\vec v_1 \right\rangle \\
&= \langle \vec v_1, \vec v_2 \rangle
- \frac{\langle \vec v_1, \vec v_2 \rangle}{\langle \vec v_1, \vec v_1 \rangle}
\langle \vec v_1, \vec v_1 \rangle \\
&= 0.
\end{aligned}
$$

Since $\{\vec v_1, \vec v_2\}$ is a basis, vectors $\vec v_1$ and $\vec v_2'$ are linearly independent. Therefore,
$$
\{\vec v_1, \vec v_2'\}
$$
is also a basis of $V$, now consisting of **orthogonal vectors**.

Such a basis is called an **orthogonal basis**.

### Numerical Example

Suppose
$$
\langle \vec v_1, \vec v_1 \rangle = 9, \quad
\langle \vec v_2, \vec v_2 \rangle = 8, \quad
\langle \vec v_1, \vec v_2 \rangle = 6.
$$

Then
$$
\vec v_2' = \vec v_2 - \frac{6}{9}\vec v_1
= \vec v_2 - \frac{2}{3}\vec v_1.
$$

Thus,
$$
\{\vec v_1,\; \vec v_2 - \tfrac{2}{3}\vec v_1\}
$$
is an orthogonal basis.

### Lengths of the Orthogonal Vectors

The norms are:
$$
\|\vec v_1\| = \sqrt{\langle \vec v_1, \vec v_1 \rangle} = 3,
$$
and
$$
\begin{aligned}
\|\vec v_2'\|
&= \sqrt{\langle \vec v_2 - \tfrac{2}{3}\vec v_1,\; \vec v_2 - \tfrac{2}{3}\vec v_1 \rangle} \\
&= \sqrt{
\langle \vec v_2, \vec v_2 \rangle
- \tfrac{4}{3}\langle \vec v_1, \vec v_2 \rangle
+ \tfrac{4}{9}\langle \vec v_1, \vec v_1 \rangle
} \\
&= \sqrt{8 - 8 + 4} = 2.
\end{aligned}
$$


### Normalization

Dividing a vector by its norm produces a **unit vector** (length $1$). This process is called **normalization**.

For any vector $\vec w \neq 0$,
$$
\left\|\frac{\vec w}{\|\vec w\|}\right\| = 1.
$$

Normalize $\vec v_1$ and $\vec v_2'$:
$$
\vec e_1 = \frac{\vec v_1}{\|\vec v_1\|} = \frac{1}{3}\vec v_1,
$$
$$
\vec e_2 = \frac{\vec v_2'}{\|\vec v_2'\|}
= \frac{1}{2}\vec v_2'
= \frac{1}{2}\vec v_2 - \frac{1}{3}\vec v_1.
$$

### Result: Orthonormal Basis

The vectors $\vec e_1$ and $\vec e_2$ form a basis of $V$ that is both **orthogonal** and **normalized**:

$$
\langle \vec e_1, \vec e_1 \rangle = 1, \quad
\langle \vec e_2, \vec e_2 \rangle = 1, \quad
\langle \vec e_1, \vec e_2 \rangle = 0.
$$

These relations are exactly the same as those of the standard basis in $\mathbb{R}^2$ with the usual dot product.

This procedure is the two-dimensional case of the **Gram–Schmidt orthonormalization process**.

## Geometric Side of the Story

All the algebraic manipulations introduced earlier may seem somewhat overcomplicated at first. However, when viewed from a **geometric perspective**, they become much more intuitive and natural.

Let us once again consider two vectors
$$
\vec v_1 \quad \text{and} \quad \vec v_2,
$$
but now interpret them as **arrows in the plane**.

From a geometric standpoint, the construction
$$
\vec v_2' = \vec v_2 - \frac{\langle \vec v_1, \vec v_2 \rangle}{\langle \vec v_1, \vec v_1 \rangle}\,\vec v_1
$$
has a very clear meaning.

The term
$$
\frac{\langle \vec v_1, \vec v_2 \rangle}{\langle \vec v_1, \vec v_1 \rangle}\,\vec v_1
$$
is the **projection of $\vec v_2$ onto $\vec v_1$**.

Therefore, $\vec v_2'$ is obtained by **subtracting the projection of $\vec v_2$ onto $\vec v_1$** from $\vec v_2$. What remains is precisely the component of $\vec v_2$ that is **perpendicular** to $\vec v_1$.

Geometrically:
- $\vec v_1$ defines a direction,
- the projection extracts the part of $\vec v_2$ aligned with that direction,
- subtracting it removes all parallel influence,
- leaving a vector orthogonal to $\vec v_1$.

This is why
$$
\langle \vec v_1, \vec v_2' \rangle = 0.
$$

Once this orthogonal vector is constructed, **normalization** simply rescales the arrows so that they have **unit length**, without changing their directions.

As a result, the Gram–Schmidt process can be understood geometrically as:
1. keeping the first direction,
2. removing parallel components from the second,
3. rescaling vectors to unit length.

This geometric interpretation explains why orthonormal bases are so natural, stable, and powerful in applications ranging from coordinate systems to projections, Fourier analysis, and machine learning.

## Full Geometric Interpretation in $\mathbb{R}^3$

Let us now extend the geometric intuition behind orthogonalization to three dimensions.

Assume we are working in a Euclidean space $(\mathbb{R}^3, \langle \cdot, \cdot \rangle)$ and are given three **linearly independent vectors**
$$
\vec v_1,\; \vec v_2,\; \vec v_3.
$$
Geometrically, these vectors span the entire three-dimensional space.

Our goal is to construct an **orthonormal basis**
$$
\{\vec e_1,\vec e_2,\vec e_3\}
$$
that spans the same space, using only geometric operations: **projection, subtraction, and normalization**.

---

### Step 1: First Direction

We begin with
$$
\vec e_1 = \frac{\vec v_1}{\|\vec v_1\|}.
$$

**Geometric meaning:**
We simply choose the direction of $\vec v_1$ and rescale it to have length 1.
This defines the first axis of our new coordinate system.

---

### Step 2: Removing the Parallel Component

Next, we construct a vector orthogonal to $\vec e_1$ from $\vec v_2$:
$$
\vec u_2 = \vec v_2 - \langle \vec v_2,\vec e_1\rangle\,\vec e_1.
$$

**Geometric meaning:**
- $\langle \vec v_2,\vec e_1\rangle\,\vec e_1$ is the **projection of $\vec v_2$ onto $\vec e_1$**
- subtracting it removes all motion along $\vec e_1$
- $\vec u_2$ lies entirely in the plane perpendicular to $\vec e_1$

Now normalize:
$$
\vec e_2 = \frac{\vec u_2}{\|\vec u_2\|}.
$$

We now have **two perpendicular unit vectors**, defining a plane.

---

### Step 3: Removing Components Along a Plane

The third vector requires removing *two* projections:
$$
\vec u_3
= \vec v_3
- \langle \vec v_3,\vec e_1\rangle\,\vec e_1
- \langle \vec v_3,\vec e_2\rangle\,\vec e_2.
$$

**Geometric meaning:**
- the first subtraction removes the component along $\vec e_1$
- the second removes the component lying in the $\vec e_2$ direction
- what remains is **perpendicular to the entire plane** spanned by $\vec e_1$ and $\vec e_2$

Normalize:
$$
\vec e_3 = \frac{\vec u_3}{\|\vec u_3\|}.
$$

---

### Final Result

The vectors
$$
\{\vec e_1,\vec e_2,\vec e_3\}
$$
satisfy
$$
\langle \vec e_i,\vec e_j\rangle = \delta_{ij},
$$
meaning:
- all vectors have length 1
- all vectors are mutually perpendicular

Geometrically, this is exactly the same structure as the standard axes
$$
(1,0,0),\;(0,1,0),\;(0,0,1),
$$
but **aligned with the original data**.

---

### Key Geometric Insight

In $\mathbb{R}^3$, Gram–Schmidt can be understood as:

1. Choose a direction
2. Flatten the next vector into a perpendicular plane
3. Remove everything lying inside that plane
4. Normalize at each step

Each step removes **unwanted geometric influence**, leaving only new independent directions.

---

### Why This Matters

This geometric process explains why orthonormal bases are:
- numerically stable
- easy to interpret
- essential for projections, least squares, PCA, and Fourier methods

Every modern data-driven method that “rotates” space into meaningful axes is built on this exact geometry.

## Higher Dimensions and Orthonormal Bases

The idea of a basis consisting of **unit vectors that are mutually orthogonal** generalizes naturally to any finite dimension.

Let $(V,\langle \cdot,\cdot\rangle)$ be a **Euclidean space** with
$$
\dim(V)=n.
$$

---

### Orthogonal and Orthonormal Bases

A basis
$$
\{\vec e_1,\vec e_2,\dots,\vec e_n\}
$$
is called **orthogonal** if
$$
\langle \vec e_i,\vec e_j\rangle = 0
\quad \text{for all } i\neq j.
$$

It is called **orthonormal** if, in addition,
$$
\langle \vec e_i,\vec e_i\rangle = 1
\quad \text{for all } i.
$$

---

### Kronecker Delta Notation

The **Kronecker delta** is defined as
$$
\delta_{i,j} =
\begin{cases}
1, & i=j, \\
0, & i\neq j.
\end{cases}
$$

Using this notation, a basis is orthonormal if and only if
$$
\langle \vec e_i,\vec e_j\rangle = \delta_{i,j}
\quad \text{for all } i,j\in\{1,\dots,n\}.
$$

This compactly encodes both orthogonality and unit length.

---

## Gram–Schmidt Process (General Case)

Let
$$
\{\vec v_1,\vec v_2,\dots,\vec v_n\}
$$
be **any basis** of $V$.

We construct an orthonormal basis
$$
\{\vec e_1,\vec e_2,\dots,\vec e_n\}
$$
as follows.

---

### Step 1: Orthogonalization

Define vectors $\vec w_k$ recursively:

- First vector:
$$
\vec w_1 = \vec v_1.
$$

- For $k=2,\dots,n$:
$$
\vec w_k
= \vec v_k
- \sum_{j=1}^{k-1}
\operatorname{proj}_{\vec w_j}(\vec v_k),
$$
where
$$
\operatorname{proj}_{\vec w_j}(\vec v_k)
=
\frac{\langle \vec v_k,\vec w_j\rangle}
     {\langle \vec w_j,\vec w_j\rangle}
\,\vec w_j.
$$

The vectors $\vec w_1,\dots,\vec w_n$ are **mutually orthogonal**.

---

### Step 2: Normalization

Normalize each vector:
$$
\vec e_k = \frac{\vec w_k}{\|\vec w_k\|}
\quad \text{for } k=1,\dots,n.
$$

Then
$$
\{\vec e_1,\vec e_2,\dots,\vec e_n\}
$$
is an **orthonormal basis** of $V$.

---

## Example in $\mathbb{R}^3$

Let
$$
\vec v_1=
\begin{pmatrix}1\\0\\1\end{pmatrix},
\quad
\vec v_2=
\begin{pmatrix}1\\-2\\0\end{pmatrix},
\quad
\vec v_3=
\begin{pmatrix}1\\-1\\1\end{pmatrix}.
$$

---

### Step 1

$$
\vec w_1=\vec v_1,
\quad
\vec e_1=\frac{1}{\sqrt{2}}
\begin{pmatrix}1\\0\\1\end{pmatrix}.
$$

---

### Step 2

$$
\vec w_2
=\vec v_2
-\frac{\langle \vec v_2,\vec w_1\rangle}
       {\langle \vec w_1,\vec w_1\rangle}
\vec w_1
=
\begin{pmatrix}
\frac12\\-2\\-\frac12
\end{pmatrix}.
$$

Normalize:
$$
\vec e_2=
\frac{1}{\sqrt{6}}
\begin{pmatrix}
1\\-4\\-1
\end{pmatrix}.
$$

---

### Step 3

$$
\vec w_3
=\vec v_3
-\operatorname{proj}_{\vec w_1}(\vec v_3)
-\operatorname{proj}_{\vec w_2}(\vec v_3)
=
\begin{pmatrix}
-\frac{2}{9}\\-\frac{1}{9}\\\frac{2}{9}
\end{pmatrix}.
$$

Normalize:
$$
\vec e_3=
\frac{1}{3}
\begin{pmatrix}
-2\\-1\\2
\end{pmatrix}.
$$

---

### Final Result

The vectors
$$
\{\vec e_1,\vec e_2,\vec e_3\}
$$
form an **orthonormal basis of $\mathbb{R}^3$**, satisfying
$$
\langle \vec e_i,\vec e_j\rangle=\delta_{i,j}.
$$

---

### Geometric Interpretation

At each step, Gram–Schmidt:
1. Removes components aligned with previous directions
2. Leaves only perpendicular components
3. Normalizes lengths to 1

This is exactly the geometric process of **constructing perpendicular coordinate axes aligned with given data**.

## Features of an Orthonormal Basis

The key advantage of choosing an **orthonormal basis** in a Euclidean space $V$ is that it makes $V$ behave exactly like $\mathbb{R}^n$ equipped with the **standard dot product**, where
$$
n = \dim(V).
$$

Let us see why this is the case.

---

### Inner Products in an Arbitrary Basis

Let
$$
\{\vec e_1, \vec e_2, \dots, \vec e_n\}
$$
be an arbitrary basis of $V$.

Any vectors $\vec a, \vec b \in V$ can be written as
$$
\vec a = a_1 \vec e_1 + a_2 \vec e_2 + \cdots + a_n \vec e_n,
\quad
\vec b = b_1 \vec e_1 + b_2 \vec e_2 + \cdots + b_n \vec e_n.
$$

Their inner product is then
$$
\langle \vec a, \vec b \rangle
=
\sum_{i=1}^n a_i b_i \langle \vec e_i, \vec e_i \rangle
+
\sum_{i<j} (a_i b_j + a_j b_i)\langle \vec e_i, \vec e_j \rangle.
$$

This expression is computationally expensive:
- The number of terms with $i<j$ grows **quadratically** with $n$.
- From $n \ge 4$, cross terms dominate the calculation.

---

### Simplification in an Orthonormal Basis

Now assume the basis is **orthonormal**, meaning
$$
\langle \vec e_i, \vec e_j \rangle = \delta_{i,j}.
$$

Then:
- All cross terms vanish ($\langle \vec e_i, \vec e_j \rangle = 0$ for $i \ne j$)
- All diagonal terms equal 1

Hence,
$$
\boxed{
\langle \vec a, \vec b \rangle
=
a_1 b_1 + a_2 b_2 + \cdots + a_n b_n
}
$$

This is exactly the **standard dot product in $\mathbb{R}^n$**.

---

### Coordinate Representation

Writing vectors in an orthonormal basis is equivalent to working with column vectors in $\mathbb{R}^n$:
$$
\vec a \leftrightarrow
\begin{pmatrix}
a_1\\
a_2\\
\vdots\\
a_n
\end{pmatrix},
\quad
\vec b \leftrightarrow
\begin{pmatrix}
b_1\\
b_2\\
\vdots\\
b_n
\end{pmatrix}.
$$

Vector addition and scalar multiplication become
$$
\begin{pmatrix}
a_1\\
a_2\\
\vdots\\
a_n
\end{pmatrix}
+
\begin{pmatrix}
b_1\\
b_2\\
\vdots\\
b_n
\end{pmatrix}
=
\begin{pmatrix}
a_1+b_1\\
a_2+b_2\\
\vdots\\
a_n+b_n
\end{pmatrix},
$$

$$
\lambda
\begin{pmatrix}
a_1\\
a_2\\
\vdots\\
a_n
\end{pmatrix}
=
\begin{pmatrix}
\lambda a_1\\
\lambda a_2\\
\vdots\\
\lambda a_n
\end{pmatrix}.
$$

Thus, **working in an orthonormal basis is algebraically identical to working in $\mathbb{R}^n$**.

---

### Expansion via Projections

Let $\vec x \in V$ and let
$$
\{\vec e_1,\dots,\vec e_n\}
$$
be an orthonormal basis. Suppose
$$
\vec x = x_1 \vec e_1 + \cdots + x_n \vec e_n.
$$

Consider the sum
$$
\sum_{i=1}^n \langle \vec x, \vec e_i \rangle \vec e_i.
$$

Using orthonormality:
$$
\langle \vec x, \vec e_i \rangle
=
\left\langle
\sum_{j=1}^n x_j \vec e_j, \vec e_i
\right\rangle
=
x_i.
$$

Therefore,
$$
\boxed{
\vec x
=
\sum_{i=1}^n \langle \vec x, \vec e_i \rangle \vec e_i
}
$$

---

### Interpretation

- The coefficients $x_i$ are **projections** of $\vec x$ onto the basis vectors.
- Coordinates in an orthonormal basis are obtained **directly via inner products**.
- This property underlies:
  - Fourier series
  - Principal Component Analysis (PCA)
  - QR decomposition
  - Signal processing
  - Machine learning embeddings

---

### Final Takeaway

Choosing an orthonormal basis:
- Eliminates cross terms
- Simplifies inner products
- Turns abstract Euclidean spaces into $\mathbb{R}^n$
- Makes geometry, computation, and interpretation transparent

## Conclusion

Let $(V,\langle \cdot,\cdot\rangle)$ be a Euclidean space with
$$
\dim(V)=n.
$$

### Normalization

The **normalization** of a nonzero vector $\vec v \in V$ is the unit vector
$$
\frac{1}{\|\vec v\|}\,\vec v.
$$

### Orthogonal and Orthonormal Bases

A basis
$$
\{\vec e_1,\vec e_2,\dots,\vec e_n\}
$$
is called **orthogonal** if
$$
\langle \vec e_i,\vec e_j\rangle = 0 \quad \text{for all } i\neq j.
$$

A basis
$$
\{\vec e_1,\vec e_2,\dots,\vec e_n\}
$$
is called **orthonormal** if
$$
\langle \vec e_i,\vec e_j\rangle = \delta_{i,j},
$$
where $\delta_{i,j}$ is the Kronecker delta.

### Coordinate Representation

If vectors in $V$ are written with respect to an orthonormal basis, then:
- The space $V$ can be identified with $\mathbb{R}^n$
- The inner product $\langle \cdot,\cdot\rangle$ becomes the standard dot product

### Gram–Schmidt Process

Given any basis of a Euclidean space, one can always construct an orthonormal basis using the **Gram–Schmidt process**.

### Expansion Formula

For any vector $\vec x\in V$ and any orthonormal basis
$$
\{\vec e_1,\vec e_2,\dots,\vec e_n\},
$$
the following identity holds:
$$
\boxed{
\vec x
=
\langle \vec x,\vec e_1\rangle\,\vec e_1
+\cdots+
\langle \vec x,\vec e_n\rangle\,\vec e_n
}
$$

This shows that the coordinates of $\vec x$ are precisely its projections onto the basis vectors.

### Final Insight

Orthonormal bases turn abstract Euclidean spaces into familiar coordinate spaces, making geometry, computation, and interpretation straightforward.