---
title: Orthogonal Matrices
subject:  Orthogonality
subtitle: 
short_title: Orthogonal Matrices
authors:
  - name: Nikolai Matni
    affiliations:
      - Dept. of Electrical and Systems Engineering
      - University of Pennsylvania
    email: nmatni@seas.upenn.edu
license: CC-BY-4.0
keywords: Orthogonal Matrix, QR Decomposition
math:
  '\vv': '\mathbf{#1}'
  '\bm': '\begin{bmatrix}'
  '\em': '\end{bmatrix}'
  '\R': '\mathbb{R}'
---

## Reading

Material related to this page, as well as additional exercises, can be found in ALA 4.3.

## Learning Objectives

By the end of this page, you should know:
- What is an orthogonal matrix?
- What is the QR decomposition of a square matrix?
- How can we use the QR decomposition to solve systems of equatitons of the form $A\vv{x} = \vv b$, with $A$ square?

# Orthogonal Matrices

Rotations and reflections play key roltes in geomtry, physics, robotics, quantum mechanics, airplans, compute graphics, data science, and more. These transformations are encoded via *orthogonal matrices*, that is matirces whose columns form an orthonormal basis for $\mathbb{R}^n$. They also play a centrla role in one of the most important methods of linear algebra, the *QR decomposition*.

We start with a definition.

:::{prf:definition} Orthogonal Matrix
:label: orthogonal-matrix-defn

A square matrix $Q$ is called *orthogonal* if it satisfies 

\begin{align*}
    QQ^{\top} = Q^{\top}Q = I.
\end{align*}

This means that $Q^{-1} = Q^{\top}$ (in fact, we could define orthogonal matrices this way instead), and that solving linear systems of the form $Q\vv x = \vv b$ is very easy: simply set $\vv x = Q^\top \vv b$!

Notice that $Q^\top Q = I$ implies that the columns of $Q$ are orthonormal. If $Q = [\vv{q_1}, ..., \vv{q_n}]$, then 

\begin{align*}
    (Q^\top Q)_{ij} = \vv{q_i}^\top \vv{q_j} = I_{ij} = \begin{cases} 1 \quad\text{if $i \neq j$}\\ 0\quad\text{if $i = j$}\end{cases}
\end{align*}

which is exactly the definition of an orthonormal collcetion of vectors. Further, since ther eare $n$ such vectors, they must form an [orthonormal basis](#orthonormal-basis-defn) for $\mathbb{R}^n$. 
:::

Now, let's explore some of the consequences of this definition.

:::{prf:example} $2 \times 2$ orthogonal matrices
:label: orthogonal-matrices-ex1

A $2\times 2$ matric $Q = \bm a&b\\c&d\em$ is orthogonal if and only if

\begin{align*}
 Q^\top Q = \bm a^2 + c^2 & ab + cd \\ ac + cd & b^2 + d^2\em = \bm 1\\ 0\\ 0\\ 1\em 
\end{align*}

or equivalently

\begin{align*}
    a^2 + c^2 = 1, \quad ab + cd = 0, \quad b^2 + d^2 = 1
\end{align*}

The first and last equations say that $\bm a\\ c \em$ and $\bm b\\ d \em$ lie on the unit circle in $\mathbb{R}^2$: a convenient and revealing way of writing this is by setting

\begin{align*}
    a = \cos \theta, \quad c= \sin \theta, \quad b = \cos \phi, \quad d = \sin \phi
\end{align*}

since $\cos^2 \theta + \sin^2\theta = 1$ for all $\theta \in \mathbb{R}$.

Our last condition is $0 = ad + cd = \cos\theta \cos \phi +\sin\theta \sin\phi = \cos(\theta - \phi)$. Now 
\begin{align*}
    \cos (\theta - \phi) = 0 &\iff \theta - \phi = \frac{\pi}{2} + 2 n \pi\quad \text{or} \quad \theta - \phi = -\frac{\pi}{2} \\
    &\iff \pi = \theta \pm \frac{\pi}{2} 
\end{align*}

This means either:

* $b = -\sin\theta$ and $d = \cos\theta$ 

* or $b = \sin\theta$ and $d = -\cos \theta$

As a result, every $2\times 2$ orthogonal matrix has one of two possible forms:

\begin{align*}
    \bm \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \em \quad\text{or}\quad \bm \cos\theta & \sin\theta \\ \sin\theta & -\cos\theta \em
\end{align*}

where by convention, we restrict $\theta \in [0, 2\pi)$.

The columns of both matrices form an orthonormal basis for $\mathbb{R}^2$. The first is obtained by rotating the [standard basis](#basis_eg) $\vv{e_1}, \vv{e_2}$ through angle $\theta$, the second by first reflexting about the x-axis and the rotating.

![Orthogonal matrices in $\mathbb{R}^2$](../figures/04-orthogonal_matrix.png)

:::

If we think about the map $\vv x \mapsto Q \vv x$ defined by multiplication with an orthogonal matrix as rotating and/or reflectingthe vector $\vv x$, then the following property should not be surprising:

:::{important}
The product of two orthogonal matrices is also orthogonal!
:::

Before grinding through some algebra, let's think about this through the lens of rotation and reflections. Multiply $\vv x$ by a product of orthogonal matrices $Q_2Q_1$ is the same as first rotation/reflecting $\vv x$ by $Q_1$ to obtain $Q_1 \vv x$, and then rotating/reflecting $Q_1 \vv x$ by $Q_2$ to get $Q_2 Q_1 \vv x$. Now a sequence of rotations and reflections is still ultimately a rotation and/or reflection so we must have $Q_2 Q_1 \vv x = Q \vv x$ for some orthogonal $Q = Q_2 Q_1$.

Let's check that this intuition carries over in the math. Since $Q_1$ and $Q_2$ are orthogonal, we have that

\begin{align*}
    Q^\top_1 Q_1 = I = Q_2^\top Q_2.
\end{align*}

Let's check that $(Q_1Q_2)^\top (Q_1Q_2) = I$:

\begin{align*}
    (Q_1Q_2)^\top (Q_1Q_2) = Q_2^\top \underbrace{Q_1^\top Q_1}_{I}Q_2 = \underbrace{Q_2^\top Q_2}_I = I
\end{align*}

Therefore $(Q_1 Q_2)^{-1} = (Q_1 Q_2)^\top$, and we indeed have $Q_1Q_2$ is orthogonal.

:::{important}

This multiplicative property combined with the fact that the inverse of an orthogonal matrix is orthogonal (why?) says that the set of all orthogonal matrices (of dimension $n$) forms a *group* (under matrix multiplication). 

Group theory underlies much of modern physics and quantum mechanics and plays a central role in robotics. Although we will not spend too much time on groups in this class, you are sure to see them again in the future. 

The aforementioned *orthogonal group* in particular is central to rigid body mechanics, atomic structure and chemistry, and computer graphics, among many other applications.

:::

# The QR Factorization