# Positive definite matrices

Positive definite matrices are a special family of **symmetric matrices** that behave much like **positive numbers**.

In this topic, you will learn about their **geometric interpretation** and a practical way to **recognize** them.

They are the **final missing piece** needed to build the **Singular Value Decomposition (SVD)**. You will explore their most useful properties and even learn how to **define the square root of a matrix**.

## Definition

Let us begin with the definition. Consider a **symmetric matrix** $A$ of size $n$. It is called:

- **Positive definite (PD)** if all of its eigenvalues are **strictly positive**
- **Positive semidefinite (PSD)** if all of its eigenvalues are **non-negative**

For instance, take the following matrices:

$
\begin{pmatrix}
3 & 1 \\
1 & 3
\end{pmatrix}
\qquad
\begin{pmatrix}
1 & -1 \\
-1 & 1
\end{pmatrix}
$

The eigenvalues of the first matrix are $4$ and $2$, so it is **positive definite**.
The eigenvalues of the second matrix are $2$ and $0$, hence it is **positive semidefinite**.

Although the set of symmetric matrices may seem rather small, it contains an even more special family of matrices with remarkable additional properties. But if this set is so restricted, why should we care about it? After all, symmetric matrices do not appear frequently at first glance.

Do not worry—by the end of this topic, you will discover that **any matrix** (not necessarily square) can be used to construct a **PSD matrix**. Moreover, the properties of this PSD matrix allow you to analyze the original matrix in a powerful way.

Since PD (PSD) matrices have **positive (non-negative) eigenvalues**, they behave in many ways like **positive (non-negative) numbers**. This analogy will naturally emerge when studying their properties.

For now, note that because PD and PSD matrices are symmetric, they admit a **spectral decomposition**:

$
A = U D U^{T}
$

Here:
- $U$ is an **orthogonal matrix**
- $D$ is a **diagonal matrix**

In this setting, the decomposition is especially meaningful because the diagonal entries of $D$ (the eigenvalues of $A$) are **positive** (or **non-negative**).

In what follows, let $A$ denote a **positive definite (or positive semidefinite) matrix**, and let us take a closer look at the **geometry** of these matrices.

## Ellipses and the Spectral Decomposition

Let us push the geometric interpretation one step further.
Consider again the quadratic form associated with the matrix $A$ and look at all vectors $v$ such that
$$
q(v) = 1.
$$
In coordinates, this means
$$
3x^2 + 2xy + 3y^2 = 1.
$$

If you recall some analytic geometry, you may recognize that this equation defines an **ellipse**.

However, the presence of the *cross term* $2xy$ makes the geometry difficult to interpret directly. This is precisely where **spectral decomposition** becomes powerful: it allows us to eliminate the cross term by rotating the coordinate system.


### Spectral Decomposition of $A$

Consider the spectral decomposition of $A$:
$$
A = U D U^T,
$$
with
$$
U = \frac{1}{\sqrt{2}}
\begin{pmatrix}
1 & -1 \\
1 & \phantom{-}1
\end{pmatrix},
\qquad
D =
\begin{pmatrix}
4 & 0 \\
0 & 2
\end{pmatrix}.
$$

Using this decomposition, we can rewrite the quadratic form as
$$
\begin{aligned}
q(v)
&= v^T A v \\
&= v^T U D U^T v \\
&= (U^T v)^T D (U^T v).
\end{aligned}
$$

### Change of Coordinates

Let $v = (x,y)^T$. Then
$$
U^T v
=
\frac{1}{\sqrt{2}}
\begin{pmatrix}
1 & 1 \\
-1 & 1
\end{pmatrix}
\begin{pmatrix}
x \\
y
\end{pmatrix}
=
\frac{1}{\sqrt{2}}
\begin{pmatrix}
x + y \\
- x + y
\end{pmatrix}.
$$

Substituting into the quadratic form gives
$$
\begin{aligned}
q(v)
&= (U^T v)^T D (U^T v) \\
&=
\frac{1}{2}
\begin{pmatrix}
x+y & -x+y
\end{pmatrix}
\begin{pmatrix}
4 & 0 \\
0 & 2
\end{pmatrix}
\begin{pmatrix}
x+y \\
- x+y
\end{pmatrix} \\
&=
\frac{1}{2}
\begin{pmatrix}
x+y & -x+y
\end{pmatrix}
\begin{pmatrix}
4(x+y) \\
2(-x+y)
\end{pmatrix} \\
&=
\frac{1}{2}\bigl[4(x+y)^2 + 2(-x+y)^2\bigr] \\
&= 2(x+y)^2 + (-x+y)^2.
\end{aligned}
$$

---

### Interpretation

The equation
$$
q(v) = 1
$$
now becomes
$$
2(x+y)^2 + (-x+y)^2 = 1.
$$

There are **no cross terms anymore**. The ellipse is expressed in coordinates aligned with its principal axes.

This illustrates a general and powerful fact:

- Every **positive definite matrix** defines an ellipse (or, in higher dimensions, an ellipsoid).
- The **spectral decomposition removes cross terms** by rotating the coordinate system.
- The **columns of $U$ point in the directions of the principal axes** of the ellipse.
- The **eigenvalues in $D$ determine the stretching** along those axes.

In short, spectral decomposition turns a complicated quadratic form into a simple, geometrically transparent one.

## How to Detect Positive Definiteness

You can recognize whether a matrix is **symmetric** or **diagonal** at a glance.
However, determining whether a matrix is **positive definite (PD)** or **positive semidefinite (PSD)** is less immediate. A direct approach would require computing all eigenvalues, which is often computationally expensive.

A widely used alternative is **Sylvester’s Criterion**.

---

## Sylvester’s Criterion

Let $A$ be a **symmetric** matrix of size $n \times n$.

For each integer $k = 1, 2, \dots, n$, consider the **top-left $k \times k$ principal minor** of $A$, that is, the determinant of the submatrix formed by the first $k$ rows and the first $k$ columns.

Sylvester’s criterion states:

> A symmetric matrix $A$ is **positive definite** if and only if **all** its top-left $k \times k$ principal minors are **strictly positive**.

---

## Example 1

Consider the matrix
$$
A =
\begin{pmatrix}
3 & 1 \\
1 & 3
\end{pmatrix}.
$$

The top-left principal minors are:

- First order:
  $$
  \det(3) = 3 > 0
  $$

- Second order:
  $$
  \det
  \begin{pmatrix}
  3 & 1 \\
  1 & 3
  \end{pmatrix}
  = 9 - 1 = 8 > 0
  $$

Since **all** top-left principal minors are positive, the matrix is **positive definite**.

---

## Example 2

Now consider the larger symmetric matrix
$$
A =
\begin{pmatrix}
2 & 0 & -1 \\
0 & 3 & 0 \\
-1 & 0 & 2
\end{pmatrix}.
$$

Its top-left principal minors are:

- First order:
  $$
  \det(2) = 2 > 0
  $$

- Second order:
  $$
  \det
  \begin{pmatrix}
  2 & 0 \\
  0 & 3
  \end{pmatrix}
  = 6 > 0
  $$

- Third order:
  $$
  \det
  \begin{pmatrix}
  2 & 0 & -1 \\
  0 & 3 & 0 \\
  -1 & 0 & 2
  \end{pmatrix}
  = 9 > 0
  $$

All top-left principal minors are positive, so this matrix is also **positive definite**.

---

## Important Remarks

- **Sylvester’s criterion applies only to symmetric matrices.**
- It characterizes **positive definiteness**, not merely positive semidefiniteness.
- For **PSD matrices**, the criterion does *not* apply directly (minors may be zero).
- Computationally, this method is often far simpler than finding eigenvalues.

---

### Summary

- Positive definiteness can be checked **without eigenvalues**.
- Compute determinants of the top-left $k \times k$ submatrices.
- If *all* are strictly positive, the matrix is **positive definite**.

This criterion is one of the most practical tools for recognizing PD matrices in both theory and applications.

## Leveraging Positive Definite (PD) Matrices

Every real number $x$ satisfies
$$
x^2 \ge 0.
$$
A remarkably similar phenomenon occurs with matrices, and this is precisely why **positive definite (PD)** and **positive semidefinite (PSD)** matrices are so important.

## From Arbitrary Matrices to PD Matrices

Let $B$ be any real, square (and invertible) matrix. Then the matrix
$$
B^T B
$$
is always **positive definite** (or **positive semidefinite** if $B$ is not invertible).

This is the matrix analogue of squaring a real number: $B^T B$ plays the role of $x^2$.

## The Converse: Square Roots of PD Matrices

What is more surprising is that the converse also holds:

> If $A$ is a PD (or PSD) matrix, then there exists a square matrix $B$ such that
> $$
> B^T B = A.
> $$

In this sense, $B$ behaves like a **square root** of $A$.

However, unlike real numbers, this square root is **not unique**. There may be many matrices $C$ such that
$$
C^T C = A.
$$

## Orthogonal Freedom

Despite the non-uniqueness, all such square roots are closely related. This relationship is known as **orthogonal freedom**:

> If
> $$
> B^T B = C^T C,
> $$
> then there exists an **orthogonal matrix** $U$ such that
> $$
> B = U C.
> $$

This result may seem abstract, but it is fundamental. It lies at the heart of the **spectral decomposition** and many matrix factorizations.

## The Principal Square Root

Consider the number $4$. It has two square roots: $2$ and $-2$.
Among them, only $2$ is **non-negative**.

An analogous concept exists for matrices.

For any **PSD matrix** $A$, there exists a **unique PSD matrix** $P$ such that
$$
P^2 = A.
$$
This matrix is called the **principal square root** of $A$ and is denoted by
$$
A^{1/2}.
$$

## Computing the Principal Square Root

If $A$ has a spectral decomposition
$$
A = U D U^T,
$$
where:
- $U$ is orthogonal,
- $D$ is diagonal with non-negative entries,

then define $D^{1/2}$ as the diagonal matrix whose entries are the square roots of those in $D$.

The principal square root is then
$$
A^{1/2} = U D^{1/2} U^T.
$$

## Absolute Value of a Matrix

Just as the absolute value of a real number is defined by
$$
|x| = \sqrt{x^2},
$$
an analogous definition exists for matrices.

For any square matrix $M$,
$$
M^T M
$$
is PSD and therefore admits a square root. This allows us to define
$$
|M| = (M^T M)^{1/2}.
$$

The matrix $|M|$ captures the “magnitude” of $M$ while remaining PSD.


## Polar Decomposition

The deepest connection between a matrix and its “absolute value” is given by the **polar decomposition**:

> For any square matrix $M$, there exists an orthogonal matrix $U$ such that
> $$
> M = U \, |M|.
> $$

This is the matrix analogue of writing a real number as its **sign** times its **absolute value**.

## Summary

- $B^T B$ is always PSD (PD if $B$ is invertible).
- Every PD (PSD) matrix has a square root.
- The **principal square root** is the unique PSD square root.
- Spectral decomposition makes square roots easy to compute.
- Every matrix admits a **polar decomposition**:
  $$
  M = U |M|.
  $$

These ideas form the final bridge toward the **singular value decomposition (SVD)**.