# Diagonalizing matricex

In this topic, we will finally delve into the **practical procedure of diagonalizing matrices**. Our focus is to provide you with the necessary tools and steps to successfully carry out the diagonalization process.

Before addressing the procedure itself, we first develop the required **theoretical foundations**.

Please recall that a square matrix $A$ of dimension $n$ is **diagonalizable** if there exist:

- a **diagonal matrix** $D$, and
- an **invertible matrix** $P$

such that

$$
A = P D P^{-1}
$$

In this decomposition:
- the columns of $P$ are eigenvectors of $A$, and
- the diagonal entries of $D$ are the corresponding eigenvalues.


## The Surprising Connection with Eigenvectors

Let us first carefully analyze the definition of **diagonalizability**.

Write the columns of the matrix $P$ as
$v_1, \ldots, v_n$
and the diagonal entries of the matrix $D$ as
$d_1, \ldots, d_n$.

Starting from the diagonalization equation

$$
A = P D P^{-1},
$$

multiply both sides on the **right** by $P$. This yields

$$
A P = P D.
$$

For these two matrices to be equal, it is sufficient that their **columns** are equal. Therefore, we focus on the columns of each side. In fact, it is enough to consider the first column, since the same reasoning applies to all others.

### Column-wise interpretation

From the properties of matrix multiplication:

- The first column of $A P$ is simply the product of $A$ with the first column of $P$, namely
  $$
  A v_1.
  $$

- For the product $P D$, since $D$ is a **diagonal matrix**, the first column is
  $$
  d_1 v_1.
  $$

Thus, equating the first columns of $A P$ and $P D$, we obtain

$$
A v_1 = d_1 v_1.
$$

### Interpretation as an eigenvalue problem

But this equation should look familiar. It is precisely the **eigenvalue equation**. It tells us that:

- $v_1$ is an eigenvector of $A$, and
- $d_1$ is the corresponding eigenvalue.

### General conclusion

Applying the same reasoning to all columns, we arrive at the key result:

> A matrix $A$ is diagonalizable **if and only if** the columns of $P$ are eigenvectors of $A$, and the corresponding eigenvalues appear as the diagonal entries of $D$.

### Practical consequence

Thanks to this result, constructing the matrices $P$ and $D$ becomes straightforward:

1. Compute all eigenvalues of $A$ and their corresponding eigenvectors.
2. Place the eigenvalues on the diagonal of $D$.
3. Place the corresponding eigenvectors as columns of $P$.

If $P$ is invertible (i.e., the eigenvectors are linearly independent), then $A$ is diagonalizable.


## How to Detect Diagonalizability

Not every matrix is diagonalizable. However, the deep connection between diagonalizability and **eigenvectors** allows us to precisely characterize when a matrix can be diagonalized.

Recall that each eigenvalue of a matrix has an associated **eigenspace**, defined as the subspace spanned by its eigenvectors. The **dimension** of this eigenspace is called the **geometric multiplicity** of the eigenvalue.


### Characterization Theorem

For an $n \times n$ matrix $A$, the following statements are equivalent:

**a)** $A$ is diagonalizable.

**b)** There exists a basis of $\mathbb{R}^n$ (or $\mathbb{C}^n$) consisting entirely of eigenvectors of $A$.

**c)** The sum of the geometric multiplicities of all eigenvalues of $A$ is equal to $n$.


### Practical criteria

The simplest situation in which diagonalizability is guaranteed is when an $n \times n$ matrix has **$n$ distinct eigenvalues**. In this case, the corresponding eigenvectors are automatically linearly independent, and $A$ is diagonalizable.

If a matrix does **not** have $n$ distinct eigenvalues, diagonalizability is still possible, but it must be verified explicitly. In this case, one must:

1. Compute all eigenvalues of $A$.
2. Determine the geometric multiplicity of each eigenvalue.
3. Check whether the sum of these geometric multiplicities is equal to $n$.


### Summary

If an $n \times n$ matrix does not have $n$ distinct eigenvalues, then:

> The matrix is diagonalizable **if and only if** the sum of the geometric multiplicities of its eigenvalues equals $n$.

This criterion provides a complete and practical test for diagonalizability.


## Steps to Diagonalize a Matrix

Combining everything we have discovered, we now obtain a **straightforward and elegant procedure** for diagonalizing any square matrix (provided it is diagonalizable).


### Step-by-step procedure

1. **Find all eigenvalues** of the matrix $A$.

2. **Determine the eigenvectors** corresponding to each eigenvalue.

3. **Compute the dimension of each eigenspace**, that is, the geometric multiplicity of each eigenvalue.

4. **Add the geometric multiplicities** of all eigenvalues.

   - If the sum is exactly $n$, then $A$ is diagonalizable and you may proceed.
   - If the sum is less than $n$, then $A$ is **not** diagonalizable.

5. **Construct the matrix $P$**:

   - The columns of $P$ are the eigenvectors of $A$.
   - Choose an order and keep it consistent.

6. **Construct the matrix $D$**:

   - The diagonal entries of $D$ are the eigenvalues of $A$.
   - Each eigenvalue appears as many times as the number of corresponding eigenvectors placed in $P$, and in the same order.

### Final result

If the above steps succeed, the matrix $A$ admits the decomposition

$$
A = P D P^{-1}.
$$

Let us now put this procedure into practice using concrete examples.


## A Simple Diagonalization

Let us start with a simple matrix:

$$
A =
\begin{pmatrix}
3 & 1 \\
2 & 4
\end{pmatrix}
$$

### Step 1: Find the eigenvalues

The first step is to compute the **characteristic polynomial**:

$$
\chi(\lambda) = \det(A - \lambda I)
$$

We obtain

$$
\chi(\lambda)
= \det
\begin{pmatrix}
3 - \lambda & 1 \\
2 & 4 - \lambda
\end{pmatrix}
= \lambda^2 - 7\lambda + 10.
$$

The roots of this polynomial are the eigenvalues. Solving

$$
\lambda^2 - 7\lambda + 10 = 0,
$$

we find

$$
\lambda_1 = 5, \qquad \lambda_2 = 2
$$

Since $A$ is a $2 \times 2$ matrix and has two distinct eigenvalues, it is **diagonalizable**.


### Step 2: Find the eigenvectors

#### Eigenvalue $\lambda = 5$

We solve

$$
(A - 5I)v = 0.
$$

That is,

$$
\begin{pmatrix}
-2 & 1 \\
2 & -1
\end{pmatrix}
\begin{pmatrix}
x \\
y
\end{pmatrix}
=
\begin{pmatrix}
0 \\
0
\end{pmatrix}.
$$

A possible solution is

$$
v_1 =
\begin{pmatrix}
1 \\
2
\end{pmatrix}.
$$

#### Eigenvalue $\lambda = 2$

Now we solve

$$
(A - 2I)v = 0,
$$

which gives

$$
\begin{pmatrix}
1 & 1 \\
2 & 2
\end{pmatrix}
\begin{pmatrix}
x \\
y
\end{pmatrix}
=
\begin{pmatrix}
0 \\
0
\end{pmatrix}.
$$

An easy solution is

$$
v_2 =
\begin{pmatrix}
-1 \\
1
\end{pmatrix}.
$$

### Step 3: Construct $P$ and $D$

We now build the matrices $D$ and $P$.

- The diagonal matrix $D$ contains the eigenvalues in the chosen order:
  $$
  D =
  \begin{pmatrix}
  5 & 0 \\
  0 & 2
  \end{pmatrix}
  $$

- The matrix $P$ has the corresponding eigenvectors as columns:
  $$
  P =
  \begin{pmatrix}
  v_1 \mid v_2
  \end{pmatrix}
  =
  \begin{pmatrix}
  1 & -1 \\
  2 & 1
  \end{pmatrix}
  $$

### Final result

With these matrices, we have the diagonalization

$$
A = P D P^{-1}
$$

You are encouraged to verify this identity directly by computing $P^{-1}$ and checking the product.


In [5]:
import numpy as np

In [6]:
# Define the matrix A
A = np.array([[3, 1],
              [2, 4]], dtype=float)

A

array([[3., 1.],
       [2., 4.]])

In [3]:
# Step 1 & 2: Eigenvalues and eigenvectors
eigenvalues, eigenvectors = np.linalg.eig(A)

eigenvalues, eigenvectors


(array([2., 5.]),
 array([[-0.70710678, -0.4472136 ],
        [ 0.70710678, -0.89442719]]))

In [4]:
# Construct D (diagonal matrix of eigenvalues)
D = np.diag(eigenvalues)

# Construct P (columns are eigenvectors)
P = eigenvectors

P, D

(array([[-0.70710678, -0.4472136 ],
        [ 0.70710678, -0.89442719]]),
 array([[2., 0.],
        [0., 5.]]))

In [7]:
# Compute P inverse
P_inv = np.linalg.inv(P)

# Verify diagonalization: A â‰ˆ P D P^{-1}
A_reconstructed = P @ D @ P_inv

A_reconstructed


array([[3., 1.],
       [2., 4.]])

In [8]:
# Numerical verification (floating point tolerance)
np.allclose(A, A_reconstructed)

True

### Notes (important with NumPy)

- `np.linalg.eig` returns **normalized eigenvectors**. These may differ from hand-chosen eigenvectors, but they still produce a valid diagonalization.

- **Floating-point arithmetic** introduces small numerical errors. Therefore, use
  `np.allclose(A, A_reconstructed)`
  instead of the equality operator `==`.

- The **order of eigenvalues and eigenvectors is not guaranteed**. NumPy selects an order, but it is consistent between the eigenvalues and their corresponding eigenvectors.


If desired, you can also:

- Force a specific ordering of eigenvalues and eigenvectors
- Show the manual construction of $P$ using the exact eigenvectors $(1, 2)^T$ and $(-1, 1)^T$
- Extend the procedure to larger matrices or to cases where the matrix is not diagonalizable


## A More Challenging Diagonalization

Let us analyze the following matrix:

$$
A =
\begin{pmatrix}
5 & 0 & -3 \\
-3 & 2 & 3 \\
6 & 0 & -4
\end{pmatrix}
$$

### Step 1: Eigenvalues

We compute the **characteristic polynomial**

$$
\chi(\lambda) = \det(A - \lambda I)
$$

A direct calculation gives

$$
\chi(\lambda) = -\lambda^3 + 3\lambda^2 - 4
$$

The roots of this polynomial are

$$
\lambda = 2 \quad \text{and} \quad \lambda = -1
$$

At this point, we cannot yet conclude whether $A$ is diagonalizable. We must compute the eigenvectors and check their geometric multiplicities.

### Step 2: Eigenvectors for $\lambda = 2$

We solve the system

$$
(A - 2I)v = 0,
$$

which corresponds to

$$
\begin{pmatrix}
3 & 0 & -3 \\
-3 & 0 & 3 \\
6 & 0 & -6
\end{pmatrix}
\begin{pmatrix}
x \\
y \\
z
\end{pmatrix}
=
\begin{pmatrix}
0 \\
0 \\
0
\end{pmatrix}
$$

The solution set is a vector subspace given by

$$
\left\{
x
\begin{pmatrix}
1 \\
0 \\
1
\end{pmatrix}
+
y
\begin{pmatrix}
0 \\
1 \\
0
\end{pmatrix}
\;\middle|\;
x, y \in \mathbb{R}
\right\}
$$

A basis for this eigenspace is therefore

$$
v_1 =
\begin{pmatrix}
1 \\
0 \\
1
\end{pmatrix},
\qquad
v_2 =
\begin{pmatrix}
0 \\
1 \\
0
\end{pmatrix}
$$

Hence, the geometric multiplicity of the eigenvalue $\lambda = 2$ is $2$.

### Step 3: Eigenvectors for $\lambda = -1$

We now solve

$$
(A + I)v = 0,
$$

that is,

$$
\begin{pmatrix}
6 & 0 & -3 \\
-3 & 3 & 3 \\
6 & 0 & -3
\end{pmatrix}
\begin{pmatrix}
x \\
y \\
z
\end{pmatrix}
=
\begin{pmatrix}
0 \\
0 \\
0
\end{pmatrix}.
$$

One possible nonzero solution is

$$
v_3 =
\begin{pmatrix}
1 \\
-1 \\
2
\end{pmatrix}.
$$

Thus, the geometric multiplicity of $\lambda = -1$ is $1$.

### Step 4: Constructing $P$ and $D$

Since the sum of the geometric multiplicities is

$$
2 + 1 = 3,
$$

which equals the dimension of the matrix, the matrix $A$ is **diagonalizable**.

We construct:

- The diagonal matrix $D$, whose diagonal entries are the eigenvalues (repeated according to their multiplicities):

$$
D =
\begin{pmatrix}
2 & 0 & 0 \\
0 & 2 & 0 \\
0 & 0 & -1
\end{pmatrix}
$$

- The matrix $P$, whose columns are the corresponding eigenvectors in the same order:

$$
P =
\begin{pmatrix}
v_1 \mid v_2 \mid v_3
\end{pmatrix}
=
\begin{pmatrix}
1 & 0 & 1 \\
0 & 1 & -1 \\
1 & 0 & 2
\end{pmatrix}
$$

### Final result

With these matrices, the diagonalization of $A$ is given by

$$
A = P D P^{-1}
$$

You may verify this result by explicitly computing $P^{-1}$ and checking the matrix product.


In [7]:
# Define the matrix A
A = np.array([
    [5,  0, -3],
    [-3, 2,  3],
    [6,  0, -4]
], dtype=float)

A

array([[ 5.,  0., -3.],
       [-3.,  2.,  3.],
       [ 6.,  0., -4.]])

In [9]:
# Step 1: Compute eigenvalues and eigenvectors
eigenvalues, eigenvectors = np.linalg.eig(A)

eigenvalues, eigenvectors

(array([ 2.,  2., -1.]),
 array([[ 0.        ,  0.70710678,  0.40824829],
        [ 1.        ,  0.        , -0.40824829],
        [ 0.        ,  0.70710678,  0.81649658]]))

In [10]:
# Construct the diagonal matrix D
D = np.diag(eigenvalues)

# Construct the matrix P (columns are eigenvectors)
P = eigenvectors

P, D

(array([[ 0.        ,  0.70710678,  0.40824829],
        [ 1.        ,  0.        , -0.40824829],
        [ 0.        ,  0.70710678,  0.81649658]]),
 array([[ 2.,  0.,  0.],
        [ 0.,  2.,  0.],
        [ 0.,  0., -1.]]))

In [11]:
# Compute the inverse of P
P_inv = np.linalg.inv(P)

# Reconstruct A from the diagonalization
A_reconstructed = P @ D @ P_inv

A_reconstructed

array([[ 5.,  0., -3.],
       [-3.,  2.,  3.],
       [ 6.,  0., -4.]])

In [12]:
# Numerical verification (allowing for floating-point error)
np.allclose(A, A_reconstructed)

True

### Optional: Manual construction using the exact eigenvectors from the theory

This mirrors the hand-derived diagonalization.

In [15]:
# Manual eigenvectors (as derived analytically)
v1 = np.array([1, 0, 1], dtype=float)
v2 = np.array([0, 1, 0], dtype=float)
v3 = np.array([1, -1, 2], dtype=float)

# Construct P and D manually
P_manual = np.column_stack([v1, v2, v3])
D_manual = np.diag([2, 2, -1])

# Verify the diagonalization
A_manual_reconstructed = P_manual @ D_manual @ np.linalg.inv(P_manual)

P_manual, D_manual, A_manual_reconstructed


(array([[ 1.,  0.,  1.],
        [ 0.,  1., -1.],
        [ 1.,  0.,  2.]]),
 array([[ 2,  0,  0],
        [ 0,  2,  0],
        [ 0,  0, -1]]),
 array([[ 5.,  0., -3.],
        [-3.,  2.,  3.],
        [ 6.,  0., -4.]]))

In [14]:
# Final verification
np.allclose(A, A_manual_reconstructed)

True

## Not All Matrices Are Diagonalizable

Not every matrix is diagonalizable. Consider the following simple example:

$$
A =
\begin{pmatrix}
2 & 1 \\
0 & 2
\end{pmatrix}
$$


### Eigenvalue analysis

The matrix $A$ has a **single eigenvalue**:

$$
\lambda = 2,
$$

with algebraic multiplicity $2$. However, the **dimension of its eigenspace** is only $1$. In other words, the geometric multiplicity of the eigenvalue $2$ is strictly smaller than its algebraic multiplicity.

Therefore, the matrix $A$ is **not diagonalizable**.

### Why diagonalization fails

Suppose, for the sake of argument, that $A$ were diagonalizable. Then we could construct a diagonal matrix

$$
D =
\begin{pmatrix}
2 & 0 \\
0 & 2
\end{pmatrix}
= 2I
$$

If $A$ were diagonalizable, we would have

$$
A = P D P^{-1}
$$

Substituting $D = 2I$, we obtain

$$
A = P (2I) P^{-1}
  = 2 P I P^{-1}
  = 2 P P^{-1}
  = 2I
$$

### Contradiction

But clearly,

$$
A \neq 2I.
$$

This contradiction shows that our assumption was false. Hence, the matrix $A$ **cannot be diagonalizable**.


### Key takeaway

A matrix is diagonalizable **only if** it has enough linearly independent eigenvectors to form a basis. When the geometric multiplicity of an eigenvalue is smaller than its algebraic multiplicity, diagonalization is impossible.


## Conclusion

A square matrix $A$ of dimension $n$ is **diagonalizable** if there exist a diagonal matrix $D$ and an invertible matrix $P$ such that

$$
A = P D P^{-1}.
$$

The diagonal entries of $D$ are the **eigenvalues** of $A$, while the columns of $P$ are the corresponding **eigenvectors**.

The matrix $A$ is diagonalizable **if and only if** there exists a basis of $\mathbb{R}^n$ (or $\mathbb{C}^n$) formed by eigenvectors of $A$. This condition is equivalent to requiring that the **sum of the geometric multiplicities** of the eigenvalues of $A$ is equal to $n$.

In particular, if $A$ has $n$ distinct eigenvalues, then $A$ is diagonalizable.

To diagonalize a matrix $A$ in practice:
1. Compute its eigenvalues.
2. Compute the eigenvectors associated with each eigenvalue.
3. Construct the matrix $D$ using the eigenvalues on the diagonal.
4. Construct the matrix $P$ using the corresponding eigenvectors as columns.

If these steps produce an invertible matrix $P$, then the diagonalization of $A$ is complete.


## Exercises

### The Characteristic Polynomial Is Not Enough

In your work as a data scientist, you are working with the following matrix:

$$
A =
\begin{pmatrix}
1 & 1 \\
0 & 2
\end{pmatrix}.
$$

In order to perform a **dimensionality reduction** process, you need to diagonalize the matrix.

**Question:**
Write the **sum of the entries on the diagonal of** $D$.


### Solution

Calcualte the eigenvalues of the matrix $A$.

In [16]:
coeffs = [1, -3, 1]

roots = np.roots(coeffs)
roots

array([2.61803399, 0.38196601])

In [17]:
import sympy as sp

# Define the symbol
x = sp.symbols('x')

# Define the polynomial: x^2 - 3x + 1 = 0
poly = x**2 - 3*x + 1

# Solve for x
solutions = sp.solve(poly, x)
solutions

[3/2 - sqrt(5)/2, sqrt(5)/2 + 3/2]

In [18]:
total = sum(solutions)
print(total)

3
