# Cholesky Decomposition

## Theoretical Foundation

The **Cholesky decomposition** is a factorization of a Hermitian, positive-definite matrix into the product of a lower triangular matrix and its conjugate transpose. For real symmetric positive-definite matrices, this simplifies to:

$$A = LL^T$$

where $L$ is a lower triangular matrix with positive diagonal entries, and $L^T$ is its transpose.

### Positive-Definite Matrices

A symmetric matrix $A \in \mathbb{R}^{n \times n}$ is **positive-definite** if:

$$\mathbf{x}^T A \mathbf{x} > 0 \quad \text{for all } \mathbf{x} \neq \mathbf{0}$$

Equivalently, all eigenvalues of $A$ are strictly positive.

### Algorithm Derivation

Given a positive-definite matrix $A$, we compute $L$ element-by-element. Expanding $A = LL^T$:

$$a_{ij} = \sum_{k=1}^{\min(i,j)} l_{ik} l_{jk}$$

For the diagonal elements ($i = j$):

$$l_{ii} = \sqrt{a_{ii} - \sum_{k=1}^{i-1} l_{ik}^2}$$

For the off-diagonal elements ($i > j$):

$$l_{ij} = \frac{1}{l_{jj}} \left( a_{ij} - \sum_{k=1}^{j-1} l_{ik} l_{jk} \right)$$

### Computational Complexity

The Cholesky decomposition requires approximately $\frac{n^3}{3}$ floating-point operations, which is roughly half the cost of LU decomposition. This makes it the preferred method for solving systems involving positive-definite matrices.

### Applications

1. **Linear Systems**: Solving $A\mathbf{x} = \mathbf{b}$ via forward and back substitution
2. **Monte Carlo Simulation**: Generating correlated random variables
3. **Optimization**: Computing the inverse and determinant of covariance matrices
4. **Kalman Filtering**: Numerically stable updates of covariance matrices

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from scipy.linalg import cholesky as scipy_cholesky

np.random.seed(42)

## Implementation of Cholesky Decomposition

We implement the Cholesky-Banachiewicz algorithm, which computes $L$ row by row.

In [None]:
def cholesky_decomposition(A):
    """
    Compute the Cholesky decomposition of a positive-definite matrix.
    
    Parameters
    ----------
    A : ndarray
        Symmetric positive-definite matrix of shape (n, n)
    
    Returns
    -------
    L : ndarray
        Lower triangular matrix such that A = L @ L.T
    """
    n = A.shape[0]
    L = np.zeros((n, n))
    
    for i in range(n):
        for j in range(i + 1):
            if i == j:
                # Diagonal elements
                sum_sq = np.sum(L[i, :j] ** 2)
                val = A[i, i] - sum_sq
                if val <= 0:
                    raise ValueError("Matrix is not positive-definite")
                L[i, j] = np.sqrt(val)
            else:
                # Off-diagonal elements
                sum_prod = np.sum(L[i, :j] * L[j, :j])
                L[i, j] = (A[i, j] - sum_prod) / L[j, j]
    
    return L

## Generating a Positive-Definite Matrix

A reliable method to generate a positive-definite matrix is to use $A = B^T B + \epsilon I$ where $B$ is any matrix and $\epsilon > 0$.

In [None]:
def generate_positive_definite_matrix(n, condition_number=10):
    """
    Generate a random symmetric positive-definite matrix.
    
    Parameters
    ----------
    n : int
        Size of the matrix
    condition_number : float
        Approximate condition number of the resulting matrix
    
    Returns
    -------
    A : ndarray
        Symmetric positive-definite matrix
    """
    # Generate random orthogonal matrix via QR decomposition
    B = np.random.randn(n, n)
    Q, _ = np.linalg.qr(B)
    
    # Create eigenvalues with desired condition number
    eigenvalues = np.linspace(1, condition_number, n)
    
    # Construct A = Q @ diag(eigenvalues) @ Q.T
    A = Q @ np.diag(eigenvalues) @ Q.T
    
    # Ensure symmetry (numerical precision)
    A = (A + A.T) / 2
    
    return A

# Generate test matrix
n = 5
A = generate_positive_definite_matrix(n)

print("Matrix A:")
print(np.array2string(A, precision=4, suppress_small=True))

## Computing and Verifying the Decomposition

In [None]:
# Compute Cholesky decomposition
L = cholesky_decomposition(A)

print("Lower triangular matrix L:")
print(np.array2string(L, precision=4, suppress_small=True))

# Verify: A = L @ L.T
A_reconstructed = L @ L.T
reconstruction_error = np.linalg.norm(A - A_reconstructed, 'fro')

print(f"\nReconstruction error ||A - LL^T||_F = {reconstruction_error:.2e}")

# Compare with SciPy
L_scipy = scipy_cholesky(A, lower=True)
scipy_difference = np.linalg.norm(L - L_scipy, 'fro')
print(f"Difference from SciPy: {scipy_difference:.2e}")

## Solving Linear Systems

Given $A\mathbf{x} = \mathbf{b}$ with $A = LL^T$, we solve:

1. $L\mathbf{y} = \mathbf{b}$ (forward substitution)
2. $L^T\mathbf{x} = \mathbf{y}$ (back substitution)

In [None]:
def solve_cholesky(L, b):
    """
    Solve Ax = b using the Cholesky factor L where A = LL^T.
    
    Parameters
    ----------
    L : ndarray
        Lower triangular Cholesky factor
    b : ndarray
        Right-hand side vector
    
    Returns
    -------
    x : ndarray
        Solution vector
    """
    n = L.shape[0]
    
    # Forward substitution: Ly = b
    y = np.zeros(n)
    for i in range(n):
        y[i] = (b[i] - np.dot(L[i, :i], y[:i])) / L[i, i]
    
    # Back substitution: L^T x = y
    x = np.zeros(n)
    for i in range(n - 1, -1, -1):
        x[i] = (y[i] - np.dot(L[i+1:, i], x[i+1:])) / L[i, i]
    
    return x

# Test linear system solving
b = np.random.randn(n)
x_cholesky = solve_cholesky(L, b)
x_direct = np.linalg.solve(A, b)

print(f"Solution via Cholesky: {x_cholesky}")
print(f"Solution via np.linalg.solve: {x_direct}")
print(f"Difference: {np.linalg.norm(x_cholesky - x_direct):.2e}")

## Numerical Stability Analysis

We analyze the numerical stability of Cholesky decomposition by examining how errors scale with matrix condition number.

In [None]:
# Test stability across different condition numbers
condition_numbers = np.logspace(0, 6, 20)
reconstruction_errors = []
solution_errors = []

test_size = 20

for kappa in condition_numbers:
    # Generate matrix with specified condition number
    A_test = generate_positive_definite_matrix(test_size, condition_number=kappa)
    
    # Cholesky decomposition
    L_test = cholesky_decomposition(A_test)
    
    # Reconstruction error
    recon_err = np.linalg.norm(A_test - L_test @ L_test.T, 'fro') / np.linalg.norm(A_test, 'fro')
    reconstruction_errors.append(recon_err)
    
    # Solution error
    x_true = np.random.randn(test_size)
    b_test = A_test @ x_true
    x_computed = solve_cholesky(L_test, b_test)
    sol_err = np.linalg.norm(x_computed - x_true) / np.linalg.norm(x_true)
    solution_errors.append(sol_err)

print("Stability analysis complete.")

## Visualization

We visualize:
1. The sparsity pattern of the Cholesky factor
2. Numerical stability as a function of condition number
3. Matrix structure comparison

In [None]:
fig = plt.figure(figsize=(14, 10))

# Generate a larger matrix for visualization
n_vis = 10
A_vis = generate_positive_definite_matrix(n_vis, condition_number=50)
L_vis = cholesky_decomposition(A_vis)

# Plot 1: Original matrix A
ax1 = fig.add_subplot(2, 2, 1)
im1 = ax1.imshow(A_vis, cmap='RdBu_r', aspect='equal')
ax1.set_title('Original Matrix $A$', fontsize=12)
ax1.set_xlabel('Column index')
ax1.set_ylabel('Row index')
plt.colorbar(im1, ax=ax1, shrink=0.8)

# Plot 2: Cholesky factor L
ax2 = fig.add_subplot(2, 2, 2)
im2 = ax2.imshow(L_vis, cmap='RdBu_r', aspect='equal')
ax2.set_title('Cholesky Factor $L$', fontsize=12)
ax2.set_xlabel('Column index')
ax2.set_ylabel('Row index')
plt.colorbar(im2, ax=ax2, shrink=0.8)

# Plot 3: Numerical stability - reconstruction error
ax3 = fig.add_subplot(2, 2, 3)
ax3.loglog(condition_numbers, reconstruction_errors, 'b-o', linewidth=2, 
           markersize=4, label='Reconstruction error')
ax3.loglog(condition_numbers, solution_errors, 'r-s', linewidth=2, 
           markersize=4, label='Solution error')
ax3.axhline(y=np.finfo(float).eps, color='k', linestyle='--', 
            label='Machine epsilon', alpha=0.7)
ax3.set_xlabel('Condition Number $\\kappa(A)$', fontsize=11)
ax3.set_ylabel('Relative Error', fontsize=11)
ax3.set_title('Numerical Stability Analysis', fontsize=12)
ax3.legend(loc='upper left', fontsize=9)
ax3.grid(True, alpha=0.3)

# Plot 4: Eigenvalue distribution comparison
ax4 = fig.add_subplot(2, 2, 4)
eigenvalues_A = np.linalg.eigvalsh(A_vis)
eigenvalues_LLT = np.linalg.eigvalsh(L_vis @ L_vis.T)

x_pos = np.arange(len(eigenvalues_A))
width = 0.35

bars1 = ax4.bar(x_pos - width/2, eigenvalues_A, width, label='$A$', alpha=0.8)
bars2 = ax4.bar(x_pos + width/2, eigenvalues_LLT, width, label='$LL^T$', alpha=0.8)

ax4.set_xlabel('Eigenvalue Index', fontsize=11)
ax4.set_ylabel('Eigenvalue', fontsize=11)
ax4.set_title('Eigenvalue Preservation: $A$ vs $LL^T$', fontsize=12)
ax4.legend(loc='upper left', fontsize=9)
ax4.set_xticks(x_pos)
ax4.grid(True, alpha=0.3, axis='y')

plt.tight_layout()
plt.savefig('plot.png', dpi=150, bbox_inches='tight')
plt.show()

print("\nFigure saved to 'plot.png'")

## Computational Complexity Comparison

We empirically verify the $O(n^3/3)$ complexity of Cholesky decomposition.

In [None]:
import time

sizes = [10, 20, 50, 100, 200, 500]
times_custom = []
times_scipy = []

for size in sizes:
    A_time = generate_positive_definite_matrix(size)
    
    # Time custom implementation
    start = time.perf_counter()
    _ = cholesky_decomposition(A_time)
    times_custom.append(time.perf_counter() - start)
    
    # Time SciPy implementation
    start = time.perf_counter()
    _ = scipy_cholesky(A_time, lower=True)
    times_scipy.append(time.perf_counter() - start)

print("\nTiming Results:")
print(f"{'Size':<8} {'Custom (s)':<15} {'SciPy (s)':<15} {'Speedup':<10}")
print("-" * 48)
for i, size in enumerate(sizes):
    speedup = times_custom[i] / times_scipy[i] if times_scipy[i] > 0 else float('inf')
    print(f"{size:<8} {times_custom[i]:<15.6f} {times_scipy[i]:<15.6f} {speedup:<10.1f}x")

## Conclusion

The Cholesky decomposition provides an efficient and numerically stable method for factorizing symmetric positive-definite matrices. Key takeaways:

1. **Efficiency**: Requires only $n^3/3$ operations, half that of LU decomposition
2. **Stability**: Errors scale linearly with condition number, not quadratically
3. **Uniqueness**: The decomposition is unique for positive-definite matrices
4. **Applications**: Essential for solving linear systems, optimization, and simulation

The implementation demonstrated here correctly computes the Cholesky factor with errors at machine precision level for well-conditioned matrices.