# Vector and Matrix Norms and Condition Numbers

This notebook provides an introduction to vector norms, matrix norms, and condition numbers, which are fundamental concepts in numerical linear algebra.

## Vector Norms

A vector norm is a function that assigns a non-negative length or size to a vector. Common vector norms include:

- **L1 norm (Manhattan norm)**: $\|x\|_1 = \sum_{i=1}^n |x_i|$
- **L2 norm (Euclidean norm)**: $\|x\|_2 = \sqrt{\sum_{i=1}^n x_i^2}$
- **L∞ norm (Maximum norm)**: $\|x\|_\infty = \max_{i=1}^n |x_i|$

These norms satisfy the properties of:
1. Positive definiteness: $\|x\| \geq 0$ and $\|x\| = 0$ iff $x = 0$
2. Homogeneity: $\|cx\| = |c| \|x\|$
3. Triangle inequality: $\|x + y\| \leq \|x\| + \|y\|$

In [5]:
import numpy as np

# Example vector
x = np.array([3, 4, -5])

# L1 norm
l1_norm = np.linalg.norm(x, 1)
print(f"L1 norm of {x}: {l1_norm}")

# L2 norm
l2_norm = np.linalg.norm(x, 2)
print(f"L2 norm of {x}: {l2_norm}")

# L-infinity norm
linf_norm = np.linalg.norm(x, np.inf)
print(f"L∞ norm of {x}: {linf_norm}")

# Manual calculation for verification
print(f"Manual L1: {np.sum(np.abs(x))}")
print(f"Manual L2: {np.sqrt(np.sum(x**2))}")
print(f"Manual L∞: {np.max(np.abs(x))}")

L1 norm of [ 3  4 -5]: 12.0
L2 norm of [ 3  4 -5]: 7.0710678118654755
L∞ norm of [ 3  4 -5]: 5.0
Manual L1: 12
Manual L2: 7.0710678118654755
Manual L∞: 5


## Matrix Norms

Matrix norms extend the concept of vector norms to matrices. Common matrix norms include:

- **Induced norms**: Based on vector norms
  - 1-norm: $\|A\|_1 = \max_j \sum_i |a_{ij}|$ (maximum column sum)
  - 2-norm: $\|A\|_2 = \sigma_1$ (largest singular value)
  - ∞-norm: $\|A\|_\infty = \max_i \sum_j |a_{ij}|$ (maximum row sum)

- **Frobenius norm**: $\|A\|_F = \sqrt{\sum_{i,j} a_{ij}^2}$ (square root of sum of squares)

Note: The Frobenius norm is not an induced (operator) norm; it is unitarily invariant and submultiplicative.

Matrix norms satisfy similar properties to vector norms and are submultiplicative: $\|AB\| \leq \|A\| \|B\|$

In [6]:
# Example matrix
A = np.array([[1, 2, 3],
              [4, 5, 6],
              [-1, 0, 2]])

print("Matrix A:")
print(A)
print()

# Matrix norms
norm_1 = np.linalg.norm(A, 1)
norm_2 = np.linalg.norm(A, 2)
norm_inf = np.linalg.norm(A, np.inf)
norm_fro = np.linalg.norm(A, 'fro')

print(f"1-norm: {norm_1}")
print(f"2-norm: {norm_2}")
print(f"∞-norm: {norm_inf}")
print(f"Frobenius norm: {norm_fro}")

# Manual verification for 1-norm (max column sum)
col_sums = np.sum(np.abs(A), axis=0)
print(f"Manual 1-norm (max column sum): {np.max(col_sums)}")

# Manual verification for ∞-norm (max row sum)
row_sums = np.sum(np.abs(A), axis=1)
print(f"Manual ∞-norm (max row sum): {np.max(row_sums)}")

Matrix A:
[[ 1  2  3]
 [ 4  5  6]
 [-1  0  2]]

1-norm: 11.0
2-norm: 9.560659131630937
∞-norm: 15.0
Frobenius norm: 9.797958971132712
Manual 1-norm (max column sum): 11
Manual ∞-norm (max row sum): 15


## Applications and When to Use Different Norms

### Vector Norms

**L1 Norm ($\ell_1$)**:
- **Use when**: You want sparsity or robustness to outliers (in regression contexts)
- **Applications**: Compressed sensing, LASSO regression, signal processing where you want to promote sparse solutions
- **Advantages**: Promotes sparsity when used as a regularizer (e.g., LASSO); often more robust to outliers than L2 in regression settings
- **Example**: In machine learning, L1 regularization encourages feature selection

**L2 Norm ($\ell_2$)**:
- **Use when**: You want physical interpretations or smooth solutions
- **Applications**: Least squares problems, physics (energy, distance), Gaussian processes
- **Advantages**: Differentiable everywhere, corresponds to Euclidean distance
- **Example**: Standard distance measurement, error minimization in regression

**L∞ Norm ($\ell_\infty$)**:
- **Use when**: You care about the worst-case scenario or maximum deviation
- **Applications**: Control theory, stability analysis, approximation theory
- **Advantages**: Measures the largest component, useful for bounds
- **Example**: In numerical analysis, checking convergence by maximum error

### Matrix Norms

**Induced Norms**:
- **1-norm**: Useful when you want to bound the maximum column effect
- **∞-norm**: Useful when you want to bound the maximum row effect
- **2-norm**: Most commonly used, relates to singular values and stability

**Frobenius Norm**:
- **Use when**: You want a measure similar to the L2 vector norm for matrices
- **Applications**: Matrix approximation, low-rank approximations, total energy
- **Advantages**: Easy to compute, relates to the sum of squared elements

### Choosing Norms in Practice

- **For error analysis**: Use the norm that matches your error tolerance criteria
- **For algorithm stability**: Use 2-norm for spectral properties
- **For sparse solutions**: Use L1 norm (as a regularizer)
- **For worst-case bounds**: Use L∞ norm
- **For computational efficiency**: Choose norms that are easy to compute for your problem

In [9]:
# Practical example: Different norms give different results
import numpy as np

# Example vectors with outliers
v1 = np.array([1, 2, 3, 4, 5])  # No outliers
v2 = np.array([1, 2, 3, 4, 100])  # One large outlier

print("Vector 1:", v1)
print("Vector 2:", v2)
print()

# Compare norms
for vec, name in [(v1, "Vector 1"), (v2, "Vector 2")]:
    l1 = np.linalg.norm(vec, 1)
    l2 = np.linalg.norm(vec, 2)
    l_inf = np.linalg.norm(vec, np.inf)
    print(f"{name}:")
    print(f"  L1 norm: {l1:.2f}")
    print(f"  L2 norm: {l2:.2f}")
    print(f"  L∞ norm: {l_inf:.2f}")
    print()

print("Notice how:")
print("- L1 norm is most affected by the outlier (sum-based, so large values dominate the total)")
print("- L2 norm is also strongly affected (due to squaring large values)")
print("- L∞ norm is directly the maximum component (equals the outlier)")
print()

# Matrix norm example
print("Matrix norm applications:")
A = np.array([[1, 0.01], [0.01, 1]])
B = np.array([[1, 0], [0, 0.001]])  # Ill-conditioned

print("Matrix A (well-conditioned):")
print(A)
print(f"2-norm condition number: {np.linalg.cond(A):.2f}")
print()

print("Matrix B (ill-conditioned):")
print(B)
print(f"2-norm condition number: {np.linalg.cond(B):.2f}")
print()

print("The 2-norm condition number helps identify matrices that amplify errors in solutions.")

Vector 1: [1 2 3 4 5]
Vector 2: [  1   2   3   4 100]

Vector 1:
  L1 norm: 15.00
  L2 norm: 7.42
  L∞ norm: 5.00

Vector 2:
  L1 norm: 110.00
  L2 norm: 100.15
  L∞ norm: 100.00

Notice how:
- L1 norm is most affected by the outlier (sum-based, so large values dominate the total)
- L2 norm is also strongly affected (due to squaring large values)
- L∞ norm is directly the maximum component (equals the outlier)

Matrix norm applications:
Matrix A (well-conditioned):
[[1.   0.01]
 [0.01 1.  ]]
2-norm condition number: 1.02

Matrix B (ill-conditioned):
[[1.    0.   ]
 [0.    0.001]]
2-norm condition number: 1000.00

The 2-norm condition number helps identify matrices that amplify errors in solutions.


## Condition Numbers

The condition number of an invertible matrix A with respect to a norm is defined as:

$\kappa(A) = \|A\| \, \|A^{-1}\|$

It measures how sensitive the solution of $Ax = b$ is to perturbations in A and b.

- **Well-conditioned**: $\kappa(A) \approx 1$
- **Ill-conditioned**: $\kappa(A) \gg 1$

For the 2-norm, $\kappa_2(A) = \frac{\sigma_1}{\sigma_n}$ (ratio of largest to smallest singular value). The condition number depends on the chosen norm (by default, many libraries including `np.linalg.cond` use the 2-norm unless otherwise specified).

In many models, the relative error in the solution can be bounded by roughly the condition number times the relative error in the input data (bounds depend on the precise perturbation model and norm).

In [7]:
# Condition number examples

# Well-conditioned matrix (identity)
I = np.eye(3)
cond_I = np.linalg.cond(I)
print(f"Condition number of identity matrix: {cond_I}")
print()

# Moderately conditioned matrix
B = np.array([[1, 0.1],
              [0.1, 1]])
cond_B = np.linalg.cond(B)
print(f"Condition number of matrix B: {cond_B}")
print("Matrix B:")
print(B)
print()

# Ill-conditioned matrix (Hilbert matrix)
H = np.array([[1, 1/2, 1/3],
              [1/2, 1/3, 1/4],
              [1/3, 1/4, 1/5]])
cond_H = np.linalg.cond(H)
print(f"Condition number of Hilbert matrix: {cond_H}")
print("Hilbert matrix:")
print(H)
print()

# Demonstrate sensitivity to perturbations
x_exact = np.array([1, 1, 1])
b = H @ x_exact
print(f"Exact solution: {x_exact}")
print(f"Right-hand side b: {b}")

# Add small perturbation to b
delta_b = np.array([0.001, 0, 0])
b_pert = b + delta_b

# Solve perturbed system
x_pert = np.linalg.solve(H, b_pert)
print(f"Perturbed solution: {x_pert}")
print(f"Relative error in solution: {np.linalg.norm(x_pert - x_exact) / np.linalg.norm(x_exact)}")
print(f"Relative error in RHS: {np.linalg.norm(delta_b) / np.linalg.norm(b)}")
print(f"Amplification factor: {np.linalg.norm(x_pert - x_exact) / np.linalg.norm(x_exact) / (np.linalg.norm(delta_b) / np.linalg.norm(b))}")

Condition number of identity matrix: 1.0

Condition number of matrix B: 1.2222222222222225
Matrix B:
[[1.  0.1]
 [0.1 1. ]]

Condition number of Hilbert matrix: 524.0567775860644
Hilbert matrix:
[[1.         0.5        0.33333333]
 [0.5        0.33333333 0.25      ]
 [0.33333333 0.25       0.2       ]]

Exact solution: [1 1 1]
Right-hand side b: [1.83333333 1.08333333 0.78333333]
Perturbed solution: [1.009 0.964 1.03 ]
Relative error in solution: 0.027549954627900785
Relative error in RHS: 0.00044072396956813445
Amplification factor: 62.51067908763164


## Summary

- **Vector norms** measure the "size" of vectors with different properties
- **Matrix norms** extend this concept to matrices and are crucial for analyzing linear transformations
- **Condition numbers** quantify how errors in the input propagate to errors in the output of linear systems
- **Applications** guide when to choose different norms based on your specific problem requirements

Understanding these concepts is essential for:
- Analyzing the stability of numerical algorithms
- Choosing appropriate solution methods for linear systems
- Interpreting the accuracy of computed solutions
- Selecting norms that match your error criteria or computational goals

