# 02: Condition Numbers

**Module 1.2: Linear Systems & Least Squares**

## Learning Objectives

By the end of this notebook, you will:
1. Understand what condition numbers measure
2. Compute condition numbers using SVD
3. See why $\kappa(X^\top X) = \kappa(X)^2$ is dangerous
4. Diagnose ill-conditioned problems in genomics

## Resources
- Solomon, *Numerical Algorithms*, §4.3
- Cohen, *Practical Linear Algebra*, Chapter 14

In [None]:
import torch
import numpy as np
import matplotlib.pyplot as plt

torch.manual_seed(42)
plt.rcParams['figure.figsize'] = (10, 6)

---
## 1. What is a Condition Number?

The **condition number** $\kappa(A)$ measures how sensitive the solution of $Ax = b$ is to small changes in the input.

$$\kappa(A) = \|A\| \cdot \|A^{-1}\| = \frac{\sigma_{\max}}{\sigma_{\min}}$$

Where $\sigma_{\max}$ and $\sigma_{\min}$ are the largest and smallest singular values.

### Interpretation

| $\kappa(A)$ | Status | Meaning |
|-------------|--------|--------|
| $\approx 1$ | Well-conditioned | Small input changes → small output changes |
| $10^6$ | Ill-conditioned | Small input changes → large output changes |
| $\infty$ | Singular | No unique solution |

In [None]:
def compute_condition_number(A):
    """Compute condition number using SVD."""
    U, S, Vh = torch.linalg.svd(A)
    
    sigma_max = S[0].item()
    sigma_min = S[-1].item()
    
    if sigma_min < 1e-15:
        kappa = float('inf')
    else:
        kappa = sigma_max / sigma_min
    
    return kappa, S

# Well-conditioned matrix
A_good = torch.tensor([[1., 0.], [0., 1.]], dtype=torch.float64)  # Identity
kappa, S = compute_condition_number(A_good)
print(f"Identity matrix:")
print(f"  Singular values: {S.tolist()}")
print(f"  Condition number: {kappa:.2f}")
print(f"  Status: Well-conditioned ✓")

print()

# Ill-conditioned matrix
A_bad = torch.tensor([[1., 1.], [1., 1.0001]], dtype=torch.float64)
kappa, S = compute_condition_number(A_bad)
print(f"Nearly singular matrix:")
print(f"  Singular values: {S.tolist()}")
print(f"  Condition number: {kappa:.0f}")
print(f"  Status: Ill-conditioned ✗")

---
## 2. Geometric Intuition

The condition number describes how much a matrix "stretches" space:

- $\sigma_{\max}$: Maximum stretch
- $\sigma_{\min}$: Minimum stretch
- $\kappa = \sigma_{\max}/\sigma_{\min}$: Ratio of stretches

A high condition number means the matrix stretches much more in some directions than others.

In [None]:
# Visualize how matrices transform the unit circle
fig, axes = plt.subplots(1, 3, figsize=(15, 5))

# Unit circle
theta = np.linspace(0, 2*np.pi, 100)
circle = np.array([np.cos(theta), np.sin(theta)])

matrices = [
    (np.array([[1, 0], [0, 1]]), "Identity\nκ = 1"),
    (np.array([[2, 0], [0, 0.5]]), "Moderate stretch\nκ = 4"),
    (np.array([[10, 0], [0, 0.1]]), "Extreme stretch\nκ = 100")
]

for ax, (A, title) in zip(axes, matrices):
    # Transform circle
    ellipse = A @ circle
    
    ax.plot(circle[0], circle[1], 'b--', alpha=0.5, label='Unit circle')
    ax.plot(ellipse[0], ellipse[1], 'r-', linewidth=2, label='Transformed')
    ax.set_xlim(-12, 12)
    ax.set_ylim(-12, 12)
    ax.set_aspect('equal')
    ax.set_title(title)
    ax.legend()
    ax.grid(True, alpha=0.3)
    ax.axhline(y=0, color='k', linewidth=0.5)
    ax.axvline(x=0, color='k', linewidth=0.5)

plt.tight_layout()
plt.show()

---
## 3. Effect on Solution Accuracy

With floating-point precision of ~16 digits:

| $\kappa(A)$ | Digits lost | Reliable digits |
|-------------|-------------|----------------|
| $10^0$ | 0 | 16 |
| $10^4$ | 4 | 12 |
| $10^8$ | 8 | 8 |
| $10^{12}$ | 12 | 4 |
| $10^{16}$ | 16 | 0 (garbage) |

In [None]:
def demonstrate_condition_effect(kappa_target):
    """Show how condition number affects solution accuracy."""
    # Create matrix with specified condition number
    n = 10
    U, _ = torch.linalg.qr(torch.randn(n, n, dtype=torch.float64))
    V, _ = torch.linalg.qr(torch.randn(n, n, dtype=torch.float64))
    
    # Singular values from 1 to 1/kappa
    S = torch.logspace(0, -np.log10(kappa_target), n, dtype=torch.float64)
    
    A = U @ torch.diag(S) @ V.T
    
    # True solution
    x_true = torch.randn(n, dtype=torch.float64)
    b = A @ x_true
    
    # Solve
    x_computed = torch.linalg.solve(A, b)
    
    # Error
    relative_error = torch.norm(x_computed - x_true) / torch.norm(x_true)
    
    kappa_actual, _ = compute_condition_number(A)
    
    return kappa_actual, relative_error.item()

print("Effect of condition number on solution accuracy:")
print(f"{'κ(A)':<15} {'Relative Error':<20} {'Accurate Digits':<15}")
print("-" * 50)

for kappa_target in [1e2, 1e4, 1e6, 1e8, 1e10, 1e12]:
    kappa, error = demonstrate_condition_effect(kappa_target)
    digits = max(0, -np.log10(error + 1e-16))
    print(f"{kappa:<15.2e} {error:<20.2e} {digits:<15.1f}")

---
## 4. The Squared Condition Number Problem

For least squares, the normal equations involve $X^\top X$:

$$\kappa(X^\top X) = \kappa(X)^2$$

This is because singular values of $X^\top X$ are squares of singular values of $X$.

In [None]:
# Demonstrate κ(X'X) = κ(X)²
print("Condition number squaring:")
print(f"{'κ(X)':<15} {'κ(X\'X)':<20} {'κ(X)²':<15} {'Match?':<10}")
print("-" * 60)

for kappa_x in [10, 100, 1000, 10000]:
    # Create X with known condition number
    m, n = 50, 5
    U, _ = torch.linalg.qr(torch.randn(m, n, dtype=torch.float64))
    V, _ = torch.linalg.qr(torch.randn(n, n, dtype=torch.float64))
    S = torch.logspace(0, -np.log10(kappa_x), n, dtype=torch.float64)
    X = U @ torch.diag(S) @ V.T
    
    # Compute condition numbers
    kappa_X, _ = compute_condition_number(X)
    kappa_XtX, _ = compute_condition_number(X.T @ X)
    
    match = "✓" if abs(kappa_XtX - kappa_X**2) / kappa_X**2 < 0.01 else "✗"
    print(f"{kappa_X:<15.0f} {kappa_XtX:<20.0f} {kappa_X**2:<15.0f} {match:<10}")

---
## 5. Diagnosing Ill-Conditioning in Genomics

Common causes of ill-conditioned design matrices:

1. **Collinear covariates**: Batch ≈ Treatment
2. **Scaling issues**: Covariates on very different scales
3. **Imbalanced groups**: 99 controls, 1 treatment
4. **Near-redundant columns**: Age and BirthYear

In [None]:
def diagnose_design_matrix(X, names=None):
    """Diagnose potential issues with a design matrix."""
    m, n = X.shape
    
    print(f"Design matrix: {m} samples × {n} covariates")
    
    # SVD
    U, S, Vh = torch.linalg.svd(X, full_matrices=False)
    
    # Condition number
    kappa = (S[0] / S[-1]).item() if S[-1] > 1e-15 else float('inf')
    
    print(f"\nSingular values: {S.numpy().round(4)}")
    print(f"Condition number: {kappa:.2e}")
    
    # Status
    if kappa < 100:
        print("Status: ✓ Well-conditioned")
    elif kappa < 1e6:
        print("Status: ⚠️ Moderate - check your design")
    else:
        print("Status: ✗ Ill-conditioned - results unreliable!")
    
    # Check for near-zero singular values
    effective_rank = (S > S[0] * 1e-10).sum().item()
    if effective_rank < n:
        print(f"\n⚠️ Effective rank: {effective_rank}/{n}")
        print("   Some covariates are nearly collinear!")
    
    # Correlation between columns
    print("\nColumn correlations (|r| > 0.9 flagged):")
    X_centered = X - X.mean(dim=0)
    X_norm = X_centered / (X_centered.norm(dim=0) + 1e-10)
    corr = X_norm.T @ X_norm / m
    
    for i in range(n):
        for j in range(i+1, n):
            r = corr[i, j].item()
            if abs(r) > 0.9:
                name_i = names[i] if names else f"Col {i}"
                name_j = names[j] if names else f"Col {j}"
                print(f"   ⚠️ {name_i} ↔ {name_j}: r = {r:.3f}")

# Example: Problematic design
print("Example: Confounded experiment")
print("="*50)

# Batch perfectly correlated with treatment
X_bad = torch.tensor([
    [1., 0., 0.],
    [1., 0., 0.],
    [1., 0., 0.],
    [1., 1., 1.],  # Treatment = Batch!
    [1., 1., 1.],
    [1., 1., 1.],
], dtype=torch.float64)

diagnose_design_matrix(X_bad, names=['Intercept', 'Treatment', 'Batch'])

In [None]:
# Good design: Treatment and Batch not confounded
print("\nExample: Proper experimental design")
print("="*50)

X_good = torch.tensor([
    [1., 0., 0.],  # Control, Batch A
    [1., 0., 0.],
    [1., 1., 0.],  # Treatment, Batch A
    [1., 1., 0.],
    [1., 0., 1.],  # Control, Batch B
    [1., 0., 1.],
    [1., 1., 1.],  # Treatment, Batch B
    [1., 1., 1.],
], dtype=torch.float64)

diagnose_design_matrix(X_good, names=['Intercept', 'Treatment', 'Batch'])

---
## 6. Rules of Thumb

| $\kappa(A)$ | Status | Action |
|-------------|--------|--------|
| $< 100$ | ✓ Good | Proceed normally |
| $100 - 10^6$ | ⚠️ Moderate | Check design, consider centering/scaling |
| $> 10^6$ | ✗ Bad | Regularization required |
| $> 10^{12}$ | ✗✗ Very bad | Redesign experiment or model |

---
## Exercises

### Exercise 1: Compute Condition Numbers
Create matrices with condition numbers of approximately 10, 1000, and 1000000. Verify with SVD.

### Exercise 2: Scaling Effect
Create a design matrix where one column is Age (20-80) and another is Gene Expression (0.001-0.01). Check condition number before and after standardizing.

### Exercise 3: Real Design Matrix
Simulate a design matrix for: 50 samples, 2 batches, 2 treatments, 2 sexes (balanced design). Check if it's well-conditioned.

In [None]:
# Your solutions here


---
## Summary

| Concept | Key Point |
|---------|----------|
| Condition number | $\kappa(A) = \sigma_{\max}/\sigma_{\min}$ |
| Well-conditioned | $\kappa \approx 1$, stable solutions |
| Ill-conditioned | $\kappa \gg 1$, unreliable solutions |
| Squaring problem | $\kappa(X^\top X) = \kappa(X)^2$ |
| Solution | Use QR instead of normal equations |

## Next: 03_qr_decomposition.ipynb