# Day 34 — "Condition Number & Numerical Stability: Why Some Networks Are Hard to Train"

Condition number measures how unevenly a matrix scales directions. Large κ means unstable, slow optimization.


In [1]:
# Ensure repo root is on sys.path for local imports
import sys
from pathlib import Path

repo_root = Path.cwd()
if not (repo_root / "days").exists():
    for parent in Path.cwd().resolve().parents:
        if (parent / "days").exists():
            repo_root = parent
            break

sys.path.insert(0, str(repo_root))
print(f"Using repo root: {repo_root}")


Using repo root: /media/abdul-aziz/sdb7/masters_research/math_course_dlcv


## 1. Core Intuition

Ill-conditioned geometry creates narrow ravines. Gradient descent overshoots in steep directions and stalls in flat ones.


## 2. Condition Number

κ(A) = σ_max(A) / σ_min(A).

Small κ → stable. Large κ → unstable and sensitive to noise.


## 3. SVD Connection

Condition number depends entirely on the spread of singular values, not rank alone.


## 4. Python — Conditioning Demo

`days/day34/code/condition_number.py` compares well-conditioned and ill-conditioned matrices.


In [2]:
import numpy as np
from days.day34.code.condition_number import condition_number, run_gd

A_good = np.eye(2)
A_bad = np.array([[1.0, 0.0], [0.0, 1e-3]])

print("Cond(A_good):", condition_number(A_good))
print("Cond(A_bad):", condition_number(A_bad))
print("GD good:", run_gd(A_good))
print("GD bad:", run_gd(A_bad))


Cond(A_good): 1.0
Cond(A_bad): 1000.0
GD good: [9.53674316e-07 9.53674316e-07]
GD bad: [9.53674316e-07 9.90047358e-01]


## 5. Visualization — GD Paths & Condition Numbers

`days/day34/code/visualizations.py` plots gradient descent paths and compares κ(A).


In [3]:
from days.day34.code.visualizations import plot_conditioning_paths, plot_singular_spread

RUN_FIGURES = False

if RUN_FIGURES:
    plot_conditioning_paths()
    plot_singular_spread()
else:
    print("Set RUN_FIGURES = True to regenerate Day 34 figures inside days/day34/outputs/.")


Set RUN_FIGURES = True to regenerate Day 34 figures inside days/day34/outputs/.


## 6. Why Conditioning Matters in DL

- Poorly scaled weights distort gradients.
- Condition numbers multiply across layers.
- Normalization, residuals, and adaptive optimizers help fix conditioning.


## 7. Mini Exercises

1. Compare matrices with the same rank but different κ.
2. Observe GD convergence for different learning rates.
3. Track condition number growth across layers.


## 8. Key Takeaways

- Condition number measures geometric difficulty.
- Large κ → unstable, slow learning.
- Most deep learning tricks improve conditioning.
