# Day 44 — "Why Saddle Points Dominate & Why Stochastic Gradient Descent Works"

Saddles dominate high-dimensional landscapes. SGD’s noise helps escape them.


In [1]:
# Ensure repo root is on sys.path for local imports
import sys
from pathlib import Path

repo_root = Path.cwd()
if not (repo_root / "days").exists():
    for parent in Path.cwd().resolve().parents:
        if (parent / "days").exists():
            repo_root = parent
            break

sys.path.insert(0, str(repo_root))
print(f"Using repo root: {repo_root}")


Using repo root: /media/abdul-aziz/sdb7/masters_research/math_course_dlcv


## 1. Core Intuition

Saddles have mixed curvature. Gradient descent sees near-zero gradients, while SGD noise explores negative-curvature directions.


## 2. Saddle Dominance

In high dimensions, mixed-sign Hessian eigenvalues are overwhelmingly more likely than all-positive eigenvalues.


## 3. SGD = Gradient + Noise

SGD updates add structured noise from mini-batches, enabling exploration and escape from saddles.


## 4. Python — GD vs SGD Near a Saddle

`days/day44/code/sgd_saddle.py` compares deterministic GD and noisy SGD on f(x,y)=x^2-y^2.


In [2]:
from days.day44.code.sgd_saddle import run_gd, run_sgd
import numpy as np

x0 = np.array([1.0, 1.0])
print("GD final:", run_gd(x0)[-1])
print("SGD final:", run_sgd(x0)[-1])


GD final: [1.42724769e-05 9.10043815e+03]
SGD final: [1.56953126e-02 9.07149444e+03]


## 5. Visualization — GD vs SGD Path

`days/day44/code/visualizations.py` plots trajectories for GD and SGD near the saddle.


In [3]:
from days.day44.code.visualizations import plot_gd_vs_sgd

RUN_FIGURES = False

if RUN_FIGURES:
    plot_gd_vs_sgd()
else:
    print("Set RUN_FIGURES = True to regenerate Day 44 figures inside days/day44/outputs/.")


Set RUN_FIGURES = True to regenerate Day 44 figures inside days/day44/outputs/.


## 6. Mini Exercises

1. Increase batch size (reduce noise) and observe slower escape.
2. Add artificial noise to GD to mimic SGD.
3. Compare momentum vs vanilla SGD.


## 7. Key Takeaways

- Saddles dominate in high dimensions.
- GD stalls near saddles.
- SGD noise enables escape and improves generalization.
