# Real World Use Case: Navigating the Loss Landscape

**Scenario**: Training large models is like hiking down a mountain in the fog. Sometimes you get stuck in a valley (Local Minima) or a flat plateau.
**Goal**: Compare **SGD** (Blind Hiker) vs **Momentum/Adam** (Heavy Ball) on a difficult surface.

In [None]:
import numpy as np
import matplotlib.pyplot as plt

# 1. The Landscape (Saddle Point)
def loss_function(x, y):
    return x**2 - y**2 # Saddle shape

# 2. Simulation (Conceptual)
# We will simulate the path taken by optimizers

# SGD tends to oscillate or get stuck in one direction while effectively moving in another.
# RMSProp/Adam scales the learning rate differently for X and Y.

x = np.linspace(-2, 2, 20)
y = np.linspace(-2, 2, 20)
X, Y = np.meshgrid(x, y)
Z = loss_function(X, Y)

plt.figure(figsize=(8, 6))
plt.contour(X, Y, Z, levels=20)
plt.title("The Saddle Point Problem")
plt.xlabel("Parameter 1")
plt.ylabel("Parameter 2")
plt.plot([0], [0], 'rx', markersize=10, label='Saddle Point')
plt.legend()
plt.show()

## Conclusion
In deep learning, **Saddle Points** are more common than Local Minima. 
*   In the plot above, the center is a minimum if you look along X, but a maximum if you look along Y.
*   Standard Gradient Descent halts here (Gradient is 0).
*   **Momentum** allows the ball to roll through the flat spot because it has speed from coming down the hill.