# Content


### The Core Idea: Slope on a Mountain

Imagine you are standing on the side of a mountain. A friend asks, "What's the slope here?"

You can't give a single answer. The slope depends on the **direction** you are facing.
*   If you face directly uphill, the slope is steep and positive.
*   If you face directly downhill, the slope is steep and negative.
*   If you face sideways, along a path of constant elevation, the slope is zero.
*   If you face any other direction, you get some other value for the slope.

Multivariable calculus gives us the tools to answer this question precisely.
*   The **Gradient (∇f)** tells you which way is "directly uphill" and how steep it is.
*   The **Directional Derivative (Dᵤf)** tells you the slope in any specific direction you choose.

---

### The Gradient

#### Theory
The **gradient** of a multivariable function, `f(x, y)`, is a **vector** that packages the function's partial derivatives together. It is denoted by `∇f` (read "del f" or "grad f").

This vector has a profound physical meaning: **it points in the direction of the greatest rate of increase (steepest ascent) of the function at a given point.**

*   **Formula:** For a function `f(x, y)`, the gradient is:
    **∇f(x, y) = < ∂f/∂x , ∂f/∂y >**
    *   `∂f/∂x`: The partial derivative of `f` with respect to `x` (treat `y` as a constant).
    *   `∂f/∂y`: The partial derivative of `f` with respect to `y` (treat `x` as a constant).

The **magnitude** or length of the gradient vector, `||∇f||`, is also important: it tells you the *value* of the slope in that steepest direction.

#### Calculation Example
Let's find the gradient of the function `f(x, y) = x³y²` at the point `P(2, 1)`.

1.  **Find the partial derivatives:**
    *   `∂f/∂x`: Treat `y²` as a constant. The derivative of `x³` is `3x²`.
        *   `∂f/∂x = 3x²y²`
    *   `∂f/∂y`: Treat `x³` as a constant. The derivative of `y²` is `2y`.
        *   `∂f/∂y = 2x³y`

2.  **Assemble the gradient vector:**
    *   `∇f(x, y) = < 3x²y² , 2x³y >`

3.  **Evaluate the gradient at the point P(2, 1):**
    *   `∇f(2, 1) = < 3(2)²(1)² , 2(2)³(1) > = < 3(4)(1) , 2(8)(1) > = <12, 16>`

**Interpretation:**
*   At the point (2, 1) on the surface defined by `f(x, y)`, the direction of steepest ascent is given by the vector `<12, 16>`.
*   The slope in this steepest direction is the magnitude of the gradient: `||∇f(2,1)|| = √(12² + 16²) = √(144 + 256) = √400 = 20`.

#### Real-Life Usage
*   **Machine Learning (Gradient Descent):** This is the core algorithm for training most machine learning models. The "loss" is a function of the model's many parameters. The algorithm calculates the gradient of the loss function. To minimize the loss, it repeatedly takes a small step in the direction **opposite** to the gradient (the direction of steepest *descent*), thereby updating the model's parameters to make it more accurate.
*   **Meteorology:** Weather maps show isobars (lines of constant pressure). The wind flows from high to low pressure, and the pressure gradient vector (`-∇P`) points in the direction of the fastest pressure drop, indicating the direction of the strongest winds.

---

### Gradient and Graphs / 3. Gradient and Contour Maps

These two concepts are best understood together as they provide the geometric interpretation of the gradient.

#### Gradient and Graphs (3D View)
If you visualize the function `f(x, y)` as a 3D surface, the gradient vector `∇f(a,b)` at a point `(a,b)` can be thought of as a vector in the xy-plane that points in the direction you would have to walk to go straight uphill on the surface.

#### Gradient and Contour Maps (2D View)
This is the more powerful visualization. A **contour map** (or level set) shows lines where the function's value `f(x, y)` is constant. For our mountain analogy, these are lines of constant elevation.

**Key Property:** The gradient vector `∇f` at any point is always **perpendicular (orthogonal)** to the contour line that passes through that point.

**Why?** The contour line represents the direction of **zero change** in the function's value. The gradient represents the direction of **maximum change**. It makes intuitive sense that the direction of maximum change must be perpendicular to the direction of no change.


*Image: The red arrows are gradient vectors. Notice how each one is perpendicular to the blue contour line it starts on.*

---

### Directional Derivative and Slope

#### Theory
The gradient tells us the slope in the steepest direction, but what about all the other directions? That's what the **directional derivative** is for.

The directional derivative of `f` at a point `(x, y)` in the direction of a **unit vector** `u` is the slope of the surface at that point in that specific direction. It is denoted `Dᵤf`.

**Crucial Point:** The direction must be specified by a **unit vector `u`** (a vector with a magnitude of 1). If you have a direction vector `v` that is not a unit vector, you must first normalize it: `u = v / ||v||`.

The formal definition is based on a limit, but the practical calculation is much simpler and elegantly involves the gradient.

*   **Formula:** **Dᵤf(x, y) = ∇f(x, y) ⋅ u**
    *   The directional derivative is the **dot product** of the gradient and the direction unit vector.

#### Calculation Example
Let's continue with our function `f(x, y) = x³y²` at the point `P(2, 1)`, where we found `∇f(2, 1) = <12, 16>`.

We want to find the slope of the surface at this point in the direction of the vector **v = <3, 4>**.

1.  **Normalize the direction vector `v` to get the unit vector `u`:**
    *   `||v|| = √(3² + 4²) = √(9 + 16) = √25 = 5`
    *   `u = v / ||v|| = <3, 4> / 5 = <3/5, 4/5>`

2.  **Calculate the dot product:**
    *   `Dᵤf(2, 1) = ∇f(2, 1) ⋅ u`
    *   `Dᵤf(2, 1) = <12, 16> ⋅ <3/5, 4/5>`
    *   `Dᵤf(2, 1) = (12 * 3/5) + (16 * 4/5) = 36/5 + 64/5 = 100/5 = 20`

**Interpretation:**
*   The slope of the surface at the point (2, 1) in the direction of `<3, 4>` is exactly **20**.

**Wait, this is the same as the magnitude of the gradient!** Why? Because the direction vector we chose, `<3, 4>`, happens to point in the exact same direction as our gradient vector, `<12, 16>` (since `<12, 16> = 4 * <3, 4>`). We calculated the slope in the direction of steepest ascent, which is, by definition, the magnitude of the gradient.

**Let's try a different direction:** Find the slope in the direction of **v = <1, 0>** (the positive x-direction).
1.  `v = <1, 0>` is already a unit vector, so `u = <1, 0>`.
2.  `Dᵤf(2, 1) = <12, 16> ⋅ <1, 0> = (12*1) + (16*0) = 12`.
    *   **Interpretation:** The slope in the pure x-direction is 12. Notice that this is exactly the value of the partial derivative `∂f/∂x` at that point! The directional derivative is a generalization of the partial derivative.

***

### Python Code Illustration

This code will use `sympy` for symbolic math to find the gradient and `numpy` and `matplotlib` to calculate values and visualize the concepts.


In [None]:
import sympy as sp
import numpy as np
import matplotlib.pyplot as plt

# --- 1. Gradient Calculation using SymPy ---
# Define symbolic variables
x, y = sp.symbols('x y')
# Define a function
f = x**2 * sp.sin(y)

# Calculate partial derivatives
df_dx = sp.diff(f, x)
df_dy = sp.diff(f, y)

# Create the gradient vector
gradient = [df_dx, df_dy]

# Lambdify the symbolic expressions into fast, numerical Python functions
f_np = sp.lambdify((x, y), f, 'numpy')
grad_np = [sp.lambdify((x, y), comp, 'numpy') for comp in gradient]

# Point of interest
point = (1, np.pi/2)

# Evaluate the gradient at the point
grad_at_point = [comp(*point) for comp in grad_np]
grad_magnitude = np.linalg.norm(grad_at_point)

print("--- Gradient Calculation ---")
print(f"Function f(x, y) = {f}")
print(f"Gradient ∇f = {gradient}")
print(f"\nAt point {point}:")
print(f"  ∇f({point[0]}, {point[1]:.2f}) = <{grad_at_point[0]:.2f}, {grad_at_point[1]:.2f}>")
print(f"  Magnitude ||∇f|| (steepest slope): {grad_magnitude:.4f}")
print("-" * 50)


# --- 2. Directional Derivative Calculation ---
# Find the slope in the direction of vector v = <-1, 2>
v = np.array([-1, 2])
# Normalize v to get the unit vector u
u = v / np.linalg.norm(v)

# Calculate the dot product: Dᵤf = ∇f ⋅ u
directional_derivative = np.dot(grad_at_point, u)

print("\n--- Directional Derivative Calculation ---")
print(f"Direction vector v = {v}")
print(f"Unit vector u = <{u[0]:.4f}, {u[1]:.4f}>")
print(f"Directional derivative Dᵤf = ∇f ⋅ u = {directional_derivative:.4f}")
print(f"Interpretation: The slope of the surface at the point in this direction is {directional_derivative:.4f}.")
print("-" * 50)


# --- 3. Visualization: Gradient and Contour Map ---
print("\n--- Visualization ---")
# Create a grid of points for plotting
x_vals = np.linspace(-2, 2, 40)
y_vals = np.linspace(0, np.pi, 40)
X, Y = np.meshgrid(x_vals, y_vals)
Z = f_np(X, Y)

fig = plt.figure(figsize=(12, 5))

# 3D Surface Plot
ax1 = fig.add_subplot(1, 2, 1, projection='3d')
ax1.plot_surface(X, Y, Z, cmap='viridis', alpha=0.8)
ax1.set_title('3D Surface Plot of f(x, y)')
ax1.set_xlabel('x')
ax1.set_ylabel('y')

# 2D Contour Plot
ax2 = fig.add_subplot(1, 2, 2)
contours = ax2.contour(X, Y, Z, 20, cmap='viridis')
ax2.clabel(contours, inline=True, fontsize=8)
ax2.set_title('Contour Map with Gradient Vector')
ax2.set_xlabel('x')
ax2.set_ylabel('y')
ax2.set_aspect('equal')

# Plot the gradient vector at our point
ax2.quiver(point[0], point[1], grad_at_point[0], grad_at_point[1], 
           color='red', scale=15, label='Gradient ∇f')
ax2.plot(point[0], point[1], 'ro') # Mark the point
ax2.legend()
plt.tight_layout()
plt.show()

print("Notice how the red gradient vector is perpendicular to the contour line at the point.")
