# Optimization and Geometric Analysis of Functions in Two Variables

In this activity, we will work with multivariable differential calculus using `sympy` and `matplotlib` to:
- Calculate partial derivatives, the gradient, and the Hessian matrix of a two-variable function.
- Identify and classify critical points by finding where the partial derivatives are zero.
- Visualize the results graphically with a contour plot.

In [None]:
import sympy as sp
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

## 1. Function Definition

We define a symbolic function of two variables, $g(x, y)$, that has a clearly identifiable critical point. In this case:

$$
g(x, y) = x^2 + 3y^2 - 4x + 2y + 1
$$

In [None]:
# Add the src directory to the Python path
import sys
sys.path.append('../src')

from plotting_utils import plot_3d_surface_and_contour

In [None]:
# Define the symbolic variables
x, y = sp.symbols('x y')

# Define the multivariable function
g = x**2 + 3*y**2 - 4*x + 2*y + 1

g

## 2. Partial Derivatives, Gradient, and Hessian Matrix

### 1. Partial Derivatives

The **partial derivatives** of a two-variable function $ f(x, y) $ are defined as the rate of change of the function with respect to one of its variables, while holding the other constant:

- Partial derivative with respect to $ x $:
  $$
  \frac{\partial f}{\partial x}(x, y) = \lim_{h \to 0} \frac{f(x+h, y) - f(x, y)}{h}
  $$

- Partial derivative with respect to $ y $:
  $$
  \frac{\partial f}{\partial y}(x, y) = \lim_{h \to 0} \frac{f(x, y+h) - f(x, y)}{h}
  $$

---

### 2. Gradient $ \nabla f $

The **gradient** of a scalar function $ f(x, y) $ is a **vector** that contains all of its partial derivatives. It points in the direction of the **greatest rate of increase** of the function:

$$
\nabla f(x, y) = \left( \frac{\partial f}{\partial x}, \frac{\partial f}{\partial y} \right)
$$

- The gradient is **perpendicular** to the level curves of $ f $.
- In optimization, $ \nabla f = 0 $ indicates the **critical points**.

---

### 3. Hessian Matrix $ H(f) $

The **Hessian matrix** is a square matrix containing all the **second-order partial derivatives** of a function. It describes the **curvature** of $ f(x, y) $:

$$
H(f(x, y)) =
\begin{bmatrix}
\frac{\partial^2 f}{\partial x^2} & \frac{\partial^2 f}{\partial x \partial y} \\\\
\frac{\partial^2 f}{\partial y \partial x} & \frac{\partial^2 f}{\partial y^2}
\end{bmatrix}
$$

First, we will determine each of these parameters for the function $g(x,y)$.

In [None]:
# Partial derivatives
dg_dx = sp.diff(g, x)
dg_dy = sp.diff(g, y)

# Gradient
grad_g = sp.Matrix([dg_dx, dg_dy])

# Hessian Matrix
hess_g = sp.hessian(g, (x, y))

print("Partial derivative with respect to x:\n∂g/∂x:", dg_dx)
print("\nPartial derivative with respect to y:\n∂g/∂y:", dg_dy)
print("\nGradient of g(x,y):")
display(grad_g)
print("\nHessian Matrix of g(x,y):")
display(hess_g)

Using the `.solve()` method from `sympy`, we can determine the critical points of the function $g(x,y)$.

In [None]:
critical_points = sp.solve([dg_dx, dg_dy], (x, y))
critical_points # Returns a dict with the critical points

## 3. Critical Point Classification

We evaluate the Hessian matrix at the critical point and analyze its eigenvalues to classify the point:
- **All eigenvalues are positive**: local minimum
- **All eigenvalues are negative**: local maximum
- **Mixed (positive and negative) eigenvalues**: saddle point

In [None]:
# Substitute the critical points into the Hessian matrix
H_at_crit = hess_g.subs(critical_points)

# Returns a dict {keys: eigenvalues, vals: algebraic multiplicity}
eigenvals = H_at_crit.eigenvals()

print("Hessian Matrix at the critical point:")
display(H_at_crit)
print("\nEigenvalues of the Hessian Matrix:", eigenvals)

In this case, the eigenvalues are positive, so the critical point is a **local minimum**.

## 4. Visualization

To visualize the function, we will plot the surface of $g(x, y)$ and a contour map, marking the critical point on both graphs.

In [None]:
# Plot the results
plot_3d_surface_and_contour(g, critical_points)

## Why is multivariable calculus important in Machine Learning?

- It is especially used in training algorithms like gradient descent: calculating the gradient helps determine the direction to adjust the model's parameters to minimize a loss function.
- The Hessian matrix is fundamental in more advanced optimization methods (like Newton-Raphson), as it provides information about the curvature of the function.
- Classifying a function's critical points helps identify optimal or unstable minima and maxima (saddle points), which is crucial for the model's convergence to useful solutions.