# Project 02: Dual-Descent-Viz - Constrained Optimization

**Author:** Davi Bonetto  

## 1. Theoretical Foundations

### 1.1 The Primal Problem
Standard Gradient Descent finds a local minimum of an unconstrained function $f(x)$. However, in the real world (and in Deep Learning), variables are often constrained. 

We aim to minimize a function $f(x)$ subject to an inequality constraint $g(x) \leq 0$:

$$ \min_x f(x) \quad \text{subject to} \quad g(x) \leq 0 $$

### 1.2 The Lagrangian & Dual Ascent
We convert this constrained problem into an unconstrained "Min-Max" game using the **Lagrangian Function**:

$$ \mathcal{L}(x, \lambda) = f(x) + \lambda g(x) $$

Here, $\lambda$ (Lambda) is the **Lagrange Multiplier**. It acts as a "price" or "penalty" for violating the constraint.

**The Logic:**
1.  **Primal Descent (Minimize $x$):** The agent tries to minimize the total cost (Objective + Penalty). If $\lambda$ is high, the agent is forced to reduce $g(x)$.
    $$ x_{t+1} = x_t - \alpha \nabla_x \mathcal{L} $$

2.  **Dual Ascent (Maximize $\lambda$):** The system tries to maximize the penalty. If the agent violates the constraint ($g(x) > 0$), $\lambda$ shoots up, making the violation expensive. If satisfied, $\lambda$ drops to 0.
    $$ \lambda_{t+1} = \text{ReLU}(\lambda_t + \beta \nabla_\lambda \mathcal{L}) = \text{ReLU}(\lambda_t + \beta g(x)) $$

### 1.3 KKT Conditions
At the optimal point $(x^*, \lambda^*)$, the Karush-Kuhn-Tucker (KKT) conditions hold:
- **Stationarity:** $\nabla f(x^*) + \lambda^* \nabla g(x^*) = 0$ (Forces balance).
- **Primal Feasibility:** $g(x^*) \leq 0$.
- **Dual Feasibility:** $\lambda^* \geq 0$.
- **Complementary Slackness:** $\lambda^* g(x^*) = 0$.

## 2. The Problem Setup

We optimize a 2D agent $(x, y)$.

**Objective ($f$):** A non-convex "Saddle" function. The agent wants to slide down to $-\infty$, but...
$$ f(x, y) = x^2 - y^2 $$

**Constraint ($g$):** The agent must stay inside the Unit Circle.
$$ g(x, y) = x^2 + y^2 - 1 \leq 0 $$

The generic minimum of $f$ is at $(0, \pm \infty)$. But constrained to the circle, the minimums are at $(0, 1)$ and $(0, -1)$ with value $-1$.

In [None]:
import numpy as np
import plotly.graph_objects as go

# --- 2.1 Define Functions ---

def f(x, y):
    """Objective: Saddle Surface"""
    return x**2 - y**2

def grad_f(x, y):
    """Gradient of f w.r.t [x, y]"""
    df_dx = 2 * x
    df_dy = -2 * y
    return np.array([df_dx, df_dy])

def g(x, y):
    """Constraint: Inside Unit Circle (x^2 + y^2 - 1 <= 0)"""
    return x**2 + y**2 - 1

def grad_g(x, y):
    """Gradient of g w.r.t [x, y]"""
    dg_dx = 2 * x
    dg_dy = 2 * y
    return np.array([dg_dx, dg_dy])

## 3. The 'DualDescent' Engine

We calculate the gradients of the Lagrangian:
$$ \nabla_x \mathcal{L} = \nabla f(x) + \lambda \nabla g(x) $$

In [None]:
class DualDescentOptimizer:
    def __init__(self, learning_rate_primal=0.05, learning_rate_dual=0.1):
        self.lr_x = learning_rate_primal
        self.lr_lambda = learning_rate_dual
        self.history = []
        
    def optimize(self, start_x, start_y, iterations=200):
        # Initialize variables
        x, y = start_x, start_y
        lam = 0.0 # Initial Lagrange Multiplier (usually 0)
        
        self.history = []
        
        for i in range(iterations):
            # 1. Compute Gradients
            gf = grad_f(x, y)
            gg = grad_g(x, y)
            constraint_val = g(x, y)
            
            # 2. Lagrangian Gradient w.r.t x (Primal step)
            # L = f + lambda * g
            # dL/dx = df/dx + lambda * dg/dx
            grad_L_x = gf + lam * gg
            
            # 3. Update primal variables (Gradient Descent)
            # x_new = x - lr * grad_L_x
            x_new = x - self.lr_x * grad_L_x[0]
            y_new = y - self.lr_x * grad_L_x[1]
            
            # 4. Update dual variable (Gradient Ascent)
            # lambda needs to increase if constraint is violated (g > 0)
            # And we project to >= 0 (KKT condition)
            lam_new = lam + self.lr_lambda * constraint_val
            lam_new = max(0.0, lam_new)
            
            # Store history
            z_val = f(x, y)
            self.history.append({
                'step': i,
                'x': x, 'y': y, 'z': z_val,
                'lambda': lam,
                'constraint': constraint_val
            })
            
            # Update state
            x, y, lam = x_new, y_new, lam_new
            
        return self.history

# --- Run Simulation ---
# Start OUTSIDE the circle at (1.5, 0.0) where objective x^2 - y^2 is 2.25
optimizer = DualDescentOptimizer(learning_rate_primal=0.02, learning_rate_dual=0.1)
history = optimizer.optimize(start_x=1.5, start_y=0.1, iterations=150)

print(f"Final Position: ({history[-1]['x']:.3f}, {history[-1]['y']:.3f})")
print(f"Final Constraint Value: {history[-1]['constraint']:.3f} (Should be <= 0)")
print(f"Final Lambda: {history[-1]['lambda']:.3f}")

## 4. Interactive Visualization

We visualize the agent "surfing" the saddle point while hitting the invisible cylindrical wall of the constraint.

In [None]:
# Prepare Data for Plotting
traj_x = [h['x'] for h in history]
traj_y = [h['y'] for h in history]
traj_z = [h['z'] for h in history]

# 1. Surface Data (The Saddle)
x_range = np.linspace(-2, 2, 50)
y_range = np.linspace(-2, 2, 50)
X, Y = np.meshgrid(x_range, y_range)
Z = f(X, Y)

# 2. Constraint Data (The Cylinder Wall x^2 + y^2 = 1)
theta = np.linspace(0, 2*np.pi, 50)
z_cyl = np.linspace(-4, 4, 10)
THETA, Z_CYL = np.meshgrid(theta, z_cyl)
X_CYL = np.cos(THETA)
Y_CYL = np.sin(THETA)

# --- Plotly Figure ---
fig = go.Figure()

# A. The Saddle Surface
fig.add_trace(go.Surface(z=Z, x=X, y=Y, colorscale='Viridis', opacity=0.7, name='Objective'))

# B. The Constraint Wall
# We construct this using Mesh3d or just lines. Surface is easier here.
fig.add_trace(go.Surface(z=Z_CYL, x=X_CYL, y=Y_CYL, showscale=False, opacity=0.3, colorscale='Greys', name='Constraint'))

# C. The Trajectory
fig.add_trace(go.Scatter3d(
    x=traj_x, y=traj_y, z=traj_z,
    mode='lines+markers',
    marker=dict(size=4, color=list(range(len(traj_x))), colorscale='Bluered', showscale=False),
    line=dict(color='black', width=3),
    name='Agent Path'
))

# D. Start and End points
fig.add_trace(go.Scatter3d(
    x=[traj_x[0]], y=[traj_y[0]], z=[traj_z[0]],
    mode='markers', marker=dict(size=8, color='green'), name='Start'
))
fig.add_trace(go.Scatter3d(
    x=[traj_x[-1]], y=[traj_y[-1]], z=[traj_z[-1]],
    mode='markers', marker=dict(size=8, color='red'), name='End'
))

fig.update_layout(
    title="Dual Descent: Saddle Point Optimization with Constraints",
    scene=dict(
        xaxis_title='X',
        yaxis_title='Y',
        zaxis_title='f(x,y)',
        aspectmode='cube'
    ),
    width=900,
    height=700
)

fig.show()