# M2 Coding Assignment: Challenging Optimization Scenarios

In this notebook, you will leverage what you've learned and implement gradient descent to find the minimum of complex functions. You will experiment with different learning rates and introduce gradient clipping to handle boundary cases. This exercise will demonstrate the nuances and challenges of optimization in more realistic scenarios.

## Functions under consideration

### Rosenbrock Function

**Formula:**

$ f_1(x, y) = (a - x)^2 + b(y - x^2)^2 $

where $ a = 1 $ and $ b = 100 $.

**Characteristics:**

* The Rosenbrock function is known for its narrow, curved valley, which makes it challenging for gradient descent to converge quickly to the global minimum.
* It is often used to test optimization algorithms due to its sensitivity to parameter tuning, such as learning rate and gradient clipping.

### Himmelblau's Function
**Formula:**

$ f_2(x, y) = (x^2 + y - 11)^2 + (x + y^2 - 7)^2 $

**Characteristics:**

* Himmelblau’s function has multiple local minima and saddle points, making it a great example to study how gradient descent behaves when the optimization landscape is more complex.
* This function helps demonstrate how the initial starting point can drastically influence the outcome of gradient descent.

## How to Use This Notebook
1. Select a Function: Use the dropdown menu labeled "Function" to choose between the Rosenbrock and Himmelblau functions. This will change the landscape on which gradient descent is performed.

2. Adjust the Learning Rate: The learning rate controls the size of the steps taken during each iteration of gradient descent. A value too high might cause the algorithm to overshoot the minimum, while a value too low may result in slow convergence. Use the slider to set a learning rate between 0.0001 and 0.1.

3. Set the Initial Point: Use the sliders labeled "Initial X" and "Initial Y" to set the starting coordinates for gradient descent. For both the Rosenbrock and Himmelblau functions, the initial point can significantly impact the path taken during optimization.

4. Specify the Number of Iterations: Use the "Iterations" slider to determine how many steps of gradient descent will be performed. More iterations allow the algorithm to refine its approach but can increase computation time.

5. Enable Gradient Clipping: Gradient clipping helps prevent excessively large gradient updates, which can stabilize the training process. Set the "Clip Threshold" slider to a value between 0 (no clipping) and 20. This will cap the gradient values at the specified threshold.

6. Visualize the Path: The notebook will display a 3D plot showing the selected function's surface and the path taken by gradient descent as a red line with markers. This visualization helps you understand how the algorithm converges to a minimum, how it navigates through valleys or multiple minima, and how parameter choices affect the process.

## Exploration Goals
* Experiment with Different Learning Rates: Observe how increasing or decreasing the learning rate impacts the convergence speed and accuracy.
* Test Different Initial Points: Understand how the starting location influences the path and final solution, especially for the Himmelblau function with its multiple minima.
* Use Gradient Clipping: See how clipping the gradient can prevent overshooting and make the optimization process more stable.
* Analyze Iteration Counts: Find the balance between a sufficient number of iterations for convergence and efficiency.

By adjusting these parameters and observing the results, you will gain a deeper understanding of how gradient descent works in practice and how to tune it for different optimization scenarios. Happy experimenting!



In [None]:
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import ipywidgets as widgets
from IPython.display import display, clear_output

# Define the Rosenbrock function and its gradient
def rosenbrock(x, y, a=1, b=100):
    return (a - x)**2 + b * (y - x**2)**2

def grad_rosenbrock(x, y, a=1, b=100):
    df_dx = -2 * (a - x) - 4 * b * x * (y - x**2)
    df_dy = 2 * b * (y - x**2)
    return np.array([df_dx, df_dy])

# Define the Himmelblau function and its gradient
def himmelblau(x, y):
    return (x**2 + y - 11)**2 + (x + y**2 - 7)**2

def grad_himmelblau(x, y):
    df_dx = 4 * x * (x**2 + y - 11) + 2 * (x + y**2 - 7)
    df_dy = 2 * (x**2 + y - 11) + 4 * y * (x + y**2 - 7)
    return np.array([df_dx, df_dy])

# Gradient descent function with optional gradient clipping
def gradient_descent(learning_rate, initial_point, num_iterations, clip_threshold, gradient):
    x, y = initial_point
    history = [(x, y)]

    for i in range(num_iterations):
        grad = gradient(x, y)

        # Apply gradient clipping if a threshold is specified
        if clip_threshold:
            grad = np.clip(grad, -clip_threshold, clip_threshold)

        # Update x and y values
        x -= learning_rate * grad[0]
        y -= learning_rate * grad[1]

        history.append((x, y))

    return np.array(history)

# Function to update the plot based on widget values
def update_plot(function_choice, learning_rate, initial_x, initial_y, num_iterations, clip_threshold):
    initial_point = (initial_x, initial_y)

    # Select function and gradient based on user input
    if function_choice == 'Rosenbrock':
        function = rosenbrock
        gradient = grad_rosenbrock
        func_name = 'Rosenbrock Function'
        global_minima = [(1, 1)]
    else:
        function = himmelblau
        gradient = grad_himmelblau
        func_name = "Himmelblau's Function"
        global_minima = [(3, 2), (-2.805, 3.131), (-3.779, -3.283), (3.584, -1.848)]

    # Perform gradient descent and collect points
    points = gradient_descent(learning_rate, initial_point, num_iterations, clip_threshold, gradient)
    xs = points[:, 0]
    ys = points[:, 1]
    zs = [function(x, y) for x, y in points]

    # Generate mesh grid for plotting the function surface and contours
    X, Y = np.linspace(-5, 5, 400), np.linspace(-5, 5, 400)
    X, Y = np.meshgrid(X, Y)
    Z = function(X, Y)

    # Create the figure for 3D surface plot and contour plot
    fig = plt.figure(figsize=(12, 6))

    # 3D surface plot
    ax1 = fig.add_subplot(121, projection='3d')
    ax1.plot_surface(X, Y, Z, cmap='coolwarm', alpha=0.7)
    ax1.plot(xs, ys, zs, color='b', marker='o', markersize=5)
    ax1.set_xlabel('X')
    ax1.set_ylabel('Y')
    ax1.set_zlabel('Z')
    ax1.set_title(f'Gradient Descent Path on {func_name}')

    # Contour plot
    ax2 = fig.add_subplot(122)
    ax2.contour(X, Y, Z, levels=50)
    ax2.plot(xs, ys, color='b', marker='o', markersize=5)
    for x_min, y_min in global_minima:
        ax2.scatter(x_min, y_min, color='r', label='Global Minima' if (x_min, y_min) == global_minima[0] else "")

    ax2.set_xlabel('X')
    ax2.set_ylabel('Y')
    ax2.set_title(f'Contour Plot with Optimization Path on {func_name}')
    ax2.legend(loc='upper right')

    plt.tight_layout()
    plt.show()

# Interactive widgets for parameter adjustment
function_choice_dropdown = widgets.Dropdown(
    options=['Rosenbrock', 'Himmelblau'],
    value='Himmelblau',
    description='Function:'
)
learning_rate_slider = widgets.FloatSlider(value=0.01, min=0.0001, max=0.1, step=0.0001, description='Learning Rate:')
initial_x_slider = widgets.FloatSlider(value=1, min=-3, max=3, step=0.1, description='Initial X:')
initial_y_slider = widgets.FloatSlider(value=1, min=-3, max=3, step=0.1, description='Initial Y:')
num_iterations_slider = widgets.IntSlider(value=100, min=100, max=5000, step=100, description='Iterations:')
clip_threshold_slider = widgets.FloatSlider(value=10, min=0, max=20, step=0.1, description='Clip Threshold (0 = No Clipping):')

# Combine all widgets into a UI container
ui = widgets.VBox([
    function_choice_dropdown,
    learning_rate_slider,
    initial_x_slider,
    initial_y_slider,
    num_iterations_slider,
    clip_threshold_slider
])

# Link the widget values to the update_plot function to update the visualization dynamically
out = widgets.interactive_output(update_plot, {
    'function_choice': function_choice_dropdown,
    'learning_rate': learning_rate_slider,
    'initial_x': initial_x_slider,
    'initial_y': initial_y_slider,
    'num_iterations': num_iterations_slider,
    'clip_threshold': clip_threshold_slider
})

# Display the interactive widgets and output area
display(ui, out)


VBox(children=(Dropdown(description='Function:', index=1, options=('Rosenbrock', 'Himmelblau'), value='Himmelb…

Output()