# Homework #6: Backtracking line search

Deadline: January 30, 2025

## Background

In this assignment, you will be implementing backtracking line search for the [Rosenbrock function](https://en.wikipedia.org/wiki/Rosenbrock_function). Backtracking line search is a heuristic approach to compute the step size in gradient descent. 

The assignment is organized into 3 parts to help guide your implementation:
- Step 1: Algorithm (fill in the TO DOs)
- Step 2: Visualize (run the cells)
- Step 3: Analyze (write a few sentences discussing what you found)

In [1]:
using Plots, LinearAlgebra, CSV, DataFrames, Random, NLsolve

## Step 1. Algorithm

Backtracking line search requires two parameters: $\sigma$ and $\beta$. It starts with unit step size $\alpha = 1$ and then reduces it by the factor $\beta$. Let's break down the method below:

1. Given a descent direction $d$ for $f$ at $x \in \textbf{dom} f$, $\sigma \in (0, 0.5], \beta \in (0, 1)$.

2. Set $\alpha = 1$. 

3. While $f(x + \alpha d) > f(x) + \sigma \alpha \nabla f(x)^T d$, then let $\alpha = \beta \alpha$.

In gradient descent, the search direction $d$ is the negative of the gradient, so $d = - \nabla f(x)$. We can rewrite the condition in step 3 as:

$$f(x - \alpha \nabla f(x)) > f(x) - \sigma \alpha || \nabla f(x) ||^2$$


Complete the TO DOs below to implement gradient descent with backtracking line search:

In [2]:
function cost_function_rosenbrock(x)
    return 100 * (x[2] - x[1]^2)^2 + (1 - x[1])^2
end

function gradient_rosenbrock(x)
    return [
        - 400 * x[1] * (x[2] - x[1]^2) - 2*(1 - x[1]),
          200 * (x[2] - x[1]^2),
    ]
end

gradient_rosenbrock (generic function with 1 method)

In [None]:
function gradient_descent_backtracking_linesearch(
    cost_function::Function,
    gradient::Function,
    initial_x::Vector,  # Initial point
    epsilon,            # Termination parameter
    s,                  # Initial alpha
    sigma,              # Backtracking parameter in (0, 0.5]
    beta,               # Rate of decrease in alpha in (0, 1)
)
    # Initialization
    x = initial_x
    k = 0
    x_history = zeros(Float64, (0, 2))
    cost_history = Float64[]
    gradient_norm_history = Float64[]
    alpha_history = Float64[]

    while true
        # Find descent direction d
        gradient_val = gradient(x)
        cost_function_val = cost_function(x)
        gradient_norm = norm(gradient_val)
        d = #TO DO

        # Compute step size alpha
        alpha = s
        while #TO DO
            alpha = #TO DO: update alpha
        end

        # Update history
        x_history = vcat(x_history, x')
        push!(cost_history, cost_function_val)
        push!(gradient_norm_history, gradient_norm)
        push!(alpha_history, alpha)

        if gradient_norm < epsilon
            break
        end

        # Update x
        x = # TO DO

        # Increment iteration count
        k += 1
    end
    return Dict(
        "x" => x_history,
        "cost" => cost_history,
        "gradient_norm" => gradient_norm_history,
        "alpha" => alpha_history,
    )
end

## Step 2: Visualize

Run the following code to visualize the results!

In [None]:
initial_x_rosenbrock = [0.25, 4.5]
epsilon = 1e-5;

results_rosenbrock_backtracking_linesearch = @time gradient_descent_backtracking_linesearch(
    cost_function_rosenbrock,
    gradient_rosenbrock,
    initial_x_rosenbrock,
    epsilon,
    2,      # s
    0.25,   # sigma
    0.5,    # beta
);

length(results_rosenbrock_backtracking_linesearch["cost"])

In [None]:
# Define the range of x and y values
x_range = 0:0.01:3
y_range = 0:0.01:5
grid = [(x, y) for x in x_range, y in y_range]

z = [cost_function_rosenbrock([x y]) for (x, y) in grid]
z = reshape(z, length(x_range), length(y_range))'
;

In [None]:
contour_rosenbrock_backtracking_linesearch = contour(
    x_range, y_range, z, 
    levels = 1000, 
    c = :viridis, color = :auto, 
    legend = false,
)
Plots.plot!(
    results_rosenbrock_backtracking_linesearch["x"][:,1],
    results_rosenbrock_backtracking_linesearch["x"][:,2],
    linestyle = :dash,
    linewidth = 2,
    markershape = :circle, 
    color = :red,
    title = "Gradient descent with backtracking_linesearch step size ($(length(results_rosenbrock_backtracking_linesearch["cost"])) iterations)"
)

## Step 3: Analyze

For completion, answer the following questions briefly:

- How does backtracking compare with constant step size gradient descent (see the the lecture notebook)? In particular, compare the number of iterations for the Rosenbrock function and discuss convergence (at a high level).

ANSWER:

- How does backtracking line search compare with exact line search? You can answer this qualitatively by addressing the computational cost and number of iterations. 

ANSWER: 

**Submit a PDF version of this notebook to Canvas (with steps 1 and 3 completed) by January 30 at 11:59 p.m. Please reach out to Shriya (karam809@mit.edu) if you have any questions!**