Gradient Descent Optimization with `pyxu-gradient-descent`

Welcome to the user guide on how to use the `pyxu-gradient-descent` package for gradient descent optimization in computational imaging. In this guide, we will walk you through a simple example demonstrating how to use the `GradientDescent` class to minimize a function using the gradient descent algorithm.

## Prerequisites

Ensure that you have the `pyxu-gradient-descent` package installed in your Python environment. You can install it using the following command:

In [1]:
!pip install pyxu-gradient-descent

Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com


## Step 1: Import Necessary Modules
Before we begin, we need to import the class we have just added to the Pyxu package `GradientDescent`. We will also import NumPy and the Pyxu's `SquaredL2Norm` functional to create an example.

In [2]:
import numpy as np
from pyxu.operator import SquaredL2Norm
from pyxu.opt.solver import GradientDescent



## Step 2: Define the Function to be Minimized

In this example, we will minimize the Least Squares (LS) distance to a given vector (`y`). We will define this function using the `SquaredL2Norm` class from Pyxu.

In [3]:
# Define the input vector
y = np.array([0, 1, 0, 0, 2, 3, 0, 1, 4, 1, 0])
# Define the dimension of the input vector
N = len(y)
# Create an instance of the SquaredL2Norm class
sl2 = SquaredL2Norm(dim=N).asloss(y)

## Step 3: Initialize the Gradient Descent Solver

Next, we initialize the gradient descent solver with the function to be minimized.

In [4]:
# Create an instance of the GradientDescent class with the function to be minimized
gd = GradientDescent(f=sl2, show_progress=False)

## Step 4: Set Initial Point and Step Size

We need to set an initial point for the gradient descent algorithm and optionally specify the step size (τ). If the step size is not specified, it will be automatically determined.

In [5]:
# Define an initial point
x0 = np.random.randn(N)

# Define the step size (optional)
tau = 0.1

## Step 5: Run the Gradient Descent Algorithm

Now we are ready to run the gradient descent algorithm using the fit method of the GradientDescent class.

In [6]:
# Run the gradient descent algorithm
gd.fit(x0=x0, tau=tau, acceleration=True, track_objective=True)

## Step 6: Retrieve the Solution

After running the gradient descent algorithm, we can retrieve the solution using the solution method of the `GradientDescent` class.

In [7]:
# Retrieve the solution
solution = gd.solution()
print("Solution:", solution)
print(f"Solution is correct --> {np.allclose(y, solution, atol=1e-2)}")

Solution: [-3.27570848e-04  9.99706300e-01  1.56713285e-03  2.65776972e-05
  1.99891020e+00  2.99674239e+00 -2.08196580e-04  9.99719493e-01
  3.99658314e+00  9.99979286e-01 -1.47893029e-04]
Solution is correct --> True


## Step 6: Retrieve the statistics during the optimization problem

After running the gradient descent algorithm, we can also retrieve some interesting statistics, in this case, the relative error (`RelError[x]`) and the value for the objective function `objective_func` at each iteration:

In [8]:
# Retrieve the stats
data, history = gd.stats()
history

array([( 0, 0.00000000e+00, 2.53759988e+01),
       ( 1, 4.16710246e-01, 1.62406392e+01),
       ( 2, 3.80357198e-01, 9.13535956e+00),
       ( 3, 2.89979702e-01, 4.39146884e+00),
       ( 4, 2.04252865e-01, 1.70487734e+00),
       ( 5, 1.37938790e-01, 4.67148345e-01),
       ( 6, 8.87222657e-02, 5.55415310e-02),
       ( 7, 5.25220156e-02, 2.52948605e-03),
       ( 8, 2.63193150e-02, 4.01507174e-02),
       ( 9, 8.09182982e-03, 6.13165784e-02),
       (10, 3.60051282e-03, 5.12772113e-02),
       (11, 9.94832478e-03, 2.82655782e-02),
       (12, 1.20984534e-02, 9.57267179e-03),
       (13, 1.12348604e-02, 1.10825594e-03),
       (14, 8.54853934e-03, 2.34850329e-04),
       (15, 5.13271879e-03, 1.96149505e-03),
       (16, 1.86075045e-03, 2.99650148e-03),
       (17, 6.96761060e-04, 2.58402292e-03),
       (18, 2.28543631e-03, 1.44474886e-03),
       (19, 2.91598958e-03, 4.67193968e-04),
       (20, 2.77623380e-03, 3.55711560e-05),
       (21, 2.13752565e-03, 3.74053279e-05),
       (22

## Step 7: Automatic Hyperparameter Tuning

Take advantage of the automatic hyperparameter tuning feature available in `GradientDescent` to optimally determine the best value for the `tau` hyperparameter. Observe how leveraging this feature can enhance efficiency by facilitating a reduction in the number of iterations needed when the step size is chosen optimally.

In [9]:
gd.fit(x0=x0, acceleration=True, track_objective=True)
_, history = gd.stats()
history

array([(0, 0.        , 25.37599877), (1, 2.08355123,  0.        ),
       (2, 0.        ,  0.        )],
      dtype=[('iteration', '<i8'), ('RelError[x]', '<f8'), ('Memorize[objective_func]', '<f8')])