# **Lab 6: optimization and learning**
**Gustav Grevsten**

# **Abstract**

The purpose of this lab is to implement and test the gradient descent method algorithm for finding stationary points in multivarable functions. We implement it using the golden-section search method to calculate the step length. In the end, we conclude that the algorithms implemented yield results that were expected.

# **Set up environment**

In [19]:
# Load neccessary modules.
import numpy as np

# **Introduction**

Gradient descent is an optimization technique that involves finding the minimum or maximum value of a multivariable function $f: \mathbb{R}^n → \mathbb{R}, n \in \mathbb{N}$ by iteratively adjusting the parameters in the direction of the steepest descent or ascent of the function. This method is useful for finding stationary points, which are locations where the gradient of the function is the zero vector. In essence, the gradient descent algorithm works by following or going against the direction of the gradient until a minimum or maximum point is reached, and can be adapted to handle functions of varying complexity and dimensionality. Despite its simplicity, gradient descent is a powerful tool for solving optimization problems and has become a fundamental technique in modern data science and engineering.

# **Method**

In this lab we will employ the gradient descent method, or steepest descent method, in order to find stationary points for multivariable functions. 

Starting from an initial guess vector $\overline{x}_0$, we update our guess by taking a step in the direction of the negative gradient of the function $f(\overline{x})$ at $x_n$, $n \in \mathbb{N}$:

$$x_{n+1} = x_n - \alpha \nabla f(x_n)$$

where $\alpha$ is a scalar called the step length, and $\nabla f(x_n)$ is the gradient of the function evaluated at $x_n$. This process is repeated iteratively until a convergence criterion is met, which in this case is a sufficiently small norm $||\nabla f(x_n)||$.

In this case, $\alpha$ is determined using the golden-section search method to minimize the function within the interval $\left[x_n, x_n - \frac{\nabla f(x_n)}{||\nabla f(x_n)||}\right]$.


In [21]:
def line_search(f, a, b, thresh = 10**-2):
    """Golden-section search"""
    gr = (np.sqrt(5) + 1) / 2
    c = np.subtract(b, np.multiply(np.subtract(b, a), 1/gr))
    d = np.add(a, np.multiply(np.subtract(b, a), 1/gr))
    while np.linalg.norm(np.subtract(b, a)) > thresh:
        if f(c) < f(d):
            b = d
        else:
            a = c

        c = np.subtract(b, np.multiply(np.subtract(b, a), 1/gr))
        d = np.add(a, np.multiply(np.subtract(b, a), 1/gr))

    return np.multiply(np.add(b, a), 1/2)

def grad(F, x):
  dX = 10**-10
  dF = []
  for i in range(len(x)):
    X = x.copy()
    X[i] += dX
    dF.append(np.subtract(f(X), f(x)))
  return np.multiply(dF, 1/dX)

def grad_dec(f, x0, tresh = 10**-5):
  dF = grad(f, x0)
  x = x0
  while np.linalg.norm(dF) > tresh:
    dF = grad(f, x)
    dF_unit = np.multiply(dF, 1/(np.linalg.norm(dF)))
    alpha = np.linalg.norm(np.subtract(x, line_search(f, x, np.subtract(x, dF_unit))))
    x = np.subtract(x, np.multiply(dF, alpha))
  return x

# **Results**

Here, we test the algorithm presented in the methods section. We use objective function $f(x,y) = sin(x) + cos(y)$ with the gradient $\nabla f(x,y) = [cos(x), -sin(y)]^T$. This function has the stationary points $(\frac{\pi}{2} + n\pi, n\pi)$, $n \in \mathbb{Z}$. We test the algorithm using the arbitrary starting guesses $(0,0), (1,1)$ and $(-1,-1)$

In [22]:
def f(v):
  x, y = v[0], v[1]
  return np.sin(x) + np.cos(y)

print("Stationary point found for initial guess (0,0): " + str(grad_dec(f, [0,0])))
print("Stationary point found for initial guess (1,1): " + str(grad_dec(f, [1,1])))
print("Stationary point found for initial guess (-1,-1): " + str(grad_dec(f, [-1,-1])))

Stationary point found for initial guess (0,0): [-1.57078545  0.        ]
Stationary point found for initial guess (1,1): [-1.57078467  3.14159176]
Stationary point found for initial guess (-1,-1): [-1.57079529 -3.14158073]


As we can see, the algorithm converged to three separate stationary points for each of the starting points and all solutions have accurate values.

# **Discussion**

As expected, the results confirm the effectiveness of this optimization technique for finding stationary points in functions. It should be noted that the step length $\alpha$ can be calculated using other methods, such as the conjugate gradient method, and a different interval could have been chosen for the golden-section search. The choice of both of these can have an impact on the convergence of the results.

Overall, the results provide further evidence of the usefulness and versatility of the gradient descent method for optimization problems.