# Nonlinear optimisation

This notebooks contains a number of topics for nonlinear optimisation:

1. Plotting of Gradient and calculation of Hessian
2. Analytical steepest descent method with SymPy
3. Use of SciPy to solve nonlinear optimisation problems


## Gradient and Hessian

Let's start with plotting the gradient of a 2D function

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import axes3d, Axes3D

# Define the function and the gradient
def f(x, y):
    return -(np.cos(x)**2 + np.cos(y)**2)**2

def grad_f(x, y):
    return 4 * (np.cos(x)**2 + np.cos(y)**2) * np.sin(x), 4 * (np.cos(x)**2 + np.cos(y)**2) * np.sin(y)
    

In [None]:
# Create a grid of x and y values
x = np.linspace(-np.pi/2, np.pi/2, 100)
y = np.linspace(-np.pi/2, np.pi/2, 100)
X, Y = np.meshgrid(x, y)

# Evaluate the function on the grid
Z = f(X, Y)

# Create the figure for plotting
fig = plt.figure(figsize=(12, 8))
ax = fig.add_subplot(111, projection='3d')

# Plot the surface
ax.plot_surface(X, Y, Z, alpha=0.5, rstride=4, cstride=4, color='b')

# Plot the gradient vectors (quivers)
# Subsample for quiver to reduce the density of the vectors
skip = (slice(None, None, 6), slice(None, None, 5))  # Change 5 to greater numbers to reduce more
U, V = grad_f(X[skip], Y[skip])
# Normalization
norm = np.sqrt(U**2 + V**2)
U /= norm
V /= norm
ax.quiver(X[skip], Y[skip], -4, U, V, 0, color='r', length=0.1, normalize=True)


# Labels and title
ax.set_xlabel('X axis')
ax.set_ylabel('Y axis')
ax.set_zlabel('Z axis')
ax.set_title('Surface plot with gradient projection in XY-plane')

# Show plot
plt.show()

# Save the figure to file
#fig.savefig('Function_and_gradient.png', dpi=300)

## Steepest descent method

We will set up an analytical solution with SymPy. Here is a link to an introduction to it: [Introduction tutorial](https://docs.sympy.org/latest/tutorials/intro-tutorial/index.html)

The implementation should work for any function of three variables for which SymPy manages to calculate the gradient.

First we define the function and calculate its gradient.

In [28]:
import sympy as sp

# Declare variables
x1, x2, x3 = sp.symbols('x1 x2 x3')
a = sp.symbols('a')

# Define the function
f = x1**2 + x1 * (1-x2) + x2**2 - x2*x3 + x3**2 + x3

# Display the function
print("The function f(x1, x2, x3) is defined as:")
display(f)

# Compute the gradient of f
gradient_f = sp.Matrix([f.diff(x) for x in (x1, x2, x3)])

# Display the gradient
print("The gradient of f(x1, x2, x3) is:")
display(gradient_f)

The function f(x1, x2, x3) is defined as:


x1**2 + x1*(1 - x2) + x2**2 - x2*x3 + x3**2 + x3

The gradient of f(x1, x2, x3) is:


Matrix([
[  2*x1 - x2 + 1],
[-x1 + 2*x2 - x3],
[ -x2 + 2*x3 + 1]])

Now we set up a loop to iteratively calculate the next point. We start by defining the initial point. In the loop, we are performing the following steps:

- Evaluate the negative gradient at the current point
- Calculate the step and add it to the current point
- Insert the formula for the new point into the function
- Calculate the derivative of the function with respect to a
- Find the optimal value of a and update x

In [42]:
# Set the initial point
x = [0, 0, 0]

# Set to True to show debugging output
show = False

for i in range(8):
    print("Step", i+1)
    # Evaluate the negative gradient at the current point
    grad_at_x = -gradient_f.subs({x1: x[0], x2: x[1], x3: x[2]}) 
    if show:
        display(grad_at_x)

    # Calculate the step and add it to the current point
    x_update = [a * g for g in grad_at_x]
    x_new = [sum(i) for i in zip(x, x_update)]

    if show:
        display(x_update)
        display(x_new)
    
    # Insert the formula for the new point into the function
    f_new = f.subs({x1: x_new[0], x2: x_new[1], x3: x_new[2]}) 

    # Calculate the derivative of the function with respect to a
    df_da = sp.diff(f_new, a)

    if show:
        display(f_new)
        display(df_da)

    # Find the optimal value of a
    sol = sp.solvers.solve(df_da, a)

    # Update x
    x_step = [sol[0] * g for g in grad_at_x]

    if show:
        display(sol)
        display(x_step)

    # Add the step to the current x
    x = [sum(i) for i in zip(x, x_step)]
    print("x[{}] = [".format(i+1), *x, "]")

Step 1
x[1] = [ -1/2 0 -1/2 ]
Step 2
x[2] = [ -1/2 -1/2 -1/2 ]
Step 3
x[3] = [ -3/4 -1/2 -3/4 ]
Step 4
x[4] = [ -3/4 -3/4 -3/4 ]
Step 5
x[5] = [ -7/8 -3/4 -7/8 ]
Step 6
x[6] = [ -7/8 -7/8 -7/8 ]
Step 7
x[7] = [ -15/16 -7/8 -15/16 ]
Step 8
x[8] = [ -15/16 -15/16 -15/16 ]


## Use SciPy

In [43]:
def problem_b(x):
    """Problem formulation"""
    return -(126*x[0] - 9*x[0]**2 + 182*x[1] - 13*x[1]**2)

from scipy.optimize import minimize
x0 = [0, 0]

res = minimize(problem_b, x0, method='Nelder-Mead', tol=1e-6)
print(res)
print("Function value {}".format(-problem_b(res.x)))

       message: Optimization terminated successfully.
       success: True
        status: 0
           fun: -1077.9999999999989
             x: [ 7.000e+00  7.000e+00]
           nit: 95
          nfev: 183
 final_simplex: (array([[ 7.000e+00,  7.000e+00],
                       [ 7.000e+00,  7.000e+00],
                       [ 7.000e+00,  7.000e+00]]), array([-1.078e+03, -1.078e+03, -1.078e+03]))
Function value 1077.9999999999989
