# Project 1 - Finite differences

Many physics codes only provide objective values and are not able to compute their gradients. In that case, the derivative is often approximated using finite differences. The goal of this project is to implement your own finite difference approximation and investigate the impact of round-off error as well as the increase in runtime compared to analytical gradient computation.

In [None]:
# Lets begin by setting up our toy problem again.
from pyplasmaopt import *
nfp = 2
(coils, currents, magnetic_axis, eta_bar) = get_24_coil_data(nfp=nfp, ppp=10, at_optimum=False)
stellarator = CoilCollection(coils, currents, nfp, True)
iota_target = 0.103
coil_length_target = 4.398229715025710
magnetic_axis_length_target = 6.356206812106860
eta_bar = -2.25
obj = SimpleNearAxisQuasiSymmetryObjective(
        stellarator, magnetic_axis, iota_target, eta_bar=eta_bar,
        coil_length_target=coil_length_target, magnetic_axis_length_target=magnetic_axis_length_target)

A first order approximation to the derivative at $x$ in direction of the $i$-th degree of freedom is given by
$$ (\nabla f(x))_i= \frac{f(x+\epsilon e_i) - f(x)}{\epsilon} + O(\epsilon). $$

The following code will compute the gradient in a direction $h$ and compares it to the derivative that the code computes for us.

In [None]:
def compute_directional_derivative_fd(obj, x, h, eps):
    obj.update(x, compute_derivative=False)
    fx = obj.res
    obj.update(x+eps*h, compute_derivative=False)
    fxh = obj.res
    return (fxh-fx)/eps

x = obj.x0
h = np.random.rand(*(x.shape))
obj.update(x)
derivative = np.sum(obj.dres*h)
eps = 1e-6
fd_derivative = compute_directional_derivative_fd(obj, x, h, eps)
print('Exact derivative =', derivative)
print('Finite differences derivative =', fd_derivative)
print('Error =', abs(derivative-fd_derivative))

## Tasks

1. Vary the stepsize $\epsilon$. As $\epsilon\to0$, how does the error behave?
2. Look up the 2nd, 4th, and 6th order finite difference stencils on wikipedia and implement them. Show a plot comparing the step size $\epsilon$ with the approximation error. What do you observe?
3. The `minimize` function in scipy supports finite differences to compute the gradient. To use this, set the argument `jac=False` and make sure that the function you pass to minimise only returns the objective value, and not both objective and gradient. Now run the optimisation once with exact gradients and once using the scipy finite difference approximation for 500 iterations. Compare the runtime of the two and plot the objective values. What to do you observe? Call `print` on the object that `minimize` returns to find out why each algorithm terminated.

*Note*: If you pass `callback=obj.callback` to `minimize`, then the objective history will be stored in `obj.Jvals`. Beore running a second optimisation, you can reset this history by calling `obj.clear_history()`.

## Solutions

### Task 1

In [None]:
for eps in [10**(-i) for i in range(5, 14)]:
    fd_derivative = compute_directional_derivative_fd(obj, x, h, eps)
    err = abs(derivative-fd_derivative)
    print(eps, err)

### Task 2

In [None]:
def compute_directional_derivative_fd_ho(obj, x, h, eps, order):
    if order == 1:
        shifts = [0, 1]
        weights = [-1, 1]
    elif order == 2:
        shifts = [-1, 1]
        weights = [-0.5, 0.5]
    elif order == 4:
        shifts = [-2, -1, 1, 2]
        weights = [1/12, -2/3, 2/3, -1/12]
    elif order == 6:
        shifts = [-3, -2, -1, 1, 2, 3]
        weights = [-1/60, 3/20, -3/4, 3/4, -3/20, 1/60]
    obj.update(x + shifts[0]*eps*h, compute_derivative=False)
    fd = weights[0] * obj.res
    for i in range(1, len(shifts)):
        obj.update(x + shifts[i]*eps*h, compute_derivative=False)
        fd += weights[i] * obj.res
    return fd/eps

err1 = []
err2 = []
err4 = []
err6 = []
epss = [2**(-i) for i in range(10, 40)]
for eps in epss:
    fd1 = compute_directional_derivative_fd_ho(obj, x, h, eps, 1)
    err1.append(abs(derivative-fd1))
    fd2 = compute_directional_derivative_fd_ho(obj, x, h, eps, 2)
    err2.append(abs(derivative-fd2))
    fd4 = compute_directional_derivative_fd_ho(obj, x, h, eps, 4)
    err4.append(abs(derivative-fd4))
    fd6 = compute_directional_derivative_fd_ho(obj, x, h, eps, 6)
    err6.append(abs(derivative-fd6))

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt
plt.loglog(epss, err1, label="1st order")
plt.loglog(epss, err2, label="2nd order")
plt.loglog(epss, err4, label="4th order")
plt.loglog(epss, err6, label="6th order")
plt.legend()
plt.xlabel('Step size')
plt.ylabel('Error')
plt.show()

### Task 3

In [None]:
%%time
from scipy.optimize import minimize
obj.clear_history()
def scipy_fun_fd(x):
    obj.update(x, compute_derivative=False)
    res = obj.res
    return res

res = minimize(scipy_fun_fd, obj.x0, jac=False, method='bfgs', tol=1e-20,
               options={"maxiter": 500},
               callback=obj.callback)
print(res)
convergence_fd = obj.Jvals

In [None]:
%%time
obj.clear_history()
def scipy_fun(x):
    obj.update(x)
    res = obj.res
    dres = obj.dres
    return res, dres

res = minimize(scipy_fun, obj.x0, jac=True, method='bfgs', tol=1e-20,
               options={"maxiter": 500},
               callback=obj.callback)
convergence_exact = obj.Jvals
print(res)

In [None]:
plt.semilogy(convergence_exact, label='Exact derivative')
plt.semilogy(convergence_fd, label='Finite difference approximation')
plt.xlabel('Iteration')
plt.ylabel('Objective value')
plt.legend()

The algorithm using finite differences fails after ~300 iterations due to loss of precision! Using exact derivatives, the algorithm keeps going and is able to reduce the objective further.