# Lab Week 3: Conjugate Gradients, Newton, and Quasi-Newton

_Tutors: Andreas Grothey, Josh Fogg_

14 February 2025

This lab session experiments with the concepts of the lectures in weeks 3, 4 and 5: Conjugate Gradients, Newton, and Quasi-Newton.There is again an **Assessed Task** at the end.

First download the python modules required for today's session from Learn (NLOLab3.zip) and unpack them. Some of the files have the same name as in the previous lab session (week 3) but extended functionality. Please work with the new versions. For most of this session we keep working with the files `GenericLineSearchMethod.py` and `LineSearchPlot.py` as in the previous lab session (week 3).

## 1. Conjugate Gradients

In the previous lab session we have optimized various functions by line search methods based on coordinate descent and steepest descent. Let's see if using the conjugate gradient method improves things. First try the Fletcher-Reeves variant of the CG method with exact line searches on the `ex21` problem. Go back to where the previous lab session has ended to solve the Ex21 problem with the conjugate
gradient method

### (a)

For this set the function selector in `LineSearchPlot.py` to `Ex21` and the `direction` selector in GenericLineSearchMethod.py to CG (for _conjugate gradient_). Also choose the exact line search. Compare the path taken and solution statistics with the steepest descent method on the same problem. Also change the line search to the backtracking Armijo line search and see what happens.

### (b) Polak-Riviere CG

CG with exact line searches worked reasonably well, but the inexact Armijo line search results in a strange path (and higher number of function evaluations than for the steepest descent method). We said in the lecture that Polak-Riviere has been observed to give better performance than Fletcher-Reeves. Try that by choosing the PR method in `CGDirection.py` (change the selector `method='PR'`). First with exact line search and then with the backtracking line search.

The backtracking line search stops with an error message saying that the generated direction is not a descent direction. We have seen in the lectures that this can indeed happen. A simple way to overcome this is to just use the steepest descent direction instead of CG whenever the CG direction is not a descent direction. For this change the selector `fixPR='true'` in `CGDirection.py`. This will check if the CG direction is a descent direction and if it is not return the steepest descent direction instead. Try this.

NOTE: **the PR with safeguards actually converges to the other minimum!**

### (c) Rosenbrock

Also try use conjugate gradients on Rosenbrock's problem that had defied us earlier in week 3 (for this change the function selector in `LineSearchPlot.py`). Start with the combination CG-PR/Armijo that you have last used, but it is interesting to try CG-FR and exact line searches as well.

## 2. Newton Method

### (a)

Change the file `GenericLineSearchMethod.py` so that it does a Newton step. Find the section that tests for `function == 'Newton'` and complete the calculation for $d_k$ (You will find the numpy function `np.linalg.solve(A, b)` useful that solves the system of equations $A{\bf x} = {\bf b}$). Note that you also need to change the line search strategy so that it takes full steps ($\alpha = 1$). You can do that by setting `linesearch='fullstep'`. Once you have done it, test Newton's method on the Rosenbrock problem.

### (b)

While Newton converges quickly, the path taken is very surprising. Can you explain what is happening here?

### (c)

Change the line search to the Armijo linesearch with $c_1 = 0.1$. Observe that this now takes quite a bit longer, but is more robust. The behaviour is typical of Newton-like methods though and the indication is that Newton does better with a more lax line-search that allows occasional ascent steps. It is difficult to know how much leeway to allow it though.

### (d)

As the next test try Newton will full steps on problem `Ex13` from starting point $x_0 = (0.5, 1)$ (This was the problem where you determined all the stationary points in the first lab session – we have also seen this example during the lecture on Newton methods). You will see that Newton converges (quickly) to a saddle point.

### (e)

Change the line search to Armijo and again observe what happens.

### (f)

Finally you may want to try the same (Newton with full steps vs Armijo line search) from the starting point $x_0 = (0.8, 0)$: you just need to uncomment the appropriate line in `LineSearchPlot.py` under the `Ex13` section.

### (g)

Now try this (Newton for problem `Ex13`) from the starting point $x_0 = (−1.2, 0.5)$. You will see that Newton converges to the saddle point for both the full steps and the Armijo line search.

### (h)

Then try to correct inertia by calculating eigenvalues of the Hessian matrix and correcting. You can do this by opening `GenericLineSearchMethod.py` and inserting the code

```python
v, w = LA.eig(Hk)
l_min = min(v)
if l_min < 1e-4:
    tau = -l_min+1e-4
    Hk = Hk + tau*np.eye(n)
```

before calculating the Newton step with `dk = - np.linalg.solve(Hk, gk)`. Try the resulting algorithm. Also observe for how many steps the inertia correction is used.

## 3. Quasi-Newton Methods

Let's experiment with Quasi-Newton methods. We would like to use test problems with more than two variables. For this we can use the `chebyquad` problem that was presented in the lectures. Since we cannot plot the picture of the contour and the path taken by the method easily for higher dimensional problems we will use a different driver function namely `CallLineSearch.py` instead of `LineSearchPlot`. Have a look at its code: it is very basic and simply calls `GenericLineSearchMethod.py` for the `chebyquad` problem from a given starting point. There are a few starting points given in different dimensions. The default one is for two dimensions.

Note: _The chebyquad problem finds the optimal evaluation points for a quadrature rule with equal weights on the interval $[0, 1]$._

### (a)

ets start with using the SR1 Quasi-Newton update rule. For this change the direction setting in `GenericLineSearchMethod.py` to `direction ='QN'`. This calls the function `QNDirection.py` to calculate the search direction using a quasi-Newton method. Again you may want to inspect this file: it allows you to choose the update rule at the top of the file; SR1 is the default.

### (b)

Solve the chebyquad problem from the 2-dim starting point by the SR1 update rule (with Armijo line searches) and observe what happens.

### (c)

For comparison try the Conjugate Gradient method (Fletcher–Reeves or Polak–Riviere) on the same problem (for this simply change the direction setting in `GenericLineSearchMethod.py` to `direction ='CG'`).

This is interesting since the CG and QN methods have similar theoretical properties (n-step quadratic termination, superlinear convergence, only need gradients but no Hessians). On this example even the naive SR1-QN update performs better than conjugate gradients.

### (d)

Let's attempt a larger problem by changing the starting point in `CallLineSearch.py` to the 5-dimensional one. You should see that SR1 is not able to generate descent directions at some point.

### (e)

Change the update rule to the DFP rule and then the BFGS rule and observe what happens.

### (f)

Do the same for the final starting point (9-dimensional).

### (g)

You may want to compare with Newton.

## 4. Assessed Task

### (a)

> For Task 3(e) above (BFGS for the 5-dimensional chebyquad problem), state the final approximation matrix $H_k$ and compare how well the inverse of $H_k$ approximates the Hessian at the solution.
> 
> Note: _It would be sensible to compare eigenvectors and eigenvalues: for the eigenvectors for information you may want to state the angle between them; You can use $\cos\angle(a, b) = a^{\top} b/{\|a\|}_{2}{\|b\|}_{2}$. Also the `np.linalg.eig` function returns eigenvectors of length 1._

Please submit a document (or scan of a handwritten solution) that gives your answer to 4(a) and (b). You can include a piece of python code and its output to show what you did for part (a). You can upload this as a separate file or copy/paste it into a text/word/pdf file. Please do not submit python code as Jupyter notebook (ipynb) or any zip files (since these don't display inline in Learn when we are marking the assignment). Plain python (`*.py`) is fine. 

Submission is on the NLO Learn pages (under Assessments) by **Friday 1 March, 10am**.