* Before you turn this problem in, make sure everything runs as expected. First, **restart the kernel** (in the menubar, select Kernel$\rightarrow$Restart) and then **run all cells** (in the menubar, select Cell$\rightarrow$Run All). Or, alternatively, **Restart & Run All**.

* Make sure you fill in any place that says `YOUR CODE HERE` or "YOUR ANSWER HERE".

* You can always add additional cells to the notebook to experiment, to test your answers, or to provide additional support for your answers.

* You should not need to install new packages to complete an assignment. If you use any packages not available via the MATH405 `Project.toml` then your assignment will likely not be graded correctly.

* Submissions are only accepted via CANVAS!

* Late submissions: within 24h I will reduce the grade to 70%. I will not accept submissions after 24h. Please manage your time well and complete the assignments with plenty of buffer time.

* By entering your name below you confirm that you have completed this assignment on your own and without (direct) help from your colleagues. Plagiarism / copying will be checked by comparing assignments and by by testing understanding in workshops and the oral exam (final). I reserve the option to downgrade an assignment at any point.

In [None]:
NAME = ""

---

# MATH 405/607 

# Numerical Methods for Differential Equations

## Assignment 2: Nonlinear systems, interpolation, quadrature


#### Notes

* I will start to be rigorous about following instructions precisely to enable the autograder to correctly read you answers. Please watch out for those instructions. If a solution is correct but is not stored in the correct variables I will only give partial points. 
* **Due date:** Wed 14 October 2020, 1200 noon
* 90 points (out of 125) will count for 100%

In [None]:
include("math405.jl")

### Question 1 [5+5]

Skim the `README` files of [`Roots.jl`](https://github.com/JuliaMath/Roots.jl) and of [NLsolve.jl](https://github.com/JuliaNLSolvers/NLsolve.jl) to understand what these packages are written for. Then use appropriate functions provided by these packages to solve the following nonlinear equations: 

(a) Find all solutions $x \in [0.1, 4.1\pi]$ of 
$$
    x^{-2} = \sin(x)
$$
Assign these to a vector as follows: `X_a = [ x1, x2, ... ]`

(b) Find the unique solution of the system  
$$\begin{aligned}
    & f(x) := \nabla \varphi(x) = 0,  \qquad \text{where} 
    & \varphi(x) = x_1^6 + \sum_{j = 2}^{10} (x_j - x_{j-1})^6 + x_{10}^6 -0.01 \sum_{j = 1}^{10} x_j.
\end{aligned}$$
and store it in the variable `xb` i.e. 

HINT: you need not derive and implement the gradient, but could use an AD package (e.g. `ForwardDiff.jl`) to obtain it from $\varphi$, e.g., 
```julia 
fb(x) = ForwardDiff.gradient(phi, x)
```

In [None]:
# Somewhere in this code you should have the lines 
#  Xa = [ ... ]
#  xb =  ...

# YOUR CODE HERE

### Question 2 [10+5+10]

Newton's method is chaotic (cf Fractals) and in general converges only locally, or at least its global behaviour is unpredictable for most practical purposes. Most implementations of Newton's method therefore have some "globalisation" strategies, i.e. incorporate ideas (heuristic and rigorous) to improve global convergence properties, or at least increase make the behaviour more predictable. In this question we will explore one strategy of this kind. 

Suppose your current iterate is $x_n$ then the next iterate would be $x_{n+1} = x_n - \partial f(x_n)^{-1} f(x_n)$. But instead, let us define this increment as a *search vector*. That is, we define 
$$
    p_n := - \partial f(x_n)^{-1} f(x_n)
$$
and look for updates of the form 
$$
    x_{n+1} = x_n + \alpha_n p_n
$$
where $\alpha_n \in (0, 1]$.

(a+) Compute $\frac{d}{d\alpha} |f(x_n + \alpha p_n)|^2$ at $\alpha = 0$, then deduce that, for sufficiently small $\alpha$ the update $x_{n+1}$ satisfies
$$
  \text{(DEC)} \qquad   |f(x_{n+1})| \leq (1 - \alpha_n/2) |f(x_n)|
$$
You may use any regularity on $f$ that you need.

Remarks:
* In your proof you should find that the factor $1/2$ in $(1-\alpha/2)$ is somewhat arbitrary. But it is a sensible choice that seems to work quite well in the following tests.
* You can proceed to part (b) without answering this question.

YOUR ANSWER HERE

(b) To achieve the (DEC) condition, we can use a backtracking algorithm. At each iterate $x_n$ our first guess should be $\alpha = 1$ to obtain quadratic convergence in the limit. If this fails the (DEC) condition, then we halve $\alpha$ until it is satisfied.
``` 
WHILE ||f(x + alpha p)|| > (1-alpha/2) * ||f(x)||
    alpha <- alpha / 2
```
* Implement this backtracking condition into a Newton iteration. 
* In light of part (a) of this question, terminate the backtracking with `return nothing` when $\alpha < 10^{-8}$.

In the code-cell below most of the two Newton methods have already been written. Only edit the part of the code that is indicated to incorporate the backtracking loop. Just translate the pseudocode into valid Julia code.

In [None]:

function newton(f, x0, tol, df = x -> ForwardDiff.jacobian(f, x); maxiter = 10)
    x = x0 
    it = 0
    while norm(f(x)) > tol 
        x -= df(x) \ f(x)
        it += 1; 
        if (it > maxiter) || any(isnan, x) || any(isinf, x)
            return nothing
        end
    end 
    return x, it 
end

function damped_newton(f, x0, tol, df = x -> ForwardDiff.jacobian(f, x); maxiter = 100)
    x = x0
    it = 0

    while norm(f(x), Inf) > tol 
        p = - (df(x) \ f(x))
        
        # --------- Backtracking loop 
        # YOUR CODE HERE
        # ---------------------------
        
        x += α * p
        it += 1

        # for debugging!
        # @show α, norm(f(x), Inf)    

        if it > maxiter; return nothing; end 
    end
    return x, it 
end

In [None]:
# you may edit this cell to experiment with your code ...

println("""
 This little test shows how Newton's method can converge 
 predictably or to vastly different solution
 while the damped Newton method usually converges predictably.
    
 Check that the damped Newton method always converges to the 
 root near 6.38.
""")

# wrapping a scalar problem into a vectorial one (with 1 dimension of course...)
fbes = x -> [besselj(3,x[1])]

println("Newton - Start guess 4.85:")
xn1, itn1 = newton(fbes, [4.85], 1e-10)
println("   x = $(xn1), #it = $(itn1)")

println("Newton - Start guess 4.84:")
xn2, itn2 = newton(fbes, [4.84], 1e-10)
println("   x = $(xn2), #it = $(itn2)")

println("Damped Newton - Start guess 4.85:")
xd1, itd1 = damped_newton(fbes, [4.85], 1e-10)
println("   x = $(xd1), #it = $(itd1)")

println("Damped Newton - Start guess 4.84:")
xd2, itd2 = damped_newton(fbes, [4.84], 1e-10)
println("   x = $(xd2), #it = $(itd2)")

plot(x->besselj(3,x), 0, 20, lw=3, label="", grid=:xy, size=(500,150), legend = :outertopright)
scatter!([4.84], fbes(4.84), label = "Starting guess")
scatter!(xn2, [0.0], label = "Newton")
scatter!(xd2, [0.0], label = "Damped Newton")

(c) Use both the Newton and damped Newton algorithms to solve the problem from Question 1, with starting guess `x0[i] = c * i * (11-i)` where `c in [0.01, 0.1]` and briefly comment on your observations. Use comments such as ,
```julia 
# we oberserve that `newton` ... while `damped_newton` ... 
```

In [None]:
# YOUR CODE HERE

### Question 3 - Barycentric Interpolation [10 + 5 + 10] 

(a) *First barycentric formula:* Let $x_0 < \dots < x_N$ then show that the interpolating polynomial $p_N$ to a function $f$ at the nodes $x_n$ is given by 
$$\begin{aligned}
        p_N(x) &= L(x) \sum_{n = 0}^N f(x_n) \frac{w_n}{x - x_n},  \\
        w_n &= \frac{1}{\prod_{j \neq n} (x_n - x_j)}
\end{aligned}$$
where $L(x) = \prod_{n = 0}^N (x - x_n)$ and we must assume that $x \neq x_n$ for all $n$.

YOUR ANSWER HERE

(b) *Second barycentric formula:* Proceeding from part (a), write $1$ in a "clever way" and divide by it, and hence derive 
$$
    p_N(x) = \frac{ \sum_{n = 0}^N f(x_n) \frac{w_n}{x - x_n} }{
                   \sum_{n = 0}^N \frac{w_n}{x - x_n} }
$$

YOUR ANSWER HERE

The two barycentric formulas appear to be numerically unstable, due to division by small numbers $x - x_n$ if $x$ is near a node. But in fact it was proven by [Higham (2004)](https://doi.org/10.1093/imanum/24.4.547) that they are both stable provided $x \neq x_n$ of course. This is one of the things that we will test in the following. But it is actually fairly elementary to prove (at least for the first barycentric formula!) and a very nice illustration of the standard model of floating point arithmetic!

(c+) Implement the second barycentric formula as follows: 
- Given interpolation points `X::AbstractVector`, implement a function `function baryweights(X)` which returns a vector of the weights $w = (w_n)_{n=0}^N$ as a `Vector{Float64}`.
- Write a second function `function baryeval(x, F, X, W)` where `x` is the argument, `X` the vector of interpolation nodes, `F` the vector of function values and `W` the vector of barycentric weights.
- Make sure you watch out for the special case $x = x_n$. 

Note that you are only asked to produce a "naive" implementation of the barycentric formula. A numerically stable implementation that is robust for very large polynomial degrees is beyond the scope of this exercise.

In [None]:
# YOUR CODE HERE

In [None]:
# You can use these tests to check the correctness of your implementation
# The autograder tests will be a bit more rigorous though; If the graph 
# reaches close to machine precision and the slope matches the "rate" then
# your implementation is probably correct.

# you can edit this test if you wish, e.g. experiment with the `c` value

# The Witch of Agnesi!
c = 1.0
f = x -> 1 / (1  + c^2 * x^2)

# The convergence rate is the ρ^{-N} where 
# 0.5 * (ρ - 1/ρ) = 1/c ⇔ ρ^2 - 2ρ/c - 1 = 0 ⇔ ρ = 1/c + sqrt(1/c^2+1)
ρ =  1/c + sqrt(1/c^2 + 1)
# To understand the origin of this calculation find the concept of the Bernstein ellipse 
# We may be able to cover this at the end of this course.      

xs = range(-1, 1, length=1_000)
fs = f.(xs)
NN = 5:5:40
err = []
for N in NN 
    X = cos.(range(0, pi, length=N))
    F = f.(X)
    W = baryweights(X)
    ps = baryeval.(xs, Ref(F), Ref(X), Ref(W))
    push!(err, norm(fs-ps, Inf))    
end
plot(NN, err, lw=3, m=:o, ms=6, label = "error", size = (400, 200), yaxis = :log10)
plot!(NN[3:end], 30*ρ.^(-NN[3:end]), lw=2, c=:black, ls=:dash, label = "predicted rate")

### Question 4 - Implement a Special Function [15]

Implement a routine 
```julia 
function mysin(x::Float64)
    # ... your code
end
```
which takes a real floating point number `x::Float64` as input and returns the value of $\sin(x)$ to within 7 digits absolute accuracy. Your function may use the operations `+, -, *, /` but may not use any special functions already implemented in Julia (such as `sqrt, exp, sin, cos, ...`). Comment the code, explaining briefly how you contructed the approximation. If you call any external function then please convince yourself that it uses only the basic arithmetic operations.

50% of the score will be for correctness, and 50% for evaluation speed. Full points for 7 digits target accuracy and evaluation time less than twice the evaluation time of the built-in `sin` function. Reduced accuracy or evaluation efficiency will lead to partial points.  If you don't score full points, then I will give up to 5 bonus points for elegance and insightful comments on your construction.

With the restriction on the arithmetic operations I allow you can only represent sin in terms of polynomials or rational functions. Use any method you like that we have covered in lectures, workshops, or in this assignment ... it is part of the question for you to figure out what you would like to use to construct your approximant.

In [None]:
# use this cell to experiment, e.g. determine the parameters for the 
# approximant. But implement your solution in the cell below!


In [None]:

function mysin(x::Float64)
    # YOUR CODE HERE
end



In [None]:
# BEGIN TEST

println("Correctness")
Random.seed!(23456)
X = range(1e-7, pi/2-1e-7, length=1_000_000)
X += 6e-8 * (rand(1_000_000) .- 0.5)
err = norm(mysin.(X) - sin.(X), Inf)
println("error = $err (should be < 1e-7)")
try println(@test err < 1e-7); catch; end 

println("Timing:")
x = 0.5 + 0.5 * rand()
tsin = @belapsed sin($x)
tmysin = @belapsed mysin($x)
println("Evaluation time ratio = ", tmysin / tsin, "; (should be < 2)")
try println(@test tmysin <= 2 * tsin); catch; end 

# END TEST

### Question 5: Simpson Rule [5+5+5+10]

Consider the quadrature rule  (Simpson rule) 
$$ 
    \int_{-h/2}^{h/2} f(x) \,dx \approx \frac{h}{6} \big( f(-h/2) + 4 f(0) + f(h/2) \big)
$$

(a) We said that we can most quadrature rules can be interpreted as integrating a polynomial interpolant of the integrand. Which polynomial interpolant does the Simpson rule correspond to? (state without proof)

(b) Derive an error bound, assuming that $f \in C^4([a, b])$.  
[Full points for a short proof that gets within a factor 2 of the sharp bound; partial points for a proof that gets the correct order.]

(c) State (without proof) the corresponding estimate for the composite Simpson rule with mesh-size $h$.

YOUR ANSWER HERE

(d) Using the integral $\int_0^\pi \sin(x) \,dx = 2$ as an example, confirm the convergence rate predicted in (b), numerically. Produce a figure that compares the numerical convergence rate against the predicted rate. 

Ideally, you should first implement a function, e.g., `function simpson(f, a, b, N)` which implements the simpson rule. Then implement a short test (see lectures for inspiration!) which checks that the output of your function converges with the predicted rate to the analytic value.

In [None]:
# solution part (d) 

# YOUR CODE HERE

## Question 6 [5+5+5+5] : integrate some functions

Integrate each of the following functions numerically. For each function you should 
* use an integrator that you implemented yourself (simpson?)
* use the `Cubature.jl` package. 
Store the solutions in the variables `I_x` (your own method) and `Icub_x` (the `Cubature.jl` solution) where `x` is `a`, `b`, ...; e.g. 
```julia
f_a = x -> exp(-x^2)
I_a = my_method(f_a, ...) 
Icub_a, err_a = hquadrature(f_a, ...)
```

(a) $f_a(x) = e^{-x^2}, x \in [0, 1]$

(b) $f_b(x) = x \log(x), x \in [0, 1]$ 

(c) $f_c(x) = \sqrt{x} \log(x), x \in [0, 1]$ 

(d) $f_d(x) = \sqrt{x} \exp(- 0.1 x), x \in [1, \infty]$

I encourage you to use brute-force rather than get too clever about manually resolving the singularities. Let the computer do the work for you.

In [None]:
using Cubature 

# YOUR CODE HERE