# MA3J8 Approximation Theory and Applications 

## 03 - Algebraic Polynomials




In [None]:
using SoftGlobalScope, LinearAlgebra, LaTeXStrings, Plots
gr();

### 03-1 - Runge's Phenomenon

We consider the function $f : [-1, 1] \to \mathbb{R}$, 
$$
   f(x) = \frac{1}{1 + 25 x^2}
$$
Note that $f$ is analytic on $[-1,1]$, hence from our work on trigonometric approximation we expect excellent approximation properties. We choose a uniform grid, 
$$
  x_j = -1 + 2j/N, \qquad j = 0, \dots, N
$$
and interpolate $f$ at those grid points. 

In [None]:
using Polynomials
f(x) = 1 / (1 + 25 * x^2)
N = 10
X = range(-1, stop=1, length=N)
p = polyfit(X, f.(X))
xp = range(-1, stop=1, length=200)
plot(xp, f.(xp), lw=2, label = "f")
plot!(xp, p.(xp), lw=2, label = "p$N")
plot!(X, f.(X), lw=0, m=:o, ms=6, c=2, label = "")

this does not look great. Maybe we just aren't using enough points?

In [None]:
xp = range(-1, stop=1, length=400)
P = plot(xp, f.(xp), lw=2, label = "f")
for N in [10, 20, 30]
    X = range(-1, stop=1, length=N)
    p = polyfit(X, f.(X))
    plot!(P, xp, abs.(p.(xp)), lw=2, label = "p$N", yaxis = (:log, [1e-2, 1e3]))
end 
P

Clearly, the approximations **diverge**. This is called the Runge phenomenon. It is by no means an indicator that polynomials are poor basis functions for approximation. For example, let us use a least-squares fit w.r.t. exact function values on a fine grid. 

In [None]:
xp = range(-1, stop=1, length=400)
P = plot(xp, f.(xp), lw=2, label = "f")
err = []
NN = [10, 20, 30, 40]
for N in NN
    X = range(-1, stop=1, length=N)
    p = polyfit(xp, f.(xp), N)
    plot!(P, xp, p.(xp), lw=2, label = "p$N")
    push!(err, norm(f.(xp) - p.(xp), Inf))
end 
plot(P, plot(NN, err, lw=2, m=:o, yaxis = (:log,), label = ""), layout = (1,2))

We have recovered what looks like exponential convergence! Clearly there is something we need to understand.

### 03-2 Interpolation on Chebyshev Points

In the lecture notes we have motivated the Chebyshev interpolation nodes 
$$
  x_j = \cos(\pi j/ N)
$$
We can now check whether they fix the problem we had with equispaced nodes.

In [None]:
chebnodes(N) = [ cos(j*π/N) for j = N:-1:0 ]

In [None]:
xp = range(-1, stop=1, length=400)
P = plot(xp, f.(xp), lw=2, label = "f")
NN = [10, 20, 30, 40]
errcheb = []
errfit = [] 
for N in NN
    X = chebnodes(N)
    pcheb = polyfit(X, f.(X))
    plot!(P, xp, pcheb.(xp), lw=2, label = "p$N")
    pfit = polyfit(xp, f.(xp), N)
    push!(errcheb, norm(f.(xp) - pcheb.(xp), Inf))
    push!(errfit, norm(f.(xp) - pfit.(xp), Inf))    
end 
plot(P, 
     plot(NN, [errfit, errcheb], lw=2, m=:o, 
          label = ["fit", "cheb"], yaxis = (:log,)), 
     layout = (1,2))

This is excellent news. We will start from here and explore this in a lot more detail.

Next, we observe another problem: evaluating the Chebyshev interpolant is numerically unstable! (At least how it is implemented in the `Polynomials.jl` package. We will return to this later.

In [None]:
NN = 10:4:80
errcheb = []
for N in NN
    X = chebnodes(N)
    pcheb = polyfit(X, f.(X))
    push!(errcheb, norm(f.(xp) - pcheb.(xp), Inf))
end 
plot(NN, errcheb, lw=2, m=:o,  label = "", yaxis = (:log,))

Nevertheless, we can still explore Chebyshev interpolation on some examples - as long as we remain aware of the limitation due to numerical instability. The following results look promising but the numerical stability is clearly a severe limitation for us. We will therefore explore the errors in a little more details after implementing the barycentric formula.

In [None]:
f1(x) = 1 / (1+x^2)
f2(x) = 1 / (1+25*x^2)
f3(x) = sin(3*x)
f4(x) = abs(sin(3*x))^3
f5(x) = abs(x)
f6(x) = sign(x) * abs(x)^(3/2)
f7(x) = exp(-x^2)

fall = [f1, f2, f3, f4, f5, f6, f7]
nalg = [4,5,6]
falg = fall[nalg]
nana = [1, 2, 3, 7]
fana = fall[nana]

;

In [None]:
# Chebyshev Interpolation error for functions with exponential rates 
# ------------------------------------------------------------------
# For these functions, the convergence is fast, so we don't hit 
# the numerical instability, at least for the fast converging ones
# ------------------------------------------------------------------
NN = 2:6:100
xerr = range(-1, stop=1, length=1_000)
P = plot(xaxis = (L"N",), 
         yaxis = (:log, L"\| f - I_N f\|_{L^\infty}"), 
         legend=:right)
for (f, n) in zip(fana, nana)
    err = []
    for N in NN 
        X = chebnodes(N)
        p = polyfit(X, f.(X))
        push!(err, norm(f.(xerr) - p.(xerr), Inf))
    end
    plot!(P, NN, err, lw=2, m=:o, ms=6, label = "f$n")
end 
P

In [None]:
# Chebyshev Interpolation error for functions with algebraic rates 
# ---------------------------------------------------------------
# But for the more slowly converging tests, we get into the 
# regime of numerical instability very quickly
# ---------------------------------------------------------------

NN = (2).^(2:8)
xerr = range(-1, stop=1, length=1_000)
P = plot(xaxis = (:log, L"N"), 
         yaxis = (:log, L"\| f - I_N f\|_{L^\infty}"), 
         legend=:top)
for (f, n) in zip(falg, nalg)
    err = []
    for N in NN 
        X = chebnodes(N)
        p = polyfit(X, f.(X))
        push!(err, norm(f.(xerr) - p.(xerr), Inf))
    end
    plot!(P, NN, err, lw=2, m=:o, ms=6, label = "f$n")
end 
P

## 03-3 Barycentric Formula

In Sec. 4.4 we derived the baycentric interpolation formula and showed that one of its variants is numerically stable. As a matter of fact, both are stable but for the one we are using here this is a little more involved to prove. Here, we implement the specific formula for chebyshev points.

In [None]:
"""
Barycentric interpolation with a Chebyshev grid with N grid points.
The interpolant is evaluated at points `x`.
"""
function bary(f::Function, N, x)
    X = chebnodes(N)
    F = f.(X)
    return bary(F, x; X=X)
end

function bary(F::Vector, x; X = chebnodes(length(F)-1))
    N = length(F)-1
    p = 0.5 * ( F[1] ./ (x .- X[1]) + (-1)^N * F[N+1] ./(x .- X[N+1]) )
    q = 0.5 * (1.0 ./ (x .- X[1]) + (-1)^N ./ (x .- X[N+1]))
    for n = 1:N-1
        p += (-1)^n * F[n+1] ./ (x .- X[n+1])
        q += (-1)^n ./ (x .- X[n+1])
    end 
    return p ./ q    
end

"""
generate a grid on which to plot errors; this is chosen to avoid 
any grid points since barycentric interpolation is not defined 
on those.
"""
errgrid(Np) = range(-1+0.0123, stop=1-0.00321, length=Np)

In [None]:
# back to our opening example : no sign of instability, 
# and we  get precisely the prediced rate.
# ------------------------------------------------------
f(x) = 1/(1+25*x^2)
xp = errgrid(1000)
NN = 2:10:250
err = [ norm(f.(xp) - bary(f, N, xp), Inf) for N = NN]
pred = 1.5*exp.(-NN/5)
plot(NN, [err, pred], lw=2, m=:o, 
    label=["err", "exp(-N/5)"], yaxis = (:log,))

In [None]:
# The algebraically converging functions revisited, 
# this time with the predicted slopes 
# ---------------------------------------------------
NN = (2).^(2:10)
xerr = range(-1+0.00012, stop=1-0.000032, length=1_000)
P = plot(xaxis = (:log, L"N"), 
         yaxis = (:log, L"\| f - I_N f\|_{L^\infty}"), 
         legend=:bottom)
for (f, n) in zip(falg, nalg)
    err = [ norm(f.(xerr) - bary(f, N, xerr), Inf)  for N in NN ]
    plot!(P, NN, err, lw=2, m=:o, ms=6, label = "f$n")
end 
t = [NN[5], NN[8]]
plot!(P, t, 1*t.^(-1.), lw=2, ls=:dash, c=:black, label=L"\sim N^{-1}, N^{-3/2}, N^{-3}")
plot!(P, t, t.^(-3/2), lw=2, ls=:dash, c=:black, label="")
plot!(P, t, 12*t.^(-3.), lw=2, ls=:dash, c=:black, label="")
P

In [None]:
# the exponentially convergent functions revisited 
# f1, f2 get the prediced slopes, 
# f3, f7 are entire
# ---------------------------------------------------
NN = 2:4:46
xerr = range(-1+0.00012, stop=1-0.000032, length=1_000)
P = plot(xaxis = (L"N",), 
         yaxis = (:log, L"\| f - I_N f\|_{L^\infty}"), 
         legend=:right)
for (f, n) in zip(fana, nana)
    err = [ norm(f.(xerr) - bary(f, N, xerr), Inf)  for N in NN ]
    plot!(P, NN, err, lw=2, m=:o, ms=6, label = "f$n")
end 
t = [NN[5], NN[8]]
plot!(P, t, [exp.(-t), 0.1*exp.(-t/5)], 
      lw=2, ls=:dash, c=:black, 
      label=L"\sim e^{-N}, e^{-5N}")
P

## 03-4 Applications

### Evaluating a "special function"

Special functions are functions such as $\exp(x), \sin(x), \cos(x), \dots$, the Bessel functions, $\Gamma$ function, Airy, and many more. Efficient and stable numerical evaluation of such functions is a mostly solved and well-understood problem. Nevertheless it is useful to see what kind of ideas might be involved. Here, we will just use polynomial interpolation of a Taylor series, but of course in practise one uses much more sophisticated techniques (more on that later).

For simplicity, let us just consider the `sin` function. We can obtain a decent approxiation using Taylor series. Then we interpolate the Taylor series to get a Chebvyshev interpolant. Note that in principle we only need to evaluate `sin` in $[-\pi/2, \pi/2]$ as all other cases can be reduced to shifting and reflection.

In [None]:
tsin(N, x) = imag(sum((im*x)^n / factorial(n) for n = 0:N))

xx = range(-pi/2, stop=pi/2, length=1000)
println("Taylor Expansion:")
for N in [7, 11, 15, 19]
    errN = maximum(abs(sin(x) - tsin(N,x)) for x in xx) 
    println(" N = $N => err = $errN")
end 

This suggests that `tsin(20, x)` is a machine-precision approximation to `sin`. Now we can check how many points we need with a Chebyshev interpolant.

In [None]:
scal_tsin = s -> tsin(20, s*π/2)
println("Chebyshev Interpolant:")
for N in [7, 9, 11, 15]
    errN = norm(bary(scal_tsin, N, xx*2/π) - sin.(xx), Inf)
    println(" N = $N => err = $errN")
end

In [None]:
# we can do even better by approximating only on the 
# interval [0, pi/2]
xx = range(0+3.21e-12, stop=pi/2, length=333)
scal_tsin1 = s -> sin((1 + s)*π/4)
println("Chebyshev Interpolant:")
for N in [5, 8, 11, 13]
    errN = norm(bary(scal_tsin1, N, xx*4/π.-1) - sin.(xx), Inf)
    println(" N = $N => err = $errN")
end

This may seem like a small improvement, but a factor 2/3 in the evaluation cost would in fact represent a phenomenal gain in computing speed. The explanatio of this gain can be easily visualised: the taylor polynomial optimises the error in the origin while the Chebyshev interpolant distributes it more uniformly (but not uniformly enough, which is why it is not optimal; see later).

In [None]:
N = 13
xp = range(-pi/2, stop=pi/2, length=400)
terr = tsin.(N, xp) - sin.(xp)
cerr = bary(scal_tsin, N, xp*2/π) - sin.(xp)
plot(xp, [terr, cerr], lw=2, label = ["Taylor-error", "Chebyshev-error"],
         ylim = [-2e-13, 2e-13])

### Evaluating a Matrix Function 

Consider a discrete Laplacian-like matrix, 
$$
    H = \frac{1}{2}\begin{pmatrix}
        0 & 1      &        &        & \\ 
        1 & 0      & 1      &        &  \\ 
          & \ddots & \ddots & \ddots &  \\ 
          &        &      1 &  0     & 1 \\ 
          &        &        &      1 & 0
    \end{pmatrix}
$$
One can readily see that $\sigma(H) \subset [-1,1]$.

In [None]:
using SparseArrays
Hfun(d) = spdiagm( -1 => ones(d-1)/2, 1 => ones(d-1)/2 )

println("H(5) = ")
display(Matrix(Hfun(5)))

print("σ(H(100)) ⊂ ")
println(extrema(eigvals(Matrix(Hfun(100)))))
;

We wish to evaluate $f_\beta(H)$ where $f_\beta$ is the Fermi-Dirac function 
$$
    f_\beta(z) = \Big( 1 + e^{\beta z} \Big)^{-1}
$$
We construct a Chebyshev interpolant, then use the Chebyshev transform to obtain the Chebyshev coefficients, which will then allow us to evaluate $f_\beta(H)$ via the Chebyshev basis recursion as a series of Matrix multiplications.

In [None]:
using FFTW
# we first implement the fast chebyshev transform 

revchebnodes(N) = [ cos(j*π/N) for j = 0:N ]

function fct(F)
    N = length(F)-1
    G = [F; F[N:-1:2]]
    Ĝ = real.(ifft(G))
    return [Ĝ[1]; 2 * Ĝ[2:N]; Ĝ[N+1]]
end 


function eval_chebpoly(F̃, x; ID = one(x))
    N = length(F̃)-1
    Told = ID
    if N == 0; return Told * F̃[1]; end 
    Tnew = x 
    p = Told * F̃[1] + Tnew * F̃[2]
    if N == 1; return p; end 
    for n = 2:N 
        Toldold = Told 
        Told = Tnew 
        Tnew = 2 * x * Told - Toldold
        p += F̃[n+1] * Tnew
    end 
    return p 
end

f_fermi(β, x) = 1/(1 + exp(β*x))

# a little test
β = 10+rand()
xx = range(-1, stop=1, length=1000)
for N in [11, 21, 31, 41]
    F = f_fermi.(β, revchebnodes(N))
    F̃ = fct(F)
    err = norm(f_fermi.(β, xx) - [eval_chebpoly(F̃, x) for x in xx])
    println("N = $N => err = $err")
end

In [None]:
A = Matrix(Hfun(10))
A = exp(A)

In [None]:
# and now we can use this to evaluat a matrix function 
# ------------------------------------------------------

# exact matrix function
f_fermi_mat(β, H) = pinv(I + exp(β * Matrix(H)))
    
# using the chebyshev expansion 
function f_fermi_mat_cheb(β, H, N)
    F = f_fermi.(β, revchebnodes(N))
    F̃ = fct(F)
    eval_chebpoly(F̃, H)
end

In [None]:
β, d = 10.0, 1000
# ---------------
A = Hfun(d)
fH = f_fermi_mat(β, A)
for N in [11, 21, 31, 41]
    fH_N = f_fermi_mat_cheb(β, A, N)
    err = norm(fH_N - fH, Inf)
    println("N = $N => err = $err")
end

In [None]:
println("Runtime f_fermi_mat")
for n = 1:3 
    @time f_fermi_mat(β, A)
end
println("Runtime f_fermi_mat_cheb, N = 21")
for n = 1:3 
    @time f_fermi_mat_cheb(β, A, 11)
end
println("Runtime f_fermi_mat_cheb, N = 41")
for n = 1:3 
    @time f_fermi_mat_cheb(β, A, 31)
end
println("""Don't take these runtimes too seriously; there are a
           lot of optimisations that we are missing in `eval_chebpoly`;
           in particular a lot of the allocations can be avoided.""")

### Solving a BVP 

As a final example, we look at how to solve boundary value problems using Chebyshev polynomials. There is an excellent Julia package, `ApproxFun.jl` that builds on the kind of ideas we discussed - and takes them much much further. So instead of putting together our own little toy code we will show how to use `ApproxFun.jl`.

Consider the BVP 
$$
    \epsilon u'' + 6 (1-x^2). u' + u^2 = 1,  
$$
with boundary conditions $u(-1) = 1, u(1) = -1/2$.

In [None]:
using ApproxFun
x = Fun()  # defines the identity function x -> x
N = u -> [u(-1)-1, 
          u(1)+0.5, 
          0.01 * u'' + 6 * (1-x^2) * u' + u^2 - 1]
u = newton(N, 0*x)
@show typeof(u)
@show u.space
@show length(u.coefficients)
@show norm(N(u))
plot(u; lw = 2, label = "Solution to BVP")