# 18.S190/6.S090 Problem Set 1 Solutions

## Problem 1 (5+5+6+4 points)

The [course notebook on finite differences](https://github.com/mitmath/numerical_hub/blob/fbcbf6adef724392624921c5a7cf8a9d53330347/notes/finite-differences.ipynb)
includes, without derivation, a mysterious four-line Julia function
called `stencil` that can compute finite-difference rules for
an arbitrary number of points.  The `stencil` function is reproduced below, in both Julia and Python.

In particular, if you want to compute
the $m$-th derivative of a smooth (analytic) scalar function $f(x)$
at $x_{0}$, it returns the weights $w_{k}$ of an $n$-point ($n>m$)
finite-difference rule from evaluating $f$ at points $x_{k}$ for
$k=1\ldots n$:
$$
f^{(m)}(x_{0})\approx\sum_{k=1}^{n}w_{k}f(x_{k})
$$
by solving the system of equations $Aw=e_{m+1}$, where $e_{j}\in\mathbb{R}^{n}$
is the Cartesian unit vector in the $j$-th direction and $A$ is
an $n\times n$ matrix with entries $A_{ij}=\frac{(x_{j}-x_{0})^{i-1}}{(i-1)!}$ (where $i, j = 1, \ldots, n$ are the rows and columns of $A$, respectively).

Here, you will analyze and derive this technique.

1. Let $x_{0}=0$. According to the notes, you can then compute $f^{(m)}(y)\approx\frac{1}{h^{m}}\sum_{k=1}^{n}w_{k}f(y+hx_{k})$
for an arbitrary point $y$ and an arbitrary step-size scaling factor
$h$ (which can be made smaller and smaller to reduce truncation errors; i.e. $h=\delta x$).
Derive this formula from the $f^{(m)}(x_{0})\approx \cdots$ formula above (via the chain rule and a change of variables).

2. Now evaluate it for $x_{0}=0$ (`0//1` in Julia for exact rational results)
and $x=[0,1,2,3]$ with $m=1$, i.e. using $n=4$ equally spaced points
$\ge x_{0}$ (a *higher-order* "forward-difference" formula). Use the resulting weights, in the formula scaled by
$h$ as above, to approximate the derivative $f'(1)$ for $f(x)=\sin(x)$,
and plot the relative error (compared to the exact derivative) as
a function of $h$ on a log–log scale, similar to the course notebook.
What power law in $h$ does the truncation error (approximately) seem
to follow? That is, what is the “order of accuracy”?

3. Derive the stencil equation $Aw=e_{m+1}$ above: write out the first
$n$ terms of the Taylor series (up to the $f^{(n-1)}$ derivative) for $f(x_{0}+\delta x)$, and try
to find a linear combination of this series evaluated at $\delta x=x_{k}-x_{0}$
for $k=1\ldots n$ in such a way that you obtain $f^{(m)}(x_{0})$.

4. Explain the output of `stencil` for $x =[-1,+1]$, $x_{0}=0$ (or `0//1` in Julia for exact rational results), and $m=0$?

### Solutions:


1. Consider the function $g(x) = f(y+hx)$.  By the chain rule, $g^{(m)}(x) = h^m f^{(m)}(y+hx)$, so it follows that $f^{(m)}(y) = \frac{1}{h^m} g^{(m)}(0)$.  Since $x_0 = 0$, plug in the finite-difference formula for $g^{(m)}(0) \approx \sum_{k=1}^n w_k g(x_k) = \sum_{k=1}^n w_k f(y+h x_k)$, and the result follows.

    (The key thing to remember is that the finite-difference stencil is for *any* function, not just for functions called "$f$")
   
2. For $x=[0,1,2,3]$, the 'stencil' function returns $w = [-11/6, 3, -3/2, 1/3]$ (see Julia code below).  Trying it out numerically for $\sin'(1)$, we find that the error scales as $\boxed{\sim h^3}$, i.e.~it is *third-order accurate*.

    
3.  The familiar Taylor series formula using $\delta x = (x_j-x_0)$ takes the form
$f(x_0 + (x_j-x_0) ) = f(x_j)   = \sum_{i=1}^\infty \frac{f^{(i-1)}(x_0)}{ (i-1)! } (x_j -x_0)^{i-1}.$ Letting $A_{ij}=(x_j-x_0)^{i-1}/(i-1)!$, as suggested, we have that
$f(x_j) = \sum_{i=1}^\infty A_{ij} f^{(i-1)}(x_0)$.  To form the approximation we truncate to $n$ terms, and write the equations in matrix form:
$$
\begin{pmatrix}
f(x_1) \\ \vdots \\ f(x_n) 
\end{pmatrix}
=
A^T
\begin{pmatrix}
f(x_0) \\ \vdots \\ f^{(n-1)}(x_0) 
\end{pmatrix} .
$$
Taking an inner product with $e_m$ and rearranging we see that
$$ \begin{pmatrix}
f(x_1) \\ \vdots \\ f(x_n) 
\end{pmatrix}^T A^{-1}e_m = f^{(m)}(x_0)$$
as desired, since the left hand side is exactly $\sum_{k=1}^n w_k f(x_k)$

4. The 'stencil' function returns $w = [1/2, 1/2]$ (see Julia code below). This is the interpolation of data at midpoint.

In [1]:
# in Julia:
function stencil(x::AbstractVector{<:Real}, x₀::Real, m::Integer)
    ℓ = 0:length(x)-1
    m in ℓ || throw(ArgumentError("invalid derivative order m"))
    A = @. (x' - x₀)^ℓ / factorial(ℓ)
    return A \ (ℓ .== m) # vector of weights w
end

stencil (generic function with 1 method)

#### part 1.2

In [2]:
stencil(0:3, 0//1, 1)

4-element Vector{Rational{Int64}}:
 -11//6
    3
  -3//2
   1//3

#### part 1.4

In [3]:
stencil([-1, 1], 0//1, 0)

2-element Vector{Rational{Int64}}:
 1//2
 1//2

## Problem 2 (4+4+4+8 points)

Write a function `myexp(x)` (in Julia or Python) to compute $e^x$ directly from the Taylor series definition:
$$
e^x = 1 + x + \frac{x^2}{2} + \cdots + \frac{x^n}{n!} + \cdots \, .
$$
(in the default `Float64`/`float` precision … no fair using arbitrary-precision arithmetic).

1. Explain how you can compute each term in the series from the preceding term.  (Not only is this more efficient, but it also helps avoid overflow compared to the naive approach where you compute $x^n$ and $n!$ *separately* and then divide them.)

2. Explain how you decided how many terms to sum, to make reasonably sure that the omitted terms have a negligible contribution.  (Your method should depend on $x$.  Does not need a rigorous argument, just a reasonable explanation.)

3. Check that `myexp(100.0)` gives a small *relative* error ($< 10^{-14}$) compared to `exp(100.0)` in Julia or `math.exp(100.0)` in Python), even though your `myexp(100.0) - exp(100.0)` is probably huge.

4. Explain why `myexp(-100.0)` gives a completely wrong result, no matter how many terms you include in the sum!

### Solutions:

1. Each term is $x/n$ times the previous term
2. The summation should stop when the |current term| is less than the machine epsilon times the |current sum|.

In [4]:
function myexp(x; tol=eps(Float64))
    cur_sum, cur_term, n = 1, x, 1
    while abs(cur_term / cur_sum) > tol
        cur_sum += cur_term
        n += 1
        cur_term *= x/n
    end
    return cur_sum
end
   
    

myexp (generic function with 1 method)

3. The difference is huge, but the relative error is indeed very small.

In [5]:
myexp(100) - exp(100), abs( (myexp(100) - exp(100)) / exp(100) )

(4.951760157141521e27, 1.842092399959933e-16)

4. The result is wrong due to cancellation error: the terms are large with alternating signs.

In [6]:
myexp(-100)

8.144652745098074e25

## Problem 3 (5+5 points)

Write a function $L4(x,y)$ in Julia or Python that computes the "$L_4$ norm" $L4(x,y) = (x^4 + y^4)^{1/4}$ of two real (floating-point) scalars $x$ and $y$.

1. If you implement this in the most straightforward way, directly from the formula above, does your code give an accurate answer for `L4(1e-100, 0.0)`?  What about for `L4(1e100, 0.0)`?  Why or why not?
2. *Fix* your code so that it gives an accurate answer (a *small relative error* close to machine precision)) for all floating-point inputs $x$ and $y$ (including the case in the previous part). (No fair resorting to higher-precision arithmetic!)

### Solutions

In [7]:
L4(x,y) = (abs(x)^4 + abs(y)^4)^(1/4)

L4 (generic function with 1 method)

1. We should have `L4(x,0)` give |x|, but for very small or very large `x` we get floating-point **underflow** or **overflow**, respectively. In the default double precision (`Float64`):

In [8]:
L4(1e-100, 0) # (1e-100)⁴ underflows to 0.0

0.0

In [9]:
L4(1e+100, 0)  # (1e+100)⁴ overflows to Inf

Inf

2. To eliminate this problem, we can simply compute $s = \max\{|x|,|y|\}$ and then pull out this scale factor, since in exact arithmetic $L_4(x,y) = s L_4(x/s,y/s)$ for any $s > 0$.  In this way, we avoid underflow/overflow in the leading-order term.  (If $|y|\ll |x|$ and $|y/x|^4$ underflows to zero, we don't care, because $1 \oplus |y/x|^4$ will round to `1.0` long before that point.)

In [10]:
function L4good(x,y)
    ax, ay = abs(x), abs(y)
    s = max(ax,ay)
    if s == 0
        return float(s) # don't divide by zero if x==y==0
    else
        return s * ((ax/s)^4 + (ay/s)^4)^(1/4)
    end
end

L4good (generic function with 1 method)

In [11]:
L4good(1e-100, 0)

1.0e-100

In [12]:
L4good(1e+100, 0)

1.0e100

In [13]:
L4good(0, 0)

0.0

If we compute the maximum relative error (compared to BigFloat) for million random numbers with random magnitudes from 1$0^−308$ to $10^+308$, we can it is accurate to within a few ulps:

In [14]:
maxerr = 0.0
for i = 1:10^6
    x = (rand() - 0.5) * 10.0^rand(-308:308)
    y = (rand() - 0.5) * 10.0^rand(-308:308)
    result = L4good(x,y)
    exact = L4good(big(x), big(y)) # in 256-bit precision by default
    maxerr = max(maxerr, Float64(abs(result - exact) / abs(exact)))
end
println("maximum relative err = ", maxerr, " = ", maxerr/eps(Float64), " ulps.")

maximum relative err = 9.950714535858108e-16 = 4.481403427576075 ulps.


## Problem 4 ((2+3)+(3+6) points)

In this problem set, you will apply two interpolation methods to approximate the function:

$$
s(t) = t^3 - t^2 + 2t, \quad t \in [0,5].
$$

We will use two different grids for interpolation:

- **Uniform Grid:** Equally spaced nodes over $[0,5]$.
- **Non-Uniform Grid:** Unequally spaced nodes that resemble experimental measurements.

The corresponding function values are provided below:

| $t$ (Uniform) | $s(t)$ |
|-------------------|------------|
| 0  | 0  |
| 1  | 2  |
| 2  | 8  |
| 3  | 24 |
| 4  | 56 |
| 5  | 110 |

| $t $ (Non-Uniform) | $ s(t) $ |
|----------------------|------------|
| 0.0  | 0.000  |
| 0.8  | 1.472  |
| 2.1  | 9.051  |
| 3.0  | 24.000 |
| 4.8  | 97.152 |
| 5.0  | 110.000 |

---

### Interpolation Methods

For this problem, you will write **your own implementation** (not just calling a pre-written library function) of the following two interpolation methods:

1. **Variant A (Consecutive Interpolation):**  
   On each subinterval $[t_i,t_{i+1}]$ (assuming that the $t_i$ are sorted in ascending order), the interpolant is the linear function connecting the two consecutive data points $(t_i,s(t_i))$ and $(t_{i+1},s(t_{i+1}))$. 

2. **Variant B (Inverse Distance Weighted Interpolation):**  
   For an evaluation point $t$, compute the following ratio of sums over *all* the data points:

   $$
   s_B(t)= \frac{\sum_i \frac{1}{|t-t_i|^2}\, s(t_i)}
   {\sum_j \frac{1}{|t-t_j|^2}}
   $$

   Note that as $t \to t_i$ (one of the data points), $s_B(t) \to s(t_i)$.

---

### Part 1: Implementing and inspecting the interpolations

**(a)** For evaluation points $t^*=1.5$ and $t^*=2.5$, compute the interpolated value using both Variant A and Variant B on the uniform grid. Repeat the interpolation for the non-uniform grid.  

**(b)** Plot the two interpolants (Variant A and Variant B) for each grid, for a dense set of points $t \in [0,5]$.  (Be careful when evaluating variant B exactly at a data point, so you don't divide `Inf` by `Inf`.) 

---

### Part 2: Accuracy Analysis

#### (a) Variant A

Given that the true function is $ s(t)=t^3-t^2+2t $, for an arbitrary evaluation point $t_0$ in some subinterval $[t_i,t_{i+1}]$, perform a Taylor series expansion of $s(t)$ about $t_i$.  

Using this Taylor series, show that the local interpolation error at any point $t_0$ can be expressed in terms of the following asymptotic upper bound (up to constant factors, i.e. "big-O" notation as in class):

$$
\text{Error} = \mathcal{O}((\Delta t)^2),
$$

where  $\Delta t = \max \{ t_{i+1}-t_i \}$ (the maximum spacing), as $\Delta t \to 0$.  Does it matter if the points $t_i$ are uniform or nonuniform?

#### (b) Variant B

For the inverse distance weighted interpolation (Variant B), plot the error at $t = 2.5$ on a log–log scale as a function of the number of points $N$ for uniformly spaced points ($\Delta t = 5/(N+1)$), for a sequence of *even* numbers $N$ (so that $t=2.5$ is not one of the data points), generating data from the formula $s(t)=t^3-t^2+2t$.

What convergence rate do you observe, i.e. $\mathcal{O}((\Delta t)^2)$ or $\mathcal{O}(\Delta t)$ or ...?

Prove this convergence rate, or at least try to make a convincing informal argument.
