# 18.C21 Problem Set 1

Due Wednesday 2/11 at **midnight**.  Submit in PDF format.  For handwritten solutions, use a decent-quality scan/image (e.g. get a scanner app on your phone or use a tablet).  For computational results, submit a PDF printout of your Jupyter notebook showing your code and (clearly labeled) results.  Combine all your submissions into a single PDF file.

**TO GENERATE A PDF OF A JUPYTER NOTEBOOK:** In the Jupyter client (e.g. the [JupyterLab Desktop](https://github.com/jupyterlab/jupyterlab-desktop) app), in the File pull-down menu, select Save and Export Notebook As, and then select the HTML format (not PDF, which may require special software). Then open the downloaded HTML file with your favorite browser, and use the browser's Print function to generate the PDF file.

### Problem 1 (3+3+3+3+3 points)

**(a)** In the [lecture-1 notes](https://github.com/mitmath/numerical_hub/blob/ed1c2102e103cc07052906c998cc3c0a337f0492/notes/finite-differences.ipynb), we presented the centered-difference approximation (denoted here by $D_{\delta x} f$).
$$
f'(x) \approx \underbrace{\frac{f(x+\delta x) - f(x-\delta x)}{2\delta x}}_{D_{\delta x} f} + O(\delta x^2) \, .
$$
Here, you will derive this approximation a different way: find the unique parabola/quadratic $q(x)$ that goes through the three points $(x, f(x))$ and $(x \pm \delta x, f(x \pm \delta x))$, and then show that the derivative $q'(x)$ at $x$ is **exactly the centered-difference approximation**.

**(b)** In general, suppose that we have $n$ distinct points $(x_k, f(x_k))$ for $k = 1, \ldots, n$.  Show that if you fit this to a degree-$n$ polynomial $p(x) = c_0 + c_1 x + \cdots + c_{n-1} x^{n-1}$, then the derivative $p'(x_1)$ (which we can use as an approximation for $f'(x_1)$) is of the form:
$$
f'(x_1) \approx p'(x_1) = \underbrace{v^T M^{-1}}_{w^T} \begin{pmatrix} f(x_1) \\ \vdots \\ f(x_n) \end{pmatrix}
$$
for some matrix $M$ and some vector $v$.  **What are $M$ and $v$**?  The row vector $w^T = v^T M^{-1}$ gives the *weights* of the finite-difference formula.

**(c)** In some programming language, **implement** your scheme from (b) numerically to compute $w$ for the points $x = [0,1,-1,2,-2]$, and show that you **obtain exactly the fourth-order finite-difference formula** from the lecture-1 notes (plugging in $\delta x = 1$).

**(d)** Suppose that we only compute the function $f(x)$ approximately, with some noise or other errors.  If our approximate function is denoted $\tilde f (x)$, and satisfies $|\tilde f(x) - f(x)| \le \Delta$ for some upper bound $\Delta \ge 0$ on the absolute error, **show** that the centered-difference approximation gains an additional error term proportional to $\Delta$:
$$
\left| D_{\delta x} \tilde f - f'(x) \right| \le a \, \delta x^2 + b \, \Delta + O(\delta x^3) \, ,
$$
assuming $f(x)$ has a Taylor series around $x$.  In particular, **give the exact coefficients** $a$ and $b$ in the above equation in terms of $\delta x$ and/or derivatives of $f(x)$ at $x$ (via the Taylor series).  Ignoring the higher-order $O(\delta x^3)$ terms, **what $\delta x$ minimizes the error bound**?

**(e)** Use the equation of (d) as a simple model of roundoff errors for the example $f(x) = \sin x$ from class, computing the derivative at $x=1$.  Suppose that $\Delta = \epsilon_{\text{machine}} f(1)$ (i.e. suppose that the relative error in computing $f(x)$ near $x=1$ is at most machine precision $\epsilon_{\text{machine}} = 2^{-52}$  for [`Float64` precision](https://en.wikipedia.org/wiki/Double-precision_floating-point_format), which might be slightly optimistic).  As in the lecture-1 notes, compute and plot the relative error of the centered-difference approximation versus $\delta x$ on a log–log scale, but also plot your error estimate $a (\delta x)^2 + b \Delta $, divided by $|f(1)|$ to obtain a relative error estimate, and **compare the two**.

## Problem 2 (5+5+5 points)

Consider the function $f(x) = \sqrt{1+x} - 1$, for real $x \ge -1$.

**(a)** Explain **why** computing this directly by the obvious formula `sqrt(1+x) - 1` will be **extremely inaccurate** (have a large *relative* error) for small $|x|$.  Give a **numerical example** of an $x$ in which you **lose roughly *half* of the significant digits** in `Float64` precision (i.e. you only get about 8 correct digits), compared to a high-precision answer (for example using 100 decimal digits via `setprecision(BigFloat, 100, base=10); sqrt(1 + big"1e-3") - 1` in Julia or `mpmath.mp.dps = 100; mpmath.sqrt(1 + mpmath.mpf('1e-3')) - 1` in Python with the `mpmath` library).

**(b)** Compute $f(x)$ accurately in `Float64` precision (to at least 14 digits) for the same example $x$ as in (a) by **using the Taylor series** for $f(x)$, implemented in your programming language.

**(c)** **Algebraically re-arrange** the function $f(x)$ into an *equivalent* formula (in exact arithmetic) that **doesn't suffer severe inaccuracy** for small $x$, and **check numerically** that it is indeed accurate (with `Float64` precision).   Hint: try multiplying and dividing $f(x)$ by something that gets rid of the subtraction.  Hint 2: look at the "quadratic roots" example in the [lecture 2 notes](https://github.com/mitmath/numerical_hub/blob/ed1c2102e103cc07052906c998cc3c0a337f0492/notes/Floating-Point-Intro.ipynb).

## Problem 3 (3+5+5 points)

The [standard deviation](https://en.wikipedia.org/wiki/Standard_deviation) $\sigma$ of a sequence $x_1, x_2, \ldots, x_n$ ($n > 1$) can be calculated by first computing the [sample mean](https://en.wikipedia.org/wiki/Sample_mean_and_covariance) (the "average") $\mu = \frac{1}{n} \sum_k x_k$ followed by computing:
$$
\sigma = \sqrt{\frac{1}{n-1} \sum_{k=1}^n (x - \mu)^2} \, ,
$$
where dividing by $n-1$ instead of $n$ is [Bessel's correction](https://en.wikipedia.org/wiki/Bessel%27s_correction).

This can be implemented straightforwardly in Julia as:
```jl
function my_std(x)
    n = length(x)
    μ = sum(x) / n
    return sqrt(sum((x .- μ).^2) / (n - 1))
end
```
Or in Python (via NumPy) as:
```py
import numpy as np
def my_std(x):
    x = np.asarray(x)
    n = x.size
    μ = np.sum(x) / n
    return np.sqrt(np.sum((x - μ)**2) / (n - 1))
```

**(a)** Compute, by hand, **the *exact* answer** for $\sigma$ with the $n=2$ sequence $x = [10^{300}, -10^{300}]$.

**(b)** **Compare to the straightforward implementation** of the above formulas (above, or its analogue in another language) in `Float64` precision using `x = [1e300, -1e300]`.  Also **compare to a "standard" library** in your programming language, e.g. `std(x)` in the Julia [Statistics.jl](https://github.com/JuliaStats/Statistics.jl) package or `numpy.std(x, ddof=1)` in Numpy.  **Also try** $x = [10^{-300}, -10^{-300}]$.  **Explain how and why these results differ from the exact result.**

**(c)** **Implement an improved** `std_corrected(x)` function in your programming language that doesn't suffer the problems you identified above when the elements of $x$ are very large or very small (it should also work when the elements are all zero).  Hint: try multiplying and dividing by a scale factor.

## Problem 4 (5+5+5 points)

In lecture 3, we found that piecewise linear interpolation of a function $f(x)$ sampled on a uniform grid $x_k = k \Delta x$ with spacing $\Delta x$ converges to $f$ with second-order accuracy, i.e. an *absolute* (not necessarily relative) error that is $O(\Delta x^2)$.

In Julia, you can use (e.g.) the [Interpolations.jl package](https://github.com/JuliaMath/Interpolations.jl) to do the interpolation for you.  Given a vector `x` of sample points (sorted in ascending order) and a function $f(x)$, you can create the corresponding piecewise-linear interpolation function `f_interp(x)` with:
```jl
using Interpolations
f_interp = linear_interpolation(x, f.(x)) 
```
You can then apply it to any point `xnew` with `f_interp(xnew)`, or to a vector `xnew` of points with `f_interp.(xnew)`.  In Python [with NumPy](https://numpy.org/doc/stable/reference/generated/numpy.interp.html), to do piecewise linear interpolation from a vector of points `x` to a vector of points `xnew`, given a vector `fx` of the function values at `x`, you can do `numpy.interp(xnew, x, fx)`.

**(a)** Consider the function $f(x) = |\sin x|$ evaluated on $x \in [0,4]$.  Similar to class, apply piecewise-linear interpolation from $n+1$ equally spaced points on $[0,4]$ (`x = range(0,4,length=n+1)` in Julia or `numpy.linspace(0,4,n+1)` in Python), i.e. $\Delta x = 4/n$, and compare your interpolant $I_n(x)$ to the exact function at $10^6$ equally spaced points.  On a log-log plot, show the **maximum absolute error** $\max_x |I_n(x) - f(x)|$ as a function of $n$ (ranging from $10$ to $10^5$).  Does it converge as $O(\Delta x^2)$?  **Determine and explain the order of convergence.**

**(b)** Does your answer change if you look at the [root-mean-square](https://en.wikipedia.org/wiki/Root_mean_square) error?  Why or why not?  **Determine and explain the order of convergence.**

**(c)** Propose, implement, and check a simple modification to your sample grid `x`, making it somewhat nonuniform, so that you recover $O(1/n^2)$ convergence of the maximum error.