# Interpolation and Extrapolation

```{note}
This lecture follows closely
[Numerical Recipes](https://numerical.recipes/) 2nd Edition in C and
3rd Edition in C++, Chapter 3 "Interpolation and Extrapolation".
```

In scientific computing and machine learning, interpolation and
extrapolation are essential methods for estimating unknown values from
known data.

Interpolation deals with predicting values within the range of
available data.
This is the foundation of most supervised learning tasks in machine
learning, where models predict outputs in the same region where they
were trained.
Standard interpolation methods include:
* Polynomial interpolation is flexible but can suffer from
  oscillations at the edges of the interval (Runge's phenomenon).
* Rational interpolation can help stabilize the behavior, especially
  near asymptotes or when functions have strong curvature.
* Spline interpolation, particularly cubic splines, provides smooth
  fits with continuity up to the second derivative.
  This is especially useful when a smooth curve is required, such as
  in modeling or visualization.

Extrapolation extends predictions beyond the available data.
It is inherently unreliable: without additional information, we cannot
know how a function continues beyond the known region.
A promising modern approach is physics-informed machine learning
(PIML), where physical laws such as ODEs are built into the model.
By respecting known constraints, such methods can make extrapolations
that remain consistent with physics.

Interpolation and function approximation are related but slightly
different.
It is useful to distinguish between them:
* Interpolation uses existing data to estimate specific missing
  values.
* Function approximation constructs a simplified function that
  captures the overall behavior of a more complex one.
  It is often for efficiency or analytic convenience.
  (See [Numerical Recipes](https://numerical.recipes/), Chapter 5.)

## Limitations of Interpolation

Even the best interpolation schemes can fail when the function itself
is ill-behaved.
For example:
\begin{align}
  f(x) = 3x^2 + \frac{1}{\pi^4}\ln\left[(\pi - x)^2\right] + 1
\end{align}
This function looks smooth but has a subtle singularity at $x = \pi$.
Interpolating only near the singularity produces misleading results,
as we see in these python plots.

In [None]:
import numpy as np

def f(x):
    return 3 * x**2 + np.log((np.pi - x)**2) / np.pi**4 + 1

x1 = np.linspace(3.13, 3.16, 3+1)     # very sparse
x2 = np.linspace(3.13, 3.16, 30+1)    # coarse
x3 = np.linspace(3.13, 3.16, 300+1)   # medium
x4 = np.linspace(3.13, 3.16, 3000+1)  # dense

In [None]:
from matplotlib import pyplot as plt

plt.plot(x4, f(x4),       label='3001 points')
plt.plot(x3, f(x3), '--', label='301 points')
plt.plot(x2, f(x2), 'o:', label='31 points')
plt.plot(x1, f(x1), 'o-', label='4 points')
plt.legend()

This example shows why interpolation methods should always be paired
with error estimates and awareness of the underlying physics or
mathematics.

## Preliminaries: Searching an Ordered Table

Before we can interpolate, we need to know where in the dataset our
target value lies.
This step is called searching.
* If data are sampled on a regular grid, finding neighbors is trivial:
  just use the array index.
* If data are irregularly spaced, we must locate the two points that
  bracket the target value.
* This search step can be just as costly as the interpolation itself,
  so efficient methods are critical in practice.
  In fact, in multi-dimension, it may require non-trivial algorithm
  and advanced data structure.

[Numerical Recipes](https://numerical.recipes/) describes two main
approaches: bisection and hunting, each suited to different scenarios.


### Linear Search

As a baseline, let's consider a simple linear search.
It scans through the array until the first value larger than the
target is found.

In [None]:
def linear(X, v):
    for i in range(len(X)):  # use a Python loop for clarity
        if X[i] >= v:
            return i - 1

In [None]:
import numpy as np

for _ in range(5):
    X = sorted(np.random.uniform(0, 100, 10))
    v = np.random.uniform(min(X), max(X))
    i = linear(X, v)
    assert X[i] <= v and v < X[i+1]
    print(f'{X[i]} <= {v} < {X[i+1]}')

This works, but it requires $\mathcal{O}(N)$ steps.
For large datasets, this is inefficient.

### Bisection Search

The bisection method is much faster.
It repeatedly halves the search interval until the target is bracketed.
For $N$ data points, it requires only about $\log_2(N)$ steps.

In [None]:
def bisection(X, v):
    l, h = 0, len(X) - 1
    while h - l > 1:
        m = (l + h) // 2
        if v >= X[m]:
            l = m
        else:
            h = m
    return l  # index of the closest value less than the target

In [None]:
for _ in range(5):
    X = sorted(np.random.uniform(0, 100, 10))
    v = np.random.uniform(min(X), max(X))
    i = bisection(X, v)
    assert X[i] <= v and v < X[i+1]
    print(f'{X[i]} <= {v} < {X[i+1]}')

This method is robust and efficient for uncorrelated queries, where
each target value is unrelated to the previous one.

### Hunting Method

If target values are requested in sequence and tend to be close to one
another, we can do even better.
The hunting method exploits this correlation:
1. Start near the last found index.
2. Step outward (doubling the step size each time) until the target is
   bracketed.
3. Refine the result using bisection in the narrowed interval.
This approach is often faster than starting from scratch with
bisection every time.

In [None]:
def hunt(X, v, i_last):
    n = len(X)
    assert 0 <= i_last < n - 1

    if v >= X[i_last]:
        l, h, step = i_last, min(n-1, i_last+1), 1
        while h < n - 1 and v > X[h]:
            l, h = h, min(n-1, h + step)
            step *= 2
    else:
        l, h, step = max(0, i_last-1), i_last, 1
        while l > 0 and v < X[l]:
            l, h = max(0, l - step), l
            step *= 2

    return bisection(X[l:h+1], v) + l

In [None]:
for _ in range(5):
    X = sorted(np.random.uniform(0, 100, 10))
    v = np.random.uniform(min(X), max(X))
    i = bisection(X, v)
    assert X[i] <= v and v < X[i+1]
    print(f'{X[i]} <= {v} < {X[i+1]}')

### Linear Interpolation with Searching

With a search routine in place, interpolation becomes straightforward.
Below is a simple interpolator that supports hunt, bisection, or
linear searching.

In [None]:
class Interpolator:
    
    def __init__(self, X, Y):
        assert len(X) == len(Y)
        self.X, self.Y = X, Y
        self.i_last = len(X) // 2

    def __call__(self, v, method='hunt'):
        if method == 'hunt':
            i = hunt(self.X, v, self.i_last)
        elif method == 'bisection':
            i = bisection(self.X, v)
        else:
            i = linear(self.X, v)

        self.i_last = i  # store last index for hunting

        x0, x1 = self.X[i], self.X[i+1]
        y0, y1 = self.Y[i], self.Y[i+1]
        m      = (y1 - y0) / (x1 - x0)
        return y0 + m * (v - x0)

In [None]:
import matplotlib.pyplot as plt

def f(x):
    return np.exp(-0.5 * x**2)

Xs = np.sort(np.random.uniform(-5, 5, 20))
Ys = f(Xs)

fi = Interpolator(Xs, Ys)

Xi = np.linspace(min(Xs), max(Xs), 100)
Yi = np.array([fi(x) for x in Xi])

plt.plot(Xs, Ys, 'o-', label='Sampled data')
plt.plot(Xi, Yi, '.',  label='Interpolation')
plt.legend()
plt.show()

Finally, let's test our claim: hunting should be faster than bisection
when queries are sequential and correlated.

In [None]:
from timeit import timeit

Xs = np.sort(np.random.uniform(-5, 5, 1000))
Ys = f(Xs)
fi = Interpolator(Xs, Ys)
Xi = np.linspace(min(Xs), max(Xs), 10_000)

def job(method):
    Yi = [fi(x, method) for x in Xi]

dt_linear    = timeit("job('linear')",    globals=globals(), number=1)
dt_bisection = timeit("job('bisection')", globals=globals(), number=1)
dt_hunt      = timeit("job('hunt')",      globals=globals(), number=1)

In [None]:
print('Linear   :', dt_linear)
print('Bisection:', dt_bisection)
print('Hunt     :', dt_hunt)

In [None]:
# HANDSON: change the number of sampled points and interpolation
#          points and measure the performance of all three methods.
#          What are the performance characteristics when
#          N_sample >> N_interpolation and
#          N_sample << N_interpolation?


```{note} Main Take Away

It may seem surprising that interpolation, which sounds like purely
numerical, actually spends much of its effort on searching.
The interpolation formula itself is simple, but locating the right
interval in the data dominates the work.

This may be less surprising if we start asking ourselves what are
computers good at?
Certainly at arithmetic, but equally at bookkeeping (and communication
now when we have internet).
In fact, much of computer science is devoted to studying how to
organize and retrieve data efficiently.
This is what "data structures and algorithms" are about.
Searching in interpolation is a great example of this.
Efficient bookkeeping can matter as much as the numerical formula.

Our codes here were one-dimensional and simple.
But in practice, many scientific applications require multidimensional
interpolation.
In higher dimensions, the search problem quickly becomes far more
complicated.
Efficient methods often rely on tree structures (e.g., kd-trees or
octrees) to organize points and make search practical.
This is the same kind of complexity that appears in $n$-body
simulations and particle-mesh algorithms, where bookkeeping and
efficient search dominate performance.

Thus, linear interpolation serves as a small but important lesson:
computational science (i.e., numerical analysis) and computer science
are deeply connected.
To do science at scale, one must care not only about the equations,
but also about how data are organized and searched.
```