# Sampled data

**ENGSCI233: Computational Techniques and Computer Systems** 

*Department of Engineering Science, University of Auckland*

This topic addresses algorithmic implementations of **interpolation** and **integration**.

Interpolation is when we try to estimate the value of some process using observations around it. There are many ways to do this: drawing lines between data points, or fitting smooth functions like polynomials or cubic splines.

Integration is when we try to estimate the area under a curve or, sometimes, under discrete data. Methods include the trapezium rule and Gaussian quadrature.

You need to know:
- The pros and cons of linear, polynomial and spline interpolation, and how these extend to extrapolation.
- The math of trapezium rule and Gaussian quadrature. In all cases, a weighted sum of function values.

In [None]:
# imports and environment: this cell must be executed before any other in the notebook
%matplotlib inline
from matplotlib import pyplot as plt

Sampled data is defined as ***a set of measurements of a continuous process at discrete points (locations or times)***. 

We shall use the following terminology. 
 - The **continuous process** is $y(x)$, with $x$ the **independent variable** (e.g., position, time) and $y$ the **dependent variable** (e.g., velocity, temperature, force).
 - The set of $N$ **measurement points** are $[x_0,x_1,\cdots x_{n-1}]$ or, more compactly, $x_i$, where the index, $i$, indicates "*could be any of the values 0 through to $n-1$*".
 - The corresponding set of $N$ **measurements values** are $[y(x_0), y(x_1),\cdots y(x_{n-1})]$, or $[y_0, y_1,\cdots y_{n-1}]$, or $y_i$.

As an example, consider the set of eight temperature measurements (points and values) in the table below:

|  time  |  2.5 | 3.5  | 4.5 |  5.6 |  8.6 |  9.9 | 13.0  | 13.5|
|-|-|-|-|-|-|-|-|-|-|
| temperature |24.7 | 21.5 | 21.6 | 22.2 | 28.2 | 26.3 | 41.7 | 54.8|

In this case: $N=8$; the dependent variable is temperature, $T$; and the independent variable is time, $t$, i.e., $y(x)\equiv T(t)$. Note, there is no requirement that the measurement points are evenly spaced.

***Run the code in the cell below to view the data.***

In [None]:
# imports
import numpy as np
from sampled_data233 import plot_data

# specify the data as numpy arrays
xi = np.array([2.5, 3.5, 4.5, 5.6, 8.6, 9.9, 13.0, 13.5])
yi = np.array([24.7, 21.5, 21.6, 22.2, 28.2, 26.3, 41.7, 54.8])

# plot the data
f,ax = plt.subplots(1,1)
plot_data(xi,yi,ax,label='data');

The sampled data $(x_i,y_i)$ provide an **incomplete picture** of the continuous process $y(x)$. We shall consider two uses of such data: **interpolation** and **integration**.

## [1 Interpolation](https://en.wikipedia.org/wiki/Interpolation)

<mark>***Filling in the gaps between data.***</mark>

Often we may need a measurement at a particular point or points, $x_j$, for which there are no data (note, I am using the index $i$ for data we **do have**, and $j$ for data we **don't**). However, if there is data **either side** of the unknown point, then we can use this information to approximate $y_j=y(x_j)$: this is known as **interpolation**. 

***Run the code in the cell below to see proposed interpolation points, $x_j$.***

In [None]:
# import functions
from sampled_data233 import plot_interpolation_lines

# specify the interpolation locations
xj = np.linspace(3., 13., 11)

# plot the data
f,ax = plt.subplots(1,1)
plot_data(xi,yi,ax,label='data')

# plot interpolation locations as vertical lines
plot_interpolation_lines(xj, ax);

***Modify the code above so that interpolation is twice as frequent.***

We shall consider three methods of interpolation. They are all variations on the same strategy:
1. Find an **interpolating function**, $f(x)$, that exactly or approximately matches the data, i.e., $y_i= \text{or}\approx f(x_i)$.
2. Find $y_j$ by **evaluating** $f(x)$ at the points of interest, i.e., $y_j=f(x_j)$.

The three methods - (i) polynomial fitting, (ii) piecewise linear interpolation, and (iii) cubic splines - differ only in how they **determine** $f(x)$.

Note, we are assuming the $x_j$ lies within the range of our data, i.e., $x_0\leq x_j\leq x_{n-1}$. If $x_j$ lies outside this range, the process is called **extrapolation**. 

### [1.1 Polynomial fitting](https://en.wikipedia.org/wiki/Polynomial_regression)

You have probably already encountered this type of interpolation. For example, finding the **line of best fit** is an example of fitting a polynomial of order 1 to a set of points. 

Suppose the interpolating function is a polynomial, e.g., $f(x)=3x^4-2x^2+2$ or $T=f(t)=-8.35t +1.06\times 10^4$. We need to determine:
- What **order** should the polynomial be? (What is the highest power of the independent variable?)
- How should I find its **coefficients**?

The first point is a judgment call on your part. As we shall see, you are trading off polynomial complexity against the ability to fit the data. As a general rule, you should ***seek the simplest polynomial that does a "good job fitting the data"***.

#### 1.1.1 Theory

A polynomial of order $m$ has the form

$$ f(x) = \sum\limits_{k=0}^{m} a_k x^k $$

where we use the index $k$ to distinguish from the data index, $i$, and the interpolation index, $j$. The coefficients are denoted by $a_k$, e.g., for the polynomial $y=2x^2-3x+1$, $a_k=[2,-3,1]$. 

We need to find the values of $a_k$ that provide the "**best fit to the data**" - let's define that phrase.

We **know** that the data has values $y_i$ (we measured those). The interpolating function **"predicts"** that the data should have values $f(x_i)$. Generally, these will **not** be equal, i.e., $y_i \neq f(x_i)$.

***"Doesn't that make it a terrible interpolating function?"***

Not necessarily, providing that the $f(x_i)$ is **reasonably close** to $y_i$. Formally, we define how close they are by (i) taking the difference between $f(x_i)$ and $y_i$, (ii) squaring it (so there is no difference between under and overpredicting), and (iii) adding the squared differences for all data points (to obtain a single number). Thus, the **sum of squared differences** is given

$$ R^2 = (y_0-f(x_0))^2+(y_1-f(x_1))^2+\cdots (y_{n-1}-f(x_{n-1}))^2 = \sum\limits_{i=0}^{n-1}\left(y_i-(a_0+a_1x_i+a_2x_i^2+\cdots a_mx_i^m)\right)^2 $$

This single number, sometimes called the **residual**, is a quantitative measure of "*how good does the interpolating function match the data*".

The next goal is to find values of the coefficients, $a_k$, that minimize $R^2$. Generally speaking, how do we find $x$ that minimises $y(x)$? Why, by finding the value $x$ where $dy/dx=0$ of course! (And maybe checking the second derivative is >0 to prove this is a minimum.)

Given there are $m$ coefficients for us to find, this means we can form equations for $m$ derivatives (one each with respect to the $m$ coefficients $a_k$) and solve these simultaneously. For example, for the case $m=2$, we have $f(x)=a_0+a_1x+a_2x^2$, and we can formulate three equations:

$$ \frac{\partial(R^2)}{\partial a_0}=-2 \sum\limits_{i=0}^{n-1}\left(y_i - (a_0+a_1x_i+a_2x_i^2)\right)=0,$$

$$ \frac{\partial(R^2)}{\partial a_1}=-2 \sum\limits_{i=0}^{n-1} x_i \left(y_i - (a_0+a_1x_i+a_2x_i^2)\right)=0,$$

$$ \frac{\partial(R^2)}{\partial a_2}=-2 \sum\limits_{i=0}^{n-1} x_i^2\left(y_i - (a_0+a_1x_i+a_2x_i^2)\right)=0.$$

***Confirm that you can obtain these equations: differentiate the expression for $R^2$ with respect to the coefficients, $a_k$, and set equal to zero (you will need to use the chain rule and hold the $x_i$ constant).***

Each of these equations can be rearranged in terms of (1) a LHS that depends on the unknown coefficients, $a_k$, and (2) a RHS that depends only on the known data, $x_i$ and $y_i$. For example, the first equation can be rewritten as:

$$ (n)a_0 + \left(\sum x_i\right) a_1 + \left(\sum x_i^2\right)a_2 = \sum y_i  $$

We can write the three equations using the **Vandermonde matrix** for the LHS:

$$ \begin{bmatrix} n \quad \sum x_i \quad \sum x_i^2 \\ \sum x_i \quad \sum x_i^2 \quad \sum x_i^3 \\ \sum x_i^2 \quad \sum x_i^3 \quad \sum x_i^4\end{bmatrix} \begin{bmatrix} a_0 \\ a_1 \\ a_2 \end{bmatrix} = \begin{bmatrix} \sum y_i \\ \sum x_iy_i \\ \sum x_i^2y_i \end{bmatrix}.  $$

This system of equations is now in the form $A\mathbf{x}=\mathbf{b}$ and can be solved (numerically) using the LU factorization algorithm developed last week.

For an $m$ order polynomial, the expression above generalizes:

$$ \begin{bmatrix} 
&n& \quad &\sum x_i& \quad &\sum x_i^2& \quad &\cdots& \quad &\sum x_i^m& \\ 
&\sum x_i& \quad &\sum x_i^2& \quad &\sum x_i^3&  \quad &\cdots& \quad &\sum x_i^{m+1}& \\
&\sum x_i^2& \quad &\sum x_i^3& \quad &\sum x_i^4& \quad &\cdots& \quad &\sum x_i^{m+2}& \\
&\vdots& \quad &\vdots& \quad &\vdots& \quad &\ddots& \quad &\vdots& \\
&\sum x_i^m& \quad &\sum x_i^{m+1}& \quad &\sum x_i^{m+2}& \quad &\cdots& \quad &\sum x_i^{2m}& \end{bmatrix} 
\begin{bmatrix} a_0 \\ a_1 \\ a_2 \\ \vdots \\ a_m\end{bmatrix} = \begin{bmatrix} \sum y_i \\ \sum x_iy_i \\ \sum x_i^2y_i \\ \vdots \\ \sum x_i^m y_i \end{bmatrix}.  $$


#### 1.1.2 Algorithm

The key algorithm steps for polynomial fitting are:
```
1. Initialise: polynomial order M, data XI, YI.
2. Construct size M+1 square Vandermonde matrix using data XI.
3. Construct length M+1 RHS vector using data XI, YI.
4. Solve Ax=b for coefficients of f(x).
 - compute LU decomposition of A using LU_factor
 - find x using LU_solve
```
Once the coefficients of $f(x)$ are determined, interpolation proceeds straightforwardly:
```
5. Initialise: interpolation locations, XJ.
6. Evaluate f(x) at XJ.
```
A completed Python implementation of polynomial fitting is given in `sampling233.py` and demonstrated in the next section.

#### 1.1.3 Demonstration

In [None]:
# import functions
from ipywidgets import interact, fixed
from sampled_data233 import plot_polynomial_elements

def polynomial_fitting_figure(order=1, interpolate = False, extrapolate = False, xi = None, yi = None, xj = None):
    # create figure
    f,ax = plt.subplots(1,1)
    f.set_size_inches(5,5)
    
    # see sampling233.py for details on the Python implementation of polynomial fitting
    plot_polynomial_elements(ax, xi, yi, xj, order, interpolate, extrapolate)

# run the interactive figure environment
interact(polynomial_fitting_figure, order = (1,8,1), interpolate = False, extrapolate = False, xi = fixed(xi), yi = fixed(yi), xj = fixed(xj));

In [None]:
# EXPERIMENT with the order of the fitted function above to obtain
# the BEST polynomial.

# To determine BEST, weigh the CRITERIA below:
# - MISFIT with the data
# - PLAUSIBILITY of the equation
# - ability to EXTRAPOLATE

# ENTER your answer in POLYNOMIAL FITTING ORDER poll on the MODULE PAGE

#### 1.1.4 Concept questions

***Compare the interpolating function for $m=2$ and $m=6$:***
- *Which fits the data better?*
- *Which looks more like a physical process?*
- *Which is the better choice for $m$? If you chose $2$, how do you justify not matching the data?*

> <mark>*~ your answer here ~*</mark>

***What value of $m$ ensures that the $R^2$ is (almost) zero? How does this relate to the number of data points?***

> <mark>*~ your answer here ~*</mark>

***When $R^2=0$, $f(x)$ passes through all the data points exactly. Why might this be a bad thing?***

> <mark>*~ your answer here ~*</mark>

***Why do we encounter problems when setting $m=8$.***

> <mark>*~ your answer here ~*</mark>

***Can $f(x)$ be used for extrapolation? (i.e., is $f(x)$ defined for $x_j<x_0$ or $x_j>x_{n-1}$?)***

> <mark>*~ your answer here ~*</mark>


### [1.2 Piecewise linear interpolation](https://en.wikipedia.org/wiki/Linear_interpolation#Linear_interpolation_between_two_known_points)

This method **fits a straight line segment between neighbouring data points**. For $N$ data points, there are $N-1$ adjacent data pairs, and the set of $N-1$ straight line segments defines the interpolating function, $f(x)$, in a **piecewise** manner.

This interpolation method is commonly used and is easy to implement. It is one of a subset of methods called **Lagrange interpolation**.

Piecewise linear interpolation adopts the same perspective of sampled data as the **finite difference** formula and the **Euler method**, i.e., "*we don't know what happens between these two points, let's assume its a straight line.*"

#### 1.2.1 Theory

Define the $i^{\text{th}}$ **subinterval** between neighbouring data points, $I_i=[x_i, x_{i+1}]$. The straight line linking these points is given by:

$$y(x) = y_i + \frac{y_{i+1}-y_i}{x_{i+1}-x_i}(x-x_i) = m_ix+c_i$$

where $m_i=(y_{i+1}-y_i)/(x_{i+1}-x_i)$ and $c_i=y_i-m_ix_i$ i.e., the gradient and intercept of the straight line.

#### 1.2.2 Algorithm

The key algorithm steps for piecewise linear interpolation are:
```
1. Initialise data XI, YI.
2. For the ith subinterval, compute the straight line gradient, mi, and intercept, ci.
3. Find which interpolation points, xj, fall within the subinterval, i.e., xi<xj<xi+1.
4. Evaluate the piecewise interpolating function at xj.
```
A Python implementation of piecewise linear interpolation is given in `sampling233.py` and demonstrated in the next section.

#### 1.2.3 Demonstration

In [None]:
# import functions
from sampled_data233 import plot_piecewise_elements

def piecewise_linear_figure(interpolate = False, xi = None, yi = None, xj = None):
    # create figure
    f,ax = plt.subplots(1,1)
    f.set_size_inches(5,5)
    
    # see sampling233.py for details on the Python implementation of piecewise linear interpolation
    plot_piecewise_elements(ax, interpolate, xi, yi, xj)

# run the interactive figure environment
interact(piecewise_linear_figure, interpolate = False, xi = fixed(xi), yi = fixed(yi), xj = fixed(xj));

#### 1.2.4 Concept questions

***The interpolating function passes through each point exactly. Explain when this could be a disadvantage.***

> <mark>*~ your answer here ~*</mark>

***$f(x)$ is continuous at the data points. Are its derivatives?***

> <mark>*~ your answer here ~*</mark>

***Can $f(x)$ be used for extrapolation?***

> <mark>*~ your answer here ~*</mark>


***- - - - CLASS CODING EXERCISE - - - -***

In [None]:
# PART ONE
# --------
# SUPPOSE we have the interpolating function
def interp(x0,y0,x1,y1,x): 
    return y0 + (y1 - y0)/(x1 - x0)*(x - x0)
# and the data XI and YI
xi = np.array([2.5, 3.5, 4.5, 5.6, 8.6, 9.9, 13.0, 13.5])
yi = np.array([24.7, 21.5, 21.6, 22.2, 28.2, 26.3, 41.7, 54.8])

# for example
x = 3
y = interp(xi[0], yi[0], xi[1], yi[1], x)
print(x,y)

# For the interpolation value X below, loop over PAIRS of data points in XI
# to FIND the interval that X lies inside of, then perform the interpolation
x = 6.8
# **your code here**

In [None]:
# OPTIONAL CHALLENGE
# -----------------
# Do the same thing, but for 
xj = [3, 4, 5, 6, 7, 8, 9, 10]

### [1.3 Cubic splines](https://en.wikipedia.org/wiki/Spline_interpolation)

In order to overcome **derivative discontinuity** inherent in piecewise linear interpolation (and higher order Lagrange schemes), cubic splines are used. Like the previous method, for $N$ data points, spline based interpolation breaks the range of the interpolating function into $N-1$ subintervals of the form $I_i = [x_i , x_{i+1}]$. A **different** cubic polynomial, **passing through the data exactly**, is then used to interpolate within each subinterval. 

The polynomials are chosen so that $f(x)$, and its **first and second derivatives are continuous** at the subinterval boundaries, $x_i$. 

#### 1.3.1 Theory

For the $i^{\text{th}}$ subinterval, define the cubic polynomial, $p_i(x)$, with coefficients

$$p_i(x) = a_0^{(i)} + a_1^{(i)}(x − x_i) + a_2^{(i)}(x − x_i)^2 + a_3^{(i)}(x − x_i)^3.$$

Note, the superscript $(i)$ means "*this coefficient belongs to the $i^{\text{th}}$ subinterval*", not "*to the power $i$*".

Similarly for the neighbouring subintervals ($i-1$ and $i+1$), we can write

$$p_{i-1}(x) = a_0^{(i-1)} + a_1^{(i-1)}(x − x_{i-1}) + a_2^{(i-1)}(x − x_{i-1})^2 + a_3^{(i-1)}(x − x_{i-1})^3.$$

$$p_{i+1}(x) = a_0^{(i+1)} + a_1^{(i+1)}(x − x_{i+1}) + a_2^{(i+1)}(x − x_{i+1})^2 + a_3^{(i+1)}(x − x_{i+1})^3.$$

**How many unknowns?**

As with polynomial fitting, there are a certain number of unknown polynomial coefficients. We shall set up an equal number of equations and solve these simultaneously. **How many unknowns?** Well, there are $N-1$ subintervals (for $N$ data points there are $N-1$ neighbouring pairs), and *for each* subinterval, there are 4 unknowns. So we need $4(N-1)$ equations.

Recall when solving ODEs: first we obtain the general solution with **unknown constants**, then we find the particular solution (and the values of the constants) by applying **initial and boundary conditions**. We shall take the same approach here and use conditions for (i) the **value** and (ii) the **slope** of the interpolating function at the subinterval boundaries.

**Boundary conditions on function value**

We said that the cubic polynomials for each subinterval **must** pass through the bounding data points, $(x_i, y_i)$, exactly. Therefore, $p_i(x_i)=y_i$ and $p_i(x_{i+1})=y_{i+1}$, or

$$y_i = a_0^{(i)},\quad\quad y_{i+1}=a_0^{(i)} + a_1^{(i)}\Delta x_i + a_2^{(i)}\Delta x_i^2 + a_3^{(i)}\Delta x_i^3,\quad\quad \Delta x_i=x_{i+1}-x_i,\quad\quad i=[0,1,\cdots n-2].$$

This gives us $2$ equations for each of the $N-1$ subintervals ($2(N-1)$ equations down, $2(N-1)$ to go...) 

**Boundary conditions on function derivatives**

We also said that the cubic polynomials for each subinterval must have a first and second order derivative that is **continuous** with its neighbours. The first derivative of the polynomial for the $i^{\text{th}}$ subinterval is:

$$ \frac{dp_i}{dx}=a_1^{(i)}+2a_2^{(i)}(x-x_i)+3a_3^{(i)}(x-x_i)^2 $$

and for the $i+1$ subinterval:

$$ \frac{dp_{i+1}}{dx}=a_1^{(i+1)}+2a_2^{(i+1)}(x-x_{i+1})+3a_3^{(i+1)}(x-x_{i+1})^2.$$

These two subintervals share a boundary at $x=x_{i+1}$ and here the first derivatives **must** be equal, therefore

$$a_1^{(i)}+2a_2^{(i)}\Delta x_i+3a_3^{(i)}\Delta x_i^2-a_1^{(i+1)}=0,\quad\quad i=[0,1,\cdots n-2].$$

We have one equation of this type for each pair of neighbouring subintervals, i.e., another $N-2$ equations.

Applying similar reasoning, but now requiring the **second** derivative to be continuous, leads to another $N-2$ equations of the form

$$2a_2^{(i)}+6a_3^{(i)}\Delta x_i-2a_2^{(i+1)}=0,\quad\quad i=[0,1,\cdots n-2].$$

**Finding the last two equations**

So far we have used continuity of function value and slope to define $4(N-1)-2$ equations for $4(N-1)$ unknowns. The last two equations, which define a class of cubic splines called **natural** splines, are obtained by requiring the second derivative at the **data extremes** ($x_0$ and $x_{n-1}$) to be zero, i.e.,

$$ 2a_2^{(0)}=0, \quad\quad 2a_2^{(n-2)}+6a_3^{(n-2)}\Delta x_{n-2}=0.$$

Note, an alternative approach is the "not-a-knot" condition, which instead requires the **third** derivative at the first and last subinterval boundaries to be equal.

**Putting it all together**
The $4(N-1)$ equations in $4(N-1)$ unknowns are found by setting up a matrix equation of the form $A\mathbf{x}=\mathbf{b}$, where $\mathbf{x}$ is the vector of unknown polynomial coefficients, entries in the matrix, $A$, depend only on the terms $c_i$ (the spacing between data points), and the RHS vector, $\mathbf{b}$, contains only the measurements, $y_{i}$.

#### 1.3.2 Algorithm

The key algorithm steps for constructing cubic splines are:
```
1. Set up the matrix, A, populating with appropriate coefficients of the terms in aj^(i) in the equations above.
2. Set up the RHS vector, b, populating with the appropriate righthandsides of the equations above.
3. Use LU factorization to solve the matrix equation and obtain the spline coefficients.
4. Use the spline coefficients to interpolate at the desired points.
```
Python implementation of this algorithm is the subject of Lab 4. A demonstration is provided in the next section.

#### 1.3.3 Demonstration

Run the code in the cell below for a visual illustration of cubic spline interpolation (using Python's built in spline functions).

In [None]:
# import functions
from sampled_data233 import plot_spline_elements

def cubic_spline_figure(interpolate = False, SubIntEqn = 0, xi = None, yi = None, xj = None):
    # create figure
    f,ax = plt.subplots(1,1)
    f.set_size_inches(5,5)
    
    # see sampling233.py for use of Python built-in spline interpolation
    plot_spline_elements(ax, interpolate, SubIntEqn, xi, yi, xj)

# run the interactive figure environment
interact(cubic_spline_figure, interpolate = False, SubIntEqn = (0,7,1), xi = fixed(xi), yi = fixed(yi), xj = fixed(xj));

#### 1.3.4 Concept questions

***What are the advantages and disadvantages of cubic spline interpolation compared to polynomial fitting and piecewise linear interpolation?***

> <mark>*~ your answer here ~*</mark>


## [2 Integration](https://en.wikipedia.org/wiki/Numerical_integration)

<mark>***Constructing cumulative measures of discrete quantities.***</mark>

Why do we need numerical methods for integration? Have a go at solving the integral below. Go ahead, I'll wait.

$$ \int \frac{\sin\left(\frac{a\,\cos(x)+b\,\sin(x)}{\cos(x)}\right)}{a\,\cos(x)+b\,\sin(x)}dx$$

**Not all integrals can be solved analytically.** Take the general integral $I=\int\limits_{x_0}^{x_{n-1}}g(x)\,dx$, where we know $g(x)$ as the **integrand**.

We shall consider a class of methods that approximately evaluate this integral. These methods are based on the idea that the value of an integral, $I$, corresponds to the area under the graph of the integrand. There are two cases:
1. We **know** the integrand, $g(x)$, exactly.
2. We **don't know** $g(x)$ exactly, but we do have some data, $(x_i, y_i)$. Therefore, we can find an interpolating function, $f(x)\approx g(x)$.

Numerical integration methods break the integration range into manageable sized subintervals (similar to piecewise linear interpolation and cubic splines) and then computes the area of each. If $g(x)$ is known, then the subintervals can be chosen. Otherwise, the subintervals are defined by the data locations $x_i$. 

### [2.1 Newton-Cotes methods](https://en.wikipedia.org/wiki/Newton%E2%80%93Cotes_formulas)

We will first consider the **Newton-Cotes methods**, the zeroth, first and second order versions of which you may recognise as the Rectangular, Trapezoidal and Simpson's methods, respectively.

These methods approximate the function, $g(x)$, between subinterval boundaries with a polynomial of order, $m$. For example, the Trapezium method fits a straight line between the boundary points $(x_i, g(x_i))$ and $(x_{i+1}, g(x_{i+1}))$.

***Run the code in the cell below to see how an integral is approximated by computing the areas of three subintervals. Both cases are considered: a known function, $g(x)$, and collected data, $(x_i, y_i)$.*** 

In [None]:
# import functions
from sampled_data233 import plot_integration_elements

def integration_example(gx_known, subintervals, area='None'):
    # create figure
    f,ax = plt.subplots(1,1)
    f.set_size_inches(5,5)
    
    # show plot
    plot_integration_elements(ax, gx_known, subintervals, area)

# run the interactive figure environment
interact(integration_example, gx_known=True, subintervals=False, area = ['None', 'A0','A1','A2','Atot']);

#### 2.1.1 Theory

We denote bounding values for the $i^{\text{th}}$ subinterval, $(x_i, y_i)$ and $(x_{i+1}, y_{i+1})$. If the integrand, $g(x)$, is known, then $y_i=g(x_i)$ and $y_{i+1}=g(x_{i+1})$.

For the Trapezium rule, the equation of the straight line joining the bounding values is 

$$ y(x) = \frac{y_{i+1}-y_i}{x_{i+1}-x_i}(x-x_i)+y_i,$$

The area of the subinterval, $A_i$, is then the integral beneath this line, i.e.,

$$A_i = \int\limits_{x_i}^{x_{i+1}} y(x) dx = \frac{y_{i+1}+y_i}{2}(x_{i+1}-x_i),$$

which might be interpreted literally as "*the average height of the subinterval, ($\frac{y_{i+1}+y_i}{2}$), times the width of the subinterval, ($x_{i+1}-x_i$)*".

Imagine that we add together the areas for **two neighbouring subintervals**, $I_i$ and $I_{i+1}$. Suppose further that these subintervals have the same width, i.e., $x_{i+2}-x_{i+1}=x_{i+1}-x_i=\Delta x$. Recognizing that the expressions for $A_i$ and $A_{i+1}$ share a common boundary point, we can write

$$ A_i+A_{i+1} = \frac{\Delta x}{2}(y_i+2y_{i+1}+y_{i+2})$$

and for $N-1$ subintervals

$$I=\int\limits_{x_0}^{x_{n-1}} g(x) dx \,\,\approx\,\, \sum\limits_{i=0}^{n-2}A_i =\frac{\Delta x}{2}(y_0+2y_1+\cdots+2y_{n-2}+y_{n-1}),\quad\text{for}\,\,y_i=g(x_i)$$.

***Exercise 1: Rectangular method***

The Rectangular method fits a zeroth order polynomial (a horizontal line) to each subinterval, i.e., $y(x)=y_i$. 

- ***Show that the area of the $i^{\text{th}}$ subinterval is given by $A_i=y_i(x_{i+1}-x_i)$.*** 
- ***Show that the total area of $N-1$ equal-width subintervals is***

$$ I \approx \Delta x(y_0+y_1+\cdots+y_{n-2}).$$

***Exercise 2: Simpson's method***

Simpson's method fits a second order polynomial to each subinterval (with midpoint) $[x_i,\frac{1}{2}(x_i+x_{i+1}),x_{i+1}]$. 

- ***For a subinterval of width $\Delta x$ centered at $x=0$, i.e., $x_i=-\frac{\Delta x}{2}$, and $x_{i+1}=\frac{\Delta x}{2}$. Show that the quadratic fitting the three function values $[y_i,y_{i+\frac{1}{2}},y_{i+1}]$, where $y_{i+\frac{1}{2}}$ is the function evaluated at the subinterval midpoint, is:***

$$ y = ax^2+bx+c,\quad\text{where}\,\,a=\frac{2}{(\Delta x)^2}(y_i-2y_{i+\frac{1}{2}}+y_{i+1}),\quad b=\frac{1}{\Delta x}(y_{i+1}-y_i),\quad c=y_{i+\frac{1}{2}}$$

- ***Show by integration of $y(x)$ that the area of the subinterval is***

$$ I=\int\limits_{-\Delta x/2}^{\Delta x/2} y(x) dx = \frac{\Delta x}{6}(y_i+4y_{i+\frac{1}{2}}+y_{i+1})$$

- ***Thus, show that the total area of $N-1$ equal-width subintervals is***

$$ I \approx \frac{\Delta x}{6}(y_0 + 4y_{1/2}+2y_1+4y_{3/2}+\cdots+2y_{n-2}+4y_{n-3/2}+y_{n-1}). $$


#### 2.1.2 Demonstration

In [None]:
# import functions
from sampled_data233 import plot_trapezium_elements

def trapezium_figure(N=2):
    # create figure
    f,ax = plt.subplots(1,1)
    f.set_size_inches(5,5)
    
    # show plot
    plot_trapezium_elements(ax, N)

# run the interactive figure environment
interact(trapezium_figure, N = (2,10,1));

#### 2.1.3 Concept questions

***How does increasing the number of subintervals improve the accuracy of the Trapezium method? Describe the tradeoffs.***

> <mark>*~ your answer here ~*</mark>

***Does the trapezium method appear to approximate some subintervals better or worse than others?***

> <mark>*~ your answer here ~*</mark>

***Generalize the Trapezium method to the case of subintervals of uneven width.***

> <mark>*~ your answer here ~*</mark>


### [2.2 Gaussian quadrature](https://en.wikipedia.org/wiki/Gaussian_quadrature)

This is a numerical scheme for evaluating integrals with polynomial integrands up to a given order, exactly. The polynomial degree that can be approximated depends on the order of the quadrature scheme. For approximating integrals, these methods are preferred over Newton-Cotes methods as they require less function evaluations.

#### 2.2.1 Theory

Before we start discussing Gaussian quadrature in detail, it is helpful to standardise the integration process. We achieve this by **normalising** the integral.

We introduce the variable $\xi$ such that $\xi=\frac{x-x_0}{x_1-x_0}$ and, correspondingly, $x=x_0+(x_1-x_0)\xi$. From this definition, at $x=x_0$ we have $\xi=0$ and at $x=x_1$ we have $\xi=1$. This implies that $dx=(x_1-x_0)d\xi=Jd\xi$ (where $J$ is called the **Jacobian** and is the function that transforms $dx\rightarrow d\xi$). The integrand, $g(x)$, becomes $g(x_0+(x_1-x_0)\xi)$. Using these results, we can rewrite the integral, $I$, in normalised form:

$$ I=\int\limits_{x_0}^{x_1} g(x)\,dx \rightarrow \int\limits_0^1 \bar{g}(\xi)\,J\,d\xi=\int\limits_0^1f(\xi)\,d\xi,\quad\text{where}\quad f(\xi) = g(x_0+(x_1-x_0)\xi)(x_1-x_0)$$

Gaussian quadrature evaluates the normalised integral by approximating it as a **weighted sum of function evaluations at particular locations**. *Both* the weights *and* the particular locations are the key to this method. Expressing the previous statement:

$$ \int\limits_0^1 f(\xi)\,d\xi = \sum\limits_{g=0}^{G-1} w_g\,f\left(\xi^{(g)}\right) +E_G= w_0\,f\left(\xi^{(0)}\right)+ w_1\,f\left(\xi^{(1)}\right)+\cdots+ w_{G-1}\,f\left(\xi^{(G-1)}\right)+E_G,$$

where the sample locations $f\left(\xi^{(g)}\right)$ are called **Gauss points**, $w_g$ are called the **weights** and $E_G$ is the error associated with the approximation. To implement the method, we need to: (1) find the Gauss point locations, $\xi^{(g)}$, (2) evaluate the integrand at these locations, $f\left(\xi^{(g)}\right)$, and (3) find the weights, $w_g$, to sum up the evaluations.

It can be shown that choosing $G$ Gauss points allows a polynomial of order $2G-1$ to be integrated exactly, i.e., $E_G=0$ ($G$ Gauss points have $2G$ unknowns - two locations and two weights - and a $2G-1$ polynomial has $2G$ unknown coefficients). 

Imagine that we will use a two-point scheme ($G=2$) to integrate exactly the cubic polynomial, $f(\xi) = a\xi^3+b\xi^2+c\xi+d$. Then we can write

$$\int\limits_0^1 f(\xi)\,d\xi = w_0\,f\left(\xi^{(0)}\right)+w_1\,f\left(\xi^{(1)}\right).$$

Substituting the cubic expression of $f$ and simplifying the integrals gives

$$ a\int\limits_0^1 \xi^3\,d\xi + b\int\limits_0^1 \xi^2\,d\xi + c\int\limits_0^1 \xi\,d\xi+ d\int\limits_0^1\,d\xi = w_0\left(a\left(\xi^{(0)}\right)^3+b\left(\xi^{(0)}\right)^2+c\left(\xi^{(0)}\right)+d\right)+w_1\left(a\left(\xi^{(1)}\right)^3+b\left(\xi^{(1)}\right)^2+c\left(\xi^{(1)}\right)+d\right),$$

which **must be true** for any combination of the coefficients $[a,b,c,d]$. Let's pick some values then that make our lives easier:

$$ \text{for}\quad a=b=c=0,\quad d=1 \quad \rightarrow \quad \int\limits_0^1\,d\xi= w_0+w_1,$$

which, when the integral is evaluated, yields $w_0+w_1=1$.

***Exercises***

- ***Show that, by setting each of the coefficients $a$, $b$, and $c$ in turn to $1$, and holding the others as $0$, we obtain the three additional equations below:***

$$ \frac{1}{2}=w_0\,\xi^{(0)}+w_1\,\xi^{(1)}$$

$$ \frac{1}{3}=w_0\,\left(\xi^{(0)}\right)^2+w_1\,\left(\xi^{(1)}\right)^2$$

$$ \frac{1}{4}=w_0\,\left(\xi^{(0)}\right)^3+w_1\,\left(\xi^{(1)}\right)^3$$

- ***Show by solving these equations for the four unknowns, $w_0$, $w_1$, $\xi^{(0)}$ and $\xi^{(1)}$, that***

$$ w_0=w_1=\frac{1}{2},\quad\text{and}\quad\xi^{(0)},\xi^{(1)}=\frac{1}{2}\mp\frac{1}{2\sqrt{3}}$$.

- ***Use Gaussian quadrature to evaluate the two integrals below and compare your result to the traditional method (integrate, apply limits). When do the results agree and when do they disagree?***

$$ I_0 = \int\limits_{0}^{1} 3x^2\, dx,\quad\quad I_1 = \int\limits_{0}^{1} 5x^4\, dx, $$

***Higher order schemes***

A fifth degree polynomial may be integrated exactly using 3 Gauss points with:

$$ w_0=w_2=\frac{5}{18},\quad w_1 = \frac{4}{9},\quad\text{and}\quad \xi^{(0)},\xi^{(2)}=\frac{1}{2}\mp\frac{1}{2}\sqrt{\frac{3}{5}}, \quad \xi^{(1)}=\frac{1}{2}.$$

Indeed, it is a **general feature** of these sorts of Gauss quadrature schemes that (i) the positions of the Gauss points are **symmetric** about $\frac{1}{2}$, and that (ii) the weights reflect this symmetry, and further (iii) the weights **sum to 1**.


#### 2.2.2 Demonstration

In [None]:
# here we demonstrate Gaussian quadrature for some already normalized integrals.
w0 = 0.5
w1 = 0.5
xi0 = 0.5-1./(2.*np.sqrt(3))
xi1 = 0.5+1./(2.*np.sqrt(3))

# functions for x**2
def f(x): return x**2        # a function
def I(x): return x**3/3.     # its exact integral

#def f(x): return x**3
#def I(x): return x**4/4.

#def f(x): return x**4
#def I(x): return x**5/5.

#def f(x): return np.sin(x)
#def I(x): return -np.cos(x)

#n = 2
#def f(x): return np.sin(n*x)
#def I(x): return -np.cos(n*x)/n

# integrate f(x) between 0 and 1 exactly
print('exact =',I(1) - I(0))

# Gaussian quadrature
print('approx=',w0*f(xi0)+w1*f(xi1))

# a plot of f(x)
fig,ax = plt.subplots(1,1)
x = np.linspace(0,1,1001)
ax.plot(x, f(x), 'b-');

***For the case of the 4th order polynomial, modify the quadrature above to implement a 3-point rule, and show that the approximation is exact.***

***Does a 3-point rule generate an exact approximation for `sin(nx)`? Comment.***

> <mark>*~ your answer here ~*</mark>

***- - - - CLASS CODING EXERCISE - - - -***

Imagine we wish to evaluate the integral

\begin{equation}
I = \int\limits_0^\pi sin(x)\,dx
\end{equation}


In [None]:
# PART ONE
# --------
# WRITE down the normalised integral using the first equation in 2.2.1

# f(\xi) = 

# What is the ANALYTICAL solution?

In [None]:
# PART TWO
# --------
# IMPLEMENT the normalized function below
# **hint** use np.pi and np.sin()
def sin_norm(xi):
    return ????

# Use GAUSS quadrature to compute the integral.
Iapprox = ????
Itrue = ????
print(Iapprox)
print('error=',Iapprox-Itrue)

In [None]:
# OPTIONAL CHALLENGE
# ------------------
# Improve your estimate I_approx by
# 1. dividing the integral into two parts 
#    I = I1 + I2 = int_0^pi/2 ... dx + int_pi/2^pi ... dx
# 2. do Gaussian quadrature on each 

def sin_norm2(xi):
    return ????

I_approx = ????
I_true = ????
print('error=', I_approx-I_true)

# this is called a COMPOSITE quadrature scheme

#### 2.2.3 Algorithm

The key algorithm steps to implement Gaussian quadrature are:
```
1. Normalise the integrand.
2. Determine Gauss points and weights for appropriate scheme.
3. Evaluate normalised integrand at Gauss points.
4. Compute the weighted sum.
```

You will implement these steps in the lab.