In [None]:
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt
import scipy

# Problem 1 (5 points) 

Write a function `construct_Lagrange_piecewise_polynomial(x, y, order)` which returns a function that
***interpolates*** the points $\{(x_{(i)},y_{(i)}): i= 1,\cdots,n\}$ with an ***order-k Lagrange piecewise polynomial***. I.e., the piecewise continuous concatentation of $m$ ***Lagrange polynomials***

\begin{align*}
   h(w) = {} & \left[\overset{\text{Piecewise}}{\underset{\text{summation}}{ \sum_{g=0}^{m-1}}} \overset{ \textbf{$k^{th}$ Order } w \in \left[x_{\left(gk^\vphantom{1pt}\right)}, x_{\left((gk+k^\vphantom{1pt}\right) } \right) }{\underset{\textbf{Lagrange polynomial}    }{ \sum_{j=0}^{k} y_{(j)} l_{gj}(w)}}\right] + \underset{\text{so } h(x_{(n)}) = y_{(n)}}{y_{(n)} \delta_{x_{(n)}}(w)} && y_{(j)} \text{ corresponds to } x_{(j)}\\
   l_{gj}(w) = {} & \underset{i \not = gk+j}{\prod_{i = gk}^{(g+1)k}} \frac{w-x_{(i)}}{x_{(gk+j)}-x_{(i)}}  \underset{ 1_A(a)=1 \text{ if } a\in A;\; 0 \text{ otherwise}}{\times\; 1_{\left[x_{\left(gk^\vphantom{1pt}\right)}, x_{\left((gk+k^\vphantom{1pt}\right) } \right)}(w)} && x_{(i)} < x_{(j)} \text{ for } i<j
\end{align*}

Note that each $l_{gj}(w)$ is the $j^{th}$ of $k+1$ ***Lagrange polynomial basis function*** defined over the range of the $g^{th}$ of $m$ overlapping subsets of the data

$$\begin{array}{c|ccc|ll}
g & +0 & \cdots & +k & \text{basis functions} & \text{domain} \\\hline
0 & x_{(0)} & \cdots & x_{(k)} & l_{00},\cdots, l_{0k} & \left[x_{(0)}, x_{(k)}\right)\\
1 & x_{(k)} & \cdots & x_{(2k)} & l_{10},\cdots, l_{1k}& \left[x_{(k)}, x_{(2k)}\right)\\
\vdots\\
g & x_{(gk)} & \cdots & x_{(gk+k)}& l_{g0},\cdots, l_{gk}& \left[x_{(gk)}, x_{(gk+k)}\right)\\
\vdots &\\
m-2 & x_{(n-2k)} & \cdots & x_{(n-k)} & l_{(n-1)0},\cdots, l_{(n-1)k}& \left[x_{(n-2k)}, x_{(n-k)}\right)\\
m-1 & x_{(n-k)} & \cdots & x_{(n)} & l_{(n-1)0},\cdots, l_{(n-1)k}& \left[x_{(n-k)}, x_{(n)}\right]\\
\end{array}$$

Mapping a function through points, as is done here by the ***Lagrange piecewise polynomial*** is called ***interpolation*** and this is distinct from ***approximation*** in which a reduced representation of a function is used in place of the function. Both of these are again distinct from ***estimation***, in which the parameters within a family of functional forms are chosen so the resulting function resembles observed data points. And finally, these are all again distinct from ***smoothing***, in which the family of functional forms is chosen to be simple and parsimonious and yet still capable of representating the important characteristics of the data, e.g., $E[y|x]$ or $y=\beta_0+\beta_1x$.

*This problem and conlcuding comments are inspired by **Lagrange polynomials** in the **Models for Interpolation** and **Models for Smoothing Data** sections of Chapter 4.1 **Function Approximation and Smoothing** on pages 154-156 and 157 and the paragraphs in the **introduction** and **Estimation** sections of Chapter **Approximation of Functions** on page 147 and 162 of James E. Gentle's **Computational Statistics** textbook. [Errata Warning: on page 156, cubic Lagrange polynomials join four adjacent points, not three; and, piecewise Lagrangian polynomials are not necessarily smooth at knots.]*

## Problem 0 Questions 0-1 (2 points)

An ***order-k Lagrange polynomial basis function*** is $\displaystyle l_j(w) = \prod_{i=0, i \not = j}^k \frac{w-x_{(i)}}{x_{(j)}-x_{(i)}}$.

An ***order-k Lagrange polynomial function*** is $h(w) = \displaystyle \sum_{j=0}^k y_{(j)} l_j(w)$.

Before attempting to create the `Lagrange_piecewise_polynomial` function, first define the `construct_jth_Lagrange_basis_function` and `construct_Lagrange_polynomial` functions begun below and confirm the correctness of your function by verifying graphically that the ***Lagrange polynomial*** correctly travels through `x` and `y` with

```python
x,y = np.sort(stats.norm.rvs(size=5)), stats.norm.rvs(size=5)
plt.plot(x,y,'k.')
grid = np.linspace(x[0],x[-1], 100)
for j in range(len(x)):
    plt.plot(grid, construct_jth_Lagrange_basis_function(j, x)(grid),'k--')
plt.plot(grid, construct_Lagrange_polynomial(x,y)(grid))
#check the above first, before expanding it to the piecewise version below
#plt.plot(grid, construct_Lagrange_piecewise_polynomial(x, y, order=2)(grid))
#plt.plot(grid, construct_Lagrange_piecewise_polynomial(x, y, order=1)(grid))
```

Your `construct_jth_Lagrange_basis_function` and `construct_Lagrange_polynomial` functions will be tested for correctness.

***Hint:*** Adding `@np.vectorize` on the lines above `def jth_Lagrange_basis_function(w)` and `Lagrange_polynomial(w)` means the function is written for scalar (`float`) `w` but can be called with an vector (`np.array`) `w`.

In [None]:
def construct_jth_Lagrange_basis_function(j, x):
    # order will be len(x)-1
    @np.vectorize  # makes the function below work for np.array w
    def jth_Lagrange_basis_function(w): # defined for scalar w
        pass
    return jth_Lagrange_basis_function

def construct_Lagrange_polynomial(x,y):
    # the sum of the j Lagrange basis function each evaluated at w
    @np.vectorize  # makes the function below work for np.array w
    def Lagrange_polynomial(w): # defined for scalar w
        pass
    return Lagrange_polynomial

In [None]:
# Cell for scratch work

# You are welcome to add as many new cells into this notebook as you would like.
# Just don't have scratch work cells with runtime errors because
# notebook cells are run sequentially for automated code testing.

# Any cells included for scratch work that are no longer needed may be deleted so long as
# - all the required functions are still defined and available when called
# - no cells requiring variable assignments are deleted
#    - as this causes their `cell ids` to be lost, but these `cell-ids` are required for automated code testing.


In [None]:
# Cell for scratch work


In [None]:
# 1 point [format: callable function f with signature f(j,x), i.e.,
#                  the jth Lagrange basis function of order len(x_subset)-1]
p1q0 = construct_jth_Lagrange_basis_function # equivalent to
# p1q0 = lambda j, x: construct_jth_Lagrange_basis_function(j, x)

# As long as your `construct_jth_Lagrange_basis_function` is
# correct you do not need to change anything in this cell

In [None]:
# 1 point [format: callable function f with signature f(x,y), i.e.,
#                  a Lagrange polynomial of order len(x_subset)-1 passing through x and y]

p1q1 = construct_Lagrange_polynomial # equivalent to
# p1q1 = lambda x,y: construct_Lagrange_polynomial(x,y)

# As long as your `construct_jth_Lagrange_basis_function` is
# correct you do not need to change anything in this cell

## Problem 1 Questions 2-3 (2 points)

Complete the `construct_Lagrange_piecewise_polynomial` function of the problem prompt by correctly piecing together ***Lagrange polynomials*** created from the `construct_Lagrange_polynomial` function.  

The `Lagrange_piecewise_polynomial` will be tested for correctness.

In [None]:
def construct_Lagrange_piecewise_polynomial(x, y, order):

    '''
    `x`/`y` : are numpy arrays of the same length
    `order` : each piecewise interpolation will use `order+1` data points

              Piecewise functions are end-to-end, so for ``order=2` and len(x)=5`
              two piecewise Lagrange polynomials of `order 2` will be made from
              `len(x[:3])=3` and `len(x[2:])=3` data points and connect at `x[2]`
    '''

    if len(x) != len(y):
        return "Error: len(x) is not len(y)."
    if len(x)%order != 1 and order != 1:
        return "Error: order and len(x) are note compatible."

    @np.vectorize
    def Lagrange_piecewise_polynomial(w):
        pass

    return Lagrange_piecewise_polynomial # which may be evaluated over, e.g., `np.linspace(x[0],x[-1],n)`

In [None]:
# Cell for scratch work

# You are welcome to add as many new cells into this notebook as you would like.
# Just don't have scratch work cells with runtime errors because
# notebook cells are run sequentially for automated code testing.

# Any cells included for scratch work that are no longer needed may be deleted so long as
# - all the required functions are still defined and available when called
# - no cells requiring variable assignments are deleted
#    - as this causes their `cell ids` to be lost, but these `cell-ids` are required for automated code testing.


In [None]:
# Cell for scratch work


In [None]:
# 1 point [format: callable function f with signature f(x,y), i.e.,
#                  a piecewise Lagrange polynomial of order 2 passing through x and y]
p1q2 = lambda x,y: construct_Lagrange_piecewise_polynomial(x,y, order=2)

# As long as your `construct_jth_Lagrange_basis_function` is
# correct you do not need to change anything in this cell

In [None]:
# 1 point [format: callable function f with signature f(x,y), i.e.,
#                  a piecewise Lagrange polynomial of order 1 passing through x and y]
p1q3 = lambda x,y: construct_Lagrange_piecewise_polynomial(x,y, order=1)

# As long as your `construct_jth_Lagrange_basis_function` is
# correct you do not need to change anything in this cell

In [None]:
x,y = np.sort(stats.norm.rvs(size=5)), stats.norm.rvs(size=5)
plt.plot(x,y,'k.')
grid = np.linspace(x[0],x[-1], 100)
for j in range(len(x)):
    plt.plot(grid, p1q0(j, x)(grid),'k--')
plt.plot(grid, p1q1(x,y)(grid))
plt.plot(grid, p1q2(x,y)(grid))
plt.plot(grid, p1q3(x,y)(grid))


### Problem 1 Questions 4-7 (1 point)


4. (0.25 points) What is true about ***high order Lagrange piecewise polynomials***?

    1. They generally have discontinuities where the pieces connect
    2. They are continuous and differentiable everywhere
    3. They will not always alternate between convex and concave pieces
    4. They are good for trend fitting and data smoothing
    

5. (0.25 points) Suppose some ***data smoothing / prediction*** model is ***estimated*** and produces $\hat y \approx E[y|x]$ which is a $k^{th}$ degree $(k+1<n=m\times k)$ polynomial in $x$. Which of the following are true?

    1. The ***Lagrange polynomial*** on $(x, \hat y)$ will be the same as the $\hat y$ curve
    2. The ***Lagrange polynomial*** on $(x, y)$ ***interpolates*** the same values as $\hat y$
    3. The $\hat y$ curve from $x_{(0)}$ to $x_{(n)}$ can be defined as an ***order-k piecewise Lagrange polynomial***; that is, the polynomials defining $\hat y$ and the ***order-k piecewise Lagrange polynomial*** are unique and pass through the same points so they're identical
    4. None of the above
    

6. (0.25 points) Which of the following describes ***approximating*** a $k^{th}$ degree ***Lagrange polynomial*** defined over $k+1$ data points by setting $y_{(j)}$ to $0$ for some of the Lagrange basis functions?

    1. A ***lower order piecewise Lagrange polynomial*** resulting in different Lagrange basis functions
    2. Removing some of the Lagrange basis functions producing a $k'<k$ order polynomial defined over $k+1$ data points
    3. Using a ***smoothing matrix*** to produce $\hat y \approx E[y|x]$
    4. None of the above
    

    
7. (0.25 points) Which of the following is correct?

    1. ***Approximation*** is when a reduced representation of a function is used in place of the function
    2. ***Estimation*** is when a family of functional forms is chosen to model $E[y|x]$
    3. ***Data smoothing / Prediction*** is when the parameters within a family of functional forms are chosen so the resulting function resembles observed data
    4. None of the above


In [None]:
# 0.25 points each [format: `str` either "A" or "B" or "C" or "D" based on the choices above]
p1q4 = #<"A"|"B"|"C"|"D">
p1q5 = #<"A"|"B"|"C"|"D">
p1q6 = #<"A"|"B"|"C"|"D">
p1q7 = #<"A"|"B"|"C"|"D">
# Uncomment the above and keep each only either "A" or "B" or "C" or "D"

# This cell will produce a runtime error until the `p1q3`-`p1q10` variables are assigned values