# Content:
1. [Data modeling](#1.-Data-modeling)
2. [Polynomial fitting explained using a quick implementation](#2.-Polynomial-fitting-explained-using-a-quick-implementation)
3. [Polyfit and Polyval](#3.-Polyfit-and-Polyval)
4. [Numpy's polyfit, polyval, and poly1d](#4.-Numpy's-polyfit,-polyval,-and-poly1d)

## 1. Data modeling

![boardwork019.jpg](../boardwork/boardwork019.jpg)
![boardwork020.jpg](../boardwork/boardwork020.jpg)
![boardwork021.jpg](../boardwork/boardwork021.jpg)
![boardwork022.jpg](../boardwork/boardwork022.jpg)
![boardwork023.jpg](../boardwork/boardwork023.jpg)
![boardwork024.jpg](../boardwork/boardwork024.jpg)

#### _Note:_ Instead of matrix inversion, we will directly solve the equation given in the left side using a linear solver.  
Since $\left[ {\bf X}^{\rm T}{\bf X}\right]$ is a symmetric matrix, we can use Cholesky decomposition.

## 2. Polynomial fitting explained using a quick implementation

Let's write a general program to fit a set of $N$ points to a $D$ degree polynomial. 

### Vandermonde matrix

The first step is to calculate the Vandermonde matrix. Let's calculate it for a set of x-values. Suppose we want to fit 4 points to a straightline, then $N=4$ and $D=1$. However, remember, in python the index starts with 0. So, we have to assign the variables accordingly.

In [1]:
import numpy as np
x = np.array([1, 2, 3, 5],float)

N = x.shape[0]

D=1

# Initialize the X-matrix
X = np.ones([N,D+1]) # Note that we are using D+1 here


# Add columns of x, x^2, ..., x^N-1 to build the Vandermonde matrix
#X[:,1]=x[:]
#X[:,2]=x[:]**2
#X[:,3]=x[:]**3

for i in range(1,D+1): # Note that we are using D+1 here
    X[:,i]=x[:]**i

print(X)

[[1. 1.]
 [1. 2.]
 [1. 3.]
 [1. 5.]]


Even though it is easy to calculate the Vandermonde matrix, we should note down that numpy already has a function to calculate this matrix. We can check if our results agree with numpy

In [2]:
np.vander(x, D+1,increasing=True) #If the last argument is not given, you get the orders of columns reversed!

array([[1., 1.],
       [1., 2.],
       [1., 3.],
       [1., 5.]])

### Now let's solve a problem

Lets's use the known form of a parabola, say, $y=-0.4x^2$. We can sample some points of $x$ and fit to the known values of $y$. After fitting the data we can check of the polynomial coefficients come out as expected. 

In [3]:
x=np.arange(-5, 6, 1, float) # start, stop, step, dtype

print("x-vector is:\n", x)

y=-0.4*x**2

print("y-vector is:\n", y)

x-vector is:
 [-5. -4. -3. -2. -1.  0.  1.  2.  3.  4.  5.]
y-vector is:
 [-10.   -6.4  -3.6  -1.6  -0.4  -0.   -0.4  -1.6  -3.6  -6.4 -10. ]


In [4]:
D=2  # for a parabola
X=np.vander(x, D+1, increasing=True)  # V is the Vandermonde matrix.

print(X)

[[ 1. -5. 25.]
 [ 1. -4. 16.]
 [ 1. -3.  9.]
 [ 1. -2.  4.]
 [ 1. -1.  1.]
 [ 1.  0.  0.]
 [ 1.  1.  1.]
 [ 1.  2.  4.]
 [ 1.  3.  9.]
 [ 1.  4. 16.]
 [ 1.  5. 25.]]


In [5]:
XT=np.transpose(X)
    
A=np.matmul(XT,X)

print(A)

[[  11.    0.  110.]
 [   0.  110.    0.]
 [ 110.    0. 1958.]]


Now, all we have to do is solve ${\bf A}{\bf c}={\bf b}$, where ${\bf b}={\bf X}^{\rm T}{\bf y}$.

In [6]:
import numpy as np
from scipy.linalg import cho_factor, cho_solve

b=np.matmul(XT,y)


c=np.zeros(D,float)

L, low = cho_factor(A)   

c = cho_solve((L, low), b)

print('\nThe solution is\n')
print(c)


The solution is

[ 0.00000000e+00  6.45947942e-17 -4.00000000e-01]


We see that the coefficient for $x^0$ and $x^1$ terms are 0.0. For the quadratic term ($x^2$), the coefficient is 0.4 according to the parabola we have started with. Now, suppose you want to find the value of the function at a new value of $x$, all you have to do is evaluate the polynomial. 

In [7]:
xnew=0.5
ynew=c[0]*xnew**0 + c[1]*xnew**1 + c[2]*xnew**2
print("Value of y at x=", xnew, " is ", ynew)

Value of y at x= 0.5  is  -0.09999999999999996


The result is what is expected $y(0.5)=-0.4 \times 0.5^2=-0.1$.

## 3. Polyfit and Polyval

What we have done so far is to fit a set of points to a polynomial (polyfit) and evaluate the polynomial at new points (polyval). We can write general functions for these two steps.

In [8]:
def chol(A,b):
    from scipy.linalg import cho_factor, cho_solve
    D=b.shape[0]
    
    c=np.zeros(D,float)

    L, low = cho_factor(A)   

    c = cho_solve((L, low), b)
    return c


def polyfit(x,y,D):
    '''
        Fits a given set of data x,y to a polynomial of degree D
    '''
    import numpy as np 
        
    X=np.vander(x, D+1, increasing=True)
    XT=np.transpose(X)
    
    A=np.matmul(XT,X)
    
    b=np.matmul(XT,y)

    c=chol(A,b)
    
    return(c)

#=== Let's fit to a parabola 

x=np.arange(-5, 6, 1, float)

y=-0.4*x**2

D=2  # for a parabola
c=polyfit(x,y,D)

for i in range(D+1):
    print("coefficient of x^",i," is ",c[i])

coefficient of x^ 0  is  0.0
coefficient of x^ 1  is  6.459479416000912e-17
coefficient of x^ 2  is  -0.39999999999999997


Now, let's see what happens if we can fit the same data to higher-degree polynomial.

In [9]:
D=5
c=polyfit(x,y,D)

for i in range(D+1):
    print("coefficient of x^",i," is ",c[i])

coefficient of x^ 0  is  8.033876086462141e-15
coefficient of x^ 1  is  -9.08310463879971e-15
coefficient of x^ 2  is  -0.4000000000000023
coefficient of x^ 3  is  1.3349935850729272e-15
coefficient of x^ 4  is  8.83347612444574e-17
coefficient of x^ 5  is  -3.970233448744883e-17


Only the quadratic term survives, all other coefficients are zero! How nice!

To evaluate the polynomial, i.e., the estimated values of y, one can write another function, called polyval.

In [10]:
def polyval(a,x):
    '''
        Determines the value of the polynomial using x and the coefficient vector a
    '''
    import numpy as np
    D=a.shape[0]
    N=x.shape

    y=np.zeros(N)
    for i in range(D):
        y=y+a[i]*x**i
    
    return(y)

xnew=np.array([-0.5,0.5])  # we will make the new x-values as an array
ynew=polyval(c,xnew)
print(ynew)

[-0.1 -0.1]


## 4. Numpy's polyfit, polyval, and poly1d

Again, since we have learned the basics of polynomial fitting _from scratch_, we can use numpy's in-built routines for production runs. But, before that we need to test if numpy's results agree with our own values!

In [11]:
x=np.arange(-5, 6, 1, float)

y=-0.4*x**2

D=4  # some polynomial degree

c=np.polyfit(x, y, D)

xnew=np.array([-0.5,0.5]) 
ynew=np.polyval(c,xnew)

print("Estimated value of y at new points of x is: \n",ynew)

Estimated value of y at new points of x is: 
 [-0.1 -0.1]


There's also a cool function in numpy to print the polynomial as an expression.

In [12]:
p = np.poly1d(c)
print(p)

            4             3       2
-4.122e-18 x + 1.754e-17 x - 0.4 x + 6.324e-17 x + 1.071e-15


In the next class we will learn about how to quantify the accuracy of fitting. 