# Functions for Least Squares

These are functions to determine the parameters ($\hat{x}$) for the matrix equation $y^{model} = A\hat{x}$.

In [1]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
%matplotlib notebook 

## Forming matrix for data and y vector

The data is being formed into data matrices which hold the data. Each matrix corresponds to a type of model: linear, quadratic, cubic, or power law (below).

$$A_{linear} = \begin{bmatrix}
1 & x_{1} \\
1 & x_{2} \\
\vdots & \vdots \\
\end{bmatrix}, \quad
A_{quadratic} = \begin{bmatrix}
1 & x_{1} & x_{1}^{2}\\
1 & x_{2} & x_{2}^{2} \\
\vdots & \vdots & \vdots\\
\end{bmatrix},\quad
A_{cubic} = \begin{bmatrix}
1 & x_{1} & x_{1}^{2} & x_{1}^{3}\\
1 & x_{2} & x_{2}^{2} & x_{2}^{3} \\
\vdots & \vdots & \vdots\\
\end{bmatrix}
$$

While the y vector is just the y points:
$\bar{y} = \begin{bmatrix}
y_{1} \\
y_{2} \\
\vdots \\
\end{bmatrix}
$

In [2]:
def lin_matrix(self):
    one_arr = np.ones([self.data.size,1]) #array of ones concatenated to matrices
    lin = np.hstack([one_arr, self.data]) #appending array of ones to array of data matrix
    return lin

def quad_matrix(self):
    lin = self.lin_matrix()
    quad = np.hstack([lin,self.data**2]) #appending lin to array of data squared 
    return quad

def cub_matrix(self):
    quad = self.quad_matrix()
    cub = np.hstack([quad,self.data**3]) ##appending quad to array of data cubed
    return cub

## Power Law

This is an attempt to fit a power law in the form of $y = \beta x^\alpha$. To fit the power law, it must be linearized since, according to Adrian's notes, the "linear part of the term 'linear fit' just means linear in the parameters". One way to do that is by applying log to both sides to make:

$log(y) = log (\beta x^\alpha) = log(\beta) + log(x^\alpha) = log(\beta) + \alpha log(x)$

Therefore, the linearization of $y = \beta x^\alpha$ is $log(y) = log(\beta) + \alpha log(x)$. Let $y^{'}=log(y)$ and $x^{'}=log(x)$ so that $y^{'} = log(\beta) + \alpha x^{'}$.

With this, we can pretty much proceed as with the linear case but here the x matrix and y vector will be:

$$A_{power} = \begin{bmatrix}
1 & log(x_{1}) \\
1 & log(x_{2}) \\
\vdots & \vdots \\
\end{bmatrix}, \quad
\bar{y} = \begin{bmatrix}
log(y_{1}) \\
log(y_{2}) \\
\vdots \\
\end{bmatrix}
$$

In [3]:
def pow_matrix(self):
    one_arr = np.ones([self.data.size,1])
    pow_m = np.hstack([one_arr,np.log(self.data)]) #appending array of ones to array of log xvalues
    return pow_m

## Noise covariance matrix and y model

$\hat{x}$ for $y^{model} = A\hat{x}$ is defined as $\hat{x} = [A^TN^{-1}A]^{-1}A^TN^{-1}\bar{y}$. From the earlier functions, I already found $A$ and $\bar{y}$. The $N$ matrix is defined as follows:
$$ N = \begin{pmatrix}
\sigma_{1}^{2} & 0 & 0 &\ldots{} \\
0 & \sigma_{2}^{2} & 0 & \ldots{} \\
0 & 0 & \sigma_{3}^{2} & \ldots{} \\
\vdots & \vdots & \ddots \\
\end{pmatrix} $$
where $\sigma^{2}$ is the variance or standard deviation squared. As a default, the noise covariance is the identity matrix if a noise covariance isn't given. 

Since I have determined the values for $A, N, \bar{y}$, finding $\hat{x}$ is just a matter of multiplying everything for $\hat{x} = [A^TN^{-1}A]^{-1}A^TN^{-1}\bar{y}$. I broke down the steps of the process:
<ol>
<li>$A^TN^{-1}$</li>
<li>$A^TN^{-1}\bar{y}$</li>
<li>$[A^TN^{-1}A]^{-1}$</li>
<li>$\hat{x} = [A^TN^{-1}A]^{-1}A^TN^{-1}\bar{y}$</li>
<li>$y^{model} = A\hat{x}$</li>
</ol>	

The function below finds the y values for the fit ($y^{model}$), parameters ($\hat{x}$), and the error covariance which is defined as $V= [A^{T}N^{-1}A]^{-1}$. It follows the steps I outlined above. 

In [4]:
#parameters data_matrix, y points, and noise covariance
#returns y_model, x_bar (parameters for fit) and error covariance
def ymodel(self,data_matrix,yval,noise_cov=None):
    if noise_cov == None:
        noise_cov = np.identity(data_matrix.shape[0])

    #calculating parameters
    dot_matrix = np.dot(data_matrix.T,np.linalg.inv(noise_cov)) #Step 1
    doty_matrix = np.dot(dot_matrix,yval) #Step 2
    inv_matrix = np.linalg.inv(np.dot(dot_matrix,data_matrix)) #Step 3 (error covariance)
    x_bar = np.dot(inv_matrix, doty_matrix) #Step 4
    
    #finding y model
    predict_y = np.dot(data_matrix, x_bar) #Step 5
    return predict_y, x_bar, inv_matrix

The function extrapolates the y model past the given x points. You give the function a data matrix with additional points added to extend the curve. Then, it slices the matrix contain only the actual data (given by the index of it) to calculate the parameter ($\hat{x}$). With this parameter it calculates the y values for the fit $(y^{model})$ using the extended data matrix that contains the additional points.

In [1]:
#args extended data_matrix (with extra x points added), y points, index where actual data begins in data_matrix_extend
#returns y_model, x_bar (parameters for fit) and error covariance                   
def ymodel_extend(self,data_matrix_extend,yval,index_begin_x):
    data_matrix = data_matrix_extend[index_begin_x:] #slicing array to section with only x values from data
    predict_y,x_bar,inv_matrix = self.ymodel(data_matrix,yval)
    predict_y = np.dot(data_matrix_extend, x_bar)
    return predict_y,x_bar,inv_matrix

## Error Bar

This is to find error information on final parameters to ascertain how far the fit is to the true parameters. To determine it, use the error covariance defined as $V= [A^{T}N^{-1}A]^{-1}$. The square root of the diagonal of $V$ gives the error bar of each parameter. The off-diagonal elements tell us how the errors on different parameters are correlated. 

In [1]:
def error_bar(self,err_cov):
    err = np.sqrt(np.diag(err_cov)) #taking the square root of the diagonal of V
    return err