# Linear Least Square Fitting

In [1]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
%matplotlib notebook 

# Declaring variables

Simulation data from synchroton.

In [2]:
synch_data = pd.read_csv('Synch_spectrum.txt',sep = "\s+", names = ['Frequency (Hz)','Intensity (erg cm-2 s-1 sr-1 Hz-1)'],skiprows = [0,1])
xpoints = synch_data['Frequency (Hz)']
ypoints = synch_data['Intensity (erg cm-2 s-1 sr-1 Hz-1)']

In [3]:
#points for cubic fit
# xpoints = np.array([-3,-2,-1,0,1,1.5,2,2.5, 3]) 
# ypoints = np.array([-24,-5,2,-5,4,7,15,20,30])

xmatrix= [] #linear
xmatrix_quad = [] #quadratic
xmatrix_cub = [] #cubic
xmatrix_pow = [] #power function
yval_log = [] #y vector for power
sigma_sq = []

total = np.sum(xpoints)
avg_x = total / xpoints.size
one_arr = np.ones([xpoints.size,1]) #array of ones which is concatenated to matrices
arr = xpoints[np.arange(xpoints.size)].values.reshape((xpoints.size,1))

# Determining x matrices and y vector

Here, I am taking the x and y points and adding them to their respective matrices. Note that the 'xmatrix' is equivalent to the A matrix in Adrian's notes. For a linear or quadratic x matrix:

$$X_{linear} = \begin{bmatrix}
1 & x_{1} \\
1 & x_{2} \\
\vdots & \vdots \\
\end{bmatrix}, \quad
X_{quadratic} = \begin{bmatrix}
1 & x_{1} & x_{1}^{2}\\
1 & x_{2} & x_{2}^{2} \\
\vdots & \vdots & \vdots\\
\end{bmatrix}
$$

For a cubic function just add another column for $x^3$. While the y vector is just the y points:
$\bar{y} = \begin{bmatrix}
y_{1} \\
y_{2} \\
\vdots \\
\end{bmatrix}
$

In [4]:
# for n in np.arange(xpoints.size):
#     xmatrix.append([1.0,xpoints[n]]) #adding x points to x matrix for linear

# for n in np.arange(xpoints.size):
#     xmatrix_quad.append([1.0,xpoints[n],xpoints[n]**2]) #adding x points to x matrix for quadratic 
    
# for n in np.arange(xpoints.size):
#     xmatrix_cub.append([1.0,xpoints[n],xpoints[n]**2,xpoints[n]**3]) #adding x points to x matrix for cubic 
    
# # changing to nparray
# xmatrix = np.array(xmatrix) 
# xmatrix_quad = np.array(xmatrix_quad) 
# xmatrix_cub = np.array(xmatrix_cub)

# print(xmatrix_quad)

In [5]:
# a = np.array([1,xpoints[np.arange(xpoints.size)]])
# When you compute xmatrix, see if you can come up with a slick way of doing it that doesn’t require using any loops, 
# starting from xpoints. What you have is fine, but this is good practice for when you deal with lots of data. 
# (Loops are slow when you have really big arrays). Hint: np.vstack or np.hstack might be helpful.
    
xmatrix = np.hstack([one_arr, arr])
xmatrix_quad = np.hstack([xmatrix,arr**2])
xmatrix_cub = np.hstack([xmatrix_quad,arr**3])

# Power Law

This is an attempt to fit a power law in the form of $y = \beta x^\alpha$. To fit the power law, it must be linearized since, according to Adrian's notes, the "linear part of the term 'linear fit' just means linear in the parameters". One way to do that (based on online searching) is by applying log to both sides to make:

$log(y) = log (\beta x^\alpha) = log(\beta) + log(x^\alpha) = log(\beta) + \alpha log(x)$

Therefore, the linearization of $y = \beta x^\alpha$ is $log(y) = log(\beta) + \alpha log(x)$. Let $y^{'}=log(y)$ and $x^{'}=log(x)$ so that $y^{'} = log(\beta) + \alpha x^{'}$.

With this, we can pretty much proceed as with the linear case but here the x matrix and y vector will be:

$$X_{power} = \begin{bmatrix}
1 & log(x_{1}) \\
1 & log(x_{2}) \\
\vdots & \vdots \\
\end{bmatrix}, \quad
\bar{y} = \begin{bmatrix}
log(y_{1}) \\
log(y_{2}) \\
\vdots \\
\end{bmatrix}
$$

In [6]:
# for n in np.arange(xpoints.size):
#     xmatrix_pow.append([1.0, np.log(xpoints[n])]) #adding x points to x matrix for power function 
    
# #changing to nparray
# xmatrix_pow = np.array(xmatrix_pow) 
# print(xmatrix_pow)

xmatrix_pow = np.hstack([one_arr,np.log(arr)])

# for n in np.arange(ypoints.size):
#     yval_log.append(np.log(ypoints[n]))
    
# print(yval_log)

yval_log = np.log(ypoints[np.arange(ypoints.size)].values.reshape((ypoints.size,1)))

# Noise covariance matrix

This method is based on Adrian's notes to determing the $\hat{x}$ for $y^{model} = A\hat{x}$. From his notes, $\hat{x}$ is defined as $\hat{x} = [A^TN^{-1}A]^{-1}A^TN^{-1}\bar{y}$.

From the earlier code, I already found $A$ = xmatrix and $\bar{y}$. To find $N$, I first need to find the variance or $\sigma^{2}$. The $N$ matrix is as follows:
$$ N = \begin{pmatrix}
\sigma_{1}^{2} & 0 & 0 &\ldots{} \\
0 & \sigma_{2}^{2} & 0 & \ldots{} \\
0 & 0 & \sigma_{3}^{2} & \ldots{} \\
\vdots & \vdots & \ddots \\
\end{pmatrix} $$
To find $\sigma^{2}$, I used the following formula:
$$\sigma^{2} = \frac{1}{N} \sum_{i}^{N}{(x_{i} - \mu)^{2}}$$
where $N$ is the number of terms and $\mu$ is the mean.

In [7]:
# for n in np.arange(xpoints.size):
#         sigma_sq.append(((ypoints[n] - avg_x)**2)/(xpoints.size))

# sigma_sq= np.array(sigma_sq) #sigma squared or variance
# print(sigma_sq)

sigma_sq =(((ypoints[np.arange(ypoints.size)] - avg_x)**2)/(xpoints.size)).values.reshape((ypoints.size,1))
sigma_sq = np.hstack(sigma_sq)
# print(sigma_sq)
# noise_cov = np.diag(sigma_sq)
noise_cov = np.identity(xpoints.size)

Since I have determined the values ($A, N, \bar{y}$) to find $\hat{x}$, it's just a matter of multiplying everything for $\hat{x} = [A^TN^{-1}A]^{-1}A^TN^{-1}\bar{y}$. I broke down the steps of the process:
<ol>
<li>$A^TN^{-1}$</li>
<li>$A^TN^{-1}\bar{y}$</li>
<li>$[A^TN^{-1}A]^{-1}$</li>
<li>$\hat{x} = [A^TN^{-1}A]^{-1}A^TN^{-1}\bar{y}$</li>
<li>$y^{model} = A\hat{x}$</li>
</ol>	


In [8]:
#returns y_model
def variance(m,yval):
    dot_matrix = np.dot(m.T,np.linalg.inv(noise_cov)) #Step 1
    doty_matrix = np.dot(dot_matrix,yval) #Step 2
    inv_matrix = np.linalg.inv(np.dot(dot_matrix,m)) #Step 3
    x_bar = np.dot(inv_matrix, doty_matrix) #Step 4
    y_model = np.dot(m, x_bar) #Step 5
    return y_model, x_bar

# Error Covariance

This is to find error information on final parameters to ascertain how far the fit is to the true parameters. To determine it, use the error covariance defined as $V= [A^{T}N^{-1}A]^{-1}$. The square root of the diagonal of $V$ gives the error bar of each parameter. The off-diagonal elements tell us how the errors on different parameters are correlated. Note this is based on Adrian's notes.

In [9]:
def error_cov(a_matrix):
    dot_matrix = np.dot(a_matrix.T,np.linalg.inv(noise_cov)) #Step 1 from Noise covariance matrix section
    error = np.linalg.inv(np.dot(dot_matrix,a_matrix))
    return error

def error_bar(a_matrix):
    return np.sqrt(np.diag(error_cov(a_matrix)))

# Calling Functions and Graphing

In [10]:
plt.figure(1)

lin_fit, lin_param = variance(xmatrix, ypoints)
quad_fit, quad_param = variance(xmatrix_quad, ypoints)
cub_fit, cub_param = variance(xmatrix_cub, ypoints)
pow_fit, pow_param = variance(xmatrix_pow, yval_log)

plt.plot(xpoints, lin_fit, label='linear')
# plt.plot(xpoints, quad_fit, label='quadratic')
# plt.plot(xpoints, cub_fit, label='cubic')

plt.scatter(xpoints,ypoints, color='m')
plt.legend(loc='lower left')
plt.title("Linear Best Fit")
plt.xlabel(xpoints.name)
plt.ylabel(ypoints.name)

plt.figure(2)

plt.scatter(np.log(xpoints),yval_log, color='m')
plt.plot(np.log(xpoints), pow_fit, label='power')

plt.title("Linear Best Fit for Power (log-log)")
plt.xlabel(xpoints.name)
plt.ylabel(ypoints.name)

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

Text(0, 0.5, 'Intensity (erg cm-2 s-1 sr-1 Hz-1)')

# Errors for Parameters

This is the error for parameters of the linear fit. The parameters are $b$ for intercept and $m$ for slope from the equation $y=mx+b$.

In [17]:
print('Error covariance is \n', error_cov(xmatrix))

print('\nThe slope (m) is', lin_param[1],'+/-', error_bar(xmatrix)[1], '(',lin_param[1]+error_bar(xmatrix)[1],',',
      lin_param[1]-error_bar(xmatrix)[1],')')
print('The intercept (b) is', lin_param[0],'+/-', error_bar(xmatrix)[0],'(',lin_param[0]+error_bar(xmatrix)[0],',',
      lin_param[0]-error_bar(xmatrix)[0],')')

Error covariance is 
 [[ 2.57701999e-01 -8.77793609e-10]
 [-8.77793609e-10  4.26223153e-18]]

The slope (m) is -1.0011280767193523e-26 +/- 2.064517263388118e-09 ( 2.064517263388118e-09 , -2.064517263388118e-09 )
The intercept (b) is 5.154605244256639e-18 +/- 0.5076435743479659 ( 0.5076435743479659 , -0.5076435743479659 )


This is the error for the parameter of the power law fit. The parameters are $log(\beta)$ and $\alpha$ for the equation $y^{'} = log(\beta) + \alpha x^{'}$. Note that $y^{'}=log(y)$ and $x^{'}=log(x)$ and that it's for a log-log graph.

In [18]:
print('Error covariance is \n', error_cov(xmatrix_pow))
print('\nLog(beta) is', pow_param[1],'+/-', error_bar(xmatrix_pow)[1], '(',pow_param[1]+error_bar(xmatrix_pow)[1],',',
      pow_param[1]-error_bar(xmatrix_pow)[1],')')
print('Alpha is', pow_param[0],'+/-', error_bar(xmatrix_pow)[0],'(',pow_param[0]+error_bar(xmatrix_pow)[0],',',
      pow_param[0]-error_bar(xmatrix_pow)[0],')')

Error covariance is 
 [[59.24922146 -3.12741493]
 [-3.12741493  0.16529228]]

Log(beta) is [-0.75118538] +/- 0.4065615332029686 ( [-0.34462385] , [-1.15774692] )
Alpha is [-26.23002346] +/- 7.697351587556321 ( [-18.53267188] , [-33.92737505] )
