# Least Squares Fitting Package for 1-Parameter Linear Model

This notebook provides a specialized package for fitting a linear model to your data. This version is a 1-parameter fit that adjusts only the slope of the linear model (zero intercept).

The code below assumes that your data is in four-column format, two variable, both with uncertainty. Typically you will need to modify:
- the name of your data file - modify the line containing `fname = ...` 
- the name of the output file for the final figure
- axis title elements: `x_name`, `x_units`, `y_name`, `y_units`

Note that you do not supply initial guesses; the linear model has an exact analytical solution.

The file should contain 4 columns containing the variable values for x, uncertainty in x, y, uncertainty in y, and 2 rows containing the name and units of the variables (this is produced automatically in the data pack and trim, as well as by the `data_entry2` function. If you need to swap the order of the variables, you can modify the indicies for x and y on lines 50-52.

You will often also need to modify what is treated as the "x-axis" and the "y-axis". The code is fitting y as a function of x.

In [None]:
# The script below fits to a sine wave, but can be modified for other functions
# First, we load some python packages
import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import curve_fit

################################################
# LIST OF ALL INPUTS
################################################

# fname is assumed to be in a four-column .csv file (comma separated values). The first two rows are
# assumed to be headers, like those produced in our code for packing oscilloscope data.
# The four columns are x-values, x-uncertainties, y-values, y-uncertainties.
# The .csv file must be in the same
# folder as this fit program, otherwise the full file extension must be added
# to fname: e.g. fname = 'folder/subfolder/subsubfolder/file.csv'

fname = "RC-tau_data.csv"

x_name = "Frequency"
x_units = "Hz"
y_name = "Voltage"
y_units = "V"

# The model you will fit to is defined below, in this case a sine wave.
# The parameters in the model are amplitude, freqency, and phase.
# To get a least squares fitting process started, it is necessary to provide good
# initial guesses for the parameters. From your plots of the data so far, you can make good guesses at these parameters.

param_names = ["slope"]


# definition of the fit function
# def fit_function(x, amplitude,tau):

# fit function is a linear model with slope and intercept

def fit_function(x, slope):
    return x*slope

# load the file "fname", defined above# load the file "fname", defined above; 
# if you have file with more than 4 columns, you may need to select different indicies for "usecols" (starting counting at 0)

data = np.loadtxt(fname, delimiter=",", comments="#", usecols=(0, 1, 2, 3), skiprows=2)

################################################
# READ IN DATA COLUMNS
################################################

# Here is where you access the data columns.
# You may need to alter these to choose what is on the y-axis and what is on the x-axis.

x = data[:, 0]
y = data[:, 2]
y_sigma = data[:, 3]

# calculate the best fit slope analytically (1-parameter solution)
m = sum(x*y/y_sigma**2)/sum((x/y_sigma)**2)

# calculate uncertainty of the best fit slope
m_sigma = np.sqrt(1/sum((x/y_sigma)**2))

print ("Slope is", m, "+/-", m_sigma)

###############################################################################
# calculates and prints the chi-squared, degrees of freedon, and weighted chi-squared
###############################################################################

# function that  calculates the chi square value of a fit
def chi_square (param1, x, y, sigma):
#
    return np.sum((y-fit_function(x, param1))**2/sigma**2)
    
# calculate and print chi square as well as the per degree-of-freedom value
chi2 = chi_square(m,x,y,y_sigma)
# degrees of freedom is the number of data points minus the number of parameters
dof = len(x) - 1
print ("\nGoodness of fit - Chi-squared measure:")
print ("degrees of freedom = {}, Chi2/dof = {}\n".format(dof, chi2/dof))


# residual is the difference between the data and model
x_fitfunc = np.linspace(min(x), max(x), 500)
y_fitfunc = fit_function(x_fitfunc, m)
y_fit = fit_function(x, m)
residual = y-y_fit
# creates a histogram of the residuals
# hist,bins = np.histogram(residual,bins=30)

fig = plt.figure(figsize=(7,10))

ax1 = fig.add_subplot(211)
ax1.errorbar(x,y,yerr=y_sigma,marker='.',linestyle='',label="measured data")
ax1.plot(x_fitfunc,y_fitfunc,marker="",linestyle="-",linewidth=2,color="r",
         label=" fit")
# add axis labels and title
ax1.set_xlabel('{} [{}]'.format(x_name,x_units))
ax1.set_ylabel('{} [{}]'.format(y_name,y_units))
ax1.set_title('Best fit of 1-Parameter Linear Model')
# set the x and y boundaries of your plot
#plt.xlim(lower_x,upper_x)
#plt.ylim(lower_y,upper_y)
# show a legend. loc='best' places legend where least amount of data is 
# obstructed. 
ax1.legend(loc='best',numpoints=1)


# this code produces a figure with a plot of the residuals as well
# as a histogram of the residuals. 
# fig = plt.figure(figsize=(7,10))
ax2 = fig.add_subplot(212)
ax2.errorbar(x,residual,yerr=y_sigma,marker='.',linestyle='',
             label="residual (y-y_fit)")
ax2.hlines(0,np.min(x),np.max(x),lw=2,alpha=0.8)
ax2.set_xlabel('{} [{}]'.format(x_name,x_units))
ax2.set_ylabel('y-y_fit [{}]'.format(y_units))
ax2.set_title('Residuals for the Best Fit')
ax2.legend(loc='best',numpoints=1)

# ax3 = fig.add_subplot(313)
# ax3.bar(bins[:-1],hist,width=bins[1]-bins[0])

# ax3.set_ylim(0,1.2*np.max(hist))
# ax3.set_xlabel('y-y_fit [{}]'.format(y_units))
# ax3.set_ylabel('Number of occurences')
# ax3.set_title('Histogram of the Residuals')

"""
Modify the following lines to change the name of the file used to store a JPEG of your best fit graphs
"""

# Before showing the plot, you can also save a copy of the figure as a JPEG.
# The order is important here because plt.show clears the plot information after displaying it. 
plt.savefig('FittingResults.jpeg')
plt.show()