## Template 2: Autofit

Developed by Rebeckah Fussell for Cornell Physics Labs.




This is a template you can use to analyze your lab data using the autoFit function introduced in the Fitting II homework. 

autoFit is a function that attempts to fit a linear model to a two-dimensional dataset (x and y) using a chi-squared fitting that is weighted by the uncertainty in y. The results of autoFit are only applicable for linear data. 

Make sure that "utilities.ipynb" is in the same folder as this file, then run the lines below to gain access to the autoFit function and any other necessary packages.

In [None]:
%run ./utilities.ipynb

Running help(autoFit) will show you information about how to use the autoFit function.

In [None]:
help(autoFit)

You have a couple options for entering your data with regard to uncertainty

1. Collect multiple trials for each datapoint and have the autoFit calculate statistical uncertainties.
2. Manually enter the uncertainty associated with each datapoint.

Select one of the options above based on your experimental design. Then go to the corresponding section below.

Note that entering your data in the form of x and y values without uncertainties or multiple trials that can be used to calculate uncertainties is not an option. This is because the calculation of weighted chi-squared does not make sense if you do not have some information about the uncertainty in your data.

### Option 1


You may have an experiment where it does not make sense to take multiple trials. For example, if your largest source of uncertainty is instrumental (e.g. measuring extensions with a ruler), there is no distribution in your measurements so it does not make sense to calculate statistics on multiple trials. 

Say you selected 6 different x-values and measured a y-value at each. You would also need to measure a dy-value at each of the 6 measurements, corresponding to the uncertainty in the y-value. Your dy-values should have the same units as your y_values.

Enter your x_values, y_values, and dy_values below. You probably took a different number of datapoints than 6, so change the length of the array below accordingly. Make sure that all three arrays have the same length.

In [None]:
x_values = np.array([...,...,...,...,...,...])
y_values = np.array([...,...,...,...,...,...])
dy_values = np.array([...,...,...,...,...,...])

Now that we have our data, write a function call to the fitting function. 

You will need to edit the title, xaxis, and yaxis labels. Do this by changing the text inside each set of quotation marks.

In [None]:
#if uncertainties are being entered manually
manualFitWithSliders(x_values, y_values, dy=dy_values, title="Use title= in your call", xaxis="Use xaxis= in your call", yaxis="Use yaxis= in your call")

### Option 2

Say you collected 5 trials of some y-value at 6 different x-values. First you will want to store the 6 x-values in one array, as below. Type the data from your experiment into this array. You probably took a different number of x-values, so change the length of the array below accordingly. 

In [None]:
x_values = np.array([...,...,...,...,...,...])

Next, you will want to store all your y-values. Following our example of 5 trials of 6 datasets, there are two ways you can do this. You can create separate 5-element arrays for all 6 y-values (possibly called y1_values, y2_values, y3_values,..), calculate the means and uncertainties of each, and store these means and uncertainties in new 6-element arrays (possibly called y_values and dy_values). 

A quicker way is to give all your data in one 2-dimensional y_value array, and let the fitting function calculate means and uncertainties for you. This template shows you how to do that. Row 1 should hold the 5 y-values corresponding to the first x_value in our array above, Row 2 should correspond to the second x-value, and so on (6 rows total in our example). The columns should correspond to the number of trials. 

A 2-dimensional array can be defined as below. Notice that there are 5 columns and 6 rows. Notice that each row has square brackets [ ] around it, each row is separated by a single comma, and the data is bookended by an additional set of square brackets (in addition to the parentheses in the numpy.array() function call that bookend outside the data). 

Once you understand the structure of this 2-dimensional array definition, type in your own data. You will probably need to change the number of rows and columns in the definition to suit your own lab data. When you are done, check that the number of rows in y_values is equal to the number of elements in x_values.

In [None]:
y_values = np.array([[...,...,...,...,...],
                     [...,...,...,...,...],
                     [...,...,...,...,...],
                     [...,...,...,...,...],
                     [...,...,...,...,...],
                     [...,...,...,...,...]])

Now that we have our data, write a function call to the fitting function. 

You will need to edit the title, xaxis, and yaxis labels. Do this by changing the text inside each set of quotation marks.

In [None]:
#if uncertainties are being calculated automatically
autoFit(x_values, y_values, dy=[], title="Use title= in your call", xaxis="Use xaxis= in your call", yaxis="Use yaxis= in your call")


### Nonlinear Data

You may find while plotting your data that it is not linear. autoFit is only built to accommodate linear data, and it is not appropriate to report data with a linear fit if the data is clearly nonlinear. 

If you find yourself in this situation, use the code snippet below to plot your data with uncertainties but without a linear fit. 

In [None]:
#You can use data you entered earlier in this template, or you can uncomment the lines below to enter new data
import numpy as np
import matplotlib.pyplot as plt

#x_values = np.array([...,...,...,...,...,...])
#y_values = np.array([...,...,...,...,...,...])
#dy_values = np.array([...,...,...,...,...,...])


In [None]:
def poly(x, args):
    '''
    returns the value of the polynomial sum (x**i*args[i])
    '''
    total=x**0*args[0]
    for i in range(1,len(args)):
        total+=x**i*args[i]
    return total

def plotData(x_values, y_values, dy=dy_values, title="Use title= in your call", xaxis="Use xaxis= in your call", yaxis="Use yaxis= in your call"):
    plt.figure()
    plt.errorbar(x_values, y_values, dy_values, fmt='.k')
    plt.title(title)
    plt.xlabel(xaxis)
    plt.ylabel(yaxis)
    plt.show()
    

In [None]:
plotData(x_values, y_values, dy=dy_values, title="Use title= in your call", xaxis="Use xaxis= in your call", yaxis="Use yaxis= in your call")