# Pre-class Reading: Day 16 (Nov 01, 2023)<br>Fitting
Learning goals
1. Use `scipy.optimize.curve_fit` to fit an x-y data set to a model and extract the fitting parameters
1. Use extracted fitting parameters to plot the best fit model
1. Extract uncertainties from fitting parameters
1. Include absoulte y-data uncertainties in the fit
1. Make good initial guesses of the fitting parameters to improve the chance that the real best fit will be found instead of a local minimum

## *No self-assessment questions for this reading*

Although you have likely interacted with many of these concepts in Physics 219 and elsewhere, we're presenting some of the details in different ways that you may have encountered so would like everybody to work their way through this notebook and then submit it as if it was a regular reading assignment with self-assessment quesitons.

## *16.1 Fitting with `scipy.optimize.curve_fit`*

The main fitting routine that we will use in this course is the `scipy` fitting routine `curve_fit()`. You can read more about it at:
    
* http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html

This fitting routine uses non-linear least squares fitting. For those of you that took Physics 119, this is the same as minimizing chi-squared by minimizing the sum of the residuals squared. 

## *16.2 An initial linear fit without uncertainties*

Below we provide a data set that we expect should be modelled by a straight line, $y=mx+b$. Let's first plot these data so we have a sense of what they look like.

In [None]:
import numpy as np
import matplotlib.pyplot as plt

x_data = np.array([.04, .06, .08, .1 , .13, .15])
y_data = np.array([3.14, 3.89, 4.04, 4.33, 4.67, 5.08])
yerror = np.array([0.52, 0.22, 0.53, 0.53, 0.24, 0.54])

plt.errorbar(x_data, y_data, yerr=yerror, fmt = "bo", capsize=5)
plt.xlabel('x values')
plt.ylabel('y values')
plt.grid(True)
plt.show()

Based on some quick mental math, we can get initial estimates of the slope and y-intercept for this data set. Let us initially ignore the uncertainties and notice that the data set has a rise of approximately 2 units ($\approx 5-3$) over the entire data set and a run of approximately 0.1 units ($\approx 0.14-0.04$). Thus the slope is approximately $2/0.1 \approx 20$. A reasonable estimate for the y-intercept would be $\approx 2.0 \mbox{ units}$.

Let's run the fit and double-check that the results make some sense. Note that we do have y-uncertainties in our data set (`yerror`), but we are not yet using them with `curve_fit`.

In [None]:
# This allows us to use scipy's 'optimize.curve_fit'
from scipy import optimize

# Like with solve_ivp, we need to define a function that curve_fit will interact with
# - The first parameter will always be the x variable
# - The subsequent parameters will be our parameters for our fitting function
# - Finally, we return what this function would calculate our y value to be
#   so that curve_fit can compare it to y_data
def line(x, m, b):
    return m*x + b

# curve_fit returns a tuple consisting of
# - An array of our best-fit parameters, which we call fitparams
# - A covariance matrix (fitcov) from which we can extract the uncertainties in our
#   best-fit parameters.
fitparams, fitcov = optimize.curve_fit(line, x_data, y_data)

# Print the results
print(f"m = {fitparams[0]:.1f};  b = {fitparams[1]:.2f}")

# The diagnonal of the covariance matrix gives the variance (standard deviation squared)
# of our fitting parameters and the off-diagnonals communicate the correlations between
# our fitting parameters. To get the uncertainty in each fitting parameter, we need to take
# the square root of the variance.
print(f"dm = {np.sqrt(fitcov[0,0]):.2};  db = {np.sqrt(fitcov[1,1]):.2}")

We find that our initial estimates were plausible, but more that one standard deviation away from the actual results.

Let's graph the results. 

In [None]:
## Let's plot the best fit line

# Produce enough x values over the data range to be able to make smooth curve
xf = np.linspace(x_data.min(), x_data.max(), num = 50)

# Re-use our `line` function from before and give it our best-fit parameters
# - `*` in `*fitparams` unpacks it into fitparams[0] and fitparams[1] when
#   it is being sent to the `line` function
yf = line(xf, *fitparams)

# Plot the best fit line
plt.plot(xf, yf, "r-", label="Fit (ignoring uncertainties)")

## Let's bring back the previous graph
plt.errorbar(x_data, y_data, yerr=yerror, fmt = "bo", capsize=5, label="Data")
plt.xlabel('x values')
plt.ylabel('y values')
plt.grid(True)
plt.legend()
plt.show()

Let's take the next step and include the y-uncertainties from our data in the fit.

## *16.3 Including uncertainties in our fit*

We will use two optional arguments to communicate to `curve_fit` that we want it to use the data y-uncertainties in our fit. This will use the standard approach to weighting each data point by 1/yerror<sup>2</sup>. 
1. Our first optional argument is `sigma=yerror`, which tells `curve_fit` that our "one sigma" errors are stored in `yerror`. 
2. Our second optional argument is `absolute_sigma=True`, which tells `curve_fit` to treat these as absolute errors, which is how we are plotting them. If set to `False` or not specified, `curve_fit` will treat these as relative errors.

In [None]:
# We can re-use our previous `line` function so just need to call `curve_fit` with
# our new optional arguments

fitparams2, fitcov2 = optimize.curve_fit(line, x_data, y_data, sigma=yerror, absolute_sigma=True)

print(f"m = {fitparams2[0]:.1f};  b = {fitparams2[1]:.2f}")
print(f"dm = {np.sqrt(fitcov2[0,0]):.2};  db = {np.sqrt(fitcov2[1,1]):.2}")

Let's plot our results. Note that since we already have `xf` and `yf`, we just need to find the `yf2`, the y-values predicted by our `line` function when using `fitparams2` instead of `fitparams`. 

In [None]:
yf2 = line(xf, *fitparams2)

# Plot the best fit line
plt.plot(xf, yf, "r-", label="Fit (ignoring uncertainties)")
plt.plot(xf, yf2, "g--", label="Fit (including uncertainties)")

## Let's bring back the previous graph
plt.errorbar(x_data, y_data, yerr=yerror, fmt = "bo", capsize=5, label="Data")
plt.xlabel('x values')
plt.ylabel('y values')
plt.grid(True)
plt.legend()
plt.show()

We can see that the including `yerror` in the fit as absolute errors results in a fit that gives much more weight to the two data points with small uncertainties.

## *16.4 initial guesses*

The process of least-squares fitting involves looking for a minumum in the sum of the residuals (distance between the data and the value predicted by the model) squares. Models will tend to have an absolute minimum value for this, corresponding to the best possible fit of that model to those data, but many models will also have many local minima. This means that `curve_fit` may find a solution that minimizes the sum of the residuals squared as compared to nearby variations of the solution parameters, but which is not absolute minimum. Let's take a look.

First, let's load some initial data and plot it.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit

t_data = np.array([ 0.        ,  0.25645654,  0.51291309,  0.76936963,  1.02582617,
        1.28228272,  1.53873926,  1.7951958 ,  2.05165235,  2.30810889,
        2.56456543,  2.82102197,  3.07747852,  3.33393506,  3.5903916 ,
        3.84684815,  4.10330469,  4.35976123,  4.61621778,  4.87267432,
        5.12913086,  5.38558741,  5.64204395,  5.89850049,  6.15495704,
        6.41141358,  6.66787012,  6.92432667,  7.18078321,  7.43723975,
        7.69369629,  7.95015284,  8.20660938,  8.46306592,  8.71952247,
        8.97597901,  9.23243555,  9.4888921 ,  9.74534864, 10.00180518,
       10.25826173, 10.51471827, 10.77117481, 11.02763136, 11.2840879 ,
       11.54054444, 11.79700098, 12.05345753, 12.30991407, 12.56637061])
y_data = np.array([-1.71815239e-03, -2.08015913e-01,  5.64748219e-01,  9.62492904e-01,
        9.20418866e-01,  9.35737259e-01,  1.32776675e+00,  1.58956031e+00,
        1.88558979e+00,  1.66143882e+00,  2.03767118e+00,  2.25638778e+00,
        2.00445035e+00,  2.04773244e+00,  2.17685040e+00,  1.88546903e+00,
        1.81505858e+00,  1.48373884e+00,  1.59274906e+00,  1.57320479e+00,
        9.91393316e-01,  1.08707412e+00,  7.51733967e-01, -4.56799784e-02,
       -3.91397237e-01, -1.96417994e-01, -2.74839190e-01, -4.23692135e-01,
       -7.91822825e-01, -1.31857939e+00, -1.12749963e+00, -1.84915218e+00,
       -1.90104185e+00, -2.06498665e+00, -1.90021998e+00, -1.85630497e+00,
       -2.16964289e+00, -2.38494504e+00, -1.79373814e+00, -1.84252623e+00,
       -1.61829111e+00, -1.70084067e+00, -1.63726353e+00, -1.41318373e+00,
       -1.14480241e+00, -1.20080281e+00, -9.50884586e-01, -8.69321294e-01,
       -1.45618763e-01, -2.32657944e-01])

plt.scatter(t_data, y_data, s=10, label='Data')
plt.xlabel('t values')
plt.ylabel('y values')
plt.grid(True)
plt.legend()
plt.show()

We're expecting these data to obey a model $a \sin{\omega t}$.

Like we did previously, we can get some initial estimates of the parameters just by knowing the model function and inspecting the behaviour of the data. Since this is a sin wave, we can quite easily see that we expect $a \approx 2$. Additionally, we can also see that it takes approximately 12 time units to go through one full period and the relationship between period and angular frequency is $\omega = 2\pi/T$, giving us $\omega \approx 2\pi/12 \approx 0.5$. Let's keep these values in mind as we proceed with this fit.

Let's first do a fit in the same way we have done it so far

In [None]:
# Define the sum of sine waves function
def sin_model(x, a, omega):
    return a * np.sin(omega * x)

# Fit with no initial guess
popt_no_guess, pcov_no_guess = curve_fit(sin_model, t_data, y_data)

# Print the "no guess" best fit parameters
print(f"a = {popt_no_guess[0]:.3};  omega = {popt_no_guess[1]:.3}")

# Generate our graphs
plt.scatter(t_data, y_data, s=10, label='Data')
plt.plot(t_data, sin_model(t_data, *popt_no_guess), 'r-', label='Fit (no initial guess)')
plt.xlabel('t values')
plt.ylabel('y values')
plt.grid(True)
plt.legend()
plt.show()

**Why did this fail so spectacularly???**

The final aspect of using `curve_fit` that we have not yet discussed is that all fitting functions need to start with initial "guesses" of the fitting parameters and then `curve_fit` varies these parameters slowly while monitoring the sum of residuals squared to determine when it has found a minimum. Importantly, if we don't give it an initial guess, it uses values of `1` for each parameter.

Let's include what the model `sin_model` looks like with initial guesses of $a=1$ and $\omega=1$, labelled "sin_model with initial guesses = 1" in the plot below.

In [None]:
plt.scatter(t_data, y_data, s=10, label='Data')
plt.plot(t_data, sin_model(t_data, 1, 1), 'b-', label='sin_model with initial guesses = 1')
plt.plot(t_data, sin_model(t_data, *popt_no_guess), 'r-', label='Best fit with initial guesses = 1')
plt.xlabel('t values')
plt.ylabel('y values')
plt.grid(True)
plt.legend()
plt.show()

The initial guess of the angular frequency of 1 was approximately twice as large as what we estimated from the graph. As a result `curve_fit` struggled to get out of its local minimum by adjusting omega and ultimately ended up reducing the amplitude dramatically to reduce the very large penalty being applied due to the largest residuals. 

Finally, we provide our estimates from the start of this section ($a\approx 2$ and $\omega \approx 0.5$) as an optional argument to `curve_fit` using `p0=[2, 0.5]`.

In [None]:
# Fit with an initial guess
popt_guess, pcov_guess = curve_fit(sin_model, t_data, y_data, p0=[2, 0.5])

# Plot
plt.scatter(t_data, y_data, s=10, label='Data')
plt.plot(t_data, sin_model(t_data, *popt_no_guess), 'r-', label='Fit (no initial guess)')
plt.plot(t_data, sin_model(t_data, *popt_guess), 'g-', label='Fit (with initial guess)')
plt.xlabel('t values')
plt.ylabel('y values')
plt.grid(True)
plt.legend()
plt.show()

Hopefully you can see the importance of initial guesses when fitting to nonlinear functions. 

## *Submitting this reading assignment*
Before submitting your work, please ensure you have worked carefully through all the cells. Afterward choose: File >> Save_and_Export_Notebook_As >> HTML. This will download an HTML version of your notebook to your computer which you can upload to Canvas