# Lab 06 Prelab - A review of residuals and fitting

In [None]:
%reset -f
import data_entry2
import numpy as np
import matplotlib.pyplot as plt

This prelab focuses on revisting and reinforcing concepts from Lab 05. We start with a review of residual plots.

## Part A - How to read residuals

**Your turn #1a:** Given experimental data and a model for that data, what is a residual?

**Your turn #1b:** What are properties of a residual plot that inform us that the model is a good fit to the data?

##### **Answer for #1a:**

A residual is defined as the difference between experimental data and a model, i.e., $R_i = y_i - \text{model}(x_i)$ or $R_i = y_i - f(x_i)$.

##### **Answer for #1b:**

A model that is a good fit to the data will have a residual plot with:<font>
- No obvious trend
- A roughly equal scatter of points across the x-axis (x = 0)
- And if the uncertainties are well characterized, we will also see
    + Roughly ~68% of error bars crossing the x-axis
    + Roughly all (~95%) of doubled error bars crossing the x-axis

### Part A1: Diagnosing models with residuals

Now that we have reminded ourselves of how to use residuals to diagnose the goodness of fit of a model to a given set of experimental data, let's use these criteria to diagnose some fits. Below, we have three data sets to which we have tried to fit the model of a straight line with intercept; $y_{\text{model}} = mx + b$. By running the code cells below, we can plot the three data sets with the model, as well as the subsequent residual plot. 

For each of the three datasets, point out any features of the residual plot that may indicate that there is a problem of the model fit to the data. Explain what should be changed about the model. Finally, go into the code cell and modify the model according to the diagnosis, until the residual plot indicates that the model is a good fit to the data, as per the criteria discussed in question 1.

In [None]:
# Run me to load our three data sets
# Make sure to hit the "Generate Vectors" button!

de1 = data_entry2.sheet("lab06_prelab_data1.csv")

### Your turn #2 (Dataset 1)

View and interact with the plots below to answer the following questions.

**Your turn #2a:** Which feature(s) of the residuals plot below indicates a problem with the fit of the model to the data?

**Your turn #2b:** What should be changed about the model?

**Your turn #2c:** Change the model in the code cell (in the area specified) until the residuals indicate the model is a good fit. What are your new slope and/or y-intercept?

##### **Answer #2a**

We see a trend in the residuals where all of the residual points lie below the x-axis (x=0 line)

##### **Answer #2b**

The y-intercept of the model is too large, so we should decrease it.

##### **Answer #2c:**

Changing the intercept from 1.1 to 1 in the code improves the model in the desired fashion.

##### **Cell to generate plots for data set 1**

In [None]:
# Run me to make an initial set of plots, to be modified to answer #2c above.

# Data/Model Plot
# Step 1: find the limits of the data:
xmin = np.min(xVec) # use the np.min function to find the smallest x value
xmax = np.max(xVec) # same for max
#print (xmin, xmax)  # uncomment to see what the limits are

# Step 2: generate a bunch of x points between xmin and xmax
xpoints = np.linspace(xmin, xmax, 200) # gives 200 evenly spaced points between xmin and xmax
#print(xpoints) # uncomment to see the x values that were generated.

# Step 3: calculate the model values:
#################################### MODIFY THIS PART ###########################################
slope = 1 # Estimate of the slope.
intercept = 1.1  # Estimate of the intercept
ypoints = xpoints * slope + intercept # this calculates the yvalues at all 200 points.
#################################################################################################

# Step 4: plot the curve. We plot this as a red line "r-" :
plt.plot(xpoints, ypoints, "r-", label = "y = mx + b")

# Plot the data:
plt.errorbar(xVec, y1Vec, u_y1Vec, fmt="bo", markersize=3, label="Experimental data")
plt.title("y vs. x for first dataset")
plt.xlabel("x (s)")
plt.ylabel("y (m)")
plt.legend()
plt.show()

# Residuals Plot
# Step 1: Calculate the model at each x-datapoint
ymodel = xVec * slope + intercept

# Step 2: Calcualte the residual vector
ResVec = y1Vec - ymodel

# Step 3: Plot the residual vector against the x-data vector
plt.errorbar(xVec, ResVec, u_y1Vec, fmt="bo", markersize = 3)

# Step 4: Add a R = 0 x-axis (horizontal line) to the plot
plt.hlines(y=0, xmin=xmin, xmax=xmax, color='k') # draw axis at y = 0.

# Add axis labels and title, and show the graph
plt.title("Residuals for y vs. x for first dataset")
plt.xlabel("x (s)")
plt.ylabel("Residual = data - model (m)")
plt.show()

### Your turn #3 (Dataset 2)

View and interact with the plots below to answer the following questions.

**Your turn #3a:** Which feature(s) of the residuals plot below indicates a problem with the fit of the model to the data?

**Your turn #3b:** What should be changed about the model?

**Your turn #3c:** Change the model in the code cell (in the area specified) until the residuals indicate the model is a good fit. What are your new slope and/or y-intercept?

##### **Answer #3a**

We see an (upwards) linear trend in the residuals plot.

##### **Answer #3b**

The slope of the model is too small, so we should increase it.

##### **Answer #3c**

Changing the slope from 1.9 to 2 in the code improves the model in the desired fashion.

##### **Cell to generate plots for data set 2**

In [None]:
# Run me to make an initial set of plots, to be modified to answer #3c above.

# Data/Model Plot
# Step 1: find the limits of the data:
xmin = np.min(xVec) # use the np.min function to find the smallest x value
xmax = np.max(xVec) # same for max
#print (xmin, xmax)  # uncomment to see what the limits are

# Step 2: generate a bunch of x points between xmin and xmax
xpoints = np.linspace(xmin, xmax, 200) # gives 200 evenly spaced points between xmin and xmax
#print(xpoints) # uncomment to see the x values that were generated.

# Step 3: calculate the model values:
#################################### MODIFY THIS PART ###########################################
slope = 1.9 # Estimate of the slope.
intercept = 3
ypoints = xpoints * slope + intercept # this calculates the yvalues at all 200 points.
#################################################################################################

# Step 4: plot the curve. We plot this as a red line "r-" :
plt.plot(xpoints, ypoints, "r-", label = "y = mx + b")

# Plot the data:
plt.errorbar(xVec, y2Vec, u_y2Vec, fmt="bo", markersize=3, label="Experimental data")
plt.title("y vs. x for second dataset")
plt.xlabel("x (s)")
plt.ylabel("y (m)")
plt.legend()
plt.show()

# Residuals Plot
# Step 1: Calculate the model at each x-datapoint
ymodel = xVec * slope + intercept

# Step 2: Calcualte the residual vector
ResVec = y2Vec - ymodel

# Step 3: Plot the residual vector against the x-data vector
plt.errorbar(xVec, ResVec, u_y2Vec, fmt="bo", markersize = 3)

# Step 4: Add a R = 0 x-axis (horizontal line) to the plot
plt.hlines(y=0, xmin=xmin, xmax=xmax, color='k') # draw axis at y = 0.

# Add axis labels and title, and show the graph
plt.title("Residuals for y vs. x for second dataset")
plt.xlabel("x (s)")
plt.ylabel("Residual = data - model (m)")
plt.show()

### Your turn #4 (Data set 3)

_Note:_ For this data set, the issue with the model is a bit harder to fix compared to the other two examples; for this part it is alright to skip the step of fixing the model in the code, and just comment on what appears to be wrong and how one could feasibly improve the fit of the model to the data (though we welcome the initiative to try to fix the model in the code, as well!)

**Your turn #4a:** Which feature(s) of the residuals plot below indicates a problem with the fit of the model to the data?

**Your turn #4b:** What should be changed about the model?

**Your turn #4c (optional):** Change the model in the code cell (in the area specified) until the residuals indicate the model is a good fit. Note that for this question the model is harder to fix in the code compared to the previous two questions; it is therefore a bonus for those who really want to put their python-model-fixing skills to the test. How did you need to adjust your model to make a good fit?

##### **Answer #4a**

We notice from the residuals plot that there is a upwards parabolic trend.

##### **Answer #4b**

This parabolic trend tells us that the current linear model is not a good fit for the data; we should add a (positive) quadratic term.

##### **Answer #4c**

To improve it in the desired fashion, we add $0.01x^2$ to the model, which can be done by modifying the code to:
- ypoints = 0.01 * xpoints**2 + xpoints * slope + intercept
- ymodel = 0.01 * xVec**2 + xVec * slope + intercept

##### **Cell to generate plots for data set 3**

In [None]:
# Run me to make an initial set of plots, to be modified to answer #4c above.

# Data/Model Plot
# Step 1: find the limits of the data:
xmin = np.min(xVec) # use the np.min function to find the smallest x value
xmax = np.max(xVec) # same for max
#print (xmin, xmax)  # uncomment to see what the limits are

# Step 2: generate a bunch of x points between xmin and xmax
xpoints = np.linspace(xmin, xmax, 200) # gives 200 evenly spaced points between xmin and xmax
#print(xpoints) # uncomment to see the x values that were generated.

# Step 3: calculate the model values:
# MODIFY THIS PART ##############################################################################
slope = 1 # Estimate of the slope.
intercept = 3
ypoints = xpoints * slope + intercept # this calculates the yvalues at all 200 points.
#################################################################################################

# Step 4: plot the curve. We plot this as a red line "r-" :
plt.plot(xpoints, ypoints, "r-", label = "y = mx + b")

# Plot the data:
plt.errorbar(xVec, y3Vec, uy3Vec, fmt="bo", markersize=3, label="Experimental data")
plt.title("y vs. x for third dataset")
plt.xlabel("x (s)")
plt.ylabel("y (m)")
plt.legend()
plt.show()

# Residuals Plot
# Step 1: Calculate the model at each x-datapoint
# ALSO MODIFY THIS PART, DEPENDING ON YOUR MODEL MODIFICATION ABOVE###########################
ymodel = xVec * slope + intercept
#################################################################################################

# Step 2: Calcualte the residual vector
ResVec = y3Vec - ymodel

# Step 3: Plot the residual vector against the x-data vector
plt.errorbar(xVec, ResVec, u_y3Vec, fmt="bo", markersize = 3)

# Step 4: Add a R = 0 x-axis (horizontal line) to the plot
plt.hlines(y=0, xmin=xmin, xmax=xmax, color='k') # draw axis at y = 0.

# Add axis labels and title, and show the graph
plt.title("Residuals for y vs. x for third dataset")
plt.xlabel("x (s)")
plt.ylabel("Residual = data - model (m)")
plt.show()

## Part B - Using residuals to obtain best fit parameters and their uncertainties

In the Notebook tutorial in lab 5, we learned how to use the residual plot to find the best fit parameters in a linear model and evaluate their uncertainties. We review this procedure here, looking at an experimental setting which gives similar linear data.

### A worked example

An experiment was performed using an electric toy car and an ultrasonic position sensor. The toy car was set into motion (travelling at constant velocity) and the position sensor recorded its position, $d$, every 0.5 seconds. From these data we want to determine the speed and initial position of the toy car. From kinematics, we believe that the toy car's position should follow:

$$d = vt + d_0$$

where $v$ is the speed of the car and $d_0$ is the position at $t=0$. We will therefore model this as a linear model with a non-zero y-intercept.

To begin, we will plot the data and then can make some initial estimates of the slope and y-intercept by looking at the graph or the data table. If we perform a quick rise/run slope calculation using the first and last data points, we get:

$$\text{initial slope estimate} = m = \frac{1.218\text{m}-0.323\text{m}}{3.0\text{s}-0.5\text{s}} = 0.36\text{ m/s}.$$

Similarly, to get an initial rough estimate of the y-intercept, we could observe that the $d$ value at $t=1\text{s}$ should be approximately the y-intercept plus the slope times 1 second:

$$\text{initial y-intercept estimate} = b = d_{t=1s} - \text{slope}\times 1\text{s}$$
$$\text{initial y-intercept estimate} = b = 0.537\text{m} - 0.36\text{m} \approx 0.15\text{m}$$

So we build a scatter plot, with an initial linear model using the estimated parameters above, and then build a residuals plot as well. 

It is also worth noticing that uncertainty in the position, $u\_d$, increases as the distance between the car and position sensor increases.

In [None]:
# Run me to load our second data set
# Make sure to hit the "Generate Vectors" button!

de2 = data_entry2.sheet("lab06_prelab_data2.csv")

In [None]:
# Run me to make an initial set of plots using m = 0.36 and b = 0.15 for initial model estimates

# Data/Model Plot
# Step 1: find the limits of the data:
xmin = np.min(tVec) # use the np.min function to find the smallest x value
xmax = np.max(tVec) # same for max
#print (xmin, xmax)  # uncomment to see what the limits are

# Step 2: generate a bunch of x points between xmin and xmax
xpoints = np.linspace(xmin, xmax, 200) # gives 200 evenly spaced points between xmin and xmax
#print(xpoints) # uncomment to see the x values that were generated.

# Step 3: calculate the model values:
slope = 0.36  # Estimate of the slope.
intercept = 0.15  # Estimate of the intercept
ypoints = xpoints * slope + intercept # this calculates the yvalues at all 200 points.

# Step 4: plot the curve. We plot this as a red line "r-" :
plt.plot(xpoints, ypoints, "r-", label = "d = vt + d0")

# Plot the data:
plt.errorbar(tVec, dVec, u_dVec, fmt="bo", markersize=3, label="Experimental data")
plt.title("Position vs. Time for toy car")
plt.xlabel("Time (s)")
plt.ylabel("Position (m)")
plt.legend()
plt.show()

# Residuals Plot
# Step 1: Calculate the model at each x-datapoint
ymodel = tVec * slope + intercept

# Step 2: Calcualte the residual vector
ResVec = dVec - ymodel

# Step 3: Plot the residual vector against the x-data vector
plt.errorbar(tVec, ResVec, u_dVec, fmt="bo", markersize = 3)

# Step 4: Add a R = 0 x-axis (horizontal line) to the plot
plt.hlines(y=0, xmin=xmin, xmax=xmax, color='k') # draw axis at y = 0.

# Add axis labels and title, and show the graph
plt.title("Position vs. Time for toy car")
plt.xlabel("Time (s)")
plt.ylabel("Residual = data - model (m)")
plt.show()

Looking at the scatter plot above, it looks like our data are roughly linear and that our initial guesses of $m$ and $d_0$ produce a somewhat reasonable model. However, we will make use of the residuals plot and our criteria of good model fits to improve our current fit:

*Criteria: The scatter of the residuals for a good fit will follow a Gaussian distribution with a mean of 0, which means we are looking for a roughly equal number of residuals above the y=0 line as below it. Ideally we should also see no obvious trend/shape in the residuals plot.*

Let’s use these criteria to make some small adjustments to the slope and intercept of our model. First, notice that the residuals plot shows a small trend where they are slowly decreasing from left to right. Decreasing our slope slightly could fix this, since a smaller slope will decrease the model values more at larger $t$ values, reducing the difference between data and model (the residual) more on the right-hand side of the graph than the left-hand side.

Changing the slope to 0.34 below, we see that the trend of increasing from left to right goes away, but our model line is now a bit too low, which can also be seen with most of our residuals (data-model) being positive.

In [None]:
# Run me to make an updated set of plots using m = 0.34 instead of m = 0.36

# Data/Model Plot
# Step 1: find the limits of the data:
xmin = np.min(tVec) # use the np.min function to find the smallest x value
xmax = np.max(tVec) # same for max
#print (xmin, xmax)  # uncomment to see what the limits are

# Step 2: generate a bunch of x points between xmin and xmax
xpoints = np.linspace(xmin, xmax, 200) # gives 200 evenly spaced points between xmin and xmax
#print(xpoints) # uncomment to see the x values that were generated.

# Step 3: calculate the model values:
slope = 0.34  # Estimate of the slope; this has been increased from the previous time.
intercept = 0.15  # Estimate of the intercept
ypoints = xpoints * slope + intercept # this calculates the yvalues at all 200 points.

# Step 4: plot the curve. We plot this as a red line "r-" :
plt.plot(xpoints, ypoints, "r-", label = "d = vt + d0")

# Plot the data:
plt.errorbar(tVec, dVec, u_dVec, fmt="bo", markersize=3, label="Experimental data")
plt.title("Position vs. Time for toy car")
plt.xlabel("Time (s)")
plt.ylabel("Position (m)")
plt.legend()
plt.show()

# Residuals Plot
# Step 1: Calculate the model at each x-datapoint
ymodel = tVec * slope + intercept

# Step 2: Calcualte the residual vector
ResVec = dVec - ymodel

# Step 3: Plot the residual vector against the x-data vector
plt.errorbar(tVec, ResVec, u_dVec, fmt="bo", markersize = 3)

# Step 4: Add a R = 0 x-axis (horizontal line) to the plot
plt.hlines(y=0, xmin=xmin, xmax=xmax, color='k') # draw axis at y = 0.

# Add axis labels and title, and show the graph
plt.title("Position vs. Time for toy car")
plt.xlabel("Time (s)")
plt.ylabel("Residual = data - model (m)")
plt.show()

Finally, if we adjust the $y$-intercept ($b$), we eventually find that $b$ = 0.17 gets us to a place where the residuals look randomly scattered about 0 and have no obvious trend/shape. If we have a good fit of our model to our data, and our uncertainties have been well estimated, we expect ~68% of our residuals to cross zero, which is the same as ~68% of our data points crossing the model line.

In [None]:
# Run me to make an updated set of plots using m = 0.34 and b = 0.17

# Data/Model Plot
# Step 1: find the limits of the data:
xmin = np.min(tVec) # use the np.min function to find the smallest x value
xmax = np.max(tVec) # same for max
#print (xmin, xmax)  # uncomment to see what the limits are

# Step 2: generate a bunch of x points between xmin and xmax
xpoints = np.linspace(xmin, xmax, 200) # gives 200 evenly spaced points between xmin and xmax
#print(xpoints) # uncomment to see the x values that were generated.

# Step 3: calculate the model values:
slope = 0.34  # Estimate of the slope
intercept = 0.17  # Estimate of the intercept; this has been decreased
ypoints = xpoints * slope + intercept # this calculates the yvalues at all 200 points.

# Step 4: plot the curve. We plot this as a red line "r-" :
plt.plot(xpoints, ypoints, "r-", label = "d = vt + d0")

# Plot the data:
plt.errorbar(tVec, dVec, u_dVec, fmt="bo", markersize=3, label="Experimental data")
plt.title("Position vs. Time for toy car")
plt.xlabel("Time (s)")
plt.ylabel("Position (m)")
plt.legend()
plt.show()

# Residuals Plot
# Step 1: Calculate the model at each x-datapoint
ymodel = tVec * slope + intercept

# Step 2: Calcualte the residual vector
ResVec = dVec - ymodel

# Step 3: Plot the residual vector against the x-data vector
plt.errorbar(tVec, ResVec, u_dVec, fmt="bo", markersize = 3)

# Step 4: Add a R = 0 x-axis (horizontal line) to the plot
plt.hlines(y=0, xmin=xmin, xmax=xmax, color='k') # draw axis at y = 0.

# Add axis labels and title, and show the graph
plt.title("Position vs. Time for toy car")
plt.xlabel("Time (s)")
plt.ylabel("Residual = data - model (m)")
plt.show()

Let’s interpret these results again. The slope of this graph represents the speed of the car for a model that assumes a constant speed (that’s why we are trying to use a straight line to fit this graph) and that speed is 0.34 m/s. The y-intercept represents the position of the car at $t=0$, which is 0.17m.

It is worth noting in all of this that we probably could have used the original scatter plot to improve the fit in a similar way, but this ability of the residuals plot to zoom in on the difference between data and model values allows us to make these sorts of improvements even when the uncertainties in the original scatter plot become invisibly small. 

But hold on a moment, these fit parameters from the graphs are experimental results and should have uncertainties! Let’s look at how to estimate these.

### Estimating uncertainties in fitting parameters  (fitting 2 parameters)
To estimate uncertainties in fitting parameters, we will use the method of looking at the range of reasonable best-fit lines and treating that as a confidence interval. We will make the steepest possible best-fit line "Max" (as in maximum slope) that still does a reasonable job of fitting the data and a least steep "Min" (as in minimum slope) that also does a reasonable job of fitting the data. We will call our original model line "Best". Use the "Best" parameters for the starting guesses for the "Max" and "Min" fit parameters. While doing this adjusting to find limits on the range of reasonable fits, we are adjusting both the slope and the intercept to find the steepest reasonable line. The parameter choices with the maximum reasonable slope correspond to that with the minimum reasonable intercept.

**Your turn #5:** Try adjusting the slope and intercept parameters in the plot below to find the models that correspond to the maximum and minimum slopes that still correspond to reasonable fits. Recall that you will also need to adjust the y-intercepts as well in this process.

_Note that similar to estimating confidence intervals with your individual measurements, this process involves some judgement and different people will come up with slightly different, but still reasonable, values; so do not worry if you got slightly different numbers!_

Your maximum slope answers:
* m(max) = 
* b(min) = 

Your minimum slope answers:
* m(min) =
* b(max) =

##### **Answer #5 Max:**

We find that $m_{\text{max}} = 0.37$ and $b_{\text{min}} = 0.13$ correspond to the maximum slope model still having a reasonable fit

##### **Answer #5 Min:**

We find that $m_{\text{max}} = 0.32$ and $b_{\text{min}} = 0.19$ correspond to the maximum slope model still having a reasonable fit

##### Plots to adjust to investigate maximum and minimum slopes

In [None]:
# Data/Model Plot
# Step 1: find the limits of the data:
xmin = np.min(tVec) # use the np.min function to find the smallest x value
xmax = np.max(tVec) # same for max
#print (xmin, xmax)  # uncomment to see what the limits are

# Step 2: generate a bunch of x points between xmin and xmax
xpoints = np.linspace(xmin, xmax, 200) # gives 200 evenly spaced points between xmin and xmax
#print(xpoints) # uncomment to see the x values that were generated.

# Step 3: calculate the model values:
slope = 0.34  # Estimate of the slope
intercept = 0.17  # Estimate of the intercept
ypoints = xpoints * slope + intercept # this calculates the yvalues at all 200 points.

# Step 4: plot the curve. We plot this as a red line "r-" :
plt.plot(xpoints, ypoints, "r-", label = "d = vt + d0")

# Plot the data:
plt.errorbar(tVec, dVec, u_dVec, fmt="bo", markersize=3, label="Experimental data")
plt.title("Position vs. Time for toy car")
plt.xlabel("Time (s)")
plt.ylabel("Position (m)")
plt.legend()
plt.show()

# Residuals Plot
# Step 1: Calculate the model at each x-datapoint
ymodel = tVec * slope + intercept

# Step 2: Calcualte the residual vector
ResVec = dVec - ymodel

# Step 3: Plot the residual vector against the x-data vector
plt.errorbar(tVec, ResVec, u_dVec, fmt="bo", markersize = 3)

# Step 4: Add a R = 0 x-axis (horizontal line) to the plot
plt.hlines(y=0, xmin=xmin, xmax=xmax, color='k') # draw axis at y = 0.

# Add axis labels and title, and show the graph
plt.title("Position vs. Time for toy car")
plt.xlabel("Time (s)")
plt.ylabel("Residual = data - model (m)")
plt.show()

For the sake of comparison, we plot all three different models below (best/max/min):

In [None]:
# Run me to see all three models (best/max/min) on the same scatter plot 

# Step 1: find the limits of the data:
xmin = np.min(tVec) # use the np.min function to find the smallest x value
xmax = np.max(tVec) # same for max
#print (xmin, xmax)  # uncomment to see what the limits are

# Step 2: generate a bunch of x points between xmin and xmax
xpoints = np.linspace(xmin, xmax, 200) # gives 200 evenly spaced points between xmin and xmax
#print(xpoints) # uncomment to see the x values that were generated.

# Step 3: calculate the model values:
best_slope = 0.34  # Estimate of the slope
best_intercept = 0.17  # Estimate of the intercept
best_ypoints = xpoints * best_slope + best_intercept # this calculates the yvalues at all 200 points.

max_slope = 0.37
min_intercept = 0.13
max_ypoints = xpoints * max_slope + min_intercept

min_slope = 0.32
max_intercept = 0.19
min_ypoints = xpoints * min_slope + max_intercept

# Uncertainties
um = (max_slope-min_slope)/2
ub = (max_intercept-min_intercept)/2
print("u[m] =",um,"m/s")
print("u[b] =",ub,"m")

# Step 4: plot the curve. We plot this as a red line "r-" :
plt.plot(xpoints, best_ypoints, "r-", label = "d = vt + d0 best")
plt.plot(xpoints, max_ypoints, "g-", label = "d = vt + d0 max")
plt.plot(xpoints, min_ypoints, "y-", label = "d = vt + d0 min")

# Plot the data:
plt.errorbar(tVec, dVec, u_dVec, fmt="bo", markersize=3, label="Experimental data")
plt.title("Position vs. Time for toy car")
plt.xlabel("Time (s)")
plt.ylabel("Position (m)")
plt.legend()
plt.show()

This method of determining uncertainties in the fit parameters is quite crude and is meant to give a very rough approximation of these fit parameters uncertainties. We will take these max and min fit parameter values as representing 68% confidence intervals. The standard uncertainty will be a quarter of the width of the 95% Confidence Interval. For the slope we have:

$$u\_m = (m_{\text{max}} - m_{\text{min}})/2 = (0.37-0.32)/2 = 0.025\text{ m/s}$$

and for the y-intercept:

$$u\_b = (0.19 - 0.13)/2 = 0.030\text{ m}$$

We report our final results from this fit as the car having a speed ($m = v$) of:

$$v \pm u\_v = 0.340 \pm 0.026\text{ m/s}$$

and an initial position ($b = d_0$) of:

$$d_0 \pm u\_d_0 = 0.170 \pm 0.030 \text{ m}$$

**Your turn #6:** Below is the experimental data for the position vs. time of a different toy car. Following the procedure as we covered in the above worked example (and in the tutorial), use the residual plot to determine the best fit model parameters and their uncertainties; from this conclude the initial position and speed of the toy car.

To review:
1. Start by plotting a scatterplot of the below data, with a linear model (with y-intercept) and a residuals plot. We provide the code for this below, with a poor starting guesses of $m = 0.7$ and $b = 0.2$. 
2. Using the residuals plot, adjust the slope and the y-intercept values until you see no trend in the residuals. This will give you the best fit slope $m_{\text{best}}$ and y-intercept $b_{\text{best}}$. 
3. Increase the slope/decrease the y-intercept to the extent possible where the model still fits the data reasonably well (as evaluated by the residuals). From this obtain $m_{\text{max}}$ and $b_{\text{min}}$. Analogously, obtain $m_{\text{min}}$ and $b_{\text{max}}$ by pushing the model parameters in the opposite direction.
4. From $m_{\text{max}}/m_{\text{min}}$ and $b_{\text{max}}/b_{\text{min}}$, calculate the uncertainties in $m$ and $b$.
5. The speed of the car can be identified as the slope of the distance vs. time graph, and the initial position can be identified as the y-intercept of the graph. Given this, report the speed and initial position of the toy car with uncertainty.

_Note: It is not necessary to copy and paste the same block of plotting code multiple times (this was done in the worked example above so that it is easier to follow); it suffices to have one single block of plotting code that you can run multiple times while changing the parameters inside._

In [None]:
# Run me to load our third data set
# Make sure to hit the "Generate Vectors" button!

de3 = data_entry2.sheet("lab06_prelab_data3.csv")

In [None]:


# Data/Model Plot
# Step 1: find the limits of the data:
xmin = np.min(tVec) # use the np.min function to find the smallest x value
xmax = np.max(tVec) # same for max
#print (xmin, xmax)  # uncomment to see what the limits are

# Step 2: generate a bunch of x points between xmin and xmax
xpoints = np.linspace(xmin, xmax, 200) # gives 200 evenly spaced points between xmin and xmax
#print(xpoints) # uncomment to see the x values that were generated.

# Step 3: calculate the model values:
#
# Use the comments below to keep track of your best, max and min fit parameters
#
# m(best) =
# b(best) =
#
# m(max) =
# b(min) =
#
# m(min) =
# b(max) = 
#
############## MODIFY THE MODEL PARAMETERS HERE #####################
slope = 0.7  # Estimate of the slope
intercept = 0.2  # Estimate of the intercept
#####################################################################
ypoints = xpoints * slope + intercept # this calculates the yvalues at all 200 points.

# Step 4: plot the curve. We plot this as a red line "r-" :
plt.plot(xpoints, ypoints, "r-", label = "d = vt + d0")

# Plot the data:
plt.errorbar(tVec, dVec, u_dVec, fmt="bo", markersize=3, label="Experimental data")
plt.title("Position vs. Time for toy car")
plt.xlabel("Time (s)")
plt.ylabel("Position (m)")
plt.legend()
plt.show()

# Residuals Plot
# Step 1: Calculate the model at each x-datapoint
ymodel = tVec * slope + intercept

# Step 2: Calcualte the residual vector
ResVec = dVec - ymodel

# Step 3: Plot the residual vector against the x-data vector
plt.errorbar(tVec, ResVec, u_dVec, fmt="bo", markersize = 3)

# Step 4: Add a R = 0 x-axis (horizontal line) to the plot
plt.hlines(y=0, xmin=xmin, xmax=xmax, color='k') # draw axis at y = 0.

# Add axis labels and title, and show the graph
plt.title("Position vs. Time for toy car")
plt.xlabel("Time (s)")
plt.ylabel("Residual = data - model (m)")
plt.show()

# Submit

Steps for submission:

1. Click: Run => Run_All_Cells
2. Read through the notebook to ensure all the cells executed correctly and without error.
3. File => Save_and_Export_Notebook_As->HTML
4. Inspect your downloaded html document
5. Upload the HTML document to the lab submission assignment on Canvas.

In [None]:
display_sheets()