If you're running this in Google Colab, you can click "Copy to Drive" (above &#8593;) or go to **File > Save a Copy in Drive** so you'll have your own version to work on. That requires a Google login.  
<hr/>

# Analyzing Measurement Uncertainty   
We'll use Dr. Natasha Holmes's *Statistics Summary* (linked in the course website) as a guide for how to analyze data in this course. This notebook shows how to do use python to do the calculations in section II *Statistics for Repeated Measurements with Statistical Variation* and section III *Making Comparisons*.

If you need to start over from scratch, open a [clean copy of this activity](https://colab.research.google.com/github/adamlamee/UCF_labs/blob/master/making_comparisons.ipynb). If you need a refresher on how to execute this notebook, try the [intro activity](https://colab.research.google.com/github/adamlamee/UCF_labs/blob/master/intro.ipynb).    

## II. Statistics for Repeated Measurements with Statistical Variation

In [None]:
import numpy as np                   # numpy does math
import matplotlib.pyplot as plt      # pyplot makes plots

In [None]:
# you can change the measurements and histogram properties
# keep the same format (e.g., brackets and commas) to avoid errors

sample_a = [1.2, 1.3, 1.1, 0.9, 1.4]               # some measurements from sample A
sample_b = [1.3, 1.1, 1.4, 1.5, 1.2]               # some measurements from trial B to compare
plt.hist((sample_a, sample_b), bins=5, range=[0.8,1.8], color=('pink','purple'));   # makes a histogram

In [None]:
a_mean = np.mean(sample_a)   # calculates the mean of sample a, saves it as a variable names "a_mean"
a_mean                       # displays the mean value you just calculated

In [None]:
# try adding code here to find the mean of sample b
# copy and paste are a programmers best friends, but rename your variables


In [None]:
a_stdev = np.std(sample_a, ddof=1)      # finds the standard deviation
a_stdev

In [None]:
# try that again for sample b's standard deviation


In [None]:
a_count = len(sample_a)      # "a_count" is now a variable with the number of observations in sample a
a_count

In [None]:
# now count the observations in sample b


In [None]:
# finding standard uncertainty
a_stunc = a_stdev / np.sqrt(a_count)       # python can do algebra
a_stunc

In [None]:
# how about sample b's standard uncertainty?


## III. Comparing Means

In [None]:
# t' statistic
# for the code below to work, you'll need to have done the calculations for sample b above, too
t_prime = abs((a_mean - b_mean) / np.sqrt(a_stunc**2 + b_stunc**2))
t_prime

Nicely done. If you found a t' of about 1.1, congrats! If not, check your math or start over with a [clean copy of this activity](https://colab.research.google.com/github/adamlamee/UCF_labs/blob/master/making_comparisons.ipynb).  


# IV. Plotting the Results  
Two options for plotting your data are given below.

A **scatterplot** is pretty standard when your independent variable has levels that are numeric, like distances or lengths. Want to customize this type of plot even more? See matplotlib's [scatter](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.scatter.html) and [errorbar](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.errorbar.html) pages.

A **barplot** is more appropriate when your independent variable has levels that aren't numeric, like "facing left" and "facing right". Want to customize this type of plot even more? See matplotlib's [barplot](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.bar.html) page.

In [None]:
# here's a scatterplot for numeric IV levels

# set up the values that get plotted
x_values = [3.5,10]
y_values = [a_mean, b_mean]
errorbars = [2*a_stdev, 3*b_mean]  # this is totally wrong; edit this to be twice the std unc for a and b

# this part makes the plot
fig, ax = plt.subplots()
ax.scatter(x_values, y_values)
ax.errorbar(x_values, y_values, yerr=errorbars, ecolor='black', capsize=10, fmt='o')

# edit these so your plot looks nice
ax.set_xlabel('label me?')
ax.set_ylabel('label me, too')
ax.set_title('title goes here')
ax.set_xlim(0,15)
ax.set_ylim(0,5)
ax.grid(False)
plt.show()

In [None]:
# here's a barplot for non-numeric ("categorical") IV levels

# set up the values that get plotted
bar_labels = ["long", "short"]     # you'll want to edit these labels
bar_heights = [a_mean, b_mean]     # these will be the heights of the bars, in order
errorbars = [2*a_stdev, 3*b_mean]  # this is totally wrong; edit this to be twice the std unc for a and b

# this part makes the plot
fig, ax = plt.subplots()
ax.bar(bar_labels, bar_heights, yerr=errorbars, align='center', alpha=0.5, color='green', ecolor='black', capsize=10)

# edit these so your plot looks nice
ax.set_xlabel('label me?')
ax.set_ylabel('label me, too')
ax.set_title('title goes here')
ax.grid(False)
plt.show()

<hr/>  

# Credits
This notebook was written by [Adam LaMee](http://www.adamlamee.com) with contributions by UCF graduate student Ifthakar Bin Elius. Thanks to the great folks at [Binder](https://mybinder.org/) and [Google Colaboratory](https://colab.research.google.com/notebooks/intro.ipynb) for making this notebook interactive without you needing to download it or install [Jupyter](https://jupyter.org/) on your own device.