<a href="https://colab.research.google.com/github/trolfe13/Guilford_PH231/blob/main/Measurement_Acceptability.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Acceptability of a Measured Value

An important question that we must answer as scientists is: how confident are we in our answer? It is a difficult question to answer since it is somewhat arbitrary and depends on how a person defines confidence.



## Mean and Standard Deviation

After $N$ measurements of a normally distributed quantity $x$:
$$
x_1, x_2, ... , x_n
$$

the best estimate for the true value $X$ is the mean of our measurements:

$$
({best \space estimate \space for \space} X) = \bar{x} = \frac{\sum x_i}{N}
$$

The best estimate for the width $\sigma$ is the standard deviation of the measurements:

$$
(best \space estimate \space for \space \sigma) = \sigma_x = \sqrt{\frac{\sum(x_i - \bar{x})^2}{N-1}}
$$

The uncertainty in $\bar{x}$ as an estimate of $X$ is:

$$
(uncertainty \space in \space \bar{x}) = \frac{\sigma_x}{\sqrt{N}}
$$

While the uncertainty in $\sigma_x$ as the estimate of the true width $\sigma$ is given by:

$$
(fractional \space uncertainty \space in \space \sigma_x) = \frac{1}{\sqrt{2(N-1)}}
$$

## Acceptability of the Measured Answer

Let's say we measure a quantity $x$ in the standard form:
$$
(value \space of \space x) = x_{best} \space \pm \space \sigma
$$
where $\sigma$ is the appropriate standard deviation.

Let's also say that there exists an expected value to compare against, $x_{exp}$.

Then, we can say that $x_{best}$ differs from $x_{exp}$ by $t$ standard deviations:
$$
t = \frac{|{x_{best}-x_{exp}}|}{\sigma}
$$

Assuming that $x$ is normally distributed about $x_{exp}$ with width $\sigma$, we can find the probability of a discrepancy as large as ours or larger by referencing [error function tables](https://en.wikipedia.org/wiki/Error_function).

In [1]:
import numpy as np
import pandas as pd
import scipy.stats as ss
import matplotlib.pyplot as plt

## Practice Problem

The data file "DropTimes.csv" contains 40 measurements (10 sets of 4) of the time (in hundredths of a second) it takes for a stone to fall from a ledge to the ground.

A) Compute the mean, $\bar{x_t}$, and standard deviation, $\sigma_t$, for all 40 measurements.

B) Find the means of the 10 sets of experiments $\bar{t_1},...,\bar{t_{10}}$.

C) From your result in part A), what would you expect the standard deviation of the 10 averages to be? What is it actually?

D) Make histogram plots for the 40 individual measurements and for the 10 averages. Use the same scale and bin sizes for both plots. Add a normal distribution curve to the histogram plots.

E) Use $\bar{x_t}$ and $\sigma_t$ to calculate the height the stone fell from. If the measured height of the window is 2.8 meters, is our calculated height a reasonable value?   


In [None]:
# import data as a dataframe using pandas
# data is in github repository
url = 'https://raw.githubusercontent.com/trolfe13/Guilford_PH231/main/DropTimes.csv'
df = pd.read_csv(url)
df