# Hypothesis Testing

In [1]:
import numpy as np
import scipy.stats as stats

# Payphone Fills

Past data have shown that if the payphones at the airport are emptied every 14 days, the coin collectors will be 70% full on the average. The phone company tries to schedule the collection visits at 70% full because money is lost if the phones get full and unusable, but visiting the phones too frequently is also an expense. The company keeps data on fill amounts during collection, in case they need to increase or decrease collection frequency. During the last visit, suppose that 5 phones were 50%, 40%, 70%, 75%, and 45% full, respectively. Do you think the frequency of visits needs to be changed, or is this just chance variation? (95% confidence is fine.)
Hint: Note that an action will be made (increase or decrease visits) if the average fill shifts away from the mean (70%)  in either direction. State your hypotheses based on this fact.

Hint 2: You’ll need a p-value calculator to find the p-value: https://www.graphpad.com/quickcalcs/pvalue1.cfm 

In [2]:
def sample_std_dev(array):
    sample_mu = array.mean()
    n = len(array)
    return np.sqrt(np.sum((array - sample_mu) ** 2 )/ (n-1))

def t_statistic(samples_array, mu):
    sample_mu = samples_array.mean()
    s = sample_std_dev(samples_array)
    n = len(samples_array)
    return (sample_mu - mu) / (s / np.sqrt(n))

def p_value_from_t_statistic(t_stat, dof, tails=1):
    if tails == 1:
        return stats.t.cdf(t_stat, dof)
    else: 
        return stats.t.cdf(t_stat, dof) * 2

Given fill data:

In [3]:
fills = np.array([50, 40, 70, 75, 45])

We need to compute t as shown below:

$$ t = \frac{\bar{x} - \mu}{\frac{s}{\sqrt{n}}} $$

In [4]:
x_bar = fills.mean()
x_bar

56.0

\\(\mu\\) is our average fill that we are testing our hypothesis against

In [5]:
mu = 70 # Null Hypothesis H0

Sample standard deviation needs degress of freedom \\(n-1\\)

In [6]:
# With formula
s = sample_std_dev(fills)
print("{:.4f}".format(s))

15.5724


In [7]:
# With numpy
s = fills.std(ddof=1)
print("{:.4f}".format(s))

15.5724


n is our number of samples

In [8]:
# With len function
n = len(fills)

In [9]:
# With numpy shape attribute
fills.shape

(5,)

In [10]:
n = fills.shape[0]
print("n: {}".format(n))

n: 5


In [11]:
t_stat = t_statistic(fills, mu)
print("t_satatistic: {:.4f}".format(t_stat))

t_satatistic: -2.0103


Two tailed signifcance value

In [12]:
alpha = (1.0 - 0.95) / 2
print('alpha: {:.3f}'.format(alpha))

alpha: 0.025


In [13]:
# p-value from scipy stats or can find form link above
p = p_value_from_t_statistic(t_stat, n-1, 2)
print('p-value: {:.4f}'.format(p))

p-value: 0.1148


Can we reject \\(H_0\\)?

In [14]:
p <= alpha

False

We cannot reject the null hypothesis because our data is more likey to occur given the null hypothesis is true than our significance level of 0.025.  In other words, our data cannot prove, with significance, that the average payphone fill after 14 days is not 70%.

## Trains

People in DC constantly complain that the metro consistently runs an average of 10 minutes late. You actually think it’s less than this, so you gather data for ten different trains at a specific location in DC. The following is your data in minutes of lateness: [4, 12, 6, 2, 1, 6, 7, 3, 16, 0]. Based on your data, are the people in DC correct?

In [19]:
h0 = 10
# h1 -> < 10 therefore left-tailed (one-tailed) test

# One-tailed significance value
C = 0.95
alpha = 1 - C

train_samples = np.array([4, 12, 6, 2, 1, 6, 7, 3, 16, 0])
n = len(train_samples)
sample_mean = train_samples.mean()
s = sample_std_dev(train_samples)

In [20]:
t_stat = t_statistic(train_samples, h0)
print("t_satatistic: {:.4f}".format(t_stat))

t_satatistic: -2.7129


In [23]:
p = p_value_from_t_statistic(t_stat, n-1, tails=1)
print('p-value: {:.4f}'.format(p))

p-value: 0.0119


In [24]:
p <= alpha

True

We can reject the null hypothesis because our p-value is less that out significance level (alpha).  In other words, it is less than a 5% chance that we would observe the recorded data if trains we 10 minutes late on average.