T or Z? If $\sigma$ is unknown and we have to estimate it, use t. If we're not estimating $\sigma$, use z. That said, if you're estimating $\sigma$ and you have >30, you can use z too, since z approximates t when n > 30. If you're not sure, go with t, as it's more conservative and approaches z anyway with more data points.

http://rpsychologist.com/d3/tdist/ example

# One Sample Test of Population Mean

**Problem**: On average, a patient should take 20mg of Vinceadrine per day. A large drug company believes more Vinceadrine than this is consumed. They obtained anonimous drug data from 7 random patients and found that they consumed the following amounts of Vinceadrine: 20, 30, 25, 25, 30, 15, 40. Is there sufficient evidence to say that the amount consumed is higher?

Step 1 Define our hypotheses:

$H_0$: $\mu$ <= 20

$H_1$: $\mu$ > 20

Step 2 Determine the appropriate test and level of significance:

1-tailed test, $\alpha$ = 0.05

Step 3 Calculate our test statistic:

$t= \frac{\bar{x}-\mu}{\frac{s}{\sqrt{n}}}$

Step 4 Compare our test statistic to the critical value:

$t$ > critical value?

Step 5:

Depending on the results of our comparison between our test statistic and our critical value, we reject or fail to reject null hypothesis

In [12]:
import scipy.stats as scs
import numpy as np
numbers = [20, 30, 25, 25, 30, 15, 40]
mu = 20
alpha = 0.05
xbar = np.mean(numbers)
std = np.std(numbers, ddof=1)
n = len(numbers)
dof = n - 1
t_value = scs.t.ppf(1-alpha, dof)
t_stat = (xbar - mu)/(std/np.sqrt(n))

print("Degrees of freedom = {:d}".format(dof))
print("xbar = {:0.1f}".format(xbar))
print("Standard Deviation = {:0.3f}".format(std))
print("T-value = {:0.3f}".format(t_value))
print("T-statistic = {:0.3f}".format(t_stat))

reject = t_stat > t_value
print("The t-statistic is greater than the t-value: {}".format(reject))
print("We should " + ('' if reject else 'not ') + "reject the null hypothesis")

Degrees of freedom = 6
xbar = 26.4
Standard Deviation = 8.018
T-value = 1.943
T-statistic = 2.121
The t-statistic is greater than the t-value: True
We should reject the null hypothesis


# Another Sample Test of Population Mean

**Problem**: Let's say the received knowledge, status-quo understanding of the average IQ of DSI students is 100.  (IQ tests are designed so the population average is 100.)  But, we think that all our students are really smart and that perhaps the actual average IQ of a DSI 201 student is higher than 100.
To test this we randomly sample 5 students and find the following scores: 91,101,111,121,131.
Is there sufficient evidence to say that the average IQ is higher?

Step 1 Define our hypotheses:

$H_0$: $\mu$ <= 100

$H_1$: $\mu$ > 100

Step 2 Determine the appropriate test and level of significance:

1-tailed test, $\alpha$ = 0.05

Step 3 Calculate our test statistic:

$t= \frac{\bar{x}-\mu}{\frac{s}{\sqrt{n}}}$

Step 4 Compare our test statistic to the critical value:

$t$ > critical value?

Step 4:

Depending on the results of our comparison between our test statistic and our critical value, we reject or fail to reject null hypothesis

In [19]:
import scipy.stats as scs
import numpy as np
numbers = [91,101,111,121,131]
mu = 100
alpha = 0.05
xbar = np.mean(numbers)
std = np.std(numbers, ddof=1)
n= len(numbers)
dof = n-1
t_value = scs.t.ppf(1-alpha,dof)
t_stat_man = (xbar - mu)/(std / np.sqrt(n))
print("Degrees of freedom = {:d}".format(dof))
print("xbar = {:0.1f}".format(xbar))
print( "Standard Deviation = {:0.3f}".format(std))
print( "T-value = {:0.3f}".format(t_value))
print( "T-statistic = {:0.3f}".format(t_stat_man))
print( "The t-statistic is greater than the t-value: {}".format(t_stat_man>t_value))

reject = t_stat_man > t_value
print("The t-statistic is greater than the t-value: {}".format(reject))
print("We should " + ('' if reject else 'not ') + "reject the null hypothesis")

Degrees of freedom = 4
xbar = 111.0
Standard Deviation = 15.811
T-value = 2.132
T-statistic = 1.556
The t-statistic is greater than the t-value: False
The t-statistic is greater than the t-value: False
We should not reject the null hypothesis


In [20]:
t_stat, p_value = scs.ttest_1samp(numbers, mu)
print("T-stat = {:0.3f}".format(t_stat))
print("P-value = {:0.3f}".format(p_value))

reject_p = p_value < alpha
print("The p-value is less than alpha: {}".format(reject_p))
print("We should " + ('' if reject_p else 'not ') + "reject the null hypothesis")

T-stat = 1.556
P-value = 0.195
The p-value is less than alpha: False
We should not reject the null hypothesis


In [92]:
p_val = scs.t.sf(np.abs(t_stat_man), dof)*2
p_val

0.11595196008756595

# One Sample Test of Population Proportion

Problem: Suppose my girlfriend and I flip a coin to see who has to do the dishes. She believes that I'm being nice and losing on purpose (i.e. my chances of winning < 50%). In the random sample of 200 days out of the year, I only won 82 times. Was I rigging the coin toss?

Step 1:

$H_0$: p >= 0.5

$H_1$: p < 0.5

Step 2:

1-tailed test, $\alpha$ = 0.01

Step 3:

Z-test: Rough rule to see if z-test is okay is if $\hat{p}$N > 5 and (1-$\hat{p}$)N > 5

$z = \frac{\hat{p}-p}{\sqrt{\frac{{p}(1-{p})}{n}}}$ > critical value?

Step 4:

Reject or fail to reject null hypothesis

In [93]:
total = 200
won = 82
p = 0.5
phat = won/total
z_value = scs.norm.ppf(alpha)
z_stat = (phat - p)/np.sqrt((p*(1-p)/total))

print("Phat = {:0.1f}".format(phat))
print("z-value = {:0.3f}".format(z_value))
print("z-stat = {:0.3f}".format(z_stat))
print("The z-stat is less than the z-value: {}".format(z_stat < z_value))

Phat = 0.4
z-value = -2.326
z-stat = -2.546
The z-stat is less than the z-value: True


# Two Sample Comparison of Means

Problem: Is the average price of a bottle of lemonade in Oakland different than the average price of a drink in San Francisco? A sample of drinks from San Francisco stores was taken and the prices of the drinks in the sample were 2.69, 1.50, 3.49, 4.69, 2.89. A sample of drinks from Oakland stores was taken and the prices of the drinks in the sample were 2.19, 1.10, 1.49, 2.69, 1.89.   

Step 1:

$H_0$: $\mu_1 - \mu_2 = D$ (D can be 0 when you just want to know if they're different)

$H_1$: $\mu_1 - \mu_2 \neq D$

Step 2:

Choose level of significance, $\alpha$

Step 3:

$t= \frac{\bar{x1}-\bar{x2}-D}{\sqrt{\frac{s_1^2}{{n_1}}+\frac{s_2^2}{{n_2}}}}$ > critical value?

Step 4:

Reject or fail to reject null hypothesis

In [99]:
sf = [2.69, 1.50, 3.49, 4.69, 2.89]
oak = [2.19, 1.10, 1.49, 2.69, 1.89]
alpha = 0.05
sf_xbar = np.mean(sf)
oak_xbar = np.mean(oak)
sf_std = np.std(sf, ddof=1)
oak_std = np.std(oak, ddof=1)
sf_n= len(sf)
oak_n= len(oak)
dof = sf_n-1
t_value = scs.t.ppf(1-alpha,dof)
t_stat_man = (sf_xbar - oak_xbar)/np.sqrt((sf_std**2/sf_n)+(oak_std**2/oak_n))
print("Degrees of freedom = {:d}".format(dof))
print("xbar = {:0.1f}".format(xbar))
print( "Standard Deviation = {:0.3f}".format(std))
print( "T-value = {:0.3f}".format(t_value))
print( "T-statistic = {:0.3f}".format(t_stat_man))
print( "The t-statistic is greater than the t-value: {}".format(t_stat_man>t_value))

reject = t_stat_man > t_value
print("The t-statistic is greater than the t-value: {}".format(reject))
print("We should " + ('' if reject else 'not ') + "reject the null hypothesis")

Degrees of freedom = 4
xbar = 111.0
Standard Deviation = 15.811
T-value = 2.132
T-statistic = 2.001
The t-statistic is greater than the t-value: False
The t-statistic is greater than the t-value: False
We should not reject the null hypothesis


# Two Sample Comparison of Proportions
Problem: Are the clickthrough rates different on version 1 versus version 2 of our website? We took data from 1200 visitors for a week of traffic through each version of the page, and found that the CTR for version 1 is 0.04 and the CTR for version 2 is 0.06.

Step 1:

$H_0$: $p_1 - p_2 = D$

$H_1$: $p_1 - p_2 \neq D$

(Can set $D$ to 0 here to just see if they're the same, or can set it to something else to see if version 2 changed our clickthrough rate by some particular amount.)

Step 2:

choose level of significance $\alpha$

Step 3:

Z-test: Rough rule to see if z-test is okay is if pN > 5 and p(1-n) > 5

$z = \frac{\hat{p_1}-\hat{p_2}-D}{\sqrt{\frac{\hat{p_1}(1-\hat{p_1})}{n_1}+\frac{\hat{p_2}(1-\hat{p_2})}{n_2}}}$ > critical value?

Step 4:

Reject or fail to reject null hypothesis

In [97]:
n1 = 1200
n2 = 1200
phat1 = 0.04
phat2 = 0.06
alpha = 0.01
z_value = scs.norm.ppf(alpha)
z_stat = (phat1-phat2)/(np.sqrt((phat1*(1-phat1)/n1)+(phat2*(1-phat2)/n2)))
print("Z-val: {}".format(z_value))
print("Z-stat: {}".format(z_stat))
print("The z-stat is less than the z-value: {}".format(z_stat < z_value))


Z-val: -2.3263478740408408
Z-stat: -2.2501758018520475
The z-stat is less than the z-value: False


# Chi-Square Goodness of Fit Test
How well does the expected model fit the data?

This is for cases where the observations fall into bins and the chance that an observation falls into a bin is a certain percentage.

Bins might be hours, days, type of customer (premium, paid, free)
ex. did the expected number of users of the app each day match reality?

# Goodness of Fit - Example
$$\chi^2 = \sum_{i} \frac{(O_i - E_i)^2}{E_i}$$
Dice Game:

|Value  |Observed Frequency|Expected Frequency|
|-------|------------------|------------------|
|1      |16                |?                 |
|2	    |5	               |?                 |
|3	    |9	               |?                 |
|4	    |7	               |?                 |
|5	    |6	               |?                 |
|6	    |17	               |?                 |
|Total	|60	               |?                 |

Goodness of Fit - Example
$$\chi^2 = \sum_{i} \frac{(O_i - E_i)^2}{E_i}$$
Dice Game:


|Value  |Observed Frequency|Expected Frequency|
|-------|------------------|------------------|
|1      |16                |                10|
|2	    |5	               |10                |
|3	    |9	               |10                |
|4	    |7	               |10                |
|5	    |6	               |10                |
|6	    |17	               |10                |
|Total	|60	               |60                |

```obs_table = # use an np array
exp_table = #use an np array
chi2_stat = # use the chi2 formula
print "Chi2 Statistic: {}".format(chi2_stat)
print "Critical Chi2 Value: {:0.2f}".format(scs.chi2.ppf(0.95,5))```

In [98]:
obs_table = np.array([16,5,9,7,6,17])
exp_table = np.array([10,10,10,10,10,10])
chi2_stat = sum((exp_table - obs_table)**2/exp_table)
print("Chi2 Statistic: {}".format(chi2_stat))
print("Critical Chi2 Value: {:0.2f}".format(scs.chi2.ppf(0.95,5)))

Chi2 Statistic: 13.6
Critical Chi2 Value: 11.07


In [91]:
obs_table = np.array([16,5,9,7,6,17])
exp_table = np.array([10,10,10,10,10,10])
scs.chisquare(obs_table,exp_table)

Power_divergenceResult(statistic=13.6, pvalue=0.018360196409519448)

# Bonferroni Correction


P(A) + P(B) - P(A$\cap$B) $\Rightarrow$ 0.05 + 0.05 - ?

$\frac{0.05+0.05}{2}$ $\rightarrow$ new $\alpha$

# Bonferroni Correction

20 sample case

$\frac{\alpha}{n} = \frac{0.05}{20} = 0.0025$

$P$(one significant result) $=$ $1 - P$(no significant results)

$P$(one significant result) $=$ $1-(1-0.0025)^{20}$ $=$ $0.0488$

$\Rightarrow$ Bonferroni is slightly conservative. That's good.