# Hypothesis Testing in Practice

## Chi-square

Is there a relationship between the outcome of the stats quiz and whether or not they completed the review? 

In [13]:
import pandas as pd

columns = ['passed', 'failed']
index = ['review', 'no review']
alpha = .05

observed = pd.DataFrame([[11, 1], [2, 3]], index=index, columns=columns)
observed

Unnamed: 0,passed,failed
review,11,1
no review,2,3


Run chi-square test, generate expected values and p-value. 

In [10]:
from scipy import stats
chi2, p, degf, expected = stats.chi2_contingency(observed)

# expected values
pd.DataFrame(expected, index=index, columns=columns)

Unnamed: 0,passed,failed
review,9.176471,2.823529
no review,3.823529,1.176471


Is there a significant relationship between passing/failing the quiz and review completion? 

In [12]:
p

0.09674413639330004

No! p-value is not less than our alpha of .05. However, there is a relationship with a significance level of 10% (i.e. 90% confidence interval), instead of 5%. 
This is where we think about why. It's because we don't have a large enough sample, likely. 
So I will aim to collect more data and see if a significant relationship emerges. 


There is one anomaly in our data. A student who was in a very different place during the review than when the quiz came (which was delayed due to extenuating circumstances). So, I'm interested to see what happens if we removed this data point for both the test above and the one we will run below. 

In [14]:
observed = pd.DataFrame([[11, 0], [2, 3]], index=index, columns=columns)
observed

Unnamed: 0,passed,failed
review,11,0
no review,2,3


In [15]:
chi2, p, degf, expected = stats.chi2_contingency(observed)

# expected values
pd.DataFrame(expected, index=index, columns=columns)

Unnamed: 0,passed,failed
review,8.9375,2.0625
no review,4.0625,0.9375


In [16]:
p

0.030837167828904243

As I expected, we do see significance at the .05 level. 

## Mann-Whitney Test
### A non-parametric option for the t-test

Is there a signficant difference in the mean grade between those who did the review and those who did not? 

Assumptions for the parametric t-test cannot be met with this small of a sample (specifically normality & equal variance; independence is met), therefore we will use the non-parametric test for comparing 2 means (Mann-Whitney). 

- $H_{0}$: population mean quiz grade of those who do the review <= population mean quiz grade of those who do not do the review. 

- $H_{a}$: population mean quiz grade of those who do the review > population mean quiz grade of those who do not do the review. 

In [19]:
x = [79, 72, 85, 78, 73, 94, 87, 75, 88, 88, 91, 20]
y = [34, 76, 59, 60, 91]

stats.mannwhitneyu(x, y, alternative='greater')

MannwhitneyuResult(statistic=42.5, pvalue=0.10267349519081087)

Again, we have that outlier we discussed earlier, so let's see the signficance when we drop that outlier. 

In [20]:
x = [79, 72, 85, 78, 73, 94, 87, 75, 88, 88, 91]
y = [34, 76, 59, 60, 91]

stats.mannwhitneyu(x, y, alternative='greater')

MannwhitneyuResult(statistic=42.5, pvalue=0.049974545757899995)

As expected, we see the average quiz grade for the students who did the review is significantly greater than the average quiz grade of the students who did not do the review. 