# Hypothesis Testing
We can test for a **population mean** with:
- z-test: for known variance:
$$z = \frac{\bar{x} - \mu_0}{\sigma/\sqrt{n}}$$

- t-test: for unknown variance
$$t = \frac{\bar{x} - \mu_0}{s/\sqrt{n}}$$

furter we can test for **difference in mean**s with the assumptions of 
- **unequal variance** and known $\sigma_1^2$ & $\sigma_2^2$: $$z = \frac{\bar{x}_1 - \bar{x}_2}{\sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}}$$


- **equal variance** and unknown $\sigma_1^2$ & $\sigma_2^2$: $$t = \frac{{\bar{x}_1 - \bar{x}_{2} - \mu}}{{\sqrt{\frac{{(n_1 - 1)s_1^2 + (n_2 - 1)s_2^2}}{{n_1 + n_2 - 2}} \left(\frac{1}{{n_1}} + \frac{1}{{n_2}}\right)}}}$$
here $\mu$ stands for a possible difference in means if $\mu = 0$ numerator looks like the one above.

and for **population proportion**:
- where the test-statistic is $$z=\frac{\hat{p} - p}{\sqrt{\frac{\hat{p}(1 - \hat{p})}{n}}}$$



In [None]:
athletes = pd.read_excel("data/athlets.xlsx")
athletes.head()

Unnamed: 0,Name,Height(cm),Weight(kg),Country/Team,Gender
0,"Abbadi, Ilyas",170,69,Algeria,M
1,"Achour, Dallal Merwa",176,45,Algeria,W
2,"Hammouche, Salima",165,45,Algeria,W
3,"Rouba, Amina",173,59,Algeria,W
4,"Barbosa, Neide",178,78,Angola,W


## one-sample, right-tailed means test

Hypothesis:

$H_0: µ_{height}  ≤ 173$ <br>
$H_1:  µ_{height}  > 173$

In [None]:
from scipy.stats import ttest_1samp, t

# single popluation mean - test if average height of male athletes is greater than 173cm (the population average)
# since no strict inequalities are allowed in the H_0 we put the >173 into the H_1 and now try to reject H_0
heights_male_ath = athletes.loc[athletes['Gender'] == 'M', 'Height(cm)']

t_statistic, p_value = ttest_1samp(heights_male_ath, popmean=173)
critical_value = t.ppf(0.95, df=len(heights_male_ath)-1)

# rejection region is to the right, if the test-statistic falls to the right of the critical value we reject H_0
print(f"t-statistic: {t_statistic}\ncritical value: {critical_value}")
if t_statistic >= critical_value: print("reject H_0")
else: print("failed to reject H_0, the assumtion of H_0 remains")

t-statistic: 30.03511435343714
critical value: 1.6468349454267153
reject H_0


## two-sample, two-tailed difference in means test, 
with assumption of equal or unequal variance, swutch between assumptions with ``equal_var= True/ False``

Hypothesis:

$H_0: µ_{men} - µ_{women} = 12.5$ <br>
$H_1:  µ_{men} - µ_{women} \neq 12.5$

$$t = \frac{{\bar{x}_1 - \bar{x}_{2} - \mu}}{{\sqrt{\frac{{(n_1 - 1)s_1^2 + (n_2 - 1)s_2^2}}{{n_1 + n_2 - 2}} \left(\frac{1}{{n_1}} + \frac{1}{{n_2}}\right)}}}$$

for **uneqal variances** the function simplifies:
$$t = \frac{\bar{x}_1 - \bar{x}_2 - \mu}{\sqrt{\frac{{s_1}^2}{n_1} + \frac{{s_2}^2}{n_2}}}$$

**Homogeneity of Variances**<br>
If the variances of the population of the to samples are equal or not can be test with:
- Rules of thumb: if the ratio of the larger variance to the smaller variance is less than 4
- F-test
- Levene’s Test
- Bartlett's test

In [None]:
athletes = pd.read_excel("data/athlets.xlsx")
m = athletes.loc[athletes["Gender"] == "M", "Height(cm)"]
f = athletes.loc[athletes["Gender"] == "W", "Height(cm)"]

In [None]:
from stats_functions import homo_variance_test
homo_variance_test(m, f)

Unnamed: 0,Test,Result,Reasoning
0,Rule of thumb,Variances are equal,100.17 / 72.53 = 1.38
1,F-test,Variances are not equal,p value of F-test = 0.00000000
2,Levene's test,Variances are not equal,p value of Levene's test = 0.00002614
3,Bartlett's test,Variances are not equal,p value of Bartlett's test = 0.00001811


In [None]:
from stats_functions import t_test_2sample

# since the t-statistic < t critical two tail we cannot reject the H_0
t_test_2sample(m, f, 0.05, 12.5, equal_var=False)

Unnamed: 0_level_0,data_1,data_2
t-test statistics,Unnamed: 1_level_1,Unnamed: 2_level_1
Mean,183.8262,170.904048
Variance,100.172352,72.53132
Observations,771.0,667.0
Pooled Variance,0.238668,
Hypothesized Mean Difference,12.5,
df,1436.0,
t-statistic,0.864115,
P(T<=t) one-tail,0.193834,
t critical one-tail,1.645915,
P(T<=t) two-tail,0.387669,


In [None]:
intern = pd.read_csv("data/intern.csv")

# fix column names (spaces)
intern.columns = ['Intern Number', 'Original Score', 'Score After Training', 'Gender', 'Age'] 

after_score_men = intern[intern["Gender"] == 'M']["Score After Training"]
after_score_female = intern[intern["Gender"] == 'F']["Score After Training"]

print(f"mean score women: {after_score_female.mean()}")
print(f"mean score men: {after_score_men.mean()}")

mean score women: 86.55
mean score men: 85.35


In [None]:
from stats_functions import t_test_2sample
t_test_2sample(after_score_female, after_score_men,  0.05, 0, equal_var=True)

Unnamed: 0_level_0,data_1,data_2
t-test statistics,Unnamed: 1_level_1,Unnamed: 2_level_1
Mean,86.55,85.35
Variance,67.523684,70.344737
Observations,20.0,20.0
Pooled Variance,68.934211,
Hypothesized Mean Difference,0.0,
df,38.0,
t-statistic,0.45705,
P(T<=t) one-tail,0.325119,
t critical one-tail,1.685954,
P(T<=t) two-tail,0.650237,


## two sample paired  t-test

In [None]:
training = pd.read_excel("data/training.xlsx")
training.head()

after = training["Score_after"]
before = training["Score_before"]

In [None]:
from stats_functions import ttest_paired_2sample
ttest_paired_2sample(after, before, 0.05)

Unnamed: 0_level_0,data_1,data_2
t-test statistics,Unnamed: 1_level_1,Unnamed: 2_level_1
Mean,62.3,58.3
Variance,327.520408,460.908163
Observations,50.0,50.0
Pearson Correlation Coefficient,0.316288,
Hypothesized Mean Difference,0.0,
df,49.0,
t-statistic,1.214182,
P(T<=t) one-tail,0.115249,
t critical one-tail,1.676551,
P(T<=t) two-tail,0.230498,


In [None]:
before_score = intern["Original Score"]
after_score = intern["Score After Training"]

from stats_functions import ttest_paired_2sample
ttest_paired_2sample(after_score, before_score)

Unnamed: 0_level_0,data_1,data_2
t-test statistics,Unnamed: 1_level_1,Unnamed: 2_level_1
Mean,85.95,74.6
Variance,67.535897,215.682051
Observations,40.0,40.0
Pearson Correlation Coefficient,-0.186066,
Hypothesized Mean Difference,0.0,
df,39.0,
t-statistic,3.962802,
P(T<=t) one-tail,0.000153,
t critical one-tail,1.684875,
P(T<=t) two-tail,0.000306,
