# Hypothesis Testing Practice
---
In this notebook I am going to practice some hypothesis testing skills with Python on the well-known Iris dataset (and probably some other dataset).

**In case you can not see the results in the notebook, please view this [link](https://www.kaggle.com/anzhemeng/python-x-hypo-testing).**

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 5GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

In [None]:
data = pd.read_csv('/kaggle/input/iris/Iris.csv')
data.head()

In [None]:
data['Species'].unique()

## T- Test
A t-test is a type of inferential statistic which is used to determine if there is a significant difference between the means of two groups which may be related in certain features. It is mostly used when the data sets, like the set of data recorded as outcome from flipping a coin a 100 times, would follow a normal distribution and may have unknown variances. T test is used as a hypothesis testing tool, which allows testing of an assumption applicable to a population.

### a. One sample t-test
The One Sample t Test determines whether the sample mean is statistically different from a known or hypothesised population mean. The One Sample t Test is a parametric test.

Example: checking whether the average length of Petal is 1.3. So I set $H_{0}$: the average length of petal is 1.3 and $H_1$: the average length of petal is not 1.3.

In [None]:
from scipy.stats import ttest_1samp

PetalLengthCm_mean = data['PetalLengthCm'].mean()
print('PetalLength mean value:',PetalLengthCm_mean)
tset, pval = ttest_1samp(data['PetalLengthCm'], 1.3)
print('p-values',pval)
if pval < 0.05:    # alpha value is 0.05 or 5%
   print(" we are rejecting null hypothesis")
else:
  print("we are accepting null hypothesis")

**According to the testing result, there is not sufficiently confident to conclude that the length of petal is averagedly 1.3 cm.**

### b. Two sampled t-test
The Independent Samples t Test or 2-sample t-test compares the means of two independent groups in order to determine whether there is statistical evidence that the associated population means are significantly different. The Independent Samples t Test is a parametric test. This test is also known as: Independent t Test.

Example: check is there any association between the length of petal and sepal. 

$H_0$: There is an association between them;

$H_1$: There is no association between them.

In [None]:
from scipy.stats import ttest_ind

PetalLengthCm_mean = data['PetalLengthCm'].mean()
SepalLengthCm_mean = data['SepalLengthCm'].mean()
print('PetalLength mean value:',PetalLengthCm_mean)
print('SepalLength mean value:',SepalLengthCm_mean)
PetalLengthCm_std = data['PetalLengthCm'].std()
SepalLengthCm_std = data['SepalLengthCm'].std()
print('PetalLength std value:',PetalLengthCm_std)
print('SepalLength std value:',SepalLengthCm_std)
ttest,pval = ttest_ind(data['PetalLengthCm'],data['SepalLengthCm'])
print('p-value',pval)
if pval <0.05:
  print("we reject null hypothesis")
else:
  print("we accept null hypothesis")

**According to the testing result, there is not sufficiently confident to conclude that these two features are correalted with each other.**

## Paired sampled t-test
The paired sample t-test is also called dependent sample t-test. It’s an uni variate test that tests for a significant difference between 2 related variables.

Example: check if the Petal's lengths of Iris-setosa and Iris-versicolor are different.

$H_0$: These two features are not different;

$H_1$: These two features are different.

In [None]:
from scipy import stats

ttest,pval = stats.ttest_rel(data['PetalLengthCm'].loc[data['Species']=='Iris-setosa'], data['PetalLengthCm'].loc[data['Species']=='Iris-versicolor'])
print(pval)
if pval<0.05:
    print("reject null hypothesis")
else:
    print("accept null hypothesis")

**According to the testing result, there is not sufficiently confident to conclude that these two features are different.**

## Z test
You would use a Z test if:
* Your sample size is greater than 30. Otherwise, use a t test.
* Data points should be independent from each other. In other words, one data point isn’t related or doesn’t affect another data point.
* Your data should be normally distributed. However, for large sample sizes (over 30) this doesn’t always matter.
* Your data should be randomly selected from a population, where each item has an equal chance of being selected.
* Sample sizes should be equal if at all possible. 

### a. one sample test
The example here is the same one in the section of one sample t-test.

In [None]:
from scipy import stats
from statsmodels.stats import weightstats as stests

ztest ,pval = stests.ztest(data['PetalLengthCm'], x2=None, value=1.3)
print(float(pval))
if pval<0.05:
    print("reject null hypothesis")
else:
    print("accept null hypothesis")

**According to the testing result, there is not sufficiently confident to conclude that the length of petal is averagedly 1.3 cm.**

### b. Two-sample Z test
The example here is checking two independent data groups and deciding whether sample mean of two group is equal or not.

$H_0$: the mean of the length of petal and sepal is 0;

$H_1$: the mean of the length of petal and sepal is not 0.

In [None]:
ztest ,pval1 = stests.ztest(data['PetalLengthCm'], x2=data['SepalLengthCm'], value=0, alternative='two-sided')
print(float(pval1))
if pval<0.05:
    print("reject null hypothesis")
else:
    print("accept null hypothesis")

**According to the testing result, there is not sufficiently confident to conclude that the mean length of them is zero.**

## ANOVA (F-TEST)
We could carry out a separate t-test for each pair of groups, but when you conduct many tests you increase the chances of false positives. The analysis of variance or ANOVA is a statistical inference test that lets you compare multiple groups at the same time.

### One Way F-test(Anova)
It tell whether two or more groups are similar or not based on their mean similarity and f-score.

Example: there are 3 different categories of iris and I need to check whether the sepal length of all 3 groups are similar or not.

$H_0$: They are the same;
$H_1$: They are not the same.

In [None]:
df_anova = data[['SepalLengthCm','Species']]
grps = pd.unique(df_anova.Species.values)
d_data = {grp:df_anova['SepalLengthCm'][df_anova.Species == grp] for grp in grps}
 
F, p = stats.f_oneway(d_data['Iris-setosa'], d_data['Iris-versicolor'], d_data['Iris-virginica'])
print("p-value for significance is: ", p)
if p<0.05:
    print("reject null hypothesis")
else:
    print("accept null hypothesis")

**According to the testing result, there is not sufficiently confident to conclude that the Sepal length of these three species is the same.**

### Two Way F-test
Two way F-test is extension of 1-way f-test, it is used when we have 2 independent variable and 2+ groups. 2-way F-test does not tell which variable is dominant. if we need to check individual significance then Post-hoc testing need to be performed.

In [None]:
import statsmodels.api as sm
from statsmodels.formula.api import ols
import warnings

warnings.filterwarnings('ignore')
model = ols('PetalLengthCm ~ C(SepalLengthCm)*C(SepalWidthCm)', data).fit()
print(f"Overall model F({model.df_model: .0f},{model.df_resid: .0f}) = {model.fvalue: .3f}, p = {model.f_pvalue: .4f}")
res = sm.stats.anova_lm(model, typ= 2)
res

## Chi-Square Test
The test is applicable when you have two **categorical** variables from a single population. It is used to determine whether there is a significant association between the two variables.

Here I am going to use another dataset (women's internatipnal footbal match results) which contains sufficient categorical features. And I am going to check the realtionship between the tournament where the game takes place and whether it's neutral or not.

In [None]:
df_chi = pd.read_csv('../input/womens-international-football-results/results.csv')
contingency_table=pd.crosstab(df_chi["tournament"],df_chi["neutral"])
print('contingency_table :-\n',contingency_table)
#Observed Values
Observed_Values = contingency_table.values 
print("Observed Values :-\n",Observed_Values)
b=stats.chi2_contingency(contingency_table)
Expected_Values = b[3]
print("Expected Values :-\n",Expected_Values)
no_of_rows=len(contingency_table.iloc[0:2,0])
no_of_columns=len(contingency_table.iloc[0,0:2])
ddof=(no_of_rows-1)*(no_of_columns-1)
print("Degree of Freedom:-",ddof)
alpha = 0.05
from scipy.stats import chi2
chi_square=sum([(o-e)**2./e for o,e in zip(Observed_Values,Expected_Values)])
chi_square_statistic=chi_square[0]+chi_square[1]
print("chi-square statistic:-",chi_square_statistic)
critical_value=chi2.ppf(q=1-alpha,df=ddof)
print('critical_value:',critical_value)
#p-value
p_value=1-chi2.cdf(x=chi_square_statistic,df=ddof)
print('p-value:',p_value)
print('Significance level: ',alpha)
print('Degree of Freedom: ',ddof)
print('chi-square statistic:',chi_square_statistic)
print('critical_value:',critical_value)
print('p-value:',p_value)
if chi_square_statistic>=critical_value:
    print("Reject H0,There is a relationship between 2 categorical variables")
else:
    print("Retain H0,There is no relationship between 2 categorical variables")
    
if p_value<=alpha:
    print("Reject H0,There is a relationship between 2 categorical variables")
else:
    print("Retain H0,There is no relationship between 2 categorical variables")

**According to the testing result, there is not sufficiently confident to conclude that the two categorical features are correlated.**