# t-test, ANOVA and chi-squared

In this script we will discuss what are t-test, ANOVA and chi-squared test, why they are used and how to use them in an effective way.

##### Hypothesis testing
It is a statistical way to test the correctness of a hypothesis ("educated guess") using mathematical equations & tools. As example, let's say we have a dataset with 10 variables. If we "guess" by seeing some plots or distribution that there are two variables which are much correlated, then this guess is called Hypothesis. Statistical checking procedures of the validity of this hypothesis are called Hypothesis Testing. "t-test", "ANOVA", "chi-squared", "correlation" these are some of the hypothesis testing techniques.

##### t-test
t-test is a statistic that check if two meeans are reliably different from each other. It measures the t-value which is the ratio of "variance between two groups (or two variables)" to "variance within the groups". If the t-value is not large enough, than we can say that there is not enough evidence to that those two groups are similar.


p-vlaue is the probability that the pattern of the data samples is randomly produced. We calculate p-value from corresponding t-value. If the p-value is low, it means there is very low probability that the pattern we found is randomly produced.

##### ANOVA  (Analysis of Variance) - F score

ANOVA is a statistical method that separates observed variance data into different components to use for additional tests. 

- A one-way ANOVA is used for three or more groups of data, to gain information about the relationship between the dependent and independent variables.

- A two-way ANOVA tests the effect of two independent variables on a dependent variable. A two-way ANOVA test analyzes the effect of the independent variables on the expected outcome along with their relationship to the outcome itself.

Here we calculate the F-score from the variances of the data and if the F-Score is larg enough to fall within rejection area, we reject the null hypothesis.

##### Chi Squared Value

A chi-square statistic is one way to show a relationship between two categorical variables. The chi-squared statistic is a single number that shows how much difference exists between the observed counts and the counts we would expect if there were no relationship at all in the population. A low value for chi-square means there is a high correlation between two sets of data. 

#### Which statistic to use ?

This depends on two factors. These are : 
- Purpose: Is it a relationship probelm (any connection or correlation) or a comparison problem (any difference)?
- Type of Data: Is the feature value categorical or numerical?

On the basis of these two factors, we have to use the statistic as follow:

Comparison Problem + Categorical type Data -> Chi-Squared

Comparison Problem + Categorical AND Numeric type Data -> t-test or ANOVA

Relationship Problem + Numeric type Data -> Correlation


In [1]:
# To see the implementation, lets first import the libraries and load the dataset

In [3]:
import numpy as np
import pandas as pd
import seaborn as sns
from matplotlib import pyplot as plt
%matplotlib inline
from scipy import stats

In [11]:
rating = pd.read_csv('teachingratings.csv')
rating.head()

Unnamed: 0,minority,age,gender,credits,beauty,eval,division,native,tenure,students,allstudents,prof,PrimaryLast,vismin,female,single_credit,upper_division,English_speaker,tenured_prof
0,yes,36,female,more,0.289916,4.3,upper,yes,yes,24,43,1,0,1,1,0,1,1,1
1,yes,36,female,more,0.289916,3.7,upper,yes,yes,86,125,1,0,1,1,0,1,1,1
2,yes,36,female,more,0.289916,3.6,upper,yes,yes,76,125,1,0,1,1,0,1,1,1
3,yes,36,female,more,0.289916,4.4,upper,yes,yes,77,123,1,1,1,1,0,1,1,1
4,no,59,male,more,-0.737732,4.5,upper,yes,yes,17,20,2,0,0,0,0,1,1,1


In [6]:
# Now we will find validity of three different relationships and test them using three techniques

##### t-test implementation

In [8]:
# Q1: Does the gender affect teaching evaluation scores?
# Lets assume that gender does not affect the evaluation scores
# So, null hypothesis, H0 : There is no affect of gender on the evaluation scores
# And alternative hypothesis, H1 : There is affect of gender on the evaluation scores

In [94]:
# The "gender" feature is categorical and "eval(evaluation scores)" is numerical
# So we can use either t-test or ANOVA to evaluate the hypothesis
# To do t-test and ANOVA we need to do first the homogeneity of variance(HOV) test
# If the variance is not homogeneous, then t-test can not be an effective choice
# We will use Levene's test to find HOV
# If the p-value of Levene's test is greater than 0.05, than HOV is assumed to be present

In [14]:
stats.levene(rating[rating['gender'] == 'male']['eval'], rating[rating['gender'] == 'female']['eval'],
               center = 'mean')

LeveneResult(statistic=0.19032922435292574, pvalue=0.6628469836244741)

In [16]:
# The pvalue of Levene's test is 0.66 which is much greater than 0.05
# It shows the variance is homogeneous
# We can now perform t-test

In [17]:
stats.ttest_ind(rating[rating['gender'] == 'male']['eval'], rating[rating['gender'] == 'female']['eval'],
               equal_var = True)

Ttest_indResult(statistic=3.249937943510772, pvalue=0.0012387609449522217)

In [52]:
# The p_value of the t-test is lower than 0.05. So we can reject the null hypothesis
# i.e. Gender has effect on evaluation score variable

##### ANOVA test implementation

In [93]:
# Q2: Is there any relationship between age and beauty_score?
# Lets assume that age and beauty_score is not correlated
# So, null hypothesis, H0 : There is no relation between age and beauty_score 
                                # (i.e. three population means are equal)
# And alternative hypothesis, H1 : There is a relation between age and beauty_score
                                # (i.e. there is a difference among these three population means)

In [87]:
# The "age" & "beauty_score" features are both numerical 
# It is not very practical to see the variability of beauty score with each unique age values
# So we will divide "age" feature into three groups and make it as categorical
# We can use ANOVA to evaluate the hypothesis

In [81]:
# First create a column with three different age groups

rating['age_category'] = rating['age'].apply( lambda x: 'young' if x<40 
                                             else ('middle_aged' if 40<x<57 else 'senior' ))

In [79]:
# As like t-tesst, we have to measure equality of variance 

stats.levene(rating[rating['age_category']=='young']['beauty'],
                   rating[rating['age_category']=='middle_aged']['beauty'], 
                   rating[rating['age_category']=='senior']['beauty'], 
                   center='mean')

LeveneResult(statistic=6.9340131620932075, pvalue=0.0010791898776245528)

In [82]:
# p_value is lower than 0.05 so we can run the ANOVA test

In [91]:
# Let's give aliases to those series from three different categories data

young = rating[rating['age_category']=='young']['beauty']
middle_aged = rating[rating['age_category']=='middle_aged']['beauty']
senior = rating[rating['age_category']=='senior']['beauty']


# Now running the ANOVA test

f_value_anova, p_value_anova = stats.f_oneway(young, middle_aged, senior)
print(f_value_anova, p_value_anova)

24.709209313848543 6.414064422300409e-11


In [88]:
# p_value is less than 0.05 which means that we can reject the null hypothesis
# So, there is not relation with the age and beauty score

##### Chi Squared Test implementation

In [95]:
# Q3: Is there any relationship between gender and tenure?
# Lets assume that gender and tenure is not correlated
# So, null hypothesis, H0 : There is no relation between gender and tenure
                                # (i.e. the tenure status is independant of gender)
# And alternative hypothesis, H1 : There is a relation between gender and tenure
                                # (i.e. the tenure status is dependant on gender)

In [96]:
# The "gender" & "tenure" features are both categorical 
# We can use chi-squared test to evaluate the hypothesis

In [98]:
# First we need to create a crosstable 

cross_table = pd.crosstab(rating['tenure'], rating['gender'])
cross_table

gender,female,male
tenure,Unnamed: 1_level_1,Unnamed: 2_level_1
no,50,52
yes,145,216


In [102]:
# Now running the chi squared test

stats.chi2_contingency(cross_table, correction = True)

(2.20678166999886,
 0.1374050603563787,
 1,
 array([[ 42.95896328,  59.04103672],
        [152.04103672, 208.95896328]]))

In [101]:
# It returns the 𝜒2 value, p_value, degree of freedom and expected values respectively
# We found that the p_value is 0.137 which is much higher than 0.05
# We can not reject the null hypothesis
# So there is no strong evidence that the tenure is dependant on gender