# Hypothesis Testing

The purpose of the test is to tell if there is any significant difference between two data sets.



## Overview

This module covers,

1) One sample and Two sample t-tests

2) ANOVA

3) Type I and Type II errors

4) Chi-Squared Tests

## Question 1 

*A student is trying to decide between two GPUs. He want to use the GPU for his research to run Deep learning algorithms, so the only thing he is concerned with is speed.*

*He picks a Deep Learning algorithm on a large data set and runs it on both GPUs 15 times, timing each run in hours. Results are given in the below lists GPU1 and GPU2.*

In [1]:
from scipy import stats 
import numpy as np

In [2]:
GPU1 = np.array([11,9,10,11,10,12,9,11,12,9,11,12,9,10,9])
GPU2 = np.array([11,13,10,13,12,9,11,12,12,11,12,12,10,11,13])

#Assumption: Both the datasets (GPU1 & GPU 2) are random, independent, parametric & normally distributed

Hint: You can import ttest function from scipy to perform t tests 

**First T test**

*One sample t-test*

Check if the mean of the GPU1 is equal to zero.
- Null Hypothesis is that mean is equal to zero.
- Alternate hypothesis is that it is not equal to zero.

In [10]:
x1=GPU1.mean()
print (x1)

10.333333333333334


In [19]:
#Here the given H0 is that mean=0, however we have already calculated that the mean of GPU1 is 10. Thus we should
#reject H0. However let us further try to validate our position by using 1-sample ttest. 
t_stat,p_value = stats.ttest_1samp(GPU1,0)
print ('The t_stat for 1-sample ttest on GPU1 is: %.2f'%tstat)
print ('The p-value for 1-sample ttest on GPU1 is: %.10f'%p_value)

The t_stat for 1-sample ttest on GPU1 is: -2.63
The p-value for 1-sample ttest on GPU1 is: 0.0000000000


In [None]:
#As our NULL HYPOTHESIS is that the "mean=0", we have a 2-tailed test.
#Now judging from our t_stat value, for 95% confidence interval, it lies in the REJECTION ZONE
#Similarly, using the p-value; for 95% confidence interval the ALPHA=0.05. However, our p-value is 0 & it's
#way lower than ALPHA

In [None]:
#Thus by both direct calculation of mean for GPU1, as well as by 1-sample ttest, we can conclude that we should
#reject the NULL HYPOTHESIS

## Question 2

Given,

Null Hypothesis : There is no significant difference between data sets

Alternate Hypothesis : There is a significant difference

*Do two-sample testing and check whether to reject Null Hypothesis or not.*

https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_ind.html

In [14]:
x1=GPU1.mean()
x2=GPU2.mean()
print ('Population mean for GPU1 is: %.2f'%x1)
print ('Population mean for GPU2 is: %.2f'%x2)

Population mean for GPU1 is: 10.33
Population mean for GPU2 is: 11.47


In [20]:
tstat,p_value = stats.ttest_ind(GPU1,GPU2)
print ('The t_stat for 2-sample ttest on GPU1 & GPU2 is: %.2f'%tstat)
print ('The p-value for 2-sample ttest on GPU1 & GPU2 is: %.2f'%p_value)

The t_stat for 2-sample ttest on GPU1 & GPU2 is: -2.63
The p-value for 2-sample ttest on GPU1 & GPU2 is: 0.01


In [None]:
#Given the tstat value of -2.63, we can see that it lies in the REJECTION REGION.
#Moreover, the p-value (0.01), considering 95% confidence interval (alpha=0.05), is LESS THAN alpha.
#Thus we reject the NULL HYPOTHESIS.

## Question 3

He is trying a third GPU - GPU3.

In [21]:
GPU3 = np.array([9,10,9,11,10,13,12,9,12,12,13,12,13,10,11])

#Assumption: Both the datasets (GPU1 & GPU 3) are random, independent, parametric & normally distributed

*Do two-sample testing and check whether there is significant differene between speeds of two GPUs GPU1 and GPU3.*

#### Answer:

In [22]:
#Here the NULL HYPOTHESIS is that mean of the 2 GPUs is identical, i.e. u1=u3 while 
#the ALTERNATE HYPOTHESIS is that the 2 means are not identical, i.e. u1<>u3
#Since we are dealing with an EQUALITY CONDITION, we will have to consider a 2-TAIL TEST

x1=GPU1.mean()
x3=GPU3.mean()
print ('Population mean for GPU1 is: %.2f'%x1)
print ('Population mean for GPU3 is: %.2f'%x3)

tstat,p_value = stats.ttest_ind(GPU1,GPU3)
print ('The t_stat for 2-sample ttest on GPU1 & GPU3 is: %.2f'%tstat)
print ('The p-value for 2-sample ttest on GPU1 & GPU3 is: %.2f'%p_value)

Population mean for GPU1 is: 10.33
Population mean for GPU3 is: 11.07
The t_stat for 2-sample ttest on GPU1 & GPU3 is: -1.50
The p-value for 2-sample ttest on GPU1 & GPU3 is: 0.15


In [None]:
#As this is a 2-tail test (as mentioned earlier), we need to verify if tstat is more than +- 1.96 
#(considering 95% confidence interval), or is the p-value > 0.05

#As per the above data, our tstat > -1.96, i.e. it lies in the ACCEPTANCE REGION.
#In addition, given that the p-value (0.15) > 0.05, we can not REJECT the NULL HYPOTHESIS

## ANOVA

## Question 4 

If you need to compare more than two data sets at a time, then ANOVA is your best bet. 

*The results from three experiments with overlapping 95% confidence intervals are given below, and we want to confirm that the results for all three experiments are not significantly different.*

But before conducting ANOVA, test equality of variances (using Levene's test) is satisfied or not. If not, then mention that we cannot depend on the result of ANOVA

In [24]:
import numpy as np

e1 = np.array([1.595440,1.419730,0.000000,0.000000])
e2 = np.array([1.433800,2.079700,0.892139,2.384740])
e3 = np.array([0.036930,0.938018,0.995956,1.006970])

#Assumption: All the 3 datasets (e1,e2 & e3) are random, independent, parametric & normally distributed

Perform levene test on the data

The Levene test tests the null hypothesis that all input samples are from populations with equal variances. Levene’s test, is an alternative to Bartlett’s test in the case where there are significant deviations from normality.

source: scipy.org

#### Answer:

In [25]:
stats.levene(e1,e2,e3)

LeveneResult(statistic=2.6741725711150446, pvalue=0.12259792666001798)

In [None]:
#As the p-value > 0.05 (considering 95% confidence interval), we CAN'T REJECT the NULL HYPOTHESIS, which states that
#"all input samples are from populations with equal variances"

## Question 5

The one-way ANOVA tests the null hypothesis that two or more groups have the same population mean. The test is applied to samples from two or more groups, possibly with differing sizes.

use stats.f_oneway() module to perform one-way ANOVA test

In [28]:
print ('Population mean for Experiment 1(e1) is: %.2f'%e1.mean())
print ('Population mean for Experiment 2(e2) is: %.2f'%e2.mean())
print ('Population mean for Experiment 3(e3) is: %.2f'%e3.mean())

Population mean for Experiment 1(e1) is: 0.75
Population mean for Experiment 2(e2) is: 1.70
Population mean for Experiment 3(e3) is: 0.74


In [26]:
stats.f_oneway(e1,e2,e3)

F_onewayResult(statistic=2.51357622845924, pvalue=0.13574644501798466)

In [None]:
#As the p-value > 0.05 (considering 95% confidence interval), we CAN'T REJECT the NULL HYPOTHESIS, which states that
# "results for all three experiments are not significantly different."

## Question 6

*In one or two sentences explain about **TypeI** and **TypeII** errors.*

#### Answer:

1.  Type-I error defines the probability of rejecting NULL HYPOTHESIS, when NULL HYPOTHESIS is actually TRUE.
    Type-II error defines the probability of not rejecting NULL HYPOTHESIS, when NULL HYPOTHESIS is actually FALSE.
2.  Type-I error can also be defined as FALSE POSITIVE, while Type-II error is called FALSE NEGATIVE

## Question 7 

You are a manager of a chinese restaurant. You want to determine whether the waiting time to place an order has changed in the past month from its previous population mean value of 4.5 minutes. 
State the null and alternative hypothesis.

#### Answer:


NULL HYPOTHESIS (H0): Waiting Time is LESS THAN OR EQUAL TO 4.5 minutes

ALTERNATE HYPOTHESIS (Ha): Waiting Time is MORE THAN 4.5 minutes