# Hypothesis Testing

The purpose of the test is to tell if there is any significant difference between two data sets.



## Overview

This module covers,

1) One sample and Two sample t-tests

2) ANOVA

3) Type I and Type II errors

4) Chi-Squared Tests

## Question 1 

*A student is trying to decide between two GPUs. He want to use the GPU for his research to run Deep learning algorithms, so the only thing he is concerned with is speed.*

*He picks a Deep Learning algorithm on a large data set and runs it on both GPUs 15 times, timing each run in hours. Results are given in the below lists GPU1 and GPU2.*

In [65]:
from scipy import stats 
import numpy as np

In [4]:
GPU1 = np.array([11,9,10,11,10,12,9,11,12,9,11,12,9,10,9])
GPU2 = np.array([11,13,10,13,12,9,11,12,12,11,12,12,10,11,13])

#Assumption: Both the datasets (GPU1 & GPU 2) are random, independent, parametric & normally distributed

Hint: You can import ttest function from scipy to perform t tests 

**First T test**

*One sample t-test*

Check if the mean of the GPU1 is equal to zero.
- Null Hypothesis is that mean is equal to zero.
- Alternate hypothesis is that it is not equal to zero.

In [66]:
GPU1.mean()

10.333333333333334

## Question 2

Given,

Null Hypothesis : There is no significant difference between data sets

Alternate Hypothesis : There is a significant difference

*Do two-sample testing and check whether to reject Null Hypothesis or not.*

https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_ind.html

In [67]:
import numpy as np
from scipy.stats import ttest_1samp, ttest_ind, mannwhitneyu, levene, shapiro
from statsmodels.stats.power import ttest_power
t_statistic, p_value = ttest_ind(GPU1, GPU2)
print(t_statistic, p_value)

-2.627629513471839 0.013794282041452725


## Question 3

He is trying a third GPU - GPU3.

In [68]:
GPU3 = np.array([9,10,9,11,10,13,12,9,12,12,13,12,13,10,11])

#Assumption: Both the datasets (GPU1 & GPU 3) are random, independent, parametric & normally distributed

*Do two-sample testing and check whether there is significant differene between speeds of two GPUs GPU1 and GPU3.*

#### Answer:

In [69]:
t_statistic, p_value = ttest_ind(GPU1, GPU3)
print(t_statistic, p_value)

-1.4988943759093303 0.14509210993138993


## Question 6

*In one or two sentences explain about **TypeI** and **TypeII** errors.*

#### Answer:

A Type I error is the rejection of a true null hypothesis - a "False Positive" finding or conclusion
A Type II error is the non-rejection of a false null hypothesis - a "False Negative" finding or conclusion

## Question 7 

You are a manager of a chinese restaurant. You want to determine whether the waiting time to place an order has changed in the past month from its previous population mean value of 4.5 minutes. 
State the null and alternative hypothesis.

#### Answer:


The null hypothesis is that the population mean has not changed from its previous value of 4.5 minutes i.e H0: µ = 4.5
The alternative hypothesis is that the population mean is not 4.5 minutes i.e. H1: µ ≠ 4.5

## Chi square test

## Question 8

Let's create a small dataset for dice rolls of four players

In [70]:
import numpy as np

d1 = [5, 8, 3, 8]
d2 = [9, 6, 8, 5]
d3 = [8, 12, 7, 2]
d4 = [4, 16, 7, 3]
d5 = [3, 9, 6, 5]
d6 = [7, 2, 5, 7]

dice = np.array([d1, d2, d3, d4, d5, d6])

run the test using SciPy Stats library

Depending on the test, we are generally looking for a threshold at either 0.05 or 0.01. Our test is significant (i.e. we reject the null hypothesis) if we get a p-value below our threshold.

For our purposes, we’ll use 0.01 as the threshold.

use stats.chi2_contingency() module 

This function computes the chi-square statistic and p-value for the hypothesis test of independence of the observed frequencies in the contingency table

Print the following:

- chi2 stat
- p-value
- degree of freedom
- contingency



In [71]:
from scipy import stats
stats.chi2_contingency(dice)

chi2_stat, p_val, dof, ex = stats.chi2_contingency(dice)
print("chi2 Stat")
print(chi2_stat)
print("\n")

print("p-Value")
print(p_val)
print("\n")

print("degree of freedom")
print(dof)
print("\n")

print("contingency")
print(ex)

chi2 Stat
23.315671914716496


p-Value
0.07766367301496693


degree of freedom
15


contingency
[[ 5.57419355  8.20645161  5.57419355  4.64516129]
 [ 6.50322581  9.57419355  6.50322581  5.41935484]
 [ 6.73548387  9.91612903  6.73548387  5.61290323]
 [ 6.96774194 10.25806452  6.96774194  5.80645161]
 [ 5.34193548  7.86451613  5.34193548  4.4516129 ]
 [ 4.87741935  7.18064516  4.87741935  4.06451613]]


## Question 9

### Z-test

Get zscore on the above dice data using stats.zscore module from scipy. Convert zscore values to p-value and take mean of the array.

In [72]:
 print ("\nZ-score for dice : \n", stats.zscore(dice, axis = 0)) 


Z-score for dice : 
 [[-0.46291005 -0.18884739 -1.83711731  1.44115338]
 [ 1.38873015 -0.64208114  1.22474487  0.        ]
 [ 0.9258201   0.7176201   0.61237244 -1.44115338]
 [-0.9258201   1.62408759  0.61237244 -0.96076892]
 [-1.38873015  0.03776948  0.          0.        ]
 [ 0.46291005 -1.54854863 -0.61237244  0.96076892]]


In [77]:
p_values = 1 - scipy.special.ndtr(dice)
p_values

array([[2.86651572e-07, 6.66133815e-16, 1.34989803e-03, 6.66133815e-16],
       [0.00000000e+00, 9.86587700e-10, 6.66133815e-16, 2.86651572e-07],
       [6.66133815e-16, 0.00000000e+00, 1.27986510e-12, 2.27501319e-02],
       [3.16712418e-05, 0.00000000e+00, 1.27986510e-12, 1.34989803e-03],
       [1.34989803e-03, 0.00000000e+00, 9.86587700e-10, 2.86651572e-07],
       [1.27986510e-12, 2.27501319e-02, 2.86651572e-07, 1.27986510e-12]])

In [78]:
p_values.mean()

0.002065949075736128

## Question 10

A Paired sample t-test compares means from the same group at different times.

The basic two sample t-test is designed for testing differences between independent groups. 
In some cases, you might be interested in testing differences between samples of the same group at different points in time. 
We can conduct a paired t-test using the scipy function stats.ttest_rel(). 

In [79]:
before= stats.norm.rvs(scale=30, loc=100, size=500) ## Creates a normal distribution with a mean value of 100 and std of 30
after = before + stats.norm.rvs(scale=5, loc=-1.25, size=500)

Test whether a weight-loss drug works by checking the weights of the same group patients before and after treatment using above data.

In [83]:
np.random.seed(11)

before= stats.norm.rvs(scale=30, loc=100, size=500)

after = before + stats.norm.rvs(scale=5, loc=-1.25, size=500)

weight_df = pd.DataFrame({"weight_before":before,
                          "weight_after":after,
                          "weight_change":after-before})

weight_df.describe()  



Unnamed: 0,weight_before,weight_after,weight_change
count,500.0,500.0,500.0
mean,99.010086,97.855418,-1.154668
std,29.388198,29.82296,5.184758
min,-3.808483,-6.195302,-14.959653
25%,79.423261,77.896998,-4.483683
50%,99.143563,97.987216,-1.040709
75%,118.482832,117.424764,2.216117
max,172.625839,168.795242,15.323412


In [84]:
stats.ttest_rel(a = before,
                b = after)

Ttest_relResult(statistic=4.97982085404432, pvalue=8.783191375418934e-07)