# Hypothesis Testing

The purpose of the test is to tell if there is any significant difference between two data sets.



## Overview

This module covers,

1) One sample and Two sample t-tests

2) ANOVA

3) Type I and Type II errors

4) Chi-Squared Tests

## Question 1 

*A student is trying to decide between two GPUs. He want to use the GPU for his research to run Deep learning algorithms, so the only thing he is concerned with is speed.*

*He picks a Deep Learning algorithm on a large data set and runs it on both GPUs 15 times, timing each run in hours. Results are given in the below lists GPU1 and GPU2.*

In [20]:
from scipy import stats 
import numpy as np
import pandas as pd

In [21]:
GPU1 = np.array([11,9,10,11,10,12,9,11,12,9,11,12,9,10,9])
GPU2 = np.array([11,13,10,13,12,9,11,12,12,11,12,12,10,11,13])

#Assumption: Both the datasets (GPU1 & GPU 2) are random, independent, parametric & normally distributed

Hint: You can import ttest function from scipy to perform t tests 

**First T test**

*One sample t-test*

Check if the mean of the GPU1 is equal to zero.
- Null Hypothesis is that mean is equal to zero.
- Alternate hypothesis is that it is not equal to zero.

In [22]:
from scipy.stats import ttest_1samp, ttest_ind,levene, f_oneway, chisquare, chi2_contingency,zscore, ttest_rel, norm

In [23]:
t_statistic, p_value = ttest_1samp(GPU1,0)

p_value

7.228892044970457e-15

As the p_value is far less than 0.05. Reject the null hypoyhesis i.e.the mean value of the GPU1 is equal to zero.

## Question 2

Given,

Null Hypothesis : There is no significant difference between data sets

Alternate Hypothesis : There is a significant difference

*Do two-sample testing and check whether to reject Null Hypothesis or not.*

https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_ind.html

In [24]:
t_statistic, p_value = ttest_ind(GPU1,GPU2)

p_value

0.013794282041452725

Conclusion

p_value < 0.05

Reject Null Hypothesis i.e.There is enough evidence in favour of Alternative Hypothesis.

## Question 3

He is trying a third GPU - GPU3.

In [25]:
GPU3 = np.array([9,10,9,11,10,13,12,9,12,12,13,12,13,10,11])

#Assumption: Both the datasets (GPU1 & GPU 3) are random, independent, parametric & normally distributed

*Do two-sample testing and check whether there is significant differene between speeds of two GPUs GPU1 and GPU3.*

#### Answer:

H0 : There is significant difference between speeds of GPU1, GPU2                                                             H1 : There is no significant difference between speeds of GPU1, GPU2

In [26]:
ttest_ind(GPU1,GPU3)

Ttest_indResult(statistic=-1.4988943759093303, pvalue=0.14509210993138993)

Conclusion
1. pvalue > 0.05
2. Do not reject null hypothesis.

## ANOVA

## Question 4 

If you need to compare more than two data sets at a time, an ANOVA is your best bet. 

*The results from three experiments with overlapping 95% confidence intervals are given below, and we want to confirm that the results for all three experiments are not significantly different.*

But before conducting ANOVA, test equality of variances (using Levene's test) is satisfied or not. If not, then mention that we cannot depend on the result of ANOVA

In [27]:
import numpy as np

e1 = np.array([1.595440,1.419730,0.000000,0.000000])
e2 = np.array([1.433800,2.079700,0.892139,2.384740])
e3 = np.array([0.036930,0.938018,0.995956,1.006970])

#Assumption: All the 3 datasets (e1,e2 & e3) are random, independent, parametric & normally distributed

Perform levene test on the data

The Levene test tests the null hypothesis that all input samples are from populations with equal variances. Levene’s test is an alternative to Bartlett’s test bartlett in the case where there are significant deviations from normality.

source: scipy.org

#### Answer:

H0 : All Variances are equal

H1 : All Variances are not equal

In [28]:
levene(e1,e2,e3)

LeveneResult(statistic=2.6741725711150446, pvalue=0.12259792666001798)

Conclusion

pvalue > 0.05

Do not reject Null Hypothesis. We can depend on the result of ANOVA.

## Question 5

The one-way ANOVA tests the null hypothesis that two or more groups have the same population mean. The test is applied to samples from two or more groups, possibly with differing sizes.

use stats.f_oneway() module to perform one-way ANOVA test

H0: The three groups e1, e2, e3 have the same Population mean.
    
H1 : The three groups e1, e2, e3 have different population mean.

In [29]:
f,p = f_oneway(e1,e2,e3)

print('F value',f)
print('P value',p)

F value 2.51357622845924
P value 0.13574644501798466


Since pvalue > 0.05. Do not reject null hypothesis. We can say that the groups e1, e2, e3 have the same population mean.

## Question 6

*In one or two sentences explain about **TypeI** and **TypeII** errors.*

#### Answer:

Type-I Error

    A Type-I Error occurs when the hypothesis testing concluded to reject the null hypothesis but actually it is not supposed to be rejected. In this scenario, we will take up some corrective measures for the process to which hypothesis testing is conducted even though not required.
    
Type-II Error

    A Type-II Error occurs when the hypothesis testing concluded do not reject the null hypothesis but actually it is supposed to be rejected. In this scenario, we will not take up any corrective measures for the process to which the hypothesis testing is conducted even though required.

## Question 7 

You are a manager of a chinese restaurant. You want to determine whether the waiting time to place an order has changed in the past month from its previous population mean value of 4.5 minutes. 
State the null and alternative hypothesis.

#### Answer:


H0 : The waiting time to place an order has changed.(u not equal to 4.5)

H1 : The waiting time to place an order has not changed (u is equal to 4.5)

## Chi square test

## Question 8

Let's create a small dataset for dice rolls of four players

In [30]:
import numpy as np

d1 = [5, 8, 3, 8]
d2 = [9, 6, 8, 5]
d3 = [8, 12, 7, 2]
d4 = [4, 16, 7, 3]
d5 = [3, 9, 6, 5]
d6 = [7, 2, 5, 7]

dice = np.array([d1, d2, d3, d4, d5, d6])

run the test using SciPy Stats library

Depending on the test, we are generally looking for a threshold at either 0.05 or 0.01. Our test is significant (i.e. we reject the null hypothesis) if we get a p-value below our threshold.

For our purposes, we’ll use 0.01 as the threshold.

use stats.chi2_contingency() module 

This function computes the chi-square statistic and p-value for the hypothesis test of independence of the observed frequencies in the contingency table

Print the following:

- chi2 stat
- p-value
- degree of freedom
- contingency



In [31]:
chi_statistic, p_value, dof, contingency = chi2_contingency(dice)

print('chi2 statistic ', chi_statistic)
print('---------------------------------')
print('pvalue ',p_value)
print('----------------------------------')
print('degrees of freedom', dof)
print('----------------------------------')
print('contingency ', contingency)

chi2 statistic  23.315671914716496
---------------------------------
pvalue  0.07766367301496693
----------------------------------
degrees of freedom 15
----------------------------------
contingency  [[ 5.57419355  8.20645161  5.57419355  4.64516129]
 [ 6.50322581  9.57419355  6.50322581  5.41935484]
 [ 6.73548387  9.91612903  6.73548387  5.61290323]
 [ 6.96774194 10.25806452  6.96774194  5.80645161]
 [ 5.34193548  7.86451613  5.34193548  4.4516129 ]
 [ 4.87741935  7.18064516  4.87741935  4.06451613]]


## Question 9

### Z-test

Get zscore on the above dice data using stats.zscore module from scipy. Convert zscore values to p-value and take mean of the array.

In [32]:
z_score_values = zscore(dice)

z_score_values

array([[-0.46291005, -0.18884739, -1.83711731,  1.44115338],
       [ 1.38873015, -0.64208114,  1.22474487,  0.        ],
       [ 0.9258201 ,  0.7176201 ,  0.61237244, -1.44115338],
       [-0.9258201 ,  1.62408759,  0.61237244, -0.96076892],
       [-1.38873015,  0.03776948,  0.        ,  0.        ],
       [ 0.46291005, -1.54854863, -0.61237244,  0.96076892]])

In [33]:
p_values = norm.sf(abs(z_score_values)) * 2

print(p_values)

print(p_values.mean())

[[0.64342884 0.85021243 0.06619258 0.14954135]
 [0.16491482 0.5208205  0.22067136 1.        ]
 [0.35453948 0.47299156 0.54029137 0.14954135]
 [0.35453948 0.10435712 0.54029137 0.33666837]
 [0.16491482 0.96987148 1.         1.        ]
 [0.64342884 0.12149026 0.54029137 0.33666837]]
0.4685694646738299


## Question 10

A Paired sample t-test compares means from the same group at different times.

The basic two sample t-test is designed for testing differences between independent groups. 
In some cases, you might be interested in testing differences between samples of the same group at different points in time. 
We can conduct a paired t-test using the scipy function stats.ttest_rel(). 

In [34]:
before= stats.norm.rvs(scale=30, loc=100, size=500) ## Creates a normal distribution with a mean value of 100 and std of 30
after = before + stats.norm.rvs(scale=5, loc=-1.25, size=500)

Test whether a weight-loss drug works by checking the weights of the same group patients before and after treatment using above data.

H0 : The two groups have the same mean

H1 : The two groups does not have same mean.

In [35]:
t_statistic, p_value = ttest_rel(before,after)

print('t-staistic', before)

print('-----------------------')

print('p-value ', p_value)

t-staistic [ 57.9386695   37.19487523  21.20564837  89.44612711 134.84769475
 139.55797644 101.53442577  95.38730499  95.10667846  83.17788511
  72.09280056  87.21820571 119.40187672 107.66714714 144.69145562
  89.42517802 102.60935296 119.53901468  90.16289782  99.78029778
  70.93526601 104.72962255  59.34608972  91.18165779  33.15402658
  70.57520959  94.71826493 130.864295    99.4022956   58.06264499
  94.65626948  76.18964218 142.68763262 106.10803185 112.95628015
 148.37908031  88.62413441  83.44564469 147.19894789  70.28109735
 146.61532105  82.40823648 156.01133098 107.81044982  74.36993478
  86.57758345 107.0241474   93.00767405  55.72646654  69.6512034
  72.12213398 172.14726455  64.87886398  78.01471659  60.22453524
  70.56306132 115.7533461  100.75395251  41.80876953  93.82240198
 130.83919668 138.24090178  96.08193629 137.25528074  96.22555
 121.90624616 136.08573429 104.85208442 150.76141759 113.05155495
  98.15723022 137.73302855 135.18813594 143.83067765  80.32598228
 15

Conclusion

p-value < 0.05

Reject the Null Hypothesis.