### Hypothesis Testing

Hypothesis testing is a way to test the results of an experiment and see if you have meaningful results.

* **Null hypothesis:** Denoted with H0, a null hypothesis is an **assumption that the population average is identical to a specific value**. The typical notation is μ = μ0, where μ refers to the population mean and μ0 refers to the hypothesized value.
<br><br>
* **Alternate hypothesis:** An alternative hypothesis is the opposite of the null hypothesis. We compare this hypothesis with the null hypothesis to decide whether or not we reject the null hypothesis. We denote the alternative hypothesis with H1 or Ha.
<br><br>
* **Significance Level:** Indicates whether we are confident enough to reject the null hypothesis.
<br><br>
* **Test Statistic:** Once we determine the type of hypothesis test and that our assumptions have been met, we use our data to decide whether to reject or not reject the null hypothesis. (z-test, t-test)
<br><br>
* **p-value:** is a measure used to help us reject or not the null hypothesis


**Academic Example:**
   
Boys of a certain age are known to have a mean weight of 85 pounds. A complaint is made that the boys living in a municipal children's home are underfed. As one bit of evidence, 25 boys (of the same age) are weighed and found to have a mean weight of 80.94 pounds. It is known that the population standard deviation is 11.6 pounds. Based on the available data, what should be concluded concerning the complaint? 
 
 
How to reason about the problem:

It is assumed that the population mean weight is 85, but we do not have the complete data from the population. Otherwise we would have calculated the actual mean directly. However we only have sample data from 25 subjects. So based on this sample data we will try to prove or disprove our assumption, using statistical test.

**Step 1:** Define the null hypothesis - This is our assumption about the population. It is defined by H0 and in this case H0: μ = 85;

**Step 2:** Define the alternative hypothesis - This means, what if our assumption is not true. It is defined by Ha and in this case Ha: μ < 85. 

**Step 3:** Determine if it is a one-tailed or a two-tailed test. Two-tailed is when the mean tested (alternative hypothesis) can be > or < then the mean of the population. In this case we are checking if the mean of the weight of the boys in the home is smaller then the mean of the population of boys, so it's a one-tailed test.

**Step 4:** Decide a test statistics based on the information available. Assuming data is normally distributed and number of observations are less and population variance is known (since population standard deviation is provided), we will use a z-test. This test is based on a "z-distribution" which is a normal distribution. If the population variance was not known or the testing sample is less then 30, we use a t-test. T test is based on students t distribution which is very similar to a standard normal distribution except that it is much flatter.

<img src=https://education-team-2020.s3-eu-west-1.amazonaws.com/data-analytics/7.03/7.03-t_distribution.png width="500">


**Step 5:** Level of significance: This defines the rejection region/critical region, it's the probability of making the wrong decision when the null hypothesis is true. Usually it is 0.05. It is defined by greek letter 'alpha'. 

**Step 6:** Calculate the test statistic based on the given information.

**Step 7:** Check the table.
<br> For z-test you have fixed values according to Confidence Level.
<br> For t-test you have to calculate according to the degrees of freedom (df), which is the *sample_size - 1*.

**Step 8:** Make conclusions:
* If the test statistic falls in the critical region, then we reject the Null Hypothesis
* If the test statistic falls in the region between the critical region, then we fail to reject the Null Hypothesis.

In [12]:
import math

sample_mean = 80.94 # sample of boys from the home
pop_mean = 85
pop_std = 11.6
n = 25

statistic = (sample_mean - pop_mean)/(pop_std/math.sqrt(n))
statistic
# Spoiler: we reject the null hypothesis. 
# We can say with 95% of confidence that we have enough evidence to discard the null hypothesis.
# The average weight of the boys in the home is less then 85.
# Comparing the means (80.94 and 85), we can say they are underfed.

-1.750000000000001

### P_Value

**Academic Example:** 

A psychologist was interested in exploring whether or not male and female college students have different driving behaviors. There were a number of ways that she could quantify driving behaviors. She opted to focus on the fastest speed ever driven by an individual. Therefore, the particular statistical question she framed was as follows:

* Is the mean fastest speed driven by male college students different than the mean fastest speed driven by female college students?
* She conducted a survey of a random n = 34 male college students and a random m = 29 female college students. Here is a descriptive summary of the results of her survey:


In [4]:
# Males
m_samples = 34
m_sample_mean = 105.5
m_sample_std = 20.1

# Females
f_samples = 29
f_sample_mean = 90.0
f_sample_std = 12.2

In [5]:
from scipy.stats import ttest_ind, norm

# create the samples
males = norm.rvs(loc=m_sample_mean, scale=m_sample_std, size=m_samples)
females = norm.rvs(loc=f_sample_mean, scale=f_sample_std, size=f_samples)

In [6]:
males

array([102.21296284, 119.69124994, 100.09425561, 128.90404039,
        95.42643763, 105.18426745,  76.6183121 , 124.02159315,
       114.45196132, 130.38464175, 104.87760434,  96.79083634,
        96.90349971, 123.96567382, 122.60474256,  75.33490284,
       114.81980128,  83.87905974,  49.26615422, 112.34655357,
       125.67539659,  85.9088677 , 105.95726443, 112.74461198,
        85.44284505, 126.88348964,  82.29865526,  80.59286187,
        96.20560345,  94.42562453, 125.46469407,  87.61828344,
       135.96343472, 138.91478614])

In [7]:
females

array([ 98.21238723,  83.94391936,  93.53224009,  96.10336184,
        74.14651259,  74.54241319,  80.43000964,  72.91687125,
       107.73025472,  87.95389563,  96.01842212,  74.48863161,
        91.03561676,  98.57857566,  95.78016482,  96.21750321,
        82.25753982,  50.87001799,  85.17564312,  91.62610393,
        80.90108091,  98.14229428, 104.49251582,  75.31715242,
        88.91338988,  95.20656221,  99.11571874,  98.95894577,
        86.47918401])

In [8]:
ttest_ind(males, females)

Ttest_indResult(statistic=3.7916586478857672, pvalue=0.00034559169347544104)

In [None]:
# We reject the null hypothesis. 
# We can say with 95% of confidence that we have enough evidence to discard the null hypothesis.
# The average speed between male and females are not the same.
# Comparing the means, we can say males drive faster.