## Goals of Today's Lab:

1. A brief recap of hypothesis testing
2. Parametric hypothesis testing
3. Non-parametric hypothesis testing

In [4]:
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt

## Steps of a Hypothesis Test

1. Set up null and alternative hypotheses

2. Specify the appropriate statistical test

3. Choose a significance level (alpha)

4. Determine the critical value of test statistic or p-value (find the region of rejection)

5. Calculate the observed value of the test statistic

6. Make a decision

### Practice

<img src="./img/talking.jpeg" width="60" align='left'>

</br>

In your breakout room determine which of the 5 null and alternative hypotheses are valid



1. $H_0: \bar{X} = 0$  
   $H_A: \bar{X} \neq 0$

2. $H_0: \mu \leq 34$  
   $H_A: \mu > 34$
   
3. $H_0: \text{The mean girth of all Redwood Trees in California} = 225 in.$  
   $H_A: \text{The mean girth of all Redwood Trees in California} < 225 in.$
   
4. $H_0: \sigma_1 = \sigma_2$  
   $H_A: \sigma_1 \neq \sigma_2$
   
5. $H_0: \text{The sample standard deviation of 2018 SAT test scores} = 50$  
   $H_A: \text{The sample standard deviation of 2018 SAT test scores} > 50$   

### Relating Hypotheses to Errors


#### Why is it necessary for the Null Hypothesis to be the hypothesis of no difference?

- It represents the "status-quo" 
    - A new drug for a disease 
- It represents a relevant baseline value
    - A new feed that claims to increase cattle weight by 20 lb on average
- It represents a value generally true unless proven otherwise
    - A person on trial - innocent until proven guilty
    
    
#### How does this translate into Type-I and Type-II errors?

Because depending on our hypothesis, our consideration for these errors increases.  
It is incrediby hard to control for both errors at the same time; as one decreases, the other increases.

This the reason why Type-I error is more serious.  
By rejecting the status-quo, we are considering very serious change.

<img src="img/testing_error.png" width="550">





## BUT WHY DO WE CARE SO MUCH ABOUT HYPOTHESIS TESTS??!

Hypothesis testing is without doubt, one of the universal concepts of statistical data analysis.     
It is central to many statistical methods, including  
- analysis of variance
- regression analysis
- analysis of categorical data, etc.

So...

<img src="img/hypothesis_meme_2.jpg" width="350">

**p-values**

A very low p-value signals incompatibility with the null hypothesis.   
&nbsp;  
When p-values are low, one of two things is true:  
(a) either you have just witnessed a rare chance event, or   
(b) your null hypothesis is false.


## Parametric Tests

A parametric statistical test is one that **makes assumptions about the parameters** of the population distribution(s) from which one's data are drawn.

For practical purposes, you can think of "parametric" as referring to tests, such as t-tests and the analysis of variance, that assume the underlying source population(s) to be normally distributed; they generally also assume that one's measures derive from an equal-interval scale. Examples of parametric tests include:

- **Z-Tests (Continuous Data)** - For comparing up to two population means, given standard deviation
- **t-Tests** - For comparing up to two population means
- **Chi-Squared Tests** - For comparing categorical variables
- **F-Test** - For comparing multiple populations
- **Z-Test for Proportions** - For comparing population proportions


## Non-Parametric Tests

while a non-parametric test is one that **makes no such assumptions about the parameters** of the population distribution. And you can think of "non-parametric" as referring to tests that do not make on these particular assumptions. Examples of non-parametric tests include:
- **One-Sample Sign Test** - To compare samples for change before and after a treatment 
- **Wilcoxon Signed Rank Test** - To compare samples for change before and after a treatment 
- **The Mann-Kendall Trend Test** - To assess trends in time-series data.
- **Mann-Whitney U Test** - To compare two populations
- **Kruskal-Wallis Test** - To compare more than two independent populations
- **Kolmogorov-Smirnov Test** -To compare probability distributions
- **Runs Test** - To test for randomness

### Example 1: Ice cream flavors

<img src="img/ice_cream.jpg" width="450">

As an extremely biased person, I strongly believe that the local ice cream shop has higher orders when customers order vanilla ice-cream than chocolate ice-cream.  
Why, you ask?  
Because I don't like chocolate ice-cream so much, while I LOVE vanilla.

So, in order to prove my case, we will conduct a hypothesis test using alpha of .10.  Using the 6 step process!


The data on ice cream sales is:

``` python
van_orders = [10.5, 11.9, 10, 9.7, 9.6, 10.1, 9.4, 9.5, 9.8]
choc_orders = [9.6, 9.9, 9.4, 8.9, 9.6, 9.3, 8.8, 10.5, 8.8]
```

**Step 1: State the null and alternative hypotheses**

In [42]:
van_orders = [10.5, 11.9, 10, 9.7, 9.6, 10.1, 9.4, 9.5, 9.8]
choc_orders = [9.6, 9.9, 9.4, 8.9, 9.6, 9.3, 8.8, 10.5, 8.8]
van_mean = np.array(van_orders).mean()
choc_mean = np.array(choc_orders).mean()

\begin{equation}\label{eq:}
H_0: u_1 = u_2
\end{equation}

\begin{equation}\label{eq:2}
H_1: u_1 > u_2 
\end{equation}

u_1 is vanilla mean, u_2 is chocolate mean

**Step 2: Choosing the appropriate test.  Which test should you select and why?** 

T test, two groups are being compared, outcome is categorical

**Step 3: Choosing a Significance level**

\begin{equation}\label{eq:3}
\alpha = .1 
\end{equation}


Because you told us alpha = .1

**Step 4: Determine the critical values**

In [41]:
from scipy.stats import t

a = .1
df = len(van_orders)-2+len(choc_orders)

value = t.ppf(a,df)
p = t.cdf(value,df)
s_1 = np.array(van_orders).std()
s_2 = np.array(choc_orders).std()
s_1, s_2
value

-1.3367571673273144

**Step 5:  Calculating the observed value**

In [53]:
t_observed, p_observed = stats.ttest_ind(van_orders,choc_orders)
a > p_observed

True

**Step 6: Make a decision**

Reject null hypothesis that orders are same, conclude that vanilla orders are statistically significantly greater than chocolate 

#### Bonus!

Write out the Type 1 and Type II errors in plan language for this scenario

### Example 2

You measure the delivery times of ten different restaurants in two different neighborhoods. You want to know if restaurants in the different neighborhoods have the same delivery times. It's okay to assume both samples have equal variances. Set your significance threshold to 0.05.

``` python
delivery_times_A = [28.4, 23.3, 30.4, 28.1, 29.4, 30.6, 27.8, 30.9, 27.0, 32.8]
delivery_times_B = [26.4, 26.3, 27.4, 30.4, 25.1, 28.4, 23.3, 24.7, 31.8, 24.3]
```

**Step 1: State the null and alternative hypotheses**

**Step 2: Choosing the appropriate test.  Which test should you select and why?** 

**Step 3: Choosing a Significance level**

**Step 4: Determine the critical values**

**Step 5:  Calculating the observed value**

**Step 6: Make a decision**

#### Bonus!

Write out the Type 1 and Type II errors in plan language for this scenario