# Hypothesis Testing Continued

## Recap 

### **P-values and alpha values:** <br/>
The significance threshold $\alpha$ is the threshold that defines whether a p-value is low or high. If the p-value is less than the significance level $\alpha$, we reject the null hypothesis at a significance level of $\alpha$.

### **Z-scores:** 
> The z-score tells us how many standard deviations above or below the mean an observation is. z-scores allow us to compare scores from different normal distributions.
$$\large \text{z} = \frac{x - \mu}{\sigma}$$

**z-scores and probabilities:**<br/>
To compute the probability of obtaining a z-score less than a given value z, use: <center>stats.norm.cdf(z)<center/>

To compute the probability of obtaining a z-score greater than or equal to a given value z, use: 
    <center>1 - stats.norm.cdf(z)<center/><br/>
        
To compute the z-score for a given percentile, like the 95th, use:
        <center>1 - stats.norm.ppf(0.95)<center/><br/>
        
_NOTE:_ The _one-sample_ z-test is used when you want to know if your sample comes from a particular population. The _one-sample_ z-test is used only for tests related to the sample mean. The test statistic of one-sample z-tests is called the z-statistic.

Recall the test statistic for a one-sample z-test is the z-statistic:

$$ \large \text{z-statistic} = \dfrac{\bar x - \mu_0}{{\sigma}/{\sqrt{n}}} $$
$\bar x$ is your sample mean
$n$ is the number of items in your sample
$\sigma$ is the population standard deviation
$\mu_0$ is the population mean
The z-statistic differs from the standard score formula: we divide the standard deviation by the square root of $n$ to reflect that we are dealing with the sample variance.

Imagine we have measured the blood pressure for a population of individuals. The average blood pressure for this population is 72.5 mm Hg, with a standard deviation of 12.5 mm Hg.

We then measure the blood pressure of 30 other individuals. Here are the observed blood pressures (in units of mm Hg):

62.9, 66.2, 65.0, 84.7, 68.2, 73.1, 68.3, 57.6, 65.8, 67.8, 54.0, 66.8, 56.4, 54.3, 48.3, 
73.9, 62.2, 53.0, 52.2, 74.5, 66.1, 66.7, 77.7, 73.6, 76.5, 64.2, 59.5, 66.1, 58.3, 64.9

We want to know if the average blood pressure of these 30 individuals is significantly lower than the population's average blood pressure, at a significance level of $\alpha$ = 0.05.

State the null and alternative hypotheses for this problem.

$H_o$: $\mu \leq M $ (The average blood pressure of the sample of individuals is not significantly smaller than the population average blood pressure.)

$H_a$: $\mu \gt M$ (The average blood pressure of the sample of individuals is significantly smaller than the population average blood pressure.)

Here, $\mu$ is the population average blood pressure, and $M$ is the average blood pressure of the sample of 30 individuals.

Perform a one-sample z-test. Interpret the result of the test.

In [1]:
import pandas as pd 
import numpy as np
from scipy import stats 
import matplotlib.pyplot as plt
%matplotlib inline 

import seaborn as sns
sns.set_style('darkgrid')

In [2]:
measurements = [62.9, 66.2, 65.0, 84.7, 68.2, 73.1, 68.3, 57.6, 65.8, 67.8, 54.0, 66.8, 
                56.4, 54.3, 48.3, 73.9, 62.2, 53.0, 52.2, 74.5, 66.1, 66.7, 77.7, 73.6, 
                76.5, 64.2, 59.5, 66.1, 58.3, 64.9]

x_bar = np.mean(measurements)
n = len(measurements)
mu = 72.5
sigma = 12.5
z = (x_bar - mu)/(sigma/np.sqrt(n))

p = stats.norm.cdf(z)
print("z:", round(z, 4))
print("p-value:", round(p, 4))

z: -3.3039
p-value: 0.0005


### t-distributions - when our sample isn't standard normal 
> We must use a t-distribution when:<br/>
    1. The population standard deviation is unknown<br/>
    2. The sample size is <30
    
![](https://github.com/flatiron-school/ds-confidence_intervals/raw/cbc262a0d52a771b48d737e8c195735cd97113c1/img/z_vs_t.png)

**Use ```t = stats.t.ppf(0.95, df=50-1)``` to compute t-score**

**Confidence Intervals for $t$-Distribution:**<br/>
The construction of confidence intervals for the $t$-distribution is similar to how they are made for the normal distribution. But instead of $z$-scores, we'll have $t$-scores. And if we don't have access to the population standard deviation, we'll make use of the sample standard deviation instead.

left endpt.: $\bar{x} - t\times\frac{s}{\sqrt{n}}$
right endpt.: $\bar{x} + t\times\frac{s}{\sqrt{n}}$

**Raw coding CI for t-distribution:**
```# Calculating the confidence interval
(sample_mean - t * standard_error, sample_mean + t * standard_error)```

**Alternate method below: Can only use if you don't need the t-statistic**

**Scenario:**

You are inspecting a hardware factory and want to construct a 90% confidence interval of acceptable screw lengths. You draw a sample of 30 screws and calculate their mean length as 4.8 centimeters and the standard deviation as 0.4 centimeters. What are the bounds of your confidence interval?

In [3]:
n = 30
mean = 4.8
t_value = stats.t.ppf(0.95, n-1)
margin_error = t_value * 0.4/(n**0.5)
confidence_interval = (mean - margin_error, mean + margin_error)

confidence_interval

(4.6759133066001235, 4.924086693399876)

In [7]:
stats.t.interval(
    alpha=0.95,           # Confidence level
    df=n-1,               # Degrees of freedom
    loc=mean,             # Sample mean
    scale=stats.sem()    # Unit scale for t-distribution
)

(nan, nan)

## Practice 

### Gotta Have My Pants! 👖
I'm buying jeans from store A and store B. I know nothing about their inventory other than prices.

> store1 = [20,30,30,50,75,25,30,30,40,80] <br/>
> store2 = [60,30,70,90,60,40,70,40]

Should I go just to one store for a less expensive pair of jeans? I'm pretty apprehensive about my decision, so $\alpha = 0.1$. It's okay to assume the samples have equal variances.

1. **State the null and alternative hypotheses:**

Null: Store A and B have the same jean prices.

Alternative: Store A and B do not have the same jean prices.

2. **What kind of test should we run? Why?**

Run a two-tailed two independent sample t-test. Sample sizes are small.

In [8]:
#perform the test 
store1 = [20,30,30,50,75,25,30,30,40,80]
store2 = [60,30,70,90,60,40,70,40]

stats.ttest_ind(store1, store2)

Ttest_indResult(statistic=-1.70113828065953, pvalue=0.10826653002468378)

**Make a decision.**

We fail to reject the null hypothesis at a significance level of $\alpha = 0.1$. We do not have evidence to support that jean prices are different in store A and store B.

### Rats on protein diets 
Consider the gain in weight (in grams) of 19 female rats between 28 and 84 days after birth.

Twelve rats were fed on a high protein diet and seven rats were fed on a low protein diet.

high_protein = [134, 146, 104, 119, 124, 161, 107, 83, 113, 129, 97, 123] <br/>
low_protein = [70, 118, 101, 85, 107, 132, 94] <br/>

Is there any difference in the weight gain of rats fed on high protein diet vs low protein diet? It's OK to assume equal sample variances.

1. **State the Null and alternative hypotheses:**

null: there is no difference in the weight gain of rats who were fed a high protein diet vs a low protein diet

alternative: weight gains differ by kind of diet

2. **What kind of test should we perform and why?**

Two-sided unpaired two-sample t-test. Low sample size.

In [10]:
#perform test - reject or fail to reject 
high_protein = [134, 146, 104, 119, 124, 161, 107, 83, 113, 129, 97, 123]
low_protein = [70, 118, 101, 85, 107, 132, 94]
stats.ttest_ind(high_protein, low_protein) #two-tailed test

Ttest_indResult(statistic=1.89143639744233, pvalue=0.07573012895667763)

We fail to reject the null hypothesis at a significance level of $\alpha = 0.05$.

What if we wanted to test if the rats who ate a high protein diet gained more weight than those who ate a low-protein diet?

Null: weight gain by rats who ate high protein diet same as weight gain of low protein diet rats

alternative: weight gain by rats who ate high protein diet greater than weight gain of low protein diet rats

Kind of test? One-sided unpaired two-sample test

Calculate Critical test statistic value:

In [11]:
stats.t.ppf(q=0.95, df = len(high_protein)+len(low_protein)-2) #critical t-statistic

1.7396067260750672

We can reject the null hypothesis in favor of the alternative at alpha = 0.05 (one-sided test). The value of t-statistic lies in rejection region.

## Summary
**Key Takeaways:**

- A statistical hypothesis test is a method for testing a hypothesis about a parameter in a population using data measured in a sample.

- Hypothesis tests consist of a null hypothesis and an alternative hypothesis.

- We test a hypothesis by determining the chance of obtaining a sample statistic if the null hypothesis regarding the population parameter is true.

- One-sample z-tests and one-sample t-tests are hypothesis tests for the population mean $\mu$.

- We use a one-sample z-test for the population mean when the population standard deviation is known and the sample size is sufficiently large. We use a one-sample t-test for the population mean when the population standard deviation is unknown or when the sample size is small.

- Two-sample t-tests are hypothesis tests for differences in two population means.