# Hypothesis Testing

## Scenarios

- Chemistry - do inputs from two different barley fields produce different
yields?
- Astrophysics - do star systems with near-orbiting gas giants have hotter
stars?
- Economics - demography, surveys, etc.
- Medicine - BMI vs. Hypertension, etc.
- Business - which ad is more effective given engagement?

![img1](./img/img1.png)

![img2](./img/img2.png)

### Null Hypothesis / Alternative Hypothesis Structure

![img3](./img/img3.png)                  

### The Null Hypothesis

![gmonk](https://vignette.wikia.nocookie.net/villains/images/2/2f/Ogmork.jpg/revision/latest?cb=20120217040244) There is NOTHING, **no** difference.
![the nothing](https://vignette.wikia.nocookie.net/theneverendingstory/images/a/a0/Bern-foster-the-neverending-story-clouds.jpg/revision/latest?cb=20160608083230)

### The Alternative hypothesis

![difference](./img/giphy.gif)

### Error

- TYPE I: False positive rate (incorrectly reject) --> reject H0 when H0 is true
- TYPE II: False negative rate (incorrectly fail to reject) --> not reject H0 when H1 is true

### Choosing the right error rate

- Alpha, α
- Sigma, σ
- Depends on field of study, 0.2 >= α >= 0.00001

- Whether does p fall in alpha area
- alpha = .05, p < .05, significant!

### T-test

Why use it?
- Sometimes the population standard deviation is irrelevant, and sometimes it’s
unknown. (we’ll get to the different types of t-test later)
- Sometimes a sample is too small to be confident that it’s an accurate representation of reality

### T vs Z (again)

A t-test is like a modified z-test:
- Penalize for small sample size - “degrees of freedom”
- Use sample std. dev. s to estimate population σ

![img5](./img/img5.png)

### T and Z in detail
![img4](./img/img4.png)

### T-value table

![img6](./img/img6.png)

### P-Values
![picjellybeans](https://imgs.xkcd.com/comics/significant.png)

### Language of Hypothesis Testing

If p < α : we *reject* the null hypothesis<br>
If p > α : we *fail to reject* the null hypothesis


Language is **important**

- what is p value? The probability that the results seen are in fact the result of mere random chance.

### What if the experiment fails?

- Don’t throw out failed experiments
- This methodology, with this data, does not produce significant results
 - More data
 - More time
 - More details

### T-test success recipe

Regardless of the type of t-test you are performing, there are 5 main steps to executing them:

- Set up null and alternative hypotheses --> H0, H1

- Choose a significance level --> alpha = .05

- Calculate the test statistic --> mu1, mu2

- Determine the critical or p-value (find the rejection region)

- Compare t-value with critical t-value to accept or reject the Null hypothesis.

# Question 1
Is this any different from population?
- Population mean = 85
- Sample = [90,100,110]

#### Using `scipi`

In [1]:
from scipy.stats import ttest_1samp
data = [90,100,110]
ttest_1samp(data,85)

Ttest_1sampResult(statistic=2.5980762113533156, pvalue=0.12168993434632014)

#### Manual implementation

In [2]:
from statistics import stdev

data = [90,100,110]
mu = 85
n = len(data)
s = stdev(data)
df = n-1

t = (100-85)/(s/(n**.5))

In [3]:
print(t)
print(df)

2.5980762113533156
2


# Question 2

I'm buying jeans from store A and store B.  I know nothing about their inventory other than prices. Should I go just one store for a less expensive pair of jeans?
I'm pretty apprehensive about this big decision so alpha = 0.10

Try this both manually and with scipy

- [20,30,30,50,75,25,30,30,40,80]
- [60,30,70,90,60,40,70,40]

In [1]:
store1 = [20,30,30,50,75,25,30,30,40,80]
store2 = [60,30,70,90,60,40,70,40]

from scipy.stats import ttest_ind
ttest_ind(store1, store2, equal_var = False)

Ttest_indResult(statistic=-1.7120298677915535, pvalue=0.10685037968363302)

In [2]:
ttest_ind(store1, store2, equal_var = False)

Ttest_indResult(statistic=-1.7120298677915535, pvalue=0.10685037968363302)

In [3]:
ttest_ind(store1, store2, equal_var = True)

Ttest_indResult(statistic=-1.70113828065953, pvalue=0.10826653002468378)

# Question 3
Given the same data 1, how many more samples would you need to achieve p = 0.01, assuming sample mean and sample std. dev. do not change.

In [8]:
data = [90,100,110]
mu = 85
n = len(data)
s = stdev(data)
df = n-1

t = (100-85)/(s/(n**.5))

In [11]:
print(t)
print(df)

4.5
8


In [10]:
for n in range(3,10):
    df = n-1
    t = (100-85)/(s/(n**.5))
    print (df,t)

2 2.5980762113533156
3 3.0
4 3.3541019662496843
5 3.674234614174767
6 3.968626966596886
7 4.242640687119286
8 4.5


You'd need 5 degrees of freedom, n=6.  That's 3 more samples.