# Hypothesis Testing

## Scenarios

- Chemistry - do inputs from two different barley fields produce different
yields?
- Astrophysics - do star systems with near-orbiting gas giants have hotter
stars?
- Economics - demography, surveys, etc.
- Medicine - BMI vs. Hypertension, etc.
- Business - which ad is more effective given engagement?

![img1](./img/img1.png)

![img2](./img/img2.png)

### Null Hypothesis / Alternative Hypothesis Structure

![img3](./img/img3.png)                  

### The Null Hypothesis

![gmonk](https://vignette.wikia.nocookie.net/villains/images/2/2f/Ogmork.jpg/revision/latest?cb=20120217040244) There is NOTHING, **no** difference.
![the nothing](https://vignette.wikia.nocookie.net/theneverendingstory/images/a/a0/Bern-foster-the-neverending-story-clouds.jpg/revision/latest?cb=20160608083230)

### The Alternative hypothesis

![difference](./img/giphy.gif)

### Error

- TYPE I: False positive rate (incorrectly reject)
- TYPE II: False negative rate (incorrectly fail to reject)

### Choosing the right error rate

- Alpha, α
- Sigma, σ
- Depends on field of study, 0.2 ≤ α ≤ 0.00001

### T-test

Why use it?
- Sometimes the population standard deviation is irrelevant, and sometimes it’s
unknown. (we’ll get to the different types of t-test later)
- Sometimes a sample is too small to be confident that it’s an accurate representation of reality

### T vs Z (again)

A t-test is like a modified z-test:
- Penalize for small sample size - “degrees of freedom”
- Use sample std. dev. s to estimate population σ

![img5](./img/img5.png)

### T and Z in detail
![img4](./img/img4.png)

### T-value table

![img6](./img/img6.png)

### P-Values
![picjellybeans](https://imgs.xkcd.com/comics/significant.png)

### Language of Hypothesis Testing

If p < α : we *reject* the null hypothesis<br>
If p > α : we *fail to reject* the null hypothesis


Language is **important**

### What if the experiment fails?

- Don’t throw out failed experiments
- This methodology, with this data, does not produce significant results
 - More data
 - More time
 - More details

### T-test success recipe

Regardless of the type of t-test you are performing, there are 5 main steps to executing them:

- Set up null and alternative hypotheses

- Choose a significance level

- Calculate the test statistic

- Determine the critical or p-value (find the rejection region)

- Compare t-value with critical t-value to accept or reject the Null hypothesis.

# Question 1
Is this any different from population?
- Population mean = 85
- Sample = [90,100,110]

#### Using `scipi`

In [14]:
# this is my own code 
from scipy.stats import ttest_1samp
data = [90,100,110] # the sample mean, x-bar = sum(data)/len(data)
µ = 85 # this is a population mean
ttest_1samp(data, µ) # since the pvalue is greater than 5% we fail to reject H0 hypotesis 

Ttest_1sampResult(statistic=2.5980762113533156, pvalue=0.12168993434632014)

#### Manual implementation

In [25]:
# then we can calculate the t-value 
# inorder to do so 
from statistics import stdev

data = [90, 100, 110] # get the sample mean, x-bar bividing the sum by the length
x_bar = sum(data)/len(data)
µ = 85 # this is a population mean
s = stdev(data) # this is sample mean 
n = len(data) # this is sample size
df = n - 1 # this is sample degree of fredom 

t_test = (x_bar-µ)/(s/n**0.5) # we use t_test for unkown standard divation, σ
# z_test = (x_bar-µ)/(σ/n**0.5) # unknown population standard divation σ, so we use t_test

print(df)
print(t_test)

2
2.5980762113533156


# Question 2

I'm buying jeans from store A and store B.  I know nothing about their inventory other than prices. Should I go just one store for a less expensive pair of jeans?
I'm pretty apprehensive about this big decision so alpha = 0.10

Try this both manually and with scipy

- [20,30,30,50,75,25,30,30,40,80]
- [60,30,70,90,60,40,70,40]

In [26]:
store1 = [20,30,30,50,75,25,30,30,40,80] 
store2 = [60,30,70,90,60,40,70,40]

from scipy.stats import ttest_ind


In [27]:
ttest_ind(store1, store2, equal_var = False) # we fail to reject H0 hypotesis

Ttest_indResult(statistic=-1.7120298677915535, pvalue=0.10685037968363302)

In [29]:
import numpy as np
from scipy import stats 
np.random.seed(12345678)
# Test with sample with identical means:

rvs1 = stats.norm.rvs(loc=5,scale=10,size=500) # this creat an array of 500 values
rvs2 = stats.norm.rvs(loc=5,scale=10,size=500)
print(stats.ttest_ind(rvs1,rvs2))
# (0.26833823296239279, 0.78849443369564776)

print(stats.ttest_ind(rvs1,rvs2, equal_var = False))
# (0.26833823296239279, 0.78849452749500748)

Ttest_indResult(statistic=0.26833823296238857, pvalue=0.788494433695651)
Ttest_indResult(statistic=0.26833823296238857, pvalue=0.7884945274950106)


In [30]:
print(t)
print(df)

4.5
2


In [72]:
np.random.seed(12345678)
stats.norm.rvs(loc=5, scale=10, size=10)

array([ 10.53708189,  -9.59631993,  -7.94585139, -10.09673945,
        20.71874901,  -4.75696191,   9.80698788,  11.25614307,
        12.22353018,  14.10326442])

In [75]:
stats.norm.rvs(loc=5, scale=10, size=10)

array([ 2.65451189, -4.8131382 , 17.78519151, 16.19017138, 15.63490301,
       15.64924909, 18.12775051, 11.94768725, 22.17620133,  8.781742  ])

# Question 3
Given the same data 1, how many more samples would you need to achieve p = 0.01, assuming sample mean and sample std. dev. do not change.

In [31]:
data = [90,100,110]
µ = 85
n = len(data)
x_bar = sum(data)/len(data)
s = stdev(data)
df = n-1

t_test = (x_bar-µ)/(s/(n**.5))
print(t_test)

2.5980762113533156


In [32]:
for n in range(3,10):
    df = n-1
    t_test = (x_bar-µ)/(s/(n**.5))
    print (df,t_test)

2 2.5980762113533156
3 3.0
4 3.3541019662496843
5 3.674234614174767
6 3.968626966596886
7 4.242640687119286
8 4.5


You'd need 5 degrees of freedom, n=6.  That's 3 more samples.
the 5 degree of freedom values is close to t-table. This means we need 6 values and we only have three data points
let's add three data points to and see the result 

In [34]:
data1 = [90,95,100,105,110,115]
µ1 = 85 
n1 = len(data1)
x_bar1 = sum(data1)/len(data1)
s1 = stdev(data1)
df= (n1-1)
t_test1 = (x_bar1 - µ1)/(s1/(n1**0.5))
print(t_test1)

4.58257569495584


In [38]:
for i in range (1,10):
    df1 = i-1
    t_test1 = (x_bar1-µ1)/(s1/(i**0.5))
    print(df1, t_test1)

0 1.8708286933869707
1 2.6457513110645907
2 3.2403703492039297
3 3.7416573867739413
4 4.183300132670378
5 4.58257569495584
6 4.949747468305833
7 5.291502622129181
8 5.612486080160911


In [39]:
ttest_1samp(data1, µ1)
# since the new calculated pvalue of 0.0059 is less then the alpha value of 0.01. we can confidently reject hypothesis zero H0

Ttest_1sampResult(statistic=4.58257569495584, pvalue=0.00593354451759226)