## Hypothesis Testing

[Hypothesis testing](#Hypothesis)<br/>
[AB testing](#AB)<br/>
[Z-test](#Z)<br/>
[T-test](#T)<br/>

In [14]:
from __future__ import division
import numpy as np

<a id='Hypothesis'></a>
### Hypothesis Testing

- State null hypothesis H0 (typically status quo, no effect)
- Choose a significance level alpha
- Choose and compute appropriate test statistics
- Compute p-value and 'reject' or 'fail to reject' H0

**Two sided tests**
- Reject H0 if test statistic is in upper or lower tail
- Compute p-value using probability of being in either tail

**One sided tests**
- Reject H0 if test statistic is in the wrong tail (advertising does not decrease sales)
- Compute p-value using probability of being in only one tail

**type I error** - `alpha`, reject H0 when it is true (convict innocent)<br/>
**type II error** - `beta`, fail to reject H0 when it is false (release guilty)

<img src="img/hypothesis_testing.png" width="300">

|             |          | Truth        |              |
|-------------|----------|--------------|--------------|
|             |          | H0 true      | H0 false     |
|**Findings** | H0 false | Type I (TN)  | Correct (TN) |
|             | H0 true  | Correct (TP) | Type II (FN) |

**p-value** - probability of observing data which is at least as extreme as hwat was observed<br/>
**confidence interval** - if we compute CI from multiple random samples from population, then 95% will contain the true population value

**power (sensitivity)** - `1-beta`, probability of not making type II error<br/>
**significance level (type I eror)** - `alpha`, probability of rejecting H0 given that it is true

<a id='AB'></a>
### AB Testing

Can we say if one of the landing pages is better (i.e. gets more registrations) than the other with statistical significane?

```
                 Visitors   Registrations
Landing Page 2	1,012,285   349,643
Landing Page 3	  995,750   320,432

H0: Landing pages 2 and 3 are the same.
H1: One of the pages is better.

Visit > register ratio
2: 349,643 / 1,012,285 = 0.3454
3: 320,432 / 995,750 = 0.3218
```

In [2]:
lp2 = np.zeros(1012285)
lp2[0:349643] = 1
lp3 = np.zeros(995750)
lp3[0:320432] = 1

In [15]:
from scipy.stats import ttest_ind
ttest_ind(lp2, lp3, equal_var=False)

Ttest_indResult(statistic=35.476606227821215, pvalue=1.3745040544084776e-275)

P-value 1.374 > 0.05, we cannot to reject that the pages are the same (H0).

<a id='Z'></a>
### Z-test

**z-test** - use when variance is known

To Compare means (assuming they are independent) with the same standard deviation `statsmodels.ztest`:

In [7]:
from statsmodels.stats.weightstats import ztest

ztest([2,3,4,5], [3,4,5,6])

(-1.0954451150103321, 0.27332167829229814)

To compare means from distributions with different standard deviation use `CompareMeans.ztest_ind`:

In [13]:
from statsmodels.stats.weightstats import CompareMeans, DescrStatsW

cm = CompareMeans(DescrStatsW([2,3,4,5]), DescrStatsW([3,4,5,6]))
cm.ztest_ind(alternative='two-sided')

(-1.0954451150103321, 0.27332167829229814)

<a id='T'></a>
## T-test

**t-test** - use when variance is unknown

Calculate the T-test for the means of two independent samples of scores:

In [22]:
from scipy.stats import ttest_ind
print ttest_ind([0,0,1,0,2], [0,1,1,0,2], equal_var=False) #Welch’s t-test (do not assume equal population variance)
print ttest_ind([0,0,1,0,2], [0,1,1,0,2], equal_var=True)  #standard independent 2 sample test

Ttest_indResult(statistic=-0.36514837167011088, pvalue=0.72450697149417942)
Ttest_indResult(statistic=-0.36514837167011083, pvalue=0.72446582573474294)
