# A/B Test 

Use it in order to test a new product, new feature online.

**Steps**:
- Randomly take 2 sets of users: 1 for the control set (existing feature), the other for the experiment set (new feature)
- Evaluate how differently answers users are to determine the best version of the feature.

**We can't use A/B Test when**:
- Results may take too long to have
- No data available for the experiment

**In practice**:
* Construct the user flow (the customer funnel)
* Choose a metric :
    - Click through rate: `nbClick/nbPageView`, to know usability of a feature
    - Click through probability: `nbUniqueVisitorWhoClick/nbUniqueVisitorToPag`, to know impact of a feature
* Perform experiment sizing
* Analyze results
* Draw conclusion



#### Hypotheses testing
How likely my result was obtain by chance? I have to calculate P(results due to chance)

So we need to make an hypothesis of what the result would be if the experiment have no effect - this is called the **NULL HYPOTHESIS (H0)**

If the experiment have no effect, that means the probability of the control groupe is equal to the probability of the experiement group. Or the difference between the two probabilities are null

So **H0 : Pcont = Pexp, or Pcont-Pexp = 0**

We also need an hypothesis of what the result would be if the experiement have an effect, which is the opposite to H0 - this is called the **ALTERNATIVE HYPOTHESIS (H1)**

So **H1 : Pcont-Pexp != 0**

Next steps:
* measure Pcont & Pexp
* Calculate hyp = Pcont-Pexp
* Calculate the probability of this result (hyp) was due to chance if the H0 was true P(hyp|H0)
* If we want to reject or accept an hypothesis at 95% of confidence, alpha = 1-0.05 = 0.05
* If P(hyp|H0) < alpha, we accept H0 and reject H1

****
* TotalSucces = total nb of success through both group
* TotalUsers = total nb of users
##### Polled probability of a click:
$$\hat{P}_{pool} = \frac{TotalSucces}{TotalUsers} = \frac{X_{exp} + X_{cont}}{N_{exp} + N_{cont}}$$

##### Polled standard error of a click :
$$SE_{pool} = \sqrt{\hat{P}_{pool} * (1-\hat{P}_{pool}) * (\frac{1}{N_{cont}} + \frac{1}{N_{exp}})}$$

****
##### Difference between Pexp & Pcont :
$$ \hat{d} = \hat{P}_{exp} - \hat{P}_{cont} $$

****
Under the null hypothesis:
$$d = 0$$

So We expect: 
$$\hat{d} \sim \mathcal{N} (0,SE_{pool})$$

****
comparison to our Z-score
- if:
$$\hat{d} > 1.96*SE_{pool}$$
- or:
$$\hat{d} < -1.96*SE_{pool}$$
We reject H0 and say that our diffenrence represent a statistically significant difference. That means, we reject the fact of our experiment has no effect.

### Experiment sizing

In [1]:
from statsmodels.stats.power import NormalIndPower
from statsmodels.stats.proportion import proportion_effectsize

# current click through rate
cctr = 0.1

# And we want an at least a 2% increase on the new feature so (common value from business side)
practical_significance = 0.02 

# desired click through rate on new experiment
dctr = cctr + practical_significance

# how many data points (page views) we need to reliably to detect that kind of change ?
# We have to compute the statistical power

# leave out the "nobs" parameter to solve for it
nip = NormalIndPower()
nip.solve_power(effect_size = proportion_effectsize(dctr, cctr), alpha = .05, power = 0.8)

3834.5957398840183