 # A/B test

## What is an A/B test
An experiment to compare two competing options (A, B). We use it to determine if the options are different in a statisitcal sence (hypotesis testing and permutation test) so we can determine which option is better.

### Steps of A/B test

**0. Idea & Definition:** Question, goal, data/subjects, options, test statistics.

**1. Subjects:** Set of all subjects.

**2. Randomization:** Randomly assign subjects to the two groups (A, B).

**3. Results:** Expose subjects to options, measure results, and compute test statistics.

**4. Hypothesis testing:** Determine if the observed difference is statistically significant. (Permutation test)

**5. Action/Decision:** Act based on the results.


You need to test (hypothsis tesing - p-value testing) the result if it is statistically significant or it is random. 

Test statistic: Yes rate difference. Take difference (A%-B%).

Obsereve difference between A and B must be due to either:
* Null hypothesis: Random chance (subject assignment)
* Alternative hypothesis: Real difference between A & B.

**Hypothsis testing:** is random chance (Null hypothesis) a reasonable explanation for the observed difference?
* Assume that the Null hypothesis is true
* Create corresponding Null model (probability model)
* Tests whether the observed difference is a reasonable outcome of that Null model
* Is the observed difference within the random variablity of the Null model?


**P-value**
* Given a random change (probability) model that embodies the Null hypothesis
* The p-value is the probability of obataining results as unusual/extreme as the observed result

**Significance level (alpha):**
* The probability threshold of 'unussualness' (0.05)
* Must be defined before the experiment 
* The probability we accept for a type 1 error (False positive).

**Decision:**
* **p-value >= alpha:** retain the null hypothesis (observed difference is due to random chance).
* **p-value < alpha:** reject the null hypothesis (observed difference is real/significant)

We can use the permutation method to calculate the p-value.

**Permutation test** is a **resampling** procedure used for hypothesis testing.

**Resampling:** repeatedly sample values from the observed data to assess a statistic's random variability.

* Boostrap: resampling **with** replacement, used to assess reliability of an estimate.
* Permutation: resampling **without** replacement, used for **hypothesis testing** (putting the ball away).

**Permutation Test:** 
* A resampling procefure used for hypothesis testing.
* Process for combining two (or more) data sample together, and randomly reallocating the observations to assess the random variability of the test statistics.
* A way to create the Null model and compute the p-value.
* No assumptions as an advantage. Created Null hypothesis from the data itself.

So, shuffle the results, and assign them to the A and B randomly, and calculate the p-value by using the actual results and observed results from randomly assigning process and repeat the process.

**Two-way test** (Null: A=B, alternative A != B) (Negative values are included)

**Onve-way test** (Null: A <= B, alternative A>B)

And calculate the p-value for both of them. -> decision.

In [1]:
import pandas as pd
from random import sample
from random import seed

In [2]:
t = 20
a = 10
b = 10
a_yes = 7
b_yes = 4

t_yes = a_yes + b_yes
t_no = t - t_yes

a_yes_pc = a_yes / a * 100
b_yes_pc = b_yes / b * 100

ab_yes_pc = a_yes_pc - b_yes_pc

print("Observed Yes Rate (%) A:", a_yes_pc,
     ", B:", b_yes_pc, ", A-B", ab_yes_pc,"\nTotal counts: Yes:", t_yes, ", No:", t_no)

Observed Yes Rate (%) A: 70.0 , B: 40.0 , A-B 30.0 
Total counts: Yes: 11 , No: 9


In [28]:
seed(2)
bag = [1] * t_yes + [0] * t_no # creating a bag with total numbers of yes' and no's

bag_shuffled = sample(bag1, len(bag1))
a_rand_samp = bag_shuffled[:a]
b_rand_samp = bag_shuffled[a:]

a_yes_pc_rs = sum(a_rand_samp) / a * 100
b_yes_pc_rs = sum(b_rand_samp) / b * 100

ab_yes_pc_rs = a_yes_pc_rs - b_yes_pc_rs

print("Bag :", str(bag).replace(",", ""),"\n",
     "Bag shuffled: ", str(bag_shuffled).replace(",", ""),"\n",
     "A resample: ", str(a_rand_samp).replace(",", ""),"\n",
     "B resample: ", str(b_rand_samp).replace(",", ""),"\n",
     "Resample Yes Rate (%): A:", str(a_yes_pc_rs).replace(",", ""),"B:",str(b_yes_pc_rs).replace(",", ""), "A-B",str(ab_yes_pc_rs).replace(",", ""), "\n",
     "Observed Yes Rate (%) A:", a_yes_pc,
     ", B:", b_yes_pc, ", A-B", ab_yes_pc)

Bag : [1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0] 
 Bag shuffled:  [1 1 0 0 1 0 0 1 1 0 1 1 1 0 0 0 1 0 1 1] 
 A resample:  [1 1 0 0 1 0 0 1 1 0] 
 B resample:  [1 1 1 0 0 0 1 0 1 1] 
 Resample Yes Rate (%): A: 50.0 B: 60.0 A-B -10.0 
 Observed Yes Rate (%) A: 70.0 , B: 40.0 , A-B 30.0


In [75]:
seed(2)
p = 100

perm_res = [0] * p

for i in range(p):
    bag = sample(bag, len(bag))
    a_rs = bag[:a]
    b_rs = bag[a:]
    perm_res[i] = 100 * sum(a_rs) / a - sum(b_rs) / b * 100

print(perm_res)

[10.0, 30.0, 30.0, -10.0, -10.0, 10.0, 10.0, 10.0, 10.0, -10.0, -10.0, 10.0, -10.0, 10.0, 30.0, 50.0, 10.0, -30.0, 10.0, -10.0, -30.0, -10.0, -30.0, -10.0, -50.0, -30.0, 10.0, 30.0, -30.0, 30.0, -10.0, 10.0, 10.0, 30.0, 50.0, 10.0, -10.0, -10.0, -10.0, -30.0, -10.0, -10.0, 10.0, 10.0, -30.0, 30.0, 10.0, -10.0, 30.0, -10.0, -10.0, -10.0, 10.0, 10.0, 10.0, 10.0, -50.0, -10.0, -10.0, -10.0, 30.0, -10.0, 10.0, 10.0, -30.0, 10.0, 10.0, -10.0, 10.0, 10.0, 30.0, -10.0, -30.0, -50.0, -10.0, -10.0, -10.0, 10.0, 10.0, 10.0, -50.0, 10.0, 10.0, 30.0, -50.0, -10.0, 10.0, -10.0, 30.0, -10.0, -10.0, -10.0, -50.0, 10.0, -30.0, -10.0, 30.0, 30.0, -10.0, -10.0]


In [76]:
per_res_s = pd.Series(perm_res)
print(pd.pivot_table(per_res_s.value_counts().reset_index(),
                    values = 0, columns = "index").to_string(index=False))

 -50.0  -30.0  -10.0   10.0   30.0   50.0
     6     10     35     33     14      2


In [77]:
# Two-way hypothesis test
extreme_count = sum(per_res_s.abs() >= abs(ab_yes_pc))

# One-way hypothesis test
pos_extreme_count = sum(per_res_s >= (ab_yes_pc))

print("Number of permutations: ", p, 
      "\nTwo-way: Extreme count: ", extreme_count,
      "\nTwo-way: Extreme ratio (p-value): ", extreme_count / p,
      "\nOne-way: Extreme count: ", pos_extreme_count,
      "\nOne-way: Extreme ratio (p-value): ", pos_extreme_count / p,
     )

Number of permutations:  100 
Two-way: Extreme count:  32 
Two-way: Extreme ratio (p-value):  0.32 
One-way: Extreme count:  16 
One-way: Extreme ratio (p-value):  0.16


P-values are larger than 0.05 (alpha-standard threshold), so we have to retain the null hypothesis.

P-values are the probability of the extreme values that their range is determined by the observed result (on the density plot).