<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#AB-Testing" data-toc-modified-id="AB-Testing-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>AB-Testing</a></span></li></ul></div>

# AB-Testing

![cats](images/cats.jpeg)


Imaginad que somos los cientificos de datos de la empresa de videojuegos Tactile Entertainment. Los desarrolladores del juego Cookie Cats pretenden introducir un cambio en el juego para aumentar la retencion de los jugadores. En cierto nivel del juego los jugadores se encuentran una puerta que les obliga a esperar o a pagar la app. Actualmente la puerta se encuentra en nivel 30 y se pretende pasar al nivel 40, para comprobar la retencion a 1 y 7 dias. Antes de realizar el cambio definitivo en el juego se raliza un test AB.

Los datos estan alojados en `data/cookie_cats.csv`. Nuestro grupo de control sera la version actual `gate_30` y el grupo de tratamiento sera la version `gate_40`. Debemos realizar el test para 1 dia de retencion `retention_1` y para 7 dias `retention_7`.

In [13]:
# librerias

import pandas as pd
import numpy as np
from scipy import stats

from statsmodels.stats.proportion import proportions_ztest, proportion_confint
from scipy.stats import norm, sem
import statsmodels.stats.api as sms
import pylab as plt
from scipy.stats import bernoulli, beta
from bayes import *
import warnings
warnings.filterwarnings("ignore")


In [2]:
# datos
df=pd.read_csv('data/cookie_cats.csv')
df.head() , df.shape

(   userid  version  sum_gamerounds  retention_1  retention_7
 0     116  gate_30               3        False        False
 1     337  gate_30              38         True        False
 2     377  gate_40             165         True        False
 3     483  gate_40               1        False        False
 4     488  gate_40             179         True         True,
 (90189, 5))

In [3]:
# Asumo que las personas que no jugaron ni 1 ronda deben ser eliminados del dataFrame.
df = df[df['sum_gamerounds'] != 0]
df.shape

(86195, 5)

In [4]:
df.version.value_counts()

version
gate_40    43432
gate_30    42763
Name: count, dtype: int64

In [5]:
imps_ctrl = 42763
convs_ctrl = len(df[(df.version == 'gate_30') & (df.retention_1 == True)])
imps_test = 43432
convs_test = (len(df[(df.version == 'gate_40') & (df.retention_1 == True)]))


In [6]:
CR_ctrl = convs_ctrl/imps_ctrl
CR_test = convs_test/imps_test
f'Tasas estancia: Control: {CR_ctrl}, Test: {CR_test}'

'Tasas estancia: Control: 0.46753034165049223, Test: 0.46217074967765703'

In [7]:

efecto = sms.proportion_effectsize(CR_ctrl, CR_ctrl + 0.2)
efecto

-0.4066546818588541

In [8]:
n_requirido = sms.NormalIndPower().solve_power(efecto,
                                               power=0.99,
                                               alpha=0.01)
n_requirido

290.6407148839068

In [9]:
control = df[df.version == 'gate_30'].sample(n=291, random_state=42)

tratamiento = df[df.version == 'gate_40'].sample(n=291, random_state=42)

ab_test = pd.concat([control, tratamiento], axis=0)

ab_test.reset_index(drop=True, inplace=True)

ab_test.head()

Unnamed: 0,userid,version,sum_gamerounds,retention_1,retention_7
0,2741718,gate_30,15,True,False
1,1579627,gate_30,25,False,True
2,3558203,gate_30,5,False,False
3,50850,gate_30,34,True,True
4,6476239,gate_30,38,True,True


In [14]:
tasas_conv = ab_test.groupby('version')['retention_1']

tasas_conv = tasas_conv.agg([np.mean,
                               lambda x: np.std(x, ddof=0),
                               lambda x: stats.sem(x, ddof=0)
                               ])


tasas_conv.columns = ['conversion_rate', 'std', 'sem']

tasas_conv

Unnamed: 0_level_0,conversion_rate,std,sem
version,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
gate_30,0.491409,0.499926,0.029306
gate_40,0.412371,0.492261,0.028857


In [20]:

control_res = ab_test[ab_test.version=='gate_30']['retention_1']

trat_res = ab_test[ab_test.version=='gate_40']['retention_1']
sum(control_res), sum(trat_res)

(143, 120)

In [22]:
impresiones = [control_res.shape[0], trat_res.shape[0]]  # entrar en al pagina 

conversiones = [sum(control_res), sum(trat_res)]     # comprar en la pagina



z_score, p_value = proportions_ztest(conversiones, nobs=impresiones)

(control_a, trata_a), (control_b, trata_b) = proportion_confint(conversiones, 
                                                                nobs=impresiones,
                                                                alpha=0.05)

(control_a, trata_a), (control_b, trata_b)

((0.43396985354894196, 0.35581271178970525),
 (0.5488480158668656, 0.4689295562515318))

In [23]:
print(f'z-score: {z_score:.2f}')

print(f'p-valor: {p_value:.3f}')

print(f'intervalo conf 95% para grupo control: [{control_a:.3f}, {control_b:.3f}]')

print(f'intervalo conf 95% para grupo tratamiento: [{trata_a:.3f}, {trata_b:.3f}]')

z-score: 1.92
p-valor: 0.055
intervalo conf 95% para grupo control: [0.434, 0.549]
intervalo conf 95% para grupo tratamiento: [0.356, 0.469]


Dado que el 
-valor=0.055 es mayor que 
=0.05, no podemos rechazar la hipótesis nula 
, lo que significa que retencion_1 no tiene un rendimiento significativamente diferente del viejo diseño.


In [26]:
imps_ctrl = 42763
convs_ctrl = len(df[(df.version == 'gate_30') & (df.retention_7 == True)])
imps_test = 43432
convs_test = (len(df[(df.version == 'gate_40') & (df.retention_7 == True)]))


In [27]:
CR_ctrl = convs_ctrl/imps_ctrl
CR_test = convs_test/imps_test
f'Tasas estancia: Control: {CR_ctrl}, Test: {CR_test}'

'Tasas estancia: Control: 0.19844257886490657, Test: 0.1903205010130779'

In [28]:
efecto = sms.proportion_effectsize(CR_ctrl, CR_ctrl + 0.2)
efecto

-0.4428623422097697

In [29]:
n_requirido = sms.NormalIndPower().solve_power(efecto,
                                               power=0.99,
                                               alpha=0.01)
n_requirido

245.05891080029676

In [30]:
control = df[df.version == 'gate_30'].sample(n=291, random_state=42)

tratamiento = df[df.version == 'gate_40'].sample(n=291, random_state=42)

ab_test = pd.concat([control, tratamiento], axis=0)

ab_test.reset_index(drop=True, inplace=True)

ab_test.head()

Unnamed: 0,userid,version,sum_gamerounds,retention_1,retention_7
0,2741718,gate_30,15,True,False
1,1579627,gate_30,25,False,True
2,3558203,gate_30,5,False,False
3,50850,gate_30,34,True,True
4,6476239,gate_30,38,True,True


In [31]:
tasas_conv = ab_test.groupby('version')['retention_7']

tasas_conv = tasas_conv.agg([np.mean,
                               lambda x: np.std(x, ddof=0),
                               lambda x: stats.sem(x, ddof=0)
                               ])


tasas_conv.columns = ['conversion_rate', 'std', 'sem']

tasas_conv

Unnamed: 0_level_0,conversion_rate,std,sem
version,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
gate_30,0.216495,0.411855,0.024143
gate_40,0.161512,0.368003,0.021573


In [33]:
control_res = ab_test[ab_test.version=='gate_30']['retention_7']

trat_res = ab_test[ab_test.version=='gate_40']['retention_7']
sum(control_res), sum(trat_res)

(63, 47)

In [34]:
impresiones = [control_res.shape[0], trat_res.shape[0]]  # entrar en al pagina 

conversiones = [sum(control_res), sum(trat_res)]     # comprar en la pagina



z_score, p_value = proportions_ztest(conversiones, nobs=impresiones)

(control_a, trata_a), (control_b, trata_b) = proportion_confint(conversiones, 
                                                                nobs=impresiones,
                                                                alpha=0.05)

(control_a, trata_a), (control_b, trata_b)

((0.1691746743163834, 0.11923032659353525),
 (0.2638150164052661, 0.20379372838928264))

In [35]:
print(f'z-score: {z_score:.2f}')

print(f'p-valor: {p_value:.3f}')

print(f'intervalo conf 95% para grupo control: [{control_a:.3f}, {control_b:.3f}]')

print(f'intervalo conf 95% para grupo tratamiento: [{trata_a:.3f}, {trata_b:.3f}]')

z-score: 1.69
p-valor: 0.090
intervalo conf 95% para grupo control: [0.169, 0.264]
intervalo conf 95% para grupo tratamiento: [0.119, 0.204]
