In [1]:
import pandas as pd
import numpy as np
from scipy.stats import norm

## Loading data

In [2]:
exp_data = pd.read_csv("https://raw.githubusercontent.com/woldemarg/ds_tests/master/data_analysis/company_4/task_solution/data/experiment_raw.csv",
                       parse_dates=["date"])

## Get confidence score
* [see reference #1](https://cosmiccoding.com.au/tutorials/ab_tests)
* [see reference #2](https://towardsdatascience.com/the-math-behind-a-b-testing-with-example-code-part-1-of-2-7be752e1d06f)

In [11]:
def get_confidence_ab(df):
    rate_old = df.iloc[0, 0] / df.iloc[0, 1]
    rate_new = df.iloc[1, 0] / df.iloc[1, 1]
    std_old = np.sqrt(rate_old * (1 - rate_old) / df.iloc[0, 1])
    std_new = np.sqrt(rate_new * (1 - rate_new) / df.iloc[1, 1])
    z_score = (rate_new - rate_old) / np.sqrt(std_old**2 + std_new**2)
    return norm.cdf(z_score)

## Function to perform analysis with different params

In [4]:
def print_ab_results(data,
                     key,
                     strategy="periods",
                     freq="1M",
                     param_list=[]):
    if strategy == "periods":
        groups = (data
                  .groupby(["experiment_mobile_checkout_theme",
                            pd.Grouper(key="date", freq=freq)],
                           sort=False)[key])
    elif strategy == "features":
        groups = data.groupby(param_list,
                              sort=False)[key]

    groupped = (groups
                .agg(["sum", "size"])
                .sort_index(level=-1,
                            sort_remaining=True,
                            ascending=False))

    if len(param_list) == 0 or len(param_list) > 1:
        for idx, df_select in groupped.groupby(level=-1, axis=0):
            print("B's conversion as for '{}' estimated by '{}' is better with {:.1%} confidence"
                  .format(idx, key, get_confidence_ab(df_select)))
    else:
        print("B's conversion estimated by '{}' is better with {:.1%} confidence"
              .format(key, get_confidence_ab(groupped)))

## Tasks

1) Выбрать релевантный тест и определить есть ли статистически значимая разница между ветками эксперимента по уровню **попыток покупки** ...

In [5]:
print_ab_results(data=exp_data,                 
                 key="transaction_try",
                 strategy="features",
                param_list=["experiment_mobile_checkout_theme"])


B's conversion estimated by 'transaction_try' is better with 100.0% confidence


... и уровню **успешных покупок**

In [6]:
print_ab_results(data=exp_data,                 
                 key="transaction_success", #!
                 strategy="features",
                param_list=["experiment_mobile_checkout_theme"])

B's conversion estimated by 'transaction_success' is better with 99.9% confidence


2) Проверить эффект в разрезе операционных систем (**попытки покупки**)

In [7]:
print_ab_results(data=exp_data,                 
                 key="transaction_try",
                 strategy="features",
                param_list=["experiment_mobile_checkout_theme", "os"])

B's conversion as for 'android' estimated by 'transaction_try' is better with 100.0% confidence
B's conversion as for 'ios' estimated by 'transaction_try' is better with 100.0% confidence


2) Проверить эффект в разрезе операционных систем (**успешные покупки**)

In [8]:
print_ab_results(data=exp_data,                 
                 key="transaction_success",
                 strategy="features",
                param_list=["experiment_mobile_checkout_theme", "os"]) #!

B's conversion as for 'android' estimated by 'transaction_success' is better with 98.4% confidence
B's conversion as for 'ios' estimated by 'transaction_success' is better with 98.9% confidence


3) Проверить устойчивость эффекта во времени (**попытки покупки**)

In [9]:
print_ab_results(data=exp_data,                 
                 key="transaction_try")

B's conversion as for '2020-04-30 00:00:00' estimated by 'transaction_try' is better with 94.1% confidence
B's conversion as for '2020-05-31 00:00:00' estimated by 'transaction_try' is better with 100.0% confidence


3) Проверить устойчивость эффекта во времени (**успешные покупки**)

In [10]:
print_ab_results(data=exp_data,                 
                 key="transaction_success") #!

B's conversion as for '2020-04-30 00:00:00' estimated by 'transaction_success' is better with 36.3% confidence
B's conversion as for '2020-05-31 00:00:00' estimated by 'transaction_success' is better with 99.9% confidence


Похоже, в апреле эффект от дизайна "B" был менее значимым. Возможно, пользователи привыкали к новому интерфейсу