# Overview of *Ambrosia* ``Designer`` class Spark data support

This example shows the functionality of the ``Designer`` class on Spark DataFrames. Synthetic data on LTV and user retention rate is used.

The functionality of the ``Designer`` class on Spark data currently is limited compared to the pandas format. \
In order to learn about the full functionality of the ``Designer`` and get information about why the design of A / B test parameters is needed and how it can be done, see the main tutorial on the ``Designer`` class.

In [2]:
import os

import pandas as pd
import pyspark

from ambrosia.designer import Designer

Build local spark session

In [3]:
os.environ['SPARK_LOCAL_IP'] = '127.0.0.1'
spark = pyspark.sql.SparkSession.builder.master("local[1]").getOrCreate()
spark.sparkContext.setLogLevel('ERROR')

Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).


23/04/21 17:39:34 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable


Create Spark DataFrame

In [4]:
ltv_and_retention_dataset = pd.read_csv(
    "./../tests/test_data/ltv_retention.csv")
sdf = spark.createDataFrame(ltv_and_retention_dataset)

In [5]:
sdf.printSchema()

root
 |-- LTV: double (nullable = true)
 |-- retention: double (nullable = true)



### Spark A/B test parameters theoretical design

First, we will use a theoretical approach to find the missing parameters of a hypothetical experiment. \
We will obtain theoretical estimates for the size of groups, MDE in the power of the test with the appropriate known parameters.

Create class instance and set grid of parameters, I and II type errors remain default

In [6]:
designer = Designer(dataframe=sdf,
                    effects=[1.05, 1.2],
                    sizes=[100, 1000],
                    metrics='LTV')

Design groups size

In [7]:
designer.run('size', 'theory')

                                                                                

"Errors ($\alpha$, $\beta$)",(0.05; 0.2)
Effect,Unnamed: 1_level_1
5.0%,6206
20.0%,389


Design minimal detectable effect

In [8]:
designer.run('effect', 'theory')

"Errors ($\alpha$, $\beta$)",(0.05; 0.2)
Group sizes,Unnamed: 1_level_1
100,39.6%
1000,12.5%


Design test power

In [9]:
designer.run('power', 'theory')

Unnamed: 0_level_0,Group sizes,100,1000
$\alpha$,Effect,Unnamed: 2_level_1,Unnamed: 3_level_1
0.05,5.0%,6.4%,20.3%
0.05,20.0%,29.4%,99.4%


### Spark A/B test parameters empirical design

Now let's calculate the parameters using multiple sampling of groups from the transmitted data and modeling a hypothetical effect. \
This approach, with high value of `bootstrap_size` parameter (number of sampled groups per step), gives more accurate estimation of the parameters, but requires much more computational resources than the theoretical one.

In [10]:
designer = Designer(dataframe=sdf,
                    second_type_errors=0.1,
                    effects=[1.1, 1.2],
                    sizes=[500, 2000],
                    metrics='LTV')

Currently, we don't have ``criterion`` parameter which we implement for different statistical criteria in pandas data empirical design, here ``t-test`` criterion is always used.

In [11]:
designer.run('size', 'empiric', bootstrap_size=20)

  0%|          | 0/2 [00:00<?, ?it/s]

errors,"(0.1, 0.05)"
effect,Unnamed: 1_level_1
20.0%,247
10.0%,1198


In [12]:
designer.run('effect', 'empiric', bootstrap_size=20)

  0%|          | 0/2 [00:00<?, ?it/s]

errors,"(0.1, 0.05)"
group_sizes,Unnamed: 1_level_1
500,9.6%
2000,5.9%


### Spark design for binary metrics 

For binary metrics,  ``"theory"`` or ``"binary"`` approaches can be used. \
The first approach uses different approximations for binary data, while the latter calculates experimental parameters based on the constructed confidence intervals of various types.

In [14]:
designer = Designer(dataframe=sdf,
                    second_type_errors=0.5,
                    sizes=150,
                    effects=1.2,
                    metrics='retention')

Group size:

In [15]:
designer.run('size', 'theory')

"Errors ($\alpha$, $\beta$)",(0.05; 0.5)
Effect,Unnamed: 1_level_1
20.0%,289


In [16]:
designer.run('size', 'binary', interval_type='newcombe', amount=100000)

"Errors ($\alpha$, $\beta$)",(0.05; 0.5)
Effect,Unnamed: 1_level_1
20.0%,280


Power:

In [17]:
designer.run('power', 'theory')

Unnamed: 0_level_0,Group sizes,150
$\alpha$,Effect,Unnamed: 2_level_1
0.05,20.0%,29.2%


In [18]:
designer.run('power', 'binary', interval_type='newcombe', amount=100000)

Unnamed: 0_level_0,Group sizes,150
$\alpha$,Effect,Unnamed: 2_level_1
0.05,20.0%,30.3%


In [19]:
spark.stop()