Experiments

## Intro

One of the most troublesome tasks that we have is running multiple statistical test (t-test)
while analysing results of an experiment. We often have to perform this operation for
all the metrics in a test.

Here's how to do quickly. :)

## Data prep

The key step is to have the data prepared in a standardized way. What is absolutely necessary?
* having each variant in a separate row;
* having `variant` column. Its name can be different, but we need to have variant as a
dimension;
* having `total_users` column. Again, give it your name, but we'll need this number to
calculate all the conversion rates;
* list of metrics / column names that we want to perform the tests for.

Below is a good example (other columns are allowed, we'll simply omit them in the analysis):

| variant | total\_users | converted\_users1 | converted\_users2
| :--- | :--- | :--- | :--- |
| us | 0 | 24386 |86 |246 |
| us | 1 | 24376 | 376 | 243 |

### Data for the analysis of statistical means
Analyzing statistical means will require one more column (or columns) in the analyzed dataset. In order to calculate
the t-stat for means providing variance is required. Details on how to add to the dataset will be covered in the example
with mean analysis.

## Running the test
Having the data prepared in that way, running the tests will require just
two of lines of code.

```python
from da_toolkit.experiments import Analysis
Analysis(df=df, metrics=['converted_users1', 'converted_users2'])
```

The test are being run using a Python package called [statsmodels](https://www.google.com)
(and its `proportions_ztest` function). But you don't have to worry, it produces the same
results as the spreadsheet-based solution you may be familiar with.

That's it. Let's take a look at a real-life example.

## Example
### T-test for proportions

In [1]:
# importing modules
from da_toolkit.databases import BigQuery
from da_toolkit.experiments import Analysis

Getting my experiments data

In [2]:
bq = BigQuery()
with open('queries/experiment_query.sql') as q:
    query = q.read()
df = bq.query(query)

df['variant'] = df['variant'].astype('str') # variant has to be a string
df

Unnamed: 0,variant,users,users_prod_click,users_add_to_cart,avg_prod_clicks,avg_add_to_cart_clicks,var_prod_clicks,var_add_to_cart_clicks
0,1,18172,11059,11304,1.981644,2.823337,5.330012,19.420703
1,0,18336,11330,11242,1.933363,2.95588,5.235474,39.988623


Running the tests for 2 metrics (by putting their column names as `metrics` argument)

The results will be saved as 2 attributes - a dictionary (`results`) or (more convenient)
pandas Data Frame (`results_df`)

In [3]:
exp = Analysis(df=df, metrics=['users_prod_click', 'users_add_to_cart'], total_col='users', alpha=0.1)
exp.results_df

Unnamed: 0,Unnamed: 1,cvr,delta,z_stat,p_val,power,res
users_prod_click,1,0 0.617910 1 0.608574 dtype: float64,-0.01511,1.831522,0.067023,0.574299,significant!
users_add_to_cart,1,0 0.613111 1 0.622056 dtype: float64,0.01459,-1.758425,0.078675,0.545563,significant!


We got the results as a table listing all of my analyzed metrics and results of the test in columns. As the variants were artificially created, there are no significant differences between them.

### T-test for means
Similarly, we will run the tests for 2 metrics (again, by putting their column names as `metrics` argument).
This time we'll also have to specify columns that hold respective variance value for the metrics. If we don't `Analysis`
will assume that the data contains columns that have the same names as metrics but with `var_` preffix.

To specify running the test for means we add one more argument `kind='mean'`.

In [4]:
metrics = ['avg_prod_clicks', 'avg_add_to_cart_clicks']
variances = ['var_prod_clicks', 'var_add_to_cart_clicks']
exp = Analysis(df=df, metrics=metrics, variances=variances, total_col='users', alpha=0.05, kind='mean')
exp.results_df

Unnamed: 0,Unnamed: 1,mean,delta,t_stat,p_val,res
avg_prod_clicks,1,"[1.9333627537511033, 1.9816439099376078]",0.024973,-2.006856,0.044772,significant!
avg_add_to_cart_clicks,1,"[2.9558797367016543, 2.8233368719037513]",-0.04484,2.321487,0.020266,significant!
