# A/B Testing

This notebook visualizes and measures the results of an A/B test.

In [2]:
%matplotlib notebook

#!pip install lightkurve
import sys
!{sys.executable} -m pip install numpy
!{sys.executable} -m pip install lightkurve
!{sys.executable} -m pip install matplotlib
!{sys.executable} -m pip install seaborn

import matplotlib.pyplot as plt
import lightkurve as lk




## Star selection

In [3]:
TIC = 'TIC 284475976' # TIC Star ID
sector_data = lk.search_lightcurve(TIC, author = 'SPOC', sector = 23) # can remove each arg if needed
sector_data
lc = sector_data.download()
lc.plot()

<IPython.core.display.Javascript object>

<AxesSubplot: xlabel='Time - 2457000 [BTJD days]', ylabel='Flux [$\\mathrm{e^{-}\\,s^{-1}}$]'>

In [4]:
lc.plot(linewidth = 0, marker = '.', color = 'lightcyan', alpha = 0.3)

<IPython.core.display.Javascript object>

<AxesSubplot: xlabel='Time - 2457000 [BTJD days]', ylabel='Flux [$\\mathrm{e^{-}\\,s^{-1}}$]'>

In [5]:
# Plotting from multiple sectors
TIC_2 = 'TIC 55525572'
available_data_all = lk.search_lightcurve(TIC_2, author = 'SPOC')
available_data_all

#,mission,year,author,exptime,target_name,distance
Unnamed: 0_level_1,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,s,Unnamed: 5_level_1,arcsec
0,TESS Sector 04,2018,SPOC,120,55525572,0.0
1,TESS Sector 05,2018,SPOC,120,55525572,0.0
2,TESS Sector 06,2018,SPOC,120,55525572,0.0
3,TESS Sector 08,2019,SPOC,120,55525572,0.0
4,TESS Sector 09,2019,SPOC,120,55525572,0.0
5,TESS Sector 10,2019,SPOC,120,55525572,0.0
6,TESS Sector 11,2019,SPOC,120,55525572,0.0
7,TESS Sector 12,2019,SPOC,120,55525572,0.0
8,TESS Sector 13,2019,SPOC,120,55525572,0.0
9,TESS Sector 27,2020,SPOC,20,55525572,0.0


In [6]:
select_sector = available_data_all[0:4]
select_sector

#,mission,year,author,exptime,target_name,distance
Unnamed: 0_level_1,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,s,Unnamed: 5_level_1,arcsec
0,TESS Sector 04,2018,SPOC,120,55525572,0.0
1,TESS Sector 05,2018,SPOC,120,55525572,0.0
2,TESS Sector 06,2018,SPOC,120,55525572,0.0
3,TESS Sector 08,2019,SPOC,120,55525572,0.0


In [7]:
lc_collection = select_sector.download_all() # download all the sectors ([0:4])
lc_collection

LightCurveCollection of 4 objects:
    0: <TessLightCurve LABEL="TIC 55525572" SECTOR=4 AUTHOR=SPOC FLUX_ORIGIN=pdcsap_flux>
    1: <TessLightCurve LABEL="TIC 55525572" SECTOR=5 AUTHOR=SPOC FLUX_ORIGIN=pdcsap_flux>
    2: <TessLightCurve LABEL="TIC 55525572" SECTOR=6 AUTHOR=SPOC FLUX_ORIGIN=pdcsap_flux>
    3: <TessLightCurve LABEL="TIC 55525572" SECTOR=8 AUTHOR=SPOC FLUX_ORIGIN=pdcsap_flux>

In [8]:
lc_collection.plot(linewidth = 0, marker = '.')

<IPython.core.display.Javascript object>

<AxesSubplot: xlabel='Time - 2457000 [BTJD days]', ylabel='Flux [$\\mathrm{e^{-}\\,s^{-1}}$]'>

Some simple ideas/notes very briefly:

1. Move the plot into an interactive format (like a Zooniverse frontend fork) and look at where users "click" on "curves", save that to a DB

undefined. Connect DeepNote to Umbrel?

In [9]:


visits = _deepnote_execute_sql("""select
    visits.user_id,
    visits.visited_at,
    users.signed_up_at,
    users_ab.variant
from
    visits
    left join users on visits.user_id = users.user_id
    inner join users_ab on visits.user_id = users_ab.user_id
""", 'SQL_26D1FA15_65F7_4B77_8193_83C3976930E2')
visits

NameError: name '_deepnote_execute_sql' is not defined

## Conversion Rate

We need to check whether one variant resulted in a higher conversion rate than another. Let's start with a time series.

In [None]:
visits['registered'] = visits.signed_up_at.notna()
visits = visits.sort_values('visited_at').drop_duplicates('user_id',keep='first').copy()

conversion = visits.copy()
conversion['week'] = conversion.visited_at.dt.tz_localize(None).dt.to_period('W').dt.to_timestamp()
conversion = conversion.groupby(['variant','week']).registered.value_counts(normalize=True,dropna=False).reset_index(name='conversion')
conversion = conversion.loc[conversion.registered == True]

In [None]:
_deepnote_run_altair(conversion, """{"data":{"name":"placeholder"},"mark":{"type":"line","tooltip":{"content":"data"}},"height":220,"$schema":"https://vega.github.io/schema/vega-lite/v4.json","autosize":{"type":"fit"},"encoding":{"x":{"sort":null,"type":"temporal","field":"week","scale":{"type":"linear","zero":false}},"y":{"sort":null,"type":"quantitative","field":"conversion","scale":{"type":"linear","zero":false}},"color":{"sort":null,"type":"nominal","field":"variant","scale":{"type":"linear","zero":false}}}}""")

Here is the overall conversion rate for each variant:

In [None]:
conversion.groupby('variant').conversion.mean().reset_index(name='conversion_rate')

Unnamed: 0,variant,conversion_rate
0,variant_a,0.00813
1,variant_b,0.013527


### Significance Test

It looks like Variant B has a higher conversion rate than Variant A. We need to make sure these results are significant.

In [None]:
from scipy.stats import chi2_contingency

# Calculate the proportion of registered users for variant A vs B
proportions = visits.groupby('variant').registered.value_counts(dropna=False).reset_index(name='nusers')
proportions = proportions.pivot(index='variant',columns='registered',values='nusers')

# Chi-Square test
g,p,dof,expctd = chi2_contingency(proportions)

print(f'p-value is {p}')
if p < 0.05:
    print(f'The difference in Conversion Rate between Variants A and B is statistically significant.')
else:
    print(f'The difference in Conversion Rate between Variants A and B is NOT statistically significant.')

p-value is 4.536508742253485e-12
The difference in Conversion Rate between Variants A and B is statistically significant.


## Retention Rate

We also want to know whether one variant of the website results in higher retention than another variant. For this, let's look at the weekly sessions of our users.

In [None]:


sessions_weekly = _deepnote_execute_sql("""select
    sessions.user_id,
    date_trunc('week',users.signed_up_at) as signed_up_at_week,
    floor(extract('day' from session_started_at - signed_up_at)/7) as week, -- The number of weeks that passed since the user signed up
    users_ab.variant
from
    sessions
    left join users on sessions.user_id = users.user_id
    inner join users_ab on sessions.user_id = users_ab.user_id
""", 'SQL_26D1FA15_65F7_4B77_8193_83C3976930E2')
sessions_weekly

Unnamed: 0,user_id,signed_up_at_week,week,variant
0,20730e99d00d466eb218cfecee852647,2021-09-27 00:00:00+00:00,4.0,variant_b
1,5541a58e2e1f4cbdafc9c1b12535f2d9,2021-07-12 00:00:00+00:00,14.0,variant_b
2,5541a58e2e1f4cbdafc9c1b12535f2d9,2021-07-12 00:00:00+00:00,8.0,variant_b
3,5541a58e2e1f4cbdafc9c1b12535f2d9,2021-07-12 00:00:00+00:00,8.0,variant_b
4,5541a58e2e1f4cbdafc9c1b12535f2d9,2021-07-12 00:00:00+00:00,8.0,variant_b
...,...,...,...,...
14240,b6d3a798e7a24dfc8ee996817824d101,2021-07-05 00:00:00+00:00,9.0,variant_b
14241,2c3fede222b54a4e8c6181a42ed8c3a1,2021-08-16 00:00:00+00:00,5.0,variant_b
14242,afe8126a93a2473e8458b77a911344fc,2021-08-30 00:00:00+00:00,4.0,variant_b
14243,cbd0ebd955014ee8a6f879ff5aaf166f,2021-07-26 00:00:00+00:00,8.0,variant_b


In [None]:
import pandas as pd

def get_retention(df):
    retention = df.copy()

    # Save the cohort size before we start calculating retention
    cohort_size = retention.groupby(['signed_up_at_week']).user_id.nunique().reset_index(name='cohort_size')

    # For each cohort-week, calculate the number of users who visited
    retention = retention.groupby(['signed_up_at_week','week']).user_id.nunique().reset_index(name='n_users')

    # Pivot and melt the table. This is a little trick that allows us to add rows during weeks where a cohort was not active.
    retention = retention.pivot(index=['signed_up_at_week'],columns='week',values='n_users').fillna(0)
    retention = retention.melt(value_name='n_users',ignore_index=False).reset_index()

    # If part of the cohort is still not finished the week, then exclude that cohort-week from the data
    # To do this we add 6 days to the sign up date. This gives us the last sign ups of that cohort.
    retention = retention.loc[
        ~(retention.signed_up_at_week + pd.to_timedelta(retention.week + 1,'W') + pd.Timedelta(6,'D') >
        pd.Timestamp.now(tz='UTC').floor('D'))
    ]

    # Divide by the cohort size to get a percentage
    retention = retention.merge(cohort_size,on=['signed_up_at_week'])
    retention['prop'] = retention.n_users / retention.cohort_size

    retention_avg = retention.groupby('week').prop.mean().reset_index()

    retention_avg = retention_avg.rename(columns = dict(
        prop='Retention',
        week='Week',
    ))

    return retention_avg

sessions_a = sessions_weekly.loc[sessions_weekly.variant == 'variant_a']
retention_a = get_retention(sessions_a)
retention_a['variant'] = 'variant_a'

sessions_b = sessions_weekly.loc[sessions_weekly.variant == 'variant_b']
retention_b = get_retention(sessions_b)
retention_b['variant'] = 'variant_b'

retention = pd.concat([retention_a,retention_b])

In [None]:
_deepnote_run_altair(retention, """{"data":{"name":"placeholder"},"mark":{"type":"line","tooltip":{"content":"data"}},"height":220,"$schema":"https://vega.github.io/schema/vega-lite/v4.json","autosize":{"type":"fit"},"encoding":{"x":{"sort":null,"type":"quantitative","field":"Week","scale":{"type":"linear","zero":false}},"y":{"sort":null,"type":"quantitative","field":"Retention","scale":{"type":"linear","zero":false}},"color":{"sort":null,"type":"nominal","field":"variant","scale":{"type":"linear","zero":false}}}}""")

Here is the Week-4 Retention for each variant.

In [None]:
retention.loc[retention.Week == 4][['variant','Retention']].reset_index(drop=True)

Unnamed: 0,variant,Retention
0,variant_a,0.176021
1,variant_b,0.377029


### Significance Test

Once again, check that the difference in retention is significance.

In [None]:
from scipy.stats import ttest_ind

# Label all users who retained 4+ weeks later
users_retention = sessions_weekly.groupby(['user_id','variant']).apply(lambda x: x.week.max() >= 4).reset_index(name='retained')

# Calculate the proportion of retained users for variant A vs B
proportions = users_retention.groupby('variant').retained.value_counts(dropna=False).reset_index(name='nusers')
proportions = proportions.pivot(index='variant',columns='retained',values='nusers')

# Chi-Square test
g,p,dof,expctd = chi2_contingency(proportions)

print(f'p-value is {p}')
if p < 0.05:
    print(f'The difference in Retention between Variants A and B is statistically significant.')
else:
    print(f'The difference in Retention between Variants A and B is NOT statistically significant.')

p-value is 2.5434010296902054e-12
The difference in Retention between Variants A and B is statistically significant.


<a style='text-decoration:none;line-height:16px;display:flex;color:#5B5B62;padding:10px;justify-content:end;' href='https://deepnote.com?utm_source=created-in-deepnote-cell&projectId=b4c251b4-c11a-481e-8206-c29934eb75da' target="_blank">
 </img>
Created in <span style='font-weight:600;margin-left:4px;'>Deepnote</span></a>