# Chime Case Study
Ravi Dayabhai

In [1]:
%load_ext autoreload
%autoreload 2
%config InlineBackend.figure_format = 'retina'

In [2]:
# Import dependencies
import pandas as pd
import glob
import os
import pandas

import matplotlib.pyplot as plt

In [3]:
# Global variables

# Revenue
INTERCHANGE_RATE = 1.5 / 100 # per purchase dollar

# Costs
COGS = 1.05 / 100 # per purchase dollar
SPEND_TRACKER_PMPM = 5 # cost per member-month

## Data Load / Processing

These data is super clean and relatively small. 

In [4]:
# Load data
DATA_PATH = "../data/"
data_files = glob.glob(os.path.join(DATA_PATH, "*.csv"))
data_dict = {}
for file in data_files:
    file_name = os.path.splitext(os.path.basename(file))[0]
    df_i = pd.read_csv(file)
    data_dict[file_name] = df_i

# Assign to user-friendly variables
df_perf = data_dict['performance'].copy()
df_acq_segment = data_dict['acquisition_segment_counts'].copy()
df_acq_agg = data_dict['acquisition_agg'].copy()

# Set indices
df_acq_agg.set_index('Variant', inplace=True)
df_acq_segment.set_index('Variant', inplace=True)
df_perf.set_index(['Months Since Conversion', 'Test Group', 'Segment'], 
                  inplace=True)

## EDA

The exploratory data analysis will largely be driven by the questions extracted from the case study prompt.

**Notes**:

- For the The _Treatment_, in this context, comprises a customer journey featuring a spend tracker, wherease the _Control_ arm is not availed of the existence of a spend tracker.
- Average monthly spend for Segment B is _higher_ than Segment A's.
- Members in Segment B are _more likely_ to have additional bank accounts in addition to Chime.
- The Spend Tracker is free _to all members_ and will cost Chime \$5 per member per month (PMPM), i.e., it is not revenue generating itself.

### Acquisition

- What does the funnel look like for Control vs. Treatment?
- Assuming cogent data collection, did the Treatment drive lift in CTR? Conversions?
- How does overall CAC compare for Control vs. Treatment?
 - Overall CAC for the Treatment should include the cost to provide the free spend tracker to all members _that use_ the feature.

In [5]:
# Funnel metrics by variant
df_acq_agg['CTR'] = df_acq_agg['Unique Clicks'] / df_acq_agg['Population']
df_acq_agg['CR'] = df_acq_agg['Conversions'] /  df_acq_agg['Unique Clicks']

# Cost-per-Unique-Click by variant
df_acq_agg['CPUC'] = df_acq_agg['Spend'] / df_acq_agg['Unique Clicks']

# Cost-per-Conversion by variant
df_acq_agg['CAC'] = df_acq_agg['Spend'] / df_acq_agg['Conversions']

df_acq_agg

Unnamed: 0_level_0,Spend,Population,Unique Clicks,Conversions,CTR,CR,CPUC,CAC
Variant,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
Control,1010476,20014153,200140,19815,0.01,0.099006,5.048846,50.995508
Test,1020581,20214295,208206,23007,0.0103,0.110501,4.901785,44.359586


#### Response Segmentation

First, let's take a look at the distribution of Segment A vs. Segment B respondents in each variant assignment.

### Profitability

**Hypothesis**: The spend tracker helps overall profitability by driving more cumulative spending per member over their lifetime. The lift in profitability can be decomposed into:
- Incremental profitability due to _higher retention per member_.
- Incremental profitability due to _higher avergage monthly spend per member_.

### Member Value

From the case study prompt:

> To evaluate value to members, we look at what percentage of our members test the feature and continue to use it through time.
