+ A/B Tests: Measure impact of changes on KPIs
+ KPIs: metrics important to an organization
    + How to identify: Experience + Domain knowledge + EDA
        + Experience + Knowledge: What is important to a business
        + EDA: What metrics and relationships impact these KPIs
    + Choosing a KPI
        + Stability over time
        + Importance across different user groups
        + Correlation with other business factors

# Key Performance Indicators: Measuring Business Success

## Loading & examining our data

In [37]:
import pandas as pd
import datetime
import numpy as np

In [5]:
customer_data = pd.read_csv('Data/user_demographics_v1.csv')
app_purchases = pd.read_csv("Data/purchase_data_v1.csv")

In [6]:
customer_data.head()

Unnamed: 0,uid,reg_date,device,gender,country,age
0,54030035.0,2017-06-29T00:00:00Z,and,M,USA,19
1,72574201.0,2018-03-05T00:00:00Z,iOS,F,TUR,22
2,64187558.0,2016-02-07T00:00:00Z,iOS,M,USA,16
3,92513925.0,2017-05-25T00:00:00Z,and,M,BRA,41
4,99231338.0,2017-03-26T00:00:00Z,iOS,M,FRA,59


In [7]:
app_purchases.head()

Unnamed: 0,date,uid,sku,price
0,2017-07-10,41195147,sku_three_499,499
1,2017-07-15,41195147,sku_three_499,499
2,2017-11-12,41195147,sku_four_599,599
3,2017-09-26,91591874,sku_two_299,299
4,2017-12-01,91591874,sku_four_599,599


In [23]:
customer_data['reg_date'] = customer_data['reg_date'].str.replace('T', ' ')
customer_data['reg_date'] = customer_data['reg_date'].str.replace('Z', ' ')
customer_data['reg_date'] = pd.to_datetime(customer_data['reg_date'])

In [25]:
purchase_data = app_purchases.merge(customer_data, on = ['uid'], how='left')

In [30]:
purchase_data.head()

Unnamed: 0,date,uid,sku,price,reg_date,device,gender,country,age
0,2017-07-10,41195147,sku_three_499,499,2017-06-26,and,M,BRA,17
1,2017-07-15,41195147,sku_three_499,499,2017-06-26,and,M,BRA,17
2,2017-11-12,41195147,sku_four_599,599,2017-06-26,and,M,BRA,17
3,2017-09-26,91591874,sku_two_299,299,2017-01-05,and,M,TUR,17
4,2017-12-01,91591874,sku_four_599,599,2017-01-05,and,M,TUR,17


In [31]:
purchase_data['date'] = pd.to_datetime(purchase_data['date'])
purchase_data['reg_date'] = pd.to_datetime(purchase_data['reg_date'])

## Calculating KPIs

Goal: Examine the KPI "user conversion rate" after the free trail  
Week One Conversion Rate: Limit to users who convert in their first week after the trail ends

In [33]:
# Current date
current_date = pd.to_datetime('2018-03-17 00:00:00')
# Maxmium Date
max_purchase_date = current_date - datetime.timedelta(days=28)
# Filter to only include users who registered before our max date
purchase_data_filt = purchase_data[purchase_data.reg_date < max_purchase_date]

# Filter to contain only purchases within the first 28 days of registration
purchase_data_filt = purchase_data_filt[(purchase_data_filt['date'] <=
                                         purchase_data_filt['reg_date'] + 
                                         datetime.timedelta(days=28))]
# Output the mean price paid per purchase
print(purchase_data_filt.price.mean())

414.4237288135593


In [43]:
# Average purchase price by cohort
# Set the max registration date to be one month before today
max_reg_date = current_date - datetime.timedelta(days=28)

# Find the month 1 values:
month1 = np.where((purchase_data['reg_date'] < max_reg_date) &
                    (purchase_data['date'] < purchase_data['reg_date'] + datetime.timedelta(days=28)),
                  purchase_data['price'], 
                  np.NaN)
                 
# Update the value in the DataFrame 
purchase_data['month1'] = month1
# Group the data by gender and device 
purchase_data_upd = purchase_data.groupby(by=['gender', 'device'], as_index=False)
# Aggregate the month1 and price data 
purchase_summary = purchase_data_upd.agg(
                        {'month1': ['mean', 'median'],
                        'price': ['mean', 'median']})

# Examine the results 
print(purchase_summary)

  gender device      month1              price       
                       mean median        mean median
0      F    and  388.204545  299.0  400.747504    299
1      F    iOS  432.587786  499.0  404.435330    299
2      M    and  413.705882  399.0  416.237308    499
3      M    iOS  433.313725  499.0  405.272401    299
