**Customer Lifetime Value analysis**

Problem: How much to spend on acquiring new customers?

*What is the expected lifetime on the platform, and how much do they bring in on average*

Factors to consider:
- purchase frequency (simplify to purchases ever, not for example how often in a year?)
- average order value
- average customer lifespan (for this we might use some survival analysis method, but that's for another day)

In [105]:
import pandas as pd
import sqlite3
import matplotlib.pyplot as plt
import seaborn as sns


conn = sqlite3.connect('mock_resq.db')

df = pd.read_sql_query("SELECT * FROM presentation_table", conn)
conn.close()

print(df.head())
print()
print(df.info())

                userID   cohort  M1_retention                   id  \
0   833181563296211638  2023-03             1  4648711062057701806   
1  7763311846463275691  2022-11             1  1676056141507951956   
2  8919282109171104948  2022-10             0  7745602867536251060   
3  5785370845306063462  2022-11             0  7319989469562109720   
4  8918527236425591239  2022-09             1  8979946097528312402   

             createdAt   sales              partner segment  
0  2023-08-31 10:14:49  1000.0  3518867990385707647    meal  
1  2023-03-21 17:04:54   400.0  6413422964860176913    meal  
2  2023-07-19 09:48:28   680.0   123356649204044788   snack  
3  2023-08-10 12:29:01  1099.0  7268869293921836511    meal  
4  2022-10-03 09:55:15   200.0  7530970657789428790   snack  

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 299971 entries, 0 to 299970
Data columns (total 8 columns):
 #   Column        Non-Null Count   Dtype  
---  ------        --------------   -----  
 0   use

In [106]:
# Change dates to datetime
df['cohort'] = pd.to_datetime(df['cohort'])
df['createdAt'] = pd.to_datetime(df['createdAt'])

In [113]:
# Let's start with average order value, since it's the simplest

aov = df['sales'].mean() / 100
print(f'Average order value: {aov:.2f} euros')

Average order value: 7.70 euros


In [108]:
# Purchase frequency
# We are be interested in how many puchases an average customer makes 

per_customer = df.groupby('userID').size()
avg_purchases = per_customer.mean()
print(f'The average customer makes {avg_purchases:.3f} purchases in their lifetime')
# note: this also works: len(df) / len(df.groupby('userID').size())

counts_df = df['userID'].value_counts().reset_index(name='counts')

# normalized df shows the % of each order count
norm_counts_df = counts_df['counts'].value_counts(normalize=True)
print()
print('n orders and their percentage')
print(norm_counts_df)
print()

prop_only_one_order = counts_df['counts'].value_counts(normalize=True)[1]
print(f'{100 * prop_only_one_order:.2f}% of customers have made only one order')
print(f'{100*counts_df['counts'].value_counts(normalize=True)[1:5].sum():.2f}% of customers have made between 2 and 5 orders')
print(f'{100*counts_df['counts'].value_counts(normalize=True)[5:].sum():.2f}% of customers have made more than 5 orders')

# recurring customers df of those who return for a second order
# MIGHT NOT NEED THIS?
rc_df = df[df['userID'].isin(counts_df[counts_df['counts'] > 1]['userID'])]
#rc_df 

The average customer makes 2.435 purchases in their lifetime

n orders and their percentage
counts
1      0.539421
2      0.197300
3      0.094961
4      0.053633
5      0.032189
         ...   
72     0.000008
76     0.000008
95     0.000008
57     0.000008
266    0.000008
Name: proportion, Length: 63, dtype: float64

53.94% of customers have made only one order
37.81% of customers have made between 2 and 5 orders
8.25% of customers have made more than 5 orders


In [111]:
# average customer lifespan 
# we can for example use the recurring customers and calculate:
# last order date - first order date

user_lifespans = rc_df.groupby('userID').agg({
    'createdAt': [
        ('first_purchase', 'min'),
        ('last_purchase', 'max'),
        ('purchase_count', 'count')
    ]
})
user_lifespans.columns = ['first_purchase', 'last_purchase', 'purchase_count']
user_lifespans['lifespan'] = (user_lifespans['last_purchase'] - user_lifespans['first_purchase']).dt.days
avg_lifespan_days = user_lifespans['lifespan'].mean()

print(f"Average lifespan for recurring customers: {avg_lifespan_days:.2f} days")

Average lifespan for recurring customers: 167.12 days


In [118]:
# Customer Lifetime Value
# CLV = average order value * average purchases

clv = aov * avg_purchases
print(f'Average Customer Lifetime Value: {clv:.2f} euros')
print(f"Average revenue per day: {clv / avg_lifespan_days:.2f} euros")
print(f"Average time between purchases: {avg_lifespan_days / avg_purchases:.2f} days")


Average Customer Lifetime Value: 18.76 euros
Average revenue per day: 0.11 euros
Average time between purchases: 68.64 days
