## Customer Lifetime Value (CLV)

### 1.Overview

Customer Lifetime Value (CLV) analysis is a crucial concept in marketing and customer relationship management. It helps businesses understand the long-term value that each customer brings to the company, enabling them to make data-driven decisions regarding customer acquisition, retention, and marketing strategies.

### 2.Target

By analyzing customer lifetime value, you can identify the most effective marketing channels and campaigns for acquiring high-value customers. Additionally, you can develop targeted retention strategies to keep these valuable customers engaged and loyal.

### 3.DataSet information:

For this project we will use a dataset based on customers’ relationships with the business. This DataSet is stored in File "Data".

**Attribute Information:**

* **customer_id:** Customer Identification.
* **channel:** The various marketing and communication channels through which a company interacts with its customers.
* **cost:** The costs associated with acquiring and serving to customers.
* **conversion_rate:** Conversion rate refers to the percentage of potential customers who take a desired action, such as making a purchase, out of the total number of individuals who were exposed to a marketing campaign or visited a particular channel.
* **revenue:** Revenue refers to the total amount of money generated from sales or customer transactions within a specific period.

### 4.Import libraries

In [1]:
#Import main libraries:
import pandas as pd
import plotly.graph_objs as go
import plotly.express as px
import plotly.io as pio
pio.templates.default = "plotly_dark"

### 5.Import DataSet

In [2]:
#Import Dataset:
df_clv = pd.read_csv("Data/customer_acquisition_data.csv")
df_clv.sample(n=10)

Unnamed: 0,customer_id,channel,cost,conversion_rate,revenue
230,231,email marketing,5.246263,0.043822,1643
76,77,email marketing,5.246263,0.043822,1736
84,85,paid advertising,30.450327,0.016341,4312
415,416,social media,9.546326,0.167592,941
123,124,referral,8.320327,0.123145,3376
312,313,social media,9.546326,0.167592,2974
355,356,social media,9.546326,0.167592,3278
494,495,paid advertising,30.450327,0.016341,956
517,518,paid advertising,30.450327,0.016341,1397
320,321,paid advertising,30.450327,0.016341,1804


In [3]:
#Dataframe Size:
print('df_clv dimensions: ', df_clv.shape)

df_clv dimensions:  (800, 5)


**OBS**:
* There are 5 columns (features) and 800 rows.

### 6.Data Preparation:

In [4]:
#Check for missing values and duplicates rows:
print("Duplicate rows:\n ", df_clv.duplicated().sum())
print("Missing Values:\n" + df_clv.isnull().sum().to_string())

Duplicate rows:
  0
Missing Values:
customer_id        0
channel            0
cost               0
conversion_rate    0
revenue            0


There are not any duplicated row or missing values.

### 7.Data Visualization:

In [37]:
#Plot the distribution of acquisition cost by the customer:
fig = px.histogram(data_frame=df_clv, 
                   x="cost", 
                   nbins=20, 
                   title='Distribution of Acquisition Cost',
                   color='channel',text_auto=True)
fig.show()

**OBS:**

* The **"Referral"** and **"Email Marketing"** channels have in total the highest number of customers (around 400) with acquisition costs ranging between 8 and 9.

* **"paid advertising"** has 194 customer with the highest acquisition cost (30 appoximately).

* **" email marketing"** has 214 customer with the lowest acquisition cost (5 appoximately).


In [62]:
#Plot the revenue cost by the customer:
fig = px.histogram(data_frame=df_clv, 
                   x="revenue", 
                   nbins=50, 
                   title='Distribution of Revenue',
                   color='channel')
fig.show()

**OBS:**

* Apparently all channels contribute the revenue equally.

In [7]:
# Plot compare the cost of acquisition across different channels
# Identify the most and least profitable channels
cost_by_channel = df_clv.groupby('channel')['cost'].mean().reset_index()

fig = px.bar(cost_by_channel, 
             x='channel', 
             y='cost', 
             title='Customer Acquisition Cost by Channel',
             color='channel')
fig.show()

**OBS:**

* "**paid advertisement**" is the most expensive channel.
* "**email marketing**" is the least expensive channel.

In [8]:
# Plot "conversion_rate" vs "channel"
# Identify which channels are most and least effective at converting customers:
conversion_by_channel = df_clv.groupby('channel')['conversion_rate'].mean().reset_index()

fig = px.bar(conversion_by_channel, x='channel', 
             y='conversion_rate', 
             title='Conversion Rate by Channel',
             color='channel')
fig.show()

**OBS:**

* "**Social media**" is the most effective channel for converting customers.
* "**paid advertising**" is the least effective channel for converting customers.

In [9]:
# calculate the total revenue by channel.
# Find the most and least profitable channels in terms of generating revenue:
revenue_by_channel = df_clv.groupby('channel')['revenue'].sum().reset_index()

fig = px.pie(data_frame=revenue_by_channel, 
             values='revenue', 
             names='channel', 
             title='Total Revenue by Channel', 
             hole=0.6, 
             color='channel')

fig.show()

**OBS:**

* "**email marketing**" is the most profitable channel (27.3%) in terms of generating revenue.
* "**social media**" is the least profitable channel (22.3%) in terms of generating revenue.
* There is not a huge difference between the percentages of revenue generation.

In [10]:
# calculate the return on investment (ROI) for each channel:
# ROI = revenue / cost

df_clv['roi'] = df_clv['revenue'] / df_clv['cost']
roi_by_channel = df_clv.groupby('channel')['roi'].mean().reset_index()

fig = px.bar(roi_by_channel, 
             x='channel', 
             y='roi', 
             title='Return on Investment (ROI) by Channel',
             color='channel')
fig.show()

**OBS:**

* The ROI from **email marketing** is higher than all other channels.
* The ROI from **paid advertising** is the lowest.

### 8.Customer Lifetime Value (CLV):

Calculate the customer lifetime value from each channel with the following formula:

$$ 
CLV = (revenue - cost) * \frac{{conversion_-rate}}{cost} 
$$

In [11]:
# customer lifetime Value:

df_clv['clv'] = (df_clv['revenue'] - df_clv['cost']) * df_clv['conversion_rate'] / df_clv['cost']

channel_clv = df_clv.groupby('channel')['clv'].mean().reset_index()

fig = px.bar(channel_clv, x='channel', y='clv', color='channel',
             title='Customer Lifetime Value by Channel')

fig.update_xaxes(title='Channel')
fig.update_yaxes(title='CLV')

fig.show()

**OBS:**
* "**Social Media**" and "**referral**" channels have the highest CLV.
* "**Paid Advertising**" has the lowest CLV.

In [33]:
#Compare the CLV distribution of "social media" and "referral" channels:

subset = df_clv.loc[df_clv['channel'].isin(['social media', 'referral'])]

fig = px.box(data_frame=subset, 
             x='channel', 
             y='clv', 
             title='CLV Distribution by Channel',
             color='channel',
             color_discrete_map={'referral': '#00CC96', 'social media': '#AB63FA'})

fig.update_xaxes(title='Channel')
fig.update_yaxes(title='CLV')
fig.update_layout(legend_title='Channel')

fig.show()

**OBS:**
* The Customer Lifetime Value from the "Social Media" channel is slightly better than the "referral" channel.

### 9.Final Observations:

1. **Reduce investment in "paid advertising"**: Since it is the most expensive channel and has the lowest Return on Investment (ROI), it may be beneficial to reconsider the allocation of resources to this channel. By reducing investment in paid advertising, you can potentially allocate those resources to other channels with higher effectiveness and ROI.

2. **Increase investment in "social media"**: As mentioned, "social media" is identified as the most effective channel and has the highest Customer Lifetime Value (CLV). This suggests that allocating more resources and investment into social media marketing efforts could yield better results and contribute to the overall success of the business.

3. **Increase investment in "referral"**: The "referral" channel is highlighted as having the highest Customer Lifetime Value (CLV). This indicates that customer referrals can be a valuable source of new customers and revenue. Increasing investment in referral programs or strategies to encourage and incentivize customer referrals can be beneficial for long-term customer acquisition and retention.

4. By focusing on channels with **higher effectiveness**, **lower costs**, and **higher CLV**, you can optimize your marketing strategies and maximize the return on your marketing investments.
