
# Plum Case Study

## Objectives
1. **Which acquisition source performs the best?**
2. **Which type of users are of high quality?**

This report analyzes user acquisition performance and user quality using revenue, retention, and feature adoption metrics.


## 1. Exploratory Data Analysis

### Population Percentage by Acquisition Source

In [13]:
# Convert necessary columns to numeric, setting errors='coerce' to handle non-numeric values
acquisition_performance['total_revenue'] = pd.to_numeric(acquisition_performance['total_revenue'], errors='coerce')
acquisition_performance['total_cpa'] = pd.to_numeric(acquisition_performance['total_cpa'], errors='coerce')
acquisition_performance['roi'] = pd.to_numeric(acquisition_performance['roi'], errors='coerce')  # Will convert "High ROI (no CPA)" to NaN
acquisition_performance['revenue_per_user'] = pd.to_numeric(acquisition_performance['revenue_per_user'], errors='coerce')

# Check the updated data types
print(acquisition_performance.dtypes)

print(acquisition_performance)



acquisition_source     object
total_revenue         float64
total_cpa             float64
total_users             int64
retention_rate         object
roi                   float64
revenue_per_user      float64
dtype: object
  acquisition_source  total_revenue  total_cpa  total_users retention_rate  \
1             Google       54706.46   21085.18         2798           0.87   
3           Referral       48012.39   21464.61         2846           0.90   
4             Tiktok       43736.04   28863.64         2305           0.85   
0           Facebook       34625.74   33360.73         1905           0.86   
2            Organic          24.59       0.00          146           0.92   

    roi  revenue_per_user  
1  1.59             19.55  
3  1.24             16.87  
4  0.52             18.97  
0  0.04             18.18  
2   NaN              0.17  


### Trends in User Acquisition Sources Over Time

In [14]:
# Calculate retention rate for each acquisition source
retention_rate = df.groupby('acquisition_source')['retention'].mean()

# Sort the retention rates in descending order
retention_rate = retention_rate.sort_values(ascending=False)

# Convert the Series to a DataFrame for easier use with Plotly
retention_rate_df = retention_rate.reset_index()
retention_rate_df.columns = ['acquisition_source', 'retention_rate']

# Multiply retention rates by 100 for proper percentage display
retention_rate_df['retention_rate'] = retention_rate_df['retention_rate'] * 100

# Color map for the plot
color_map = {
    'TikTok': '#d01962',
    'Google': '#04c751',
    'Referral': '#b576e8',
    'organic': '#225c48',
    'Facebook': '#6c9cf4'
}

# Create the bar chart
fig = px.bar(
    retention_rate_df,
    x='acquisition_source',
    y='retention_rate',
    labels={'acquisition_source': 'Acquisition Source', 'retention_rate': 'Retention Rate'},
    title="Retention Rate by Acquisition Source",
    color='acquisition_source',  # Specify color column here
    color_discrete_map=color_map  # Use the color map
)

# Add numeric annotation for each acquisition source
for i, row in retention_rate_df.iterrows():
    fig.add_annotation(
        x=row['acquisition_source'],
        y=row['retention_rate'] - 5,
        text=f"{row['retention_rate']:.2f}%",  # Display percentage as 20.00, 40.00 etc.
        showarrow=False,
        font=dict(size=12, color="white"),  # Change font color to white for visibility
        align="center",
        yshift=5  # Adjust vertical position if necessary
    )

# Customize appearance
fig.update_layout(
    template="plotly_white",
    plot_bgcolor="#1a0a2b",
    paper_bgcolor="#1a0a2b",
    font_color="white",
    xaxis_title="Acquisition Source",
    yaxis_title="Retention Rate",
    yaxis_ticksuffix="%",  # Show % sign
    showlegend=False,
    yaxis=dict(
        tickformat='.0f',  # Proper format for integer percentage
        gridcolor='gray',  # Set gridlines to gray
        gridwidth=0.5,  # Thickness of gridlines
        showgrid=True,  # Show gridlines
        griddash='dot'
    )
)

fig.show()


### Platform Distribution by Acquisition Source

In [16]:
# Sort the DataFrame by 'revenue_per_user' in descending order
acquisition_performance = acquisition_performance.sort_values(by='revenue_per_user', ascending=False)

color_map = {
    'Tiktok': '#d01962',
    'Google': '#04c751',
    'Referral': '#b576e8',
    'organic': '#225c48',
    'Facebook': '#6c9cf4'
}

# Create bar plot for revenue_per_user
fig = px.bar(
    acquisition_performance,
    x='acquisition_source',
    y='revenue_per_user',
    title="Revenue Per User by Acquisition Source",
    labels={'acquisition_source': 'Acquisition Source', 'revenue_per_user': 'Revenue Per User'},
    color='acquisition_source',  # Specify color column here
    color_discrete_map=color_map  # Use the color map
)

# Customize appearance
fig.update_layout(
    template="plotly_white",
    plot_bgcolor="#1a0a2b",
    paper_bgcolor="#1a0a2b",
    font_color="white",
    xaxis_title="Acquisition Source",
    yaxis_title="Revenue Per User",
    title=dict(
        text="Revenue Per User by Acquisition Source",
        x=0.5,  # 0.5 centers the title
        xanchor='center',  # Centers the title horizontally
        font=dict(size=16)
    ),
    showlegend=False,
    yaxis=dict(
        range=[0, acquisition_performance['revenue_per_user'].max() + 1],  # Adjust the range based on maximum value
        gridcolor='gray',  # Set gridlines to gray
        gridwidth=0.5,  # Thickness of gridlines
        showgrid=True,  # Show gridlines
        zeroline=False,  # Disable zero line
        tickcolor='gray',  # Set color of the y-axis ticks
        griddash='dot'
    )
)

# Show plot
fig.show()


## 2. Acquisition Source Performance

### Revenue-to-CPA Ratio by Acquisition Source

### Retention Percentage by Acquisition Source

In [50]:
# Define the color map for acquisition sources
color_map = {
    'Tiktok': '#d01962',
    'Google': '#04c751',
    'Referral': '#b576e8',
    'organic': '#225c48',
    'Facebook': '#6c9cf4'
}

# Define a function for consistent styling
def update_graph_style(fig, y_range=None):
    fig.update_layout(
        template="plotly_white",
        plot_bgcolor="#1a0a2b",
        paper_bgcolor="#1a0a2b",
        font_color="white",
        xaxis_title="Acquisition Source",
        title=dict(
            text=fig.layout.title.text,  # Keep the existing title
            x=0.5,  # Center the title
            xanchor='center',
            font=dict(size=16)
        ),
        yaxis=dict(
            gridcolor='gray',
            gridwidth=0.5,
            zeroline=False,
            tickcolor='gray',
            griddash='dot',
            showgrid=True
        ),
        showlegend=False  # Keep legend off unless necessary
    )
    if y_range:
        fig.update_yaxes(range=y_range)  # Apply custom range if provided


# Calculate retention percentage per acquisition source
df['retained'] = df['disabled_month'].isna().astype(int)  # 1 if user is retained (not disabled), 0 if disabled
retention_by_source = df.groupby('acquisition_source').agg(
    total_users=('user_id', 'count'),
    retained_users=('retained', 'sum')
).reset_index()

# Calculate retention percentage
retention_by_source['retention_percentage'] = (retention_by_source['retained_users'] / retention_by_source['total_users']) * 100

# Format the retention percentage with 2 decimal places and add % sign
retention_by_source['formatted_retention_percentage'] = retention_by_source['retention_percentage'].apply(lambda x: f"{x:.0f}%")

retention_by_source = retention_by_source.sort_values(by='formatted_retention_percentage', ascending=False)

# --- Plot: Retention Percentage by Acquisition Source ---
fig2 = px.bar(
    retention_by_source,
    x='acquisition_source',
    y='retention_percentage',
    title="Retention Percentage by Acquisition Source",
    labels={'acquisition_source': 'Acquisition Source', 'retention_percentage': 'Retention (%)'},
    color='acquisition_source',
    color_discrete_map=color_map,
    text='formatted_retention_percentage'  # Use the formatted labels
)

# Update plot styling
update_graph_style(fig2, y_range=[0, retention_by_source['retention_percentage'].max() + 5])

# Make text bold
fig2.update_traces(textfont=dict(family="Arial", size=14, color="white", weight="bold"))

fig2.show()


### Total Revenue and Average Revenue Per User

Total Revenue by Acquisition Source

**Google** generates the highest total revenue (54,706£), far surpassing the other sources.
**Referral** follows with a strong contribution of 48,012£, reflecting its effectiveness.
**Tiktok** brings in 43,736£, showing solid performance but trailing Referral.
**Facebook** generates 34,626£, performing well but with potential for increased output.
**Organic** contributes the least, with a modest revenue of 25£.

Key Takeaway:

Google is the highest performer in total revenue generation, followed by Referral and Tiktok. Organic underperforms, suggesting a need for better strategic investment in this channel.

Recommendation:

Invest more in optimizing Organic for higher revenue potential. Continue to scale Referral and Tiktok, focusing on retaining high-value users. Explore deeper targeting on Facebook to boost its performance further.

Average Revenue per User by Acquisition Source

**Google** leads with an average revenue of 20£ per user, showing strong return potential.
**Tiktok** follows closely with 19£ per user, offering robust average revenue.
**Facebook** comes in at 18£ per user, demonstrating competitive value.
**Referral** yields 17£ per user, which is solid but still below the others.
**Organic** stands at a very low 0£, indicating minimal average revenue per user.

Key Takeaway:

Google provides the highest revenue per user, followed closely by Tiktok, while Organic underperforms with negligible revenue.

Recommendation:

Enhance the monetization of Organic to increase its average revenue per user. Continue scaling Google and Tiktok, which show strong average revenue. For Referral and Facebook, implement targeted strategies to boost per-user value.

## 3. Identifying High-Quality Users

### Revenue Distribution by Premium Features

In [65]:
# 3. Total and Average Revenue by Acquisition Source

# Define the color map for acquisition sources
color_map = {
    'Tiktok': '#d01962',
    'Google': '#04c751',
    'Referral': '#b576e8',
    'organic': '#225c48',
    'Facebook': '#6c9cf4'
}

# Define a function for consistent styling
def update_graph_style(fig, y_range=None):
    fig.update_layout(
        template="plotly_white",
        plot_bgcolor="#1a0a2b",
        paper_bgcolor="#1a0a2b",
        font_color="white",
        xaxis_title="Acquisition Source",
        title=dict(
            text=fig.layout.title.text,  # Keep the existing title
            x=0.5,  # Center the title
            xanchor='center',
            font=dict(size=16)
        ),
        yaxis=dict(
            gridcolor='gray',
            gridwidth=0.5,
            zeroline=False,
            tickcolor='gray',
            griddash='dot',
            showgrid=True
        ),
        showlegend=False  # Keep legend off unless necessary
    )
    if y_range:
        fig.update_yaxes(range=y_range)  # Apply custom range if provided

total_revenue_by_source = df.groupby('acquisition_source')['revenue'].sum().reset_index()
avg_revenue_by_source = df.groupby('acquisition_source')['revenue'].mean().reset_index()

# Sort dataframes by revenue
total_revenue_by_source = total_revenue_by_source.sort_values(by='revenue', ascending=False)
avg_revenue_by_source = avg_revenue_by_source.sort_values(by='revenue', ascending=False)

# Format the total revenue and average revenue with 0 decimal places and add £ sign
total_revenue_by_source['formatted_revenue'] = total_revenue_by_source['revenue'].apply(lambda x: f"{x:.0f}£")
avg_revenue_by_source['formatted_revenue'] = avg_revenue_by_source['revenue'].apply(lambda x: f"{x:.0f}£")

# Plot: Total Revenue by Acquisition Source
fig3 = px.bar(
    total_revenue_by_source,
    x='acquisition_source',
    y='revenue',
    title='Total Revenue by Acquisition Source',
    text='formatted_revenue',
    labels={'revenue': 'Total Revenue', 'acquisition_source': 'Acquisition Source'},
    color='acquisition_source',
    color_discrete_map=color_map
)
fig3.update_layout(yaxis_title="Total Revenue (€)", xaxis_title="Acquisition Source")
fig3.update_traces(textfont=dict(family="Arial", size=14, color="white", weight="bold"))
update_graph_style(fig3)  # Apply the style settings
fig3.show()

# Plot: Average Revenue Per User by Acquisition Source
fig4 = px.bar(
    avg_revenue_by_source,
    x='acquisition_source',
    y='revenue',
    title='Average Revenue per User by Acquisition Source',
    text='formatted_revenue',
    labels={'revenue': 'Avg Revenue Per User (€)', 'acquisition_source': 'Acquisition Source'},
    color='acquisition_source',
    color_discrete_map=color_map
)
fig4.update_layout(yaxis_title="Avg Revenue (€)", xaxis_title="Acquisition Source")
fig4.update_traces(textfont=dict(family="Arial", size=14, color="white", weight="bold"))
update_graph_style(fig4)  # Apply the style settings
fig4.show()


### Adoption of Premium Features

3. Total Revenue by Acquisition Source

    Purpose: Identifies sources that drive overall revenue.
    Insight: Valuable for understanding revenue generation at scale.

4. Average Revenue per User by Acquisition Source

    Purpose: Pinpoints sources that bring high-revenue users individually.
    Insight: Complements total revenue by providing per-user value data.


## Conclusion and Recommendations

### Best Acquisition Source
- **Google** performs the best with the highest ROI and revenue per user.
- **Referral** is also strong, with solid ROI and retention rates.

### High-Quality Users
- Users investing in **SIPP** and **Funds** generate the highest revenues.
- Premium plan users (Pro/Plus) have the longest retention, highlighting their quality.

### Recommendations
1. **Scale Google and Referral acquisition** to maximize ROI and revenue.
2. **Boost feature adoption** (SIPP, Interest Pocket) through targeted education and campaigns.
3. **Convert Basic users** to Pro/Plus plans with incentives to improve retention and revenue.
4. **Optimize Facebook and TikTok campaigns** to improve cost efficiency and retention.

By implementing these strategies, Plum can enhance both acquisition performance and user quality, driving sustainable growth.
