## Customer Campaign Targetting
ML implementation series for Product Managers, post #4

### DISCLAIMER: It is greatly beneficial if you know Python and ML basics before hand. If not, I would highly urge you to learn. This should be non-negotiable. This would form the basement for future posts in this series and your career as PM working with ML teams.

## Problem statement

The email marketing team sends campaigns to millions of customers every month. The Director of CRM walks into the meeting room, visibly frustrated:

**"We're blasting everyone with the same campaigns. Open rates are declining. Click rates are stuck around 3-4%. Our best customers are getting annoyed, and we're burning budget on customers who never respond."**

The current state:
- Send emails to entire database
- Hope for engagement
- Waste budget on non-responders
- Annoy loyal customers with irrelevant messages
- No way to know who will actually engage

Marketing has the data:
- Campaign history (who opened, clicked, converted)
- Customer behavior from posts 1-4 (churn risk, segments, CLV)
- But no intelligence layer to predict response

**As a PM, the question became:**  
Can we predict which customers will engage with each campaign before we send it?


## Dataset overview

Same customer data platform (CDP) with 5,000 customers from posts 1-4.

### Tables used:
- **cdp_campaign_responses**: Historical campaign data (5,849 campaign sends)
- **cdp_customers**: Customer demographics
- **cdp_customer_features**: Behavioral metrics

### Campaign data structure:

Each record contains:
- response_id: Unique campaign send identifier
- customer_id: Who received it
- campaign_name: Campaign identifier
- campaign_type: Email, SMS, Push, Social Media, Display Ads
- sent_date: When it was sent
- delivered, opened, clicked, converted: Response metrics
- unsubscribed: Did they opt out?

### Current performance baseline:
- Delivery rate: 95%+
- Open rate: 20-25%
- **Click rate: 3-4%** (our prediction target)
- Conversion rate: <1%

**The problem:** Only 3-4% of campaigns drive meaningful engagement. We're wasting 96% of sends.


## ML approach: random forest classification

### The core question

"Given a customer's profile and campaign history, will they click on this campaign?"

This is a binary classification problem:
- Class 1: Will click
- Class 0: Won't click

### Why this approach over other ML approaches?

Let's compare the options a PM should understand:

#### Option 1: Rule-based targeting
- **How it works:** "Send to everyone who opened the last campaign"
- **Pros:** Simple, no model needed, easy to explain
- **Cons:** Misses complex patterns, can't handle multiple signals simultaneously
- **When to use:** Never for production (too simplistic)
- **PM perspective:** This is what marketing does today, and it's not working

#### Option 2: Logistic regression
- **How it works:** "Calculate a weighted score based on features, predict probability"
- **Pros:** Fast, interpretable, works well with linear relationships
- **Cons:** Assumes linear relationships, doesn't capture feature interactions
- **When to use:** When you need extreme speed and simplicity
- **PM perspective:** Good baseline, but leaves ROI on the table

#### Option 3: Gradient boosting (XGBoost, LightGBM)
- **How it works:** "Build sequential trees, each correcting previous errors"
- **Pros:** Highest accuracy, handles complex patterns
- **Cons:** Slower to train, harder to explain, can overfit
- **When to use:** When you need maximum accuracy and have ML expertise
- **PM perspective:** Overkill for most use cases, hard to explain to stakeholders

#### Option 4: Random forest (our choice)
- **How it works:** "Build many decision trees, average their predictions"
- **Pros:** 
  - Handles non-linear relationships
  - Captures feature interactions naturally
  - Works well with imbalanced data
  - Provides feature importance
  - Less prone to overfitting than single trees
  - Interpretable enough for stakeholders
- **Cons:** Slower than logistic regression (but still fast enough)
- **When to use:** When you need accuracy + interpretability + robustness
- **PM perspective:** Best balance of performance and explainability


### Why we chose random forest:

**1. Campaign response has complex, non-linear patterns**
- A customer with 50% open rate and high CLV behaves differently than one with 50% open rate and low CLV
- Segment + engagement + timing create interactions that logistic regression can't capture

**2. Data is highly imbalanced**
- Only 3-4% of campaigns result in clicks
- Random forest with class balancing handles this well

**3. We need feature importance for stakeholder buy-in**
- Marketing needs to understand WHY the model makes predictions
- Random forest provides clear feature importance rankings

**4. It's production-ready**
- Fast enough to score millions of customers
- Stable enough to not require constant retraining
- Robust enough to handle new patterns


## How random forest works (PM-friendly explanation)

Imagine you're trying to predict if a customer will click on a campaign email.

### The traditional way (what marketing does now):

"If they clicked on the last campaign, send them this one too."

Problem: Too simplistic. Ignores timing, segment, campaign type, recent behavior.

### The random forest way:

Build 100 different "decision trees," each asking different questions about the customer:

**Tree 1:**
- "Did they click on previous campaigns?"
  - Yes → "Are they in the high-value segment?"
    - Yes → "Predict: WILL CLICK"
    - No → "Is their email open rate >20%?"
      - Yes → "Predict: MIGHT CLICK"
      - No → "Predict: WON'T CLICK"
  - No → "Predict: WON'T CLICK"

**Tree 2:**
- "Is their email open rate >20%?"
  - Yes → "Are they at high churn risk?"
    - Yes → "Predict: WILL CLICK (urgent retention)"
    - No → "Days since last purchase?"
      - <30 → "Predict: WILL CLICK"
      - >30 → "Predict: MIGHT CLICK"
  - No → "Predict: WON'T CLICK"

**Tree 3:**
- "What's their CLV?"
  - >\$1000 → "Campaign type?"
    - Email → "Predict: WILL CLICK"
    - SMS → "Predict: MIGHT CLICK"
  - <\$1000 → "Predict: WON'T CLICK"

Each of the 100 trees votes. The final prediction:
- If 70+ trees say "WILL CLICK" → High confidence (send campaign)
- If 40-70 trees say "WILL CLICK" → Medium confidence (personalize first)
- If <40 trees say "WILL CLICK" → Low confidence (skip)


### Why this works:

**1. Wisdom of crowds:** Different trees capture different patterns, averaging reduces errors

**2. Probability scores:** Not just "yes/no" but "70% likely to click" → lets you set thresholds

**3. Handles complexity:** Can learn that "high engagement + low churn + recent purchase + email campaign" = high response

**4. Interpretable:** You can look at feature importance and explain to marketing why certain customers are prioritized


$ Let's-get-into-it $

In [10]:

import pandas as pd
import numpy as np
from datetime import datetime
import warnings
warnings.filterwarnings('ignore')
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report, roc_auc_score
from sklearn.cluster import KMeans

print("="*70)
print("CAMPAIGN RESPONSE PREDICTION")
print("="*70)

# Load all datasets
print("\nLoading datasets...")
cdp_customers = pd.read_csv('cdp_customers.csv')
cdp_customer_features = pd.read_csv('cdp_customer_features.csv')
cdp_campaigns = pd.read_csv('cdp_campaign_responses.csv')

print(f"✓ Customers: {len(cdp_customers):,}")
print(f"✓ Customer features: {len(cdp_customer_features):,}")
print(f"✓ Campaign responses: {len(cdp_campaigns):,}")

# Merge customer data
customer_data = cdp_customers.merge(cdp_customer_features, on='customer_id', how='left')


print("="*70)
print("CURRENT STATE: Campaign Performance")
print("="*70)

# Calculate current metrics
total_campaigns = len(cdp_campaigns)
delivered_rate = cdp_campaigns['delivered'].mean() * 100
open_rate = cdp_campaigns['opened'].mean() * 100
click_rate = cdp_campaigns['clicked'].mean() * 100
conversion_rate = cdp_campaigns['converted'].mean() * 100

print(f"\n📧 Campaign Metrics:")
print(f"   Total campaigns sent: {total_campaigns:,}")
print(f"   Delivery rate: {delivered_rate:.1f}%")
print(f"   Open rate: {open_rate:.1f}%")
print(f"   Click rate: {click_rate:.1f}%")
print(f"   Conversion rate: {conversion_rate:.1f}%")

print(f"\n💡 The Problem:")
print(f"   With {open_rate:.1f}% open rate and {click_rate:.1f}% CTR,")
print(f"   we're wasting {100-click_rate:.1f}% of email sends.")
print(f"   That means {(total_campaigns * (100-click_rate)/100):.0f} wasted emails.")

# Campaign type breakdown
print(f"\n📊 Campaign Type Distribution:")
print(cdp_campaigns['campaign_type'].value_counts())


CAMPAIGN RESPONSE PREDICTION

Loading datasets...
✓ Customers: 5,000
✓ Customer features: 5,000
✓ Campaign responses: 5,849
CURRENT STATE: Campaign Performance

📧 Campaign Metrics:
   Total campaigns sent: 5,849
   Delivery rate: 95.1%
   Open rate: 23.6%
   Click rate: 3.7%
   Conversion rate: 0.3%

💡 The Problem:
   With 23.6% open rate and 3.7% CTR,
   we're wasting 96.3% of email sends.
   That means 5631 wasted emails.

📊 Campaign Type Distribution:
campaign_type
Email                2884
SMS                   914
Push Notification     883
Social Media          589
Display Ads           579
Name: count, dtype: int64


In [12]:
# Step 1: Create segment labels (from Post #2)
segmentation_features = [
    'recency_days', 'frequency', 'monetary_value',
    'avg_order_value', 'engagement_score',
    'email_open_rate', 'email_click_rate'
]

X_segment = customer_data[segmentation_features].fillna(customer_data[segmentation_features].median())
kmeans = KMeans(n_clusters=4, random_state=42, n_init=10)
customer_data['segment_label'] = kmeans.fit_predict(X_segment)

In [14]:
# Step 2: Merge campaign data with customer features
df_campaign = cdp_campaigns.merge(
    customer_data[['customer_id', 'segment_label', 'churn_risk',
                   'customer_lifetime_value', 'engagement_score',
                   'email_open_rate', 'email_click_rate', 'frequency', 'recency_days']],
    on='customer_id',
    how='left'
)

print(f"Campaign data shape: {df_campaign.shape}")

Campaign data shape: (5849, 18)


In [16]:
# Encode churn_risk (from Post #1)
churn_mapping = {'Low': 0, 'Medium': 1, 'High': 2}
df_campaign['churn_risk_encoded'] = df_campaign['churn_risk'].map(churn_mapping)

# Encode campaign_type
campaign_type_mapping = {
    'Email': 0,
    'SMS': 1,
    'Push Notification': 2,
    'Social Media': 3,
    'Display Ads': 4
}
df_campaign['campaign_type_encoded'] = df_campaign['campaign_type'].map(campaign_type_mapping)

# Create time-based features
df_campaign['sent_date'] = pd.to_datetime(df_campaign['sent_date'])
df_campaign['days_since_campaign'] = (pd.Timestamp.now() - df_campaign['sent_date']).dt.days


In [18]:
# Features combining all previous posts
feature_columns = [
    'segment_label',              # From Post #2
    'churn_risk_encoded',         # From Post #1
    'customer_lifetime_value',    # From Post #4
    'engagement_score',
    'email_open_rate',
    'email_click_rate',
    'frequency',
    'recency_days',
    'campaign_type_encoded',
    'days_since_campaign'
]

# Prepare features and target
X = df_campaign[feature_columns].copy()
X = X.fillna(X.median())

# Target: Did they click?
y = df_campaign['clicked'].astype(int)

print(f"Feature matrix: {X.shape}")
print(f"Positive class rate: {y.mean():.1%}")


Feature matrix: (5849, 10)
Positive class rate: 3.7%


In [22]:
# Split data
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

print(f"Training set: {X_train.shape[0]:,} campaigns")
print(f"Test set: {X_test.shape[0]:,} campaigns")


Training set: 4,679 campaigns
Test set: 1,170 campaigns


In [24]:
# Train Random Forest with class balancing
rf_model = RandomForestClassifier(
    n_estimators=100,
    max_depth=10,
    min_samples_split=50,
    class_weight='balanced',  # Critical for imbalanced data
    random_state=42,
    n_jobs=-1
)

# Train model
rf_model.fit(X_train, y_train)

# Get predictions and probabilities
y_pred_test = rf_model.predict(X_test)
y_pred_proba_test = rf_model.predict_proba(X_test)[:, 1]

print("Model trained successfully!")


Model trained successfully!


In [26]:
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, roc_auc_score

# Calculate metrics
accuracy = accuracy_score(y_test, y_pred_test)
precision = precision_score(y_test, y_pred_test)
recall = recall_score(y_test, y_pred_test)
f1 = f1_score(y_test, y_pred_test)
roc_auc = roc_auc_score(y_test, y_pred_proba_test)

print(f"Accuracy: {accuracy:.1%}")
print(f"Precision: {precision:.1%}")
print(f"Recall: {recall:.1%}")
print(f"F1 Score: {f1:.3f}")
print(f"ROC-AUC: {roc_auc:.3f}")


Accuracy: 88.2%
Precision: 24.2%
Recall: 100.0%
F1 Score: 0.389
ROC-AUC: 0.923


In [28]:
# Analyze feature importance
feature_importance = pd.DataFrame({
    'feature': X.columns,
    'importance': rf_model.feature_importances_
}).sort_values('importance', ascending=False)

print("Feature importance:")
print(feature_importance)


Feature importance:
                   feature  importance
5         email_click_rate    0.775638
4          email_open_rate    0.133850
9      days_since_campaign    0.020754
7             recency_days    0.018524
3         engagement_score    0.017147
2  customer_lifetime_value    0.014627
6                frequency    0.009781
8    campaign_type_encoded    0.006179
0            segment_label    0.001785
1       churn_risk_encoded    0.001714


## Why this follows posts 1, 2, 3, and 4

We've built the strategic foundation across four posts:

**Post 1 (churn prediction):** Know who will leave  
**Post 2 (segmentation):** Know who they are  
**Post 3 (recommendations):** Know what to offer  
**Post 4 (CLV):** Know how much to invest

But all of this customer intelligence is useless if we can't engage customers effectively.

**Post 5 answers:** When and how to engage each customer through campaigns.

This completes the ML strategy stack: from intelligence to execution.


## Business impact

### The smart targeting strategy

Traditional marketing blasts everyone and hopes for engagement. ML-powered marketing targets intelligently.

**Current approach:**
- Send to entire database
- Low engagement rates
- High email fatigue
- Wasted budget

**ML approach:**
- Score every customer by response likelihood
- Send to top 30-50% by score
- Higher engagement rates
- Lower costs
- Happier customers

### Expected improvements:

| Metric | Current | ML-Powered | Improvement |
|--------|---------|------------|-------------|
| Emails sent | 100% | 30-40% | 60-70% reduction |
| Click rate | 3-4% | 8-12% | 2-3x improvement |
| Cost per click | High | Low | 70-80% reduction |
| Customer satisfaction | Low | Higher | Less email fatigue |


### Segment-specific insights (connecting to post 2):

Different segments respond differently to campaigns:

**Campaign Champions:**
- High engagement, respond to most campaigns
- Strategy: Send frequently, test new campaign types

**High-Value Customers:**
- Selective engagement, respond to personalized offers
- Strategy: Send less frequently, include recommended products (post 3)

**Engaged Browsers:**
- Medium engagement, respond to product-focused campaigns
- Strategy: Include recommendations, conversion-focused messaging

**At-Risk Dormant:**
- Low engagement, only respond to strong incentives
- Strategy: Win-back campaigns only, avoid over-mailing


## Key insights from the solution

### Insight 1: Past behavior is the strongest predictor

**Finding:** Historical email engagement (open rate, click rate) accounts for 75%+ of prediction power.

**What this means for PMs:**
- Customers who engaged before will engage again
- But timing matters: recent engagement > old engagement
- Use this to build a simple "engagement score" even without ML

**Action:** Create an engagement tier system:
- Tier 1 (hot): Clicked in last 30 days
- Tier 2 (warm): Opened in last 60 days
- Tier 3 (cold): No engagement in 90+ days
---------

### Insight 2: Segment membership changes response patterns (post 2)

**Finding:** Campaign Champions and Engaged Browsers have 2-3x higher response rates than At-Risk Dormant customers.

**What this means for PMs:**
- One-size-fits-all campaigns waste budget on wrong segments
- Segment-specific campaigns perform better
- Combining segment + engagement score = smarter targeting

**Action:** Build segment-specific campaign calendars:
- High-Value: Monthly product launches
- Engaged Browsers: Weekly deals on browsed categories
- Campaign Champions: Daily engagement campaigns (they love them)
- At-Risk Dormant: Quarterly win-back only

---------

### Insight 3: CLV determines campaign investment (post 4)

**Finding:** High-CLV customers warrant more campaign attempts even with lower response scores.

**What this means for PMs:**
- A high-value customer with 10% response rate is worth targeting
- A low-value customer with 10% response rate might not be
- ROI = (CLV × response rate) - campaign cost

**Action:** Create CLV-based targeting rules:
- CLV > \$2000: Send even if response score is medium
- CLV \$500-2000: Send only if response score is high
- CLV < \$500: Send only if response score is very high

--------

### Insight 4: Churn risk changes urgency (post 1)

**Finding:** High-churn customers are more responsive to retention campaigns but less responsive to standard marketing.

**What this means for PMs:**
- Don't waste retention messaging on low-churn customers
- High-churn customers need different campaign types
- Urgency matters: "Last chance" works for high-churn, annoys low-churn

**Action:** Create churn-specific campaign strategies:
- High churn + High CLV: Urgent retention with incentives
- High churn + Low CLV: Let them churn naturally (don't waste budget)
- Low churn: Standard marketing, no urgency

--------

### Insight 5: Campaign type matters

**Finding:** Email and SMS have higher response rates than Push or Social Media for this customer base.

**What this means for PMs:**
- Channel optimization matters
- Some customers prefer certain channels
- Test and learn by channel

**Action:** Build channel preference profiles:
- Email-responsive: Send email campaigns
- SMS-responsive: Send SMS for urgent offers
- Push-responsive: Use for time-sensitive deals
- Multi-channel: Test and rotate

--------

## Why this solution works (PM perspective)

### 1. It's built on four posts of foundation

This isn't a standalone model. It leverages:
- **Churn risk (post 1):** Prioritize retention campaigns
- **Segments (post 2):** Tailor messaging by customer type
- **Recommendations (post 3):** Personalize campaign content
- **CLV (post 4):** Allocate budget intelligently

Without posts 1-4, this would just be a campaign response model. With them, it's a complete engagement strategy.

### 2. It's immediately actionable

Clear probability scores mean clear decisions:
- Score > 0.7: Send
- Score 0.3-0.7: Personalize then send
- Score < 0.3: Skip

No ambiguity. Marketing can act on this today.

### 3. It reduces waste without sacrificing revenue

The goal isn't to send fewer emails. It's to send smarter emails.
- Send less to non-responders (reduce costs)
- Send more to likely responders (maintain revenue)
- Result: Same revenue, lower costs, happier customers

### 4. It's measurable

Unlike "spray and pray," ML targeting is measurable:
- Track predicted vs actual response
- Calculate ROI improvement
- A/B test ML targeting vs random sampling
- Prove value to stakeholders with data

### 5. It improves over time

As you send more campaigns and collect more data:
- Model learns new patterns
- Predictions get more accurate
- ROI improves continuously


## Connection to posts 1, 2, 3, and 4

### The complete ML strategy stack

Posts 1-5 form a complete customer engagement system:

**Strategic intelligence (posts 1-4):**
- **Post 1:** Predict churn → Know urgency
- **Post 2:** Segment customers → Know audience
- **Post 3:** Recommend products → Know content
- **Post 4:** Predict CLV → Know investment level

**Tactical execution (post 5):**
- **Post 5:** Predict response → Know who to engage

### Example workflow: new product launch campaign

**Step 1 (post 4):** Filter to customers with CLV > \$500 (worth the campaign cost)

**Step 2 (post 1):** Segment by churn risk:
- High churn: "Don't miss this launch" (urgency)
- Low churn: "Exclusive for loyal customers" (reward)

**Step 3 (post 2):** Target specific segments:
- High-Value Customers: Premium version
- Engaged Browsers: Standard version with discount
- Skip At-Risk Dormant (they won't respond anyway)

**Step 4 (post 3):** Personalize content:
- Include recommended complementary products
- Show "customers like you also bought..."

**Step 5 (post 5):** Score response likelihood:
- Send to top 40% by response score
- Personalize send time by engagement history

**Result:** 
- Traditional: 5M emails, 3% response, 150K clicks
- ML-powered: 2M emails, 7.5% response, 150K clicks
- Impact: Same clicks, 60% lower cost, less email fatigue

This is the power of layering ML models.


## What's next?
### Post 6: Next best action

We've built the intelligence (posts 1-4) and applied it to campaigns (post 5). The next question is:

"For each customer, what's the single best action we should take right now?"

Should we send a campaign? Recommend a product? Offer support? Update a retention offer?

Same dataset. New problem. Holistic customer engagement.

---------

Part of the "Machine learning for product leaders" series - teaching PMs just enough ML to lead with confidence.