# Digital Marketing Campaign Optimization

**Goal**: Maximize conversions by finding optimal intervention strategies

This notebook demonstrates using Intervention Search to optimize digital marketing campaigns by identifying which levers (budget, targeting, creative quality, etc.) to adjust for maximum conversions.

## 1. Load Marketing Campaign Data

In [1]:
import pandas as pd
import numpy as np
import networkx as nx
import warnings
warnings.filterwarnings('ignore')

# Load data
df = pd.read_csv('data/marketing_data.csv')
print(f"Loaded {len(df)} marketing campaigns")
print(f"\nKey metrics:")
print(f"  ‚Ä¢ Avg conversions: {df['conversions'].mean():.1f}")
print(f"  ‚Ä¢ Avg CPA: ${df['cost_per_acquisition'].mean():.2f}")
print(f"  ‚Ä¢ Avg CTR: {df['click_through_rate'].mean():.2f}%")
df.head()

Loaded 600 marketing campaigns

Key metrics:
  ‚Ä¢ Avg conversions: 270.0
  ‚Ä¢ Avg CPA: $61.44
  ‚Ä¢ Avg CTR: 3.20%


Unnamed: 0,campaign_id,ad_budget,targeting_quality,ad_creative_quality,landing_page_quality,audience_size,day_of_week,impressions,click_through_rate,clicks,conversion_rate,conversions,cost_per_click,cost_per_acquisition
0,CAMP_0000,7803.53,16.9,75.8,95.7,130234,Saturday,122722.0,2.432,2963.0,8.741,254.6,2.63,30.65
1,CAMP_0001,19038.93,27.9,2.5,73.8,578302,Thursday,302995.0,1.511,4532.0,7.118,328.4,4.2,57.98
2,CAMP_0002,14773.88,17.7,2.2,35.3,777454,Wednesday,244173.0,1.497,3651.0,4.803,175.3,4.05,84.28
3,CAMP_0003,12173.84,8.9,32.4,29.7,939242,Friday,206515.0,1.468,3056.0,3.995,126.4,3.98,96.3
4,CAMP_0004,3542.36,12.1,48.9,35.0,890765,Thursday,75565.0,1.911,1514.0,3.813,55.0,2.34,64.4


## 2. Define Marketing Causal Graph

**Causal Structure:**
- `ad_budget ‚Üí impressions ‚Üí clicks ‚Üí conversions`
- `audience_size ‚Üí impressions`
- `day_of_week ‚Üí impressions`
- `targeting_quality ‚Üí click_through_rate ‚Üí clicks`
- `ad_creative_quality ‚Üí click_through_rate`
- `landing_page_quality ‚Üí conversion_rate ‚Üí conversions`

In [2]:
# Define nodes
nodes = ['ad_budget', 'targeting_quality', 'ad_creative_quality', 
         'landing_page_quality', 'audience_size', 'day_of_week',
         'impressions', 'click_through_rate', 'clicks', 
         'conversion_rate', 'conversions']

# Define causal edges
edges = [
    ('ad_budget', 'impressions'),
    ('audience_size', 'impressions'),
    ('day_of_week', 'impressions'),
    ('targeting_quality', 'click_through_rate'),
    ('ad_creative_quality', 'click_through_rate'),
    ('impressions', 'clicks'),
    ('click_through_rate', 'clicks'),
    ('landing_page_quality', 'conversion_rate'),
    ('clicks', 'conversions'),
    ('conversion_rate', 'conversions')
]

# Create adjacency matrix
adj_matrix = pd.DataFrame(0, index=nodes, columns=nodes)
for parent, child in edges:
    adj_matrix.loc[parent, child] = 1

print("Marketing Causal Graph Structure:")
print(f"  ‚Ä¢ Nodes: {len(nodes)}")
print(f"  ‚Ä¢ Edges: {len(edges)}")

Marketing Causal Graph Structure:
  ‚Ä¢ Nodes: 11
  ‚Ä¢ Edges: 10


## 3. Train Causal Models

In [5]:
import sys
sys.path.append('..')  # Adjust the path as needed to import ht_categ

In [7]:
from ht_categ import HT, HTConfig

# Train HT model
config = HTConfig(graph=adj_matrix, model_type='XGBoost')
ht_model = HT(config)
ht_model.train(df)

print("‚úì Causal models trained")
print(f"\nModel Performance (R¬≤):")
for node, metrics in sorted(ht_model.model_metrics.items(), 
                            key=lambda x: x[1].get('r2', 0), reverse=True):
    if 'r2' in metrics:
        quality = 'üü¢' if metrics['r2'] > 0.7 else 'üü°' if metrics['r2'] > 0.5 else 'üî¥'
        print(f"  {quality} {node}: {metrics['r2']:.3f}")

üéì TRAINING MODELS WITH QUALITY ASSESSMENT

üìä Detecting variable types...
   ‚úì ad_budget: CONTINUOUS
   ‚úì targeting_quality: CONTINUOUS
   ‚úì ad_creative_quality: CONTINUOUS
   ‚úì landing_page_quality: CONTINUOUS
   ‚úì audience_size: CONTINUOUS
   ‚úì day_of_week: CATEGORICAL (7 classes: ['Friday', 'Monday', 'Saturday', 'Sunday', 'Thursday']...)
   ‚úì impressions: CONTINUOUS
   ‚úì click_through_rate: CONTINUOUS
   ‚úì clicks: CONTINUOUS
   ‚úì conversion_rate: CONTINUOUS
   ‚úì conversions: CONTINUOUS

üîß Training models (model_type: XGBoost)...
   ‚úì ad_budget: Root node (no parents) - baseline scaling only
   ‚úì targeting_quality: Root node (no parents) - baseline scaling only
   ‚úì ad_creative_quality: Root node (no parents) - baseline scaling only
   ‚úì landing_page_quality: Root node (no parents) - baseline scaling only
   ‚úì audience_size: Root node (no parents) - baseline scaling only
   ‚úì day_of_week: Root node (no parents) - baseline scaling only
   ‚úì 

## 4. Scenario 1: Increase Conversions by 25%

In [None]:
from intervention_search import InterventionSearch

# Initialize searcher with increased simulations for narrower CIs
searcher = InterventionSearch(
    graph=ht_model.graph,
    ht_model=ht_model,
    n_simulations=5000  # Increased from 100 for more precise confidence intervals
)

# Find interventions for +25% conversions
results_25 = searcher.find_interventions(
    target_outcome='conversions',
    target_change=25.0,
    tolerance=4.0,
    confidence_level=0.90,
    max_intervention_pct=30.0,
    verbose=True
)

In [None]:
# Display best intervention
best = results_25['best_intervention']

print("\n" + "="*70)
print("BEST INTERVENTION: +25% Conversions")
print("="*70)
print(f"\nüìä Intervene on: {', '.join(best['nodes'])}")
print(f"\nüéØ Required Changes:")
for node, change in best['required_pct_changes'].items():
    baseline = ht_model.baseline_stats[node]['mean']
    new_value = baseline * (1 + change/100)
    print(f"  ‚Ä¢ {node}: {change:+.1f}%")
    print(f"    (from {baseline:.1f} to {new_value:.1f})")

print(f"\nüìà Expected Results:")
print(f"  ‚Ä¢ Predicted effect: {best['actual_effect']:+.1f}%")
print(f"  ‚Ä¢ 90% CI: [{best['ci_90'][0]:+.1f}%, {best['ci_90'][1]:+.1f}%]")
print(f"  ‚Ä¢ Confidence: {best['confidence']:.0%}")
print(f"  ‚Ä¢ Status: {'‚úÖ APPROVED' if best.get('within_tolerance', False) else '‚ö†Ô∏è NEEDS REVIEW'}")
print("="*70)

## 5. Scenario 2: Multi-Node Intervention Strategy

Explore combinations of interventions for more robust improvements

In [None]:
# Search with combinations allowed
results_combo = searcher.find_interventions(
    target_outcome='conversions',
    target_change=25.0,
    tolerance=4.0,
    confidence_level=0.90,
    allow_combinations=True,  # Enable 2-node combinations
    max_intervention_pct=20.0,  # Smaller changes per node
    verbose=True
)

# Show top combinations
print("\nTop 3 Intervention Combinations:\n")
for i, candidate in enumerate(results_combo['all_candidates'][:3], 1):
    print(f"{i}. Nodes: {', '.join(candidate['nodes'])}")
    print(f"   Effect: {candidate['actual_effect']:+.1f}% ¬± {candidate['uncertainty']:.1f}%")
    print(f"   Confidence: {candidate['confidence']:.0%}")
    for node, change in candidate['required_pct_changes'].items():
        print(f"     - {node}: {change:+.1f}%")
    print()


üéØ INTERVENTION SEARCH v2.0 (Production Grade)
Target: +25.0% change in conversions
Tolerance: ¬±4.0% points
Max intervention: ¬±20.0%
Monte Carlo simulations: 100

üìä Pre-flight checks...
   Candidate nodes: 10
   Overall model quality: F

üîç Searching 10 candidates...
   Testing: clicks... ‚úì +15.6% ‚Üí +21.2%
   Testing: impressions... ‚úì +20.0% ‚Üí +21.4%
   Testing: click_through_rate... ‚úì +20.0% ‚Üí +22.0%
   Testing: conversion_rate... ‚úì +20.0% ‚Üí +21.3%

üîó Testing 2-node combinations...
   ‚úì landing_page_quality + ad_creative_quality ‚Üí +27.1%
   ‚úì landing_page_quality + audience_size ‚Üí +30.0%
   ‚úì landing_page_quality + ad_budget ‚Üí +21.7%
   ‚úì landing_page_quality + clicks ‚Üí +30.0%
   ‚úì landing_page_quality + targeting_quality ‚Üí +18.9%
   ‚úì landing_page_quality + impressions ‚Üí +24.3%


## 6. Comparative Analysis: Single vs Multi-Node

Compare different intervention strategies

In [None]:
# Create comparison table
comparison = []
for i, (label, res) in enumerate([('Single Node', results_25), 
                                    ('Multi-Node', results_combo)], 1):
    best = res['best_intervention']
    ci_width = best['ci_90'][1] - best['ci_90'][0]
    comparison.append({
        'Strategy': label,
        'Nodes': ', '.join(best['nodes']),
        'Effect': f"{best['actual_effect']:+.1f}%",
        'CI Width': f"{ci_width:.1f}%",
        'Confidence': f"{best['confidence']:.0%}",
        'Status': '‚úÖ' if best.get('within_tolerance', False) else '‚ö†Ô∏è'
    })

comp_df = pd.DataFrame(comparison)
print("\nStrategy Comparison:")
print(comp_df.to_string(index=False))

## 7. Cost-Effectiveness Analysis

Estimate cost implications of different interventions

In [None]:
# Estimate costs for different interventions
def estimate_intervention_cost(intervention, baseline_stats):
    """Rough cost estimation for marketing interventions"""
    cost_map = {
        'ad_budget': 1.0,  # Direct cost multiplier
        'targeting_quality': 0.3,  # Platform fees for better targeting
        'ad_creative_quality': 0.5,  # Creative production costs
        'landing_page_quality': 0.4,  # Development costs
    }
    
    total_cost = 0
    for node, pct_change in intervention['required_pct_changes'].items():
        if node in cost_map:
            baseline = baseline_stats[node]['mean']
            change_amount = baseline * abs(pct_change) / 100
            total_cost += change_amount * cost_map[node]
    
    return total_cost

# Analyze top 5 interventions by cost-effectiveness
print("\nCost-Effectiveness Analysis (Top 5):\n")
for i, candidate in enumerate(results_combo['all_candidates'][:5], 1):
    cost = estimate_intervention_cost(candidate, ht_model.baseline_stats)
    effect = candidate['actual_effect']
    roi = effect / (cost + 1) if cost > 0 else effect
    
    print(f"{i}. {', '.join(candidate['nodes'])}")
    print(f"   Est. Cost: ${cost:.0f} | Effect: {effect:+.1f}% | ROI: {roi:.2f}")
    print()

## 8. Key Insights & Recommendations

### üéØ Insights:

1. **Primary Drivers**: The analysis reveals which marketing levers have the strongest causal impact on conversions
2. **Trade-offs**: Single-node interventions may require larger changes vs. multi-node strategies with smaller, distributed changes
3. **Confidence**: Higher quality grades indicate more reliable model predictions for those paths
4. **Cost-Effectiveness**: Not all interventions with similar effects have the same cost implications

### üí° Recommendations:

1. **Start with High-ROI Interventions**: Prioritize changes that deliver strong effects at lower cost
2. **Monitor Model Quality**: Focus on intervention paths with R¬≤ > 0.7 for highest reliability
3. **A/B Test**: Validate predictions with controlled experiments before full rollout
4. **Iterative Approach**: Implement changes gradually and measure actual vs. predicted outcomes

### ‚ö†Ô∏è Caveats:

- Confidence intervals account for model uncertainty but not external factors
- Cost estimates are illustrative - actual costs may vary
- Results assume causal graph structure is correct

## Summary

This notebook demonstrated:
- ‚úÖ Building causal models for digital marketing
- ‚úÖ Finding optimal single and multi-node interventions
- ‚úÖ Comparing intervention strategies
- ‚úÖ Analyzing cost-effectiveness
- ‚úÖ Generating actionable business recommendations

The Intervention Search system enables data-driven decision making by:
- Quantifying uncertainty through Monte Carlo simulation
- Validating model quality for each causal path
- Ranking interventions by multiple objectives (effect, confidence, simplicity)