# Tutorial: Causal Inference 05 - Capstone Policy Simulation

Audience:
- Students ready to connect causal estimates to action plans.

Prerequisites:
- Notebooks 01 to 04.

Learning goals:
- Simulate targeting policies under budget constraints.
- Select best models per budget.
- Translate uplift into expected incremental conversions and ROI.


## Outline

1. Fit candidate models and collect uplift scores.
2. Simulate budget policies.
3. Select best model at each budget.
4. Add a simple ROI layer.


In [None]:
from pathlib import Path
import sys

project_root = Path.cwd().resolve()
if not (project_root / "src").exists():
    project_root = project_root.parent

sys.path.insert(0, str(project_root / "src"))

import pandas as pd

from causal_showcase.data import load_marketing_ab_data, train_test_split_prepared
from causal_showcase.modeling import fit_meta_learners, fit_uplift_tree
from causal_showcase.policy import select_best_model_per_budget, simulate_policy_table

data_path = project_root / "data" / "raw" / "marketing_ab.csv"
prepared = load_marketing_ab_data(data_path)
train_data, test_data = train_test_split_prepared(prepared)

learner_results = fit_meta_learners(train_data, test_data)
tree_result = fit_uplift_tree(train_data, test_data)

score_by_model = {name: res.uplift_scores for name, res in learner_results.items()}
score_by_model["Uplift Tree (KL)"] = tree_result.uplift_scores


## Step 1 - Simulate policies by budget

For each budget, we target only the top-k users ranked by predicted uplift.


In [None]:
budgets = [0.1, 0.2, 0.3, 0.4, 0.5]
policy_df = simulate_policy_table(
    y=test_data.outcome,
    treatment=test_data.treatment,
    score_by_model=score_by_model,
    budgets=budgets,
)

policy_df.sort_values(["budget_fraction", "expected_incremental_conversions"], ascending=[True, False])


## Step 2 - Best model by budget

This gives a practical targeting recommendation under each spend constraint.


In [None]:
best_df = select_best_model_per_budget(policy_df)
best_df


## Step 3 - Add ROI assumptions

Example assumptions:
- Each incremental conversion is worth `$120`.
- Contacting one targeted user costs `$1.5`.


In [None]:
value_per_conversion = 120.0
contact_cost_per_user = 1.5

roi_df = policy_df.copy()
roi_df["expected_value"] = roi_df["expected_incremental_conversions"] * value_per_conversion
roi_df["targeting_cost"] = roi_df["targeted_users"] * contact_cost_per_user
roi_df["expected_net_value"] = roi_df["expected_value"] - roi_df["targeting_cost"]

roi_df.sort_values(["budget_fraction", "expected_net_value"], ascending=[True, False]).head(15)


## Exercise and extension

- Exercise: Change value and cost assumptions and see if best models change.
- Pitfall: Treating uplift as deterministic truth (it is estimated with uncertainty).
- Extension: Add confidence bounds and robust policy selection.


In [None]:
# Answer scaffold: modify these two numbers and recompute the ranking.
custom_value_per_conversion = 80.0
custom_contact_cost = 2.5

custom_roi = policy_df.copy()
custom_roi["expected_net_value"] = (
    custom_roi["expected_incremental_conversions"] * custom_value_per_conversion
    - custom_roi["targeted_users"] * custom_contact_cost
)

custom_roi.sort_values(["budget_fraction", "expected_net_value"], ascending=[True, False]).head(10)
