# Tutorial: Causal Inference 04 - Qini Curves and Targeting Policy

Audience:
- Students who can train uplift models and now need policy evaluation.

Prerequisites:
- Notebooks 01 to 03.

Learning goals:
- Compute Qini-style curves and AUC.
- Compare policy quality across learners.
- Understand budget-dependent targeting decisions.


## Outline

1. Fit all learners + uplift tree.
2. Build Qini curves and AUC table.
3. Compare `uplift_at_k` across budget fractions.
4. Exercise + pitfall + extension.


In [None]:
from pathlib import Path
import sys

project_root = Path.cwd().resolve()
if not (project_root / "src").exists():
    project_root = project_root.parent

sys.path.insert(0, str(project_root / "src"))

import matplotlib.pyplot as plt
import pandas as pd

from causal_showcase.data import load_marketing_ab_data, train_test_split_prepared
from causal_showcase.evaluation import qini_auc, qini_curve, uplift_at_k
from causal_showcase.modeling import fit_meta_learners, fit_uplift_tree

data_path = project_root / "data" / "raw" / "marketing_ab.csv"
prepared = load_marketing_ab_data(data_path)
train_data, test_data = train_test_split_prepared(prepared)

learner_results = fit_meta_learners(train_data, test_data)
tree_result = fit_uplift_tree(train_data, test_data)


## Step 1 - Qini AUC comparison

Higher Qini AUC generally indicates better ranking of users by incremental impact.


In [None]:
curves = {}
rows = []

for name, result in learner_results.items():
    curve = qini_curve(test_data.outcome, test_data.treatment, result.uplift_scores)
    curves[name] = curve
    rows.append({"model": name, "qini_auc": qini_auc(curve)})

tree_curve = qini_curve(test_data.outcome, test_data.treatment, tree_result.uplift_scores)
curves["Uplift Tree (KL)"] = tree_curve
rows.append({"model": "Uplift Tree (KL)", "qini_auc": qini_auc(tree_curve)})

qini_df = pd.DataFrame(rows).sort_values("qini_auc", ascending=False)
qini_df


## Step 2 - Visualize Qini curves

The curve shape shows where each model creates value as you expand targeting.


In [None]:
fig, ax = plt.subplots(figsize=(9, 5))
for name, curve in curves.items():
    ax.plot(curve["fraction"], curve["incremental_gain"], label=name)
ax.axhline(0.0, color="black", linestyle="--", linewidth=1)
ax.set_xlabel("Population fraction targeted")
ax.set_ylabel("Incremental gain")
ax.set_title("Qini-Style Curves")
ax.legend(loc="best")
fig.tight_layout()
out_path = project_root / "artifacts" / "figures" / "notebook_qini_curves.png"
out_path.parent.mkdir(parents=True, exist_ok=True)
fig.savefig(out_path, dpi=160)
plt.close(fig)
print(f"Saved figure to {out_path}")


## Step 3 - Budget sensitivity (`uplift_at_k`)

A model that wins at 30% may not win at 10% or 50%.


In [None]:
budgets = [0.1, 0.2, 0.3, 0.4, 0.5]
budget_rows = []

all_scores = {name: r.uplift_scores for name, r in learner_results.items()}
all_scores["Uplift Tree (KL)"] = tree_result.uplift_scores

for budget in budgets:
    for model_name, scores in all_scores.items():
        budget_rows.append(
            {
                "budget": budget,
                "model": model_name,
                "uplift_at_k": uplift_at_k(
                    test_data.outcome,
                    test_data.treatment,
                    scores,
                    top_fraction=budget,
                ),
            }
        )

pd.DataFrame(budget_rows).sort_values(["budget", "uplift_at_k"], ascending=[True, False])


## Exercises, pitfalls, and extension

- Exercise: Find the best model for each budget and summarize in one table.
- Pitfall: Picking a model from one metric without checking budget sensitivity.
- Extension: Add business costs and compute net value, not just uplift.


In [None]:
def best_model_by_budget(table: pd.DataFrame) -> pd.DataFrame:
    ranked = table.sort_values(["budget", "uplift_at_k"], ascending=[True, False])
    return ranked.groupby("budget", as_index=False).head(1).reset_index(drop=True)

best_model_by_budget(pd.DataFrame(budget_rows))
