# EM para desagregar contribuciones incrementales de MMM a MTA (campañas/creatividades)

Este notebook muestra un enfoque **top-down**:

- Un MMM produce contribuciones incrementales por **canal/medio** a nivel **tiempo** (y opcionalmente **geo**):  
  \( C_{m,t,g}^{inc} \)

- Queremos repartir \( C_{m,t,g}^{inc} \) a niveles inferiores (campañas/creas) \(k\) dentro del medio \(m\):  
  \( z_{m,k,t,g} \), con la restricción:  
  \( \sum_k z_{m,k,t,g} = C_{m,t,g}^{inc} \)

- Usamos como “exposición” una señal de delivery (impresiones/GRPs) **transformada como en MMM**:
  - **Adstock** (carryover)
  - **Saturación** (rendimientos decrecientes, p. ej. Hill)

El reparto (E-step) se hace proporcional a:
\[
\lambda_{m,k}\;\cdot\; s(\text{Adstock}(x_{m,k,t,g}))
\]
donde \(x\) son impresiones/GRPs y \(s(\cdot)\) es saturación.

> Nota causal: si el MMM es incremental/causal a nivel canal, el total \(C^{inc}\) lo es.  
> El split intra-canal es **atribución model-based**, no causal garantizado, salvo que haya identificación adicional intra-canal.


## 1) Utilidades: Adstock y Saturación (Hill)

In [None]:
import numpy as np
import pandas as pd

def adstock_geometric(x, theta):
    x = np.asarray(x, dtype=float)
    a = np.zeros_like(x)
    for t in range(len(x)):
        a[t] = x[t] + (theta * a[t-1] if t > 0 else 0.0)
    return a

def hill_saturation(x, alpha, gamma):
    x = np.asarray(x, dtype=float)
    x_pos = np.clip(x, 0.0, None)
    xa = np.power(x_pos + 1e-12, alpha)
    ga = np.power(gamma + 1e-12, alpha)
    return xa / (xa + ga)

def effective_exposure(x, theta, alpha, gamma):
    a = adstock_geometric(x, theta=theta)
    s = hill_saturation(a, alpha=alpha, gamma=gamma)
    return a, s

x = np.array([0, 10, 0, 0, 5, 0], dtype=float)
a, s = effective_exposure(x, theta=0.6, alpha=1.2, gamma=8.0)
pd.DataFrame({"x": x, "adstock": a, "sat": s})


## 2) Datos mock: canal (medio) -> campañas, con tiempo x geo

In [None]:
import numpy as np
import pandas as pd

rng = np.random.default_rng(123)

M = 2
K_per_m = 6
T = 52
G = 5

media_names = ["Search", "Social"]
geos = [f"G{i+1}" for i in range(G)]
weeks = np.arange(T)

theta_m = {"Search": 0.40, "Social": 0.65}
alpha_m = {"Search": 1.20, "Social": 1.40}
gamma_m = {"Search": 0.35, "Social": 0.25}

campaign_rows = []
for m in media_names:
    for k in range(K_per_m):
        campaign_rows.append({
            "medium": m,
            "campaign": f"{m}_C{k+1}",
            "creative_family": rng.choice(["A", "B", "C"]),
            "objective": rng.choice(["Awareness", "Consideration", "Conversion"], p=[0.3,0.4,0.3]),
        })
campaign_df = pd.DataFrame(campaign_rows)

geo_effect = rng.lognormal(mean=0.0, sigma=0.25, size=G)
time_season = 1.0 + 0.25*np.sin(2*np.pi*weeks/T) + 0.10*np.sin(4*np.pi*weeks/T)

x_rows = []
for _, row in campaign_df.iterrows():
    m = row["medium"]
    k = row["campaign"]
    base = rng.lognormal(mean=0.0, sigma=0.5)
    for g_idx, g in enumerate(geos):
        level = base * geo_effect[g_idx] * rng.lognormal(mean=0.0, sigma=0.15)
        x_t = level * time_season * rng.lognormal(mean=0.0, sigma=0.25, size=T)
        mask = rng.random(T) < 0.08
        x_t[mask] = 0.0
        for t in range(T):
            x_rows.append({"medium": m, "campaign": k, "geo": g, "t": int(t), "x_raw": float(x_t[t])})

x_df = pd.DataFrame(x_rows)

# Scale per medium to stabilize Hill gamma interpretation
x_df["x_scaled"] = x_df["x_raw"]
for m in media_names:
    med = np.median(x_df.loc[(x_df.medium==m) & (x_df.x_raw>0), "x_raw"])
    x_df.loc[x_df.medium==m, "x_scaled"] = x_df.loc[x_df.medium==m, "x_raw"] / (med + 1e-12)

x_df.head()


## 3) Ground truth: productividad intra-medio y generación de contribuciones incrementales del MMM

In [None]:
import numpy as np
import pandas as pd

lambda_true = {}
for m in media_names:
    lam = rng.lognormal(mean=0.0, sigma=0.6, size=K_per_m)
    lam = lam / lam.mean()
    camps = campaign_df.loc[campaign_df.medium==m, "campaign"].tolist()
    for i, k in enumerate(camps):
        lambda_true[(m, k)] = float(lam[i])

campaign_df["lambda_true"] = campaign_df.apply(lambda r: lambda_true[(r["medium"], r["campaign"])], axis=1)

# Effective exposure per (medium,campaign,geo) along time
s_rows = []
for (m,k,g), grp in x_df.groupby(["medium","campaign","geo"], sort=False):
    grp = grp.sort_values("t")
    x = grp["x_scaled"].to_numpy()
    a, s = effective_exposure(x, theta=theta_m[m], alpha=alpha_m[m], gamma=gamma_m[m])
    out = grp[["medium","campaign","geo","t"]].copy()
    out["adstock"] = a
    out["s_eff"] = s
    s_rows.append(out)
s_df = pd.concat(s_rows, ignore_index=True)

beta_m = {"Search": 120.0, "Social": 80.0}

sig = s_df.merge(campaign_df[["medium","campaign","lambda_true"]], on=["medium","campaign"], how="left")
sig["mu"] = sig["lambda_true"] * sig["s_eff"]

C_rows = []
for (m,t,g), grp in sig.groupby(["medium","t","geo"], sort=False):
    total_mu = grp["mu"].sum()
    mean = beta_m[m] * total_mu
    shape = 20.0
    scale = (mean / shape) if mean > 0 else 0.0
    C = float(rng.gamma(shape=shape, scale=scale)) if mean > 0 else 0.0
    C_rows.append({"medium": m, "t": int(t), "geo": g, "C_inc": C})
C_df = pd.DataFrame(C_rows)

sig2 = sig.merge(C_df, on=["medium","t","geo"], how="left")
sig2["share_true"] = sig2["mu"] / (sig2.groupby(["medium","t","geo"])["mu"].transform("sum") + 1e-12)
sig2["z_true"] = sig2["C_inc"] * sig2["share_true"]

sig2.head(), C_df.head(), campaign_df.head()


## 4) EM para desagregar C_inc a campañas (sin covariables): ecuaciones cerradas

In [None]:
import numpy as np
import pandas as pd

def em_mmm_to_mta_single_medium(sig_m, C_m, max_iter=200, tol=1e-8, verbose=True):
    df = sig_m.merge(C_m, on=["t","geo"], how="left")
    df["C_inc"] = df["C_inc"].fillna(0.0)

    campaigns = df["campaign"].unique().tolist()
    K = len(campaigns)
    camp_to_idx = {c:i for i,c in enumerate(campaigns)}

    k_idx = df["campaign"].map(camp_to_idx).to_numpy().astype(int)
    s = df["s_eff"].to_numpy().astype(float)

    tg = pd.factorize(list(zip(df["t"], df["geo"])))[0].astype(int)
    C = df["C_inc"].to_numpy().astype(float)

    n_groups = int(tg.max()) + 1

    lambda_hat = np.ones(K, dtype=float)
    history = {"rel_change": []}

    s_sum_k = np.bincount(k_idx, weights=s, minlength=K) + 1e-12

    for it in range(max_iter):
        num = lambda_hat[k_idx] * s
        den_g = np.bincount(tg, weights=num, minlength=n_groups) + 1e-12
        share = num / den_g[tg]
        zhat = C * share

        z_sum_k = np.bincount(k_idx, weights=zhat, minlength=K)
        lambda_new = z_sum_k / s_sum_k

        # normalize scale within medium (optional)
        lambda_new = lambda_new / (lambda_new.mean() + 1e-12)

        rel = float(np.linalg.norm(lambda_new - lambda_hat) / (np.linalg.norm(lambda_hat) + 1e-12))
        history["rel_change"].append(rel)

        if verbose and (it < 5 or it % 20 == 0):
            print(f"iter {it:3d} | rel_change={rel: .3e}")

        lambda_hat = lambda_new
        if rel < tol:
            break

    df_out = df[["campaign","t","geo","s_eff","C_inc"]].copy()
    df_out["zhat"] = zhat
    return pd.Series(lambda_hat, index=campaigns, name="lambda_hat"), df_out, history

results = []
lambda_hats = []
histories = {}

for m in media_names:
    sig_m = sig2.loc[sig2.medium==m, ["campaign","t","geo","s_eff"]].copy()
    C_m = C_df.loc[C_df.medium==m, ["t","geo","C_inc"]].copy()

    lam_hat, df_hat, hist = em_mmm_to_mta_single_medium(sig_m, C_m, verbose=False)
    df_hat["medium"] = m
    results.append(df_hat)

    lam_hat_df = lam_hat.reset_index().rename(columns={"index":"campaign"})
    lam_hat_df["medium"] = m
    lambda_hats.append(lam_hat_df)

    histories[m] = hist

zhat_df = pd.concat(results, ignore_index=True)
lambda_hat_df = pd.concat(lambda_hats, ignore_index=True)

lambda_hat_df.head(), zhat_df.head()


## 5) Checks y evaluación (ground truth disponible en mock)

In [None]:
import numpy as np
import pandas as pd

chk = (zhat_df.groupby(["medium","t","geo"])["zhat"].sum()
       .reset_index()
       .merge(C_df, on=["medium","t","geo"], how="left"))
chk["gap"] = chk["zhat"] - chk["C_inc"]
print("max |gap| =", float(chk["gap"].abs().max()))

lam_true_df = campaign_df[["medium","campaign","lambda_true"]].copy()
lam_cmp = lam_true_df.merge(lambda_hat_df, on=["medium","campaign"], how="left")

corr_lam = lam_cmp.groupby("medium").apply(lambda d: np.corrcoef(d["lambda_true"], d["lambda_hat"])[0,1])
print("corr(lambda_true, lambda_hat) por medio:")
display(corr_lam)

lam_cmp.head()


In [None]:
import numpy as np
import pandas as pd

z_cmp = (sig2[["medium","campaign","t","geo","z_true"]]
         .merge(zhat_df[["medium","campaign","t","geo","zhat"]], on=["medium","campaign","t","geo"], how="left"))

for m in media_names:
    d = z_cmp.loc[z_cmp.medium==m]
    corr = np.corrcoef(d["z_true"], d["zhat"])[0,1]
    rmse = float(np.sqrt(np.mean((d["z_true"] - d["zhat"])**2)))
    print(m, "| corr(z_true, zhat) =", corr, "| rmse =", rmse)


## 6) Dónde entra adstock + saturación (resumen)


En este enfoque, adstock + saturación se usan para construir la exposición efectiva:

\\[
w_{m,k,t,g} := s_{m,k,t,g} = \\text{Hill}(\\text{Adstock}(x_{m,k,t,g}))
\\]

y luego el E-step reparte:

\\[
\\hat z_{m,k,t,g} = C_{m,t,g}^{inc} \\cdot
\\frac{\\lambda_{m,k} w_{m,k,t,g}}
{\\sum_{k'} \\lambda_{m,k'} w_{m,k',t,g}}
\\]

Esto respeta la estructura MMM y conserva masa: \\(\\sum_k \\hat z = C\\).


## 7) (Opcional) Extensión a features por campaña (lambda = exp(X beta))


Si quieres generalizar / regularizar:

\\[
\\lambda_{m,k} = \\exp(X_{m,k} \\beta_m)
\\]

El M-step pasa a ser un GLM (Poisson/Gamma) con offset (igual que HCP–Brick).


## 8) Granularidad horaria (para más adelante)


Si el MMM está a nivel semanal/diario pero necesitas atribución horaria:

1) Ejecutas EM a nivel (t,g) y obtienes \\(\\hat z_{m,k,t,g}\\)
2) Downscaling horario con perfil de delivery y conservación de masa:

\\[
\\hat z_{m,k,t,g,h} = \\hat z_{m,k,t,g} \\cdot
\\frac{u_{m,k,t,g,h}}{\\sum_h u_{m,k,t,g,h}}
\\]

donde \\(u\\) puede ser impresiones horarias o un modelo de intensidad suavizado.
