# Uber Eats A/B Testing Simulation
#### 50/50: "Free Delivery" vs "$5 Off" with CUPED and Minimum Order Value (MOV)


### Background
A food-delivery app relies heavily on promotional incentives to convert browsers into purchasers. Delivery fees are a known friction point, but dollar-off credits can feel more tangible to users. This study will investigate how promotional incentives affects conversions

### Business Problem
“We need to determine which type of coupon—‘Free Delivery’ vs. ‘$5 Off’—drives a greater uplift in order conversion, without eroding our margins more than necessary.”


### Objective
Identify the promotional offer that maximizes net incremental revenue per user—balancing increased conversion against the subsidy cost of the coupon.

In [None]:
# from dataclasses import dataclass
# import numpy as np
# import pandas as pd
# from typing import Tuple, Dict
# from math import sqrt

## 1. Configuration: Inputs & Params
#### All knobs in one place: baseline demand/AOV, delivery-fee shape, take-rate and fee margin, promo lifts/AOV multipliers, MOV (threshold + friction), CUPED on/off, and scale (n, seed)

In [2]:
np.random.seed(42)

# -----------------------------
# 1) Experiment configuration
# -----------------------------
@dataclass
class BizParams:
    # Demand & order value (baseline)
    base_conv: float = 0.10               # baseline order probability (no promo)
    aov_lognorm_mu: float = 3.3           # lognormal params -> ~ $27–$30 median
    aov_lognorm_sigma: float = 0.35

    # Delivery fee distribution
    deliv_fee_mean: float = 4.99
    deliv_fee_sd: float = 1.2
    min_fee: float = 2.49

    # Platform economics
    take_rate: float = 0.12               # commission on food subtotal
    df_margin_frac: float = 0.20          # margin on delivery fee AFTER courier cost (e.g., 20%)

    # Promo effects (average uplifts; per-user heterogeneity applied)
    uplift_fd_pp: float = 0.025           # Free Delivery absolute lift in conversion (e.g., +2.5 pp)
    uplift_5off_pp: float = 0.030         # $5 Off absolute lift (e.g., +3.0 pp)
    aov_mult_fd: float = 1.00             # AOV multiplier under Free Delivery
    aov_mult_5off: float = 1.05           # AOV multiplier under $5 Off (upsell)

    # $5 Off rules
    five_off_value: float = 5.00
    five_off_threshold: float = 15.00     # minimum basket subtotal to redeem $5 off

    # Minimum Order Value (MOV)
    mov_value: float = 12.00              # platform/restaurant minimum basket to place an order
    mov_friction: float = 0.15            # conversion drag if expected subtotal < MOV (0..1)

    # Heterogeneity (price sensitivity)
    het_beta_a: float = 2.0               # ~Beta(2, 8) on [0,1]
    het_beta_b: float = 8.0

    # Simulation controls
    n_exposures: int = 100_000
    bootstraps: int = 500                 # for CI of NIR per 1k exposures

    # CUPED controls
    cuped: bool = True
    cuped_noise_sd: float = 0.30          # noise in pre-period covariate to avoid perfect correlation

## 2. Functions
#### Tiny utilities to draw AOV/fees, compute net platform revenue (commission + fee − courier − promo), apply MOV friction and flooring, do Bernoulli draws, and bootstrap CIs

In [4]:
# -----------------------------
# 2) Functions
# -----------------------------
def draw_aov(mu: float, sigma: float, n: int) -> np.ndarray:
    """Lognormal basket subtotal (food only)."""
    return np.random.lognormal(mean=mu, sigma=sigma, size=n)

def draw_delivery_fee(mean: float, sd: float, n: int, min_fee: float) -> np.ndarray:
    fees = np.maximum(np.random.normal(loc=mean, scale=sd, size=n), min_fee)
    return np.round(fees, 2)

def courier_cost_from_fee(deliv_fee: np.ndarray, df_margin_frac: float) -> np.ndarray:
    """Courier cost = fee - margin; margin = df_margin_frac * fee."""
    return deliv_fee * (1.0 - df_margin_frac)

def compute_net_rev(subtotal: np.ndarray,
                    deliv_fee: np.ndarray,
                    charged_delivery: np.ndarray,
                    promo_discount: np.ndarray,
                    take_rate: float,
                    df_margin_frac: float) -> np.ndarray:
    """
    Net revenue (platform) per order:
      revenue = take_rate * subtotal + (charged_delivery ? deliv_fee : 0)
      cost    = courier_cost (always incurred) + promo_discount (cash-like)
      net     = revenue - cost
    """
    rev_comm = take_rate * subtotal
    rev_fee = np.where(charged_delivery, deliv_fee, 0.0)
    courier_cost = courier_cost_from_fee(deliv_fee, df_margin_frac)
    promo_cost = promo_discount
    net = rev_comm + rev_fee - courier_cost - promo_cost
    return net

def bernoulli(p: np.ndarray) -> np.ndarray:
    return (np.random.rand(p.size) < p).astype(int)

def bootstrap_ci(x: np.ndarray, reps: int = 500, alpha: float = 0.05) -> Tuple[float, float]:
    n = x.size
    means = np.empty(reps)
    for b in range(reps):
        idx = np.random.randint(0, n, n)
        means[b] = x[idx].mean()
    lo = np.percentile(means, 100*alpha/2)
    hi = np.percentile(means, 100*(1-alpha/2))
    return lo, hi

def apply_mov_friction(p: np.ndarray, expected_subtotal: np.ndarray,
                       mov_value: float, mov_friction: float) -> np.ndarray:
    """Reduce conversion probability if expected subtotal is below MOV."""
    penalize = (expected_subtotal < mov_value).astype(float)
    return np.clip(p * (1 - mov_friction * penalize), 0, 1)

def floor_to_mov(subtotal: np.ndarray, mov_value: float) -> np.ndarray:
    """If an order occurs and subtotal < MOV, user tops up to meet MOV."""
    return np.maximum(subtotal, mov_value)

## 3. CUPED covariate
##### Simulates a pre-period, no-promo net-revenue signal per user (independent of treatment). Used for variance reduction

In [5]:
def simulate_preperiod_covariate(aov_base: np.ndarray,
                                 fee_base: np.ndarray,
                                 base_conv: float,
                                 take_rate: float,
                                 df_margin_frac: float,
                                 mov_value: float,
                                 mov_friction: float,
                                 noise_sd: float) -> np.ndarray:
    """
    Build a pre-experiment covariate C correlated with outcomes, independent of treatment.
    We simulate a past-week baseline net revenue (no promo), with mild noise so it's not identical.
    """
    n = aov_base.size
    # Correlated pre-period subtotals/fees via small noise
    pre_aov = np.exp(np.log(aov_base + 1e-8) + np.random.normal(0, noise_sd, n))
    pre_fee = np.maximum(fee_base + np.random.normal(0, noise_sd * 1.2, n), 1.0)

    # MOV friction in pre-period using expected basket pre_aov
    p_pre = apply_mov_friction(np.full(n, base_conv), pre_aov, mov_value, mov_friction)
    order_pre = bernoulli(p_pre)

    # If convert, floor subtotal to MOV
    sub_pre = floor_to_mov(pre_aov, mov_value)
    # Baseline charges delivery and no promo
    charged_pre = np.ones(n, dtype=bool)
    promo_pre = np.zeros(n)
    net_pre = np.zeros(n)
    idx = (order_pre == 1)
    if idx.any():
        net_pre[idx] = compute_net_rev(
            subtotal=sub_pre[idx],
            deliv_fee=pre_fee[idx],
            charged_delivery=charged_pre[idx],
            promo_discount=promo_pre[idx],
            take_rate=take_rate,
            df_margin_frac=df_margin_frac
        )
    return net_pre  # zero for non-converters, positive for converters


## 4. Simulator (with MOV & CUPED)
##### Generates baseline counterfactuals, randomizes 50/50 into Free Delivery or $5 Off, applies MOV rules, computes per-order net revenue, and returns DataFrames with nir_per_exposure = net_variant − net_baseline


In [6]:
# -----------------------------
# 4) Simulator (with MOV and CUPED)
# -----------------------------
def simulate(params: BizParams) -> Dict[str, pd.DataFrame]:
    n = params.n_exposures
    # User heterogeneity (price sensitivity)
    price_sens = np.random.beta(params.het_beta_a, params.het_beta_b, size=n)

    # Baseline (no promo) per-exposure counterfactuals
    aov_base = draw_aov(params.aov_lognorm_mu, params.aov_lognorm_sigma, n)
    fee_base = draw_delivery_fee(params.deliv_fee_mean, params.deliv_fee_sd, n, params.min_fee)

    # MOV friction on baseline conversion using expected subtotal aov_base
    p0 = apply_mov_friction(np.full(n, params.base_conv), aov_base, params.mov_value, params.mov_friction)
    order0 = bernoulli(p0)

    # Baseline subtotal floors to MOV if order occurs
    sub0 = floor_to_mov(aov_base, params.mov_value)

    net0 = np.zeros(n)
    if order0.any():
        net0_conv = compute_net_rev(
            subtotal=sub0[order0 == 1],
            deliv_fee=fee_base[order0 == 1],
            charged_delivery=np.ones(order0.sum(), dtype=bool),   # baseline charges delivery
            promo_discount=np.zeros(order0.sum()),
            take_rate=params.take_rate,
            df_margin_frac=params.df_margin_frac
        )
        net0[order0 == 1] = net0_conv

    # Pre-period CUPED covariate (baseline, no promo)
    cov_pre = simulate_preperiod_covariate(
        aov_base=aov_base,
        fee_base=fee_base,
        base_conv=params.base_conv,
        take_rate=params.take_rate,
        df_margin_frac=params.df_margin_frac,
        mov_value=params.mov_value,
        mov_friction=params.mov_friction,
        noise_sd=params.cuped_noise_sd
    )

    # Randomize variants 50/50
    z = np.random.choice(["FREE_DELIVERY", "FIVE_OFF"], size=n)

    # --- Variant A: Free Delivery ---
    is_fd = (z == "FREE_DELIVERY")
    # Lift scaled by price sensitivity; apply MOV friction using expected subtotal under FD
    expected_sub_fd = aov_base * params.aov_mult_fd
    p_fd = np.clip(params.base_conv + params.uplift_fd_pp * (0.5 + price_sens/2), 0, 1)
    p_fd = apply_mov_friction(p_fd, expected_sub_fd, params.mov_value, params.mov_friction)
    order_fd = np.zeros(n, dtype=int)
    order_fd[is_fd] = bernoulli(p_fd[is_fd])

    # Subtotals under FD (and MOV floor if converted)
    sub_fd = floor_to_mov(aov_base * params.aov_mult_fd, params.mov_value)
    fee_fd = fee_base
    # Free delivery => delivery not charged to customer
    charged_fd = np.zeros(n, dtype=bool)
    promo_fd = np.zeros(n)
    net_fd = np.zeros(n)
    idx = np.where((is_fd) & (order_fd == 1))[0]
    if idx.size:
        net_fd[idx] = compute_net_rev(
            subtotal=sub_fd[idx],
            deliv_fee=fee_fd[idx],
            charged_delivery=charged_fd[idx],   # 0 => no fee revenue collected
            promo_discount=promo_fd[idx],
            take_rate=params.take_rate,
            df_margin_frac=params.df_margin_frac
        )

    # --- Variant B: $5 Off ---
    is_5 = (z == "FIVE_OFF")
    expected_sub_5 = aov_base * params.aov_mult_5off
    p_5 = np.clip(params.base_conv + params.uplift_5off_pp * (0.5 + price_sens/2), 0, 1)
    p_5 = apply_mov_friction(p_5, expected_sub_5, params.mov_value, params.mov_friction)
    order_5 = np.zeros(n, dtype=int)
    order_5[is_5] = bernoulli(p_5[is_5])

    sub_5 = floor_to_mov(aov_base * params.aov_mult_5off, params.mov_value)
    fee_5 = fee_base
    charged_5 = np.ones(n, dtype=bool)     # still charge delivery
    # $5 off applies only if subtotal meets threshold; capped at subtotal
    raw_disc = np.where(sub_5 >= params.five_off_threshold, params.five_off_value, 0.0)
    promo_5 = np.minimum(raw_disc, sub_5)
    net_5 = np.zeros(n)
    idx = np.where((is_5) & (order_5 == 1))[0]
    if idx.size:
        net_5[idx] = compute_net_rev(
            subtotal=sub_5[idx],
            deliv_fee=fee_5[idx],
            charged_delivery=charged_5[idx],
            promo_discount=promo_5[idx],
            take_rate=params.take_rate,
            df_margin_frac=params.df_margin_frac
        )

    # Pack per-exposure frame
    df = pd.DataFrame({
        "variant": z,
        "price_sens": price_sens,
        "aov_base": aov_base,
        "deliv_fee": fee_base,
        "order_baseline": order0,
        "net_baseline": net0,
        "order_variant": np.where(z == "FREE_DELIVERY", order_fd, order_5),
        "net_variant": np.where(z == "FREE_DELIVERY", net_fd, net_5),
        "cuped_cov_pre": cov_pre
    })
    # Per-exposure incremental net revenue vs baseline
    df["nir_per_exposure"] = df["net_variant"] - df["net_baseline"]

    return {
        "all": df,
        "A_FREE_DELIVERY": df[df["variant"] == "FREE_DELIVERY"].copy(),
        "B_FIVE_OFF": df[df["variant"] == "FIVE_OFF"].copy()
    }


## 5. Metrics & tests
#### Summaries: exposures, orders, conversion, NIR/1k with 95% CIs (raw). CUPED adds 𝜃 and adjusted NIR/1k (tighter CIs). Welch test compares arms; report Δ per 1k and p-value

In [7]:
# -----------------------------
# 5) Metrics, CUPED & tests
# -----------------------------
def summarize_variant(df: pd.DataFrame, boots: int) -> Dict[str, float]:
    n = len(df)
    orders = df["order_variant"].sum()
    cr = orders / n
    # report baseline AOV for reference
    aov_orders = df.loc[df["order_variant"] == 1, "aov_base"].mean()
    nir_exp = df["nir_per_exposure"].values
    nir_per_1k = nir_exp.mean() * 1000.0
    lo, hi = bootstrap_ci(nir_exp, reps=boots)
    ci_1k = (lo*1000.0, hi*1000.0)
    out = {
        "exposures": n,
        "orders": int(orders),
        "conv_rate": cr,
        "avg_aov_base": aov_orders,
        "nir_per_1k": nir_per_1k,
        "nir_per_1k_ci_lo": ci_1k[0],
        "nir_per_1k_ci_hi": ci_1k[1],
    }
    return out

def cuped_adjusted_metric(df: pd.DataFrame, boots: int) -> Dict[str, float]:
    """
    CUPED on incremental outcome:
      Y  = nir_per_exposure
      C  = cuped_cov_pre (pre-period baseline net revenue)
      Y* = Y - theta * (C - mean(C)),  where theta = cov(Y, C) / var(C) (pooled estimator)
    """
    y = df["nir_per_exposure"].values
    c = df["cuped_cov_pre"].values
    c_mean = c.mean()
    var_c = np.var(c, ddof=1)
    if var_c <= 1e-12:
        theta = 0.0
    else:
        theta = np.cov(y, c, ddof=1)[0, 1] / var_c
    y_star = y - theta * (c - c_mean)

    y1k = y_star.mean() * 1000.0
    lo, hi = bootstrap_ci(y_star, reps=boots)
    return {
        "cuped_theta": theta,
        "nir_per_1k_cuped": y1k,
        "nir_per_1k_cuped_ci_lo": lo*1000.0,
        "nir_per_1k_cuped_ci_hi": hi*1000.0
    }

def diff_in_means_test(a: np.ndarray, b: np.ndarray) -> Tuple[float, float]:
    """
    Welch test for difference in means (two-sided), normal approx p-value.
    Returns (diff, p_value) where diff = mean(b) - mean(a).
    """
    na, nb = a.size, b.size
    ma, mb = a.mean(), b.mean()
    va, vb = a.var(ddof=1), b.var(ddof=1)
    diff = mb - ma
    se = sqrt(va/na + vb/nb)
    from math import erf, sqrt as msqrt
    z = diff / se if se > 0 else 0.0
    p = 2 * (1 - 0.5 * (1 + erf(abs(z) / msqrt(2))))
    return diff, p



## 6. Run Experiment

In [None]:
# -----------------------------
# 6) Run experiment
# -----------------------------
if __name__ == "__main__":
    P = BizParams() 
    res = simulate(P)
    A = res["A_FREE_DELIVERY"]
    B = res["B_FIVE_OFF"]

    # --- RAW (unadjusted) summaries ---
    summ_A = summarize_variant(A, P.bootstraps)
    summ_B = summarize_variant(B, P.bootstraps)

    diff_raw, p_raw = diff_in_means_test(A["nir_per_exposure"].values,
                                         B["nir_per_exposure"].values)
    diff_raw_1k = diff_raw * 1000.0

    # --- CUPED (adjusted) summaries ---
    cuped_A = cuped_adjusted_metric(A, P.bootstraps)
    cuped_B = cuped_adjusted_metric(B, P.bootstraps)

    yA = A["nir_per_exposure"].values
    yB = B["nir_per_exposure"].values
    cA = A["cuped_cov_pre"].values
    cB = B["cuped_cov_pre"].values
    c_mean_pooled = np.concatenate([cA, cB]).mean()
    var_c_pooled = np.var(np.concatenate([cA, cB]), ddof=1)
    theta_pooled = 0.0 if var_c_pooled <= 1e-12 else (
        np.cov(np.concatenate([yA, yB]), np.concatenate([cA, cB]), ddof=1)[0, 1] / var_c_pooled
    )
    # CUPED-adjusted outcomes per exposure
    yA_star = yA - theta_pooled * (cA - c_mean_pooled)
    yB_star = yB - theta_pooled * (cB - c_mean_pooled)
    diff_cuped, p_cuped = diff_in_means_test(yA_star, yB_star)
    diff_cuped_1k = diff_cuped * 1000.0

    # ---------------- OUTPUT ----------------
    print("\n=== Uber Eats A/B Test Simulation (with MOV & CUPED) ===")
    print(f"Exposures: A={summ_A['exposures']:,}  B={summ_B['exposures']:,}")
    print(f"Conv Rate: A={summ_A['conv_rate']:.3%}  B={summ_B['conv_rate']:.3%}")
    print(f"Avg AOV (baseline ref): A=${summ_A['avg_aov_base']:.2f}  B=${summ_B['avg_aov_base']:.2f}")

    print("\n--- RAW Net Incremental Revenue per 1,000 exposures ---")
    print(f"A Free Delivery: ${summ_A['nir_per_1k']:.2f}  "
          f"(95% CI: ${summ_A['nir_per_1k_ci_lo']:.2f}, ${summ_A['nir_per_1k_ci_hi']:.2f})")
    print(f"B $5 Off      : ${summ_B['nir_per_1k']:.2f}  "
          f"(95% CI: ${summ_B['nir_per_1k_ci_lo']:.2f}, ${summ_B['nir_per_1k_ci_hi']:.2f})")
    print(f"Difference (B - A): ${diff_raw_1k:.2f}   Welch p-value: {p_raw:.4g}")

    if P.cuped:
        print("\n--- CUPED-ADJUSTED Net Incremental Revenue per 1,000 exposures ---")
        print(f"A Free Delivery (CUPED): ${cuped_A['nir_per_1k_cuped']:.2f}  "
              f"(95% CI: ${cuped_A['nir_per_1k_cuped_ci_lo']:.2f}, ${cuped_A['nir_per_1k_cuped_ci_hi']:.2f})")
        print(f"B $5 Off       (CUPED): ${cuped_B['nir_per_1k_cuped']:.2f}  "
              f"(95% CI: ${cuped_B['nir_per_1k_cuped_ci_lo']:.2f}, ${cuped_B['nir_per_1k_cuped_ci_hi']:.2f})")
        print(f"Pooled theta (CUPED): {theta_pooled:.4f}")
        print(f"Difference (B - A), CUPED: ${diff_cuped_1k:.2f}   Welch p-value: {p_cuped:.4g}")

    winner_raw = "B ($5 Off)" if diff_raw_1k > 0 else "A (Free Delivery)"
    winner_cuped = "B ($5 Off)" if diff_cuped_1k > 0 else "A (Free Delivery)"
    print(f"\nWinner on expected NIR/1k (RAW):   {winner_raw}")
    print(f"Winner on expected NIR/1k (CUPED): {winner_cuped}")



=== Uber Eats A/B Test Simulation (with MOV & CUPED) ===
Exposures: A=50,025  B=49,975
Conv Rate: A=11.620%  B=11.670%
Avg AOV (baseline ref): A=$28.79  B=$28.61

--- RAW Net Incremental Revenue per 1,000 exposures ---
A Free Delivery: $-508.25  (95% CI: $-521.35, $-493.77)
B $5 Off      : $-457.39  (95% CI: $-470.09, $-445.12)
Difference (B - A): $50.85   Welch p-value: 4.704e-08

--- CUPED-ADJUSTED Net Incremental Revenue per 1,000 exposures ---
A Free Delivery (CUPED): $-508.25  (95% CI: $-521.74, $-494.82)
B $5 Off       (CUPED): $-457.39  (95% CI: $-470.57, $-445.87)
Pooled theta (CUPED): -0.0002
Difference (B - A), CUPED: $50.86   Welch p-value: 4.699e-08

Winner on expected NIR/1k (RAW):   B ($5 Off)
Winner on expected NIR/1k (CUPED): B ($5 Off)


## Summary

$5 Off wins, but both promos destroy value under these assumptions.

Economics: NIR/1k is negative for both arms (A: -$508, B: -$457). The 95% CIs are entirely below zero, so either promo reduces net revenue vs. no promo.

Which one is less bad? B outperforms A by about +$50.9 per 1,000 exposures (p ≈ 4.7×10⁻⁸). The conversion lift difference is tiny (11.67% vs. 11.62%), so the advantage likely comes from keeping the delivery-fee revenue (vs. waiving it).

CUPED: θ ≈ 0, adjusted results ≈ raw → the pre-period covariate didn’t reduce variance (it carried little signal).