# Week 11: Advanced Incrementality Testing Platform

## Learning Objectives
- Implement scalable geo-lift analysis
- Build automated matched market selection systems
- Apply synthetic control methods at scale
- Perform causal impact analysis with large time series
- Create incrementality monitoring dashboards
- Design production deployment patterns
- Build end-to-end incrementality measurement platforms
- Handle power analysis and sample size calculations

## Prerequisites
```bash
pip install pandas numpy scipy statsmodels
pip install scikit-learn causalimpact
pip install plotly seaborn matplotlib
pip install cvxpy pymc3 arviz
pip install psycopg2-binary sqlalchemy
pip install fastapi uvicorn redis
```

## 1. Setup and Environment

In [None]:
# Standard library
import os
import sys
import logging
import time
import warnings
import json
import pickle
from datetime import datetime, timedelta
from typing import Dict, List, Tuple, Optional, Union
from collections import defaultdict
import itertools

# Data manipulation
import numpy as np
import pandas as pd

# Statistics
from scipy import stats
from scipy.spatial.distance import euclidean, cosine
from scipy.optimize import minimize
from statsmodels.tsa.stattools import adfuller, grangercausalitytests
from statsmodels.stats.power import tt_ind_solve_power
import statsmodels.api as sm

# Machine learning
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_squared_error, mean_absolute_error
from sklearn.neighbors import NearestNeighbors
import cvxpy as cp

# Causal inference
try:
    from causalimpact import CausalImpact
except:
    print("CausalImpact not available - install with: pip install causalimpact")

# Database
import psycopg2
from sqlalchemy import create_engine

# Visualization
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots

# Configuration
warnings.filterwarnings('ignore')
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
logger = logging.getLogger(__name__)

pd.set_option('display.max_columns', 100)
pd.set_option('display.max_rows', 100)
pd.set_option('display.float_format', '{:.4f}'.format)
sns.set_style('whitegrid')
plt.rcParams['figure.figsize'] = (14, 7)

print(f"Pandas: {pd.__version__}")
print(f"NumPy: {np.__version__}")
print(f"SciPy: {stats.__version__}")
print("Environment ready!")

## 2. Scalable Geo-Lift Analysis

In [None]:
class GeoLiftAnalyzer:
    """Scalable geo-lift test analysis."""
    
    def __init__(self):
        self.logger = logging.getLogger(self.__class__.__name__)
        self.results = {}
    
    def calculate_lift(self,
                      treatment_pre: np.ndarray,
                      treatment_post: np.ndarray,
                      control_pre: np.ndarray,
                      control_post: np.ndarray) -> Dict:
        """Calculate geo-lift using difference-in-differences.
        
        Args:
            treatment_pre: Treatment geo metrics before test
            treatment_post: Treatment geo metrics during test
            control_pre: Control geo metrics before test
            control_post: Control geo metrics during test
        """
        
        # Calculate means
        treatment_pre_mean = treatment_pre.mean()
        treatment_post_mean = treatment_post.mean()
        control_pre_mean = control_pre.mean()
        control_post_mean = control_post.mean()
        
        # Difference-in-differences
        treatment_diff = treatment_post_mean - treatment_pre_mean
        control_diff = control_post_mean - control_pre_mean
        did_estimate = treatment_diff - control_diff
        
        # Percentage lift
        expected_treatment_post = treatment_pre_mean * (control_post_mean / control_pre_mean)
        pct_lift = (treatment_post_mean - expected_treatment_post) / expected_treatment_post * 100
        
        # Statistical significance
        t_stat, p_value = stats.ttest_ind(treatment_post, control_post)
        
        # Confidence interval
        treatment_se = treatment_post.std() / np.sqrt(len(treatment_post))
        control_se = control_post.std() / np.sqrt(len(control_post))
        se_diff = np.sqrt(treatment_se**2 + control_se**2)
        
        ci_lower = did_estimate - 1.96 * se_diff
        ci_upper = did_estimate + 1.96 * se_diff
        
        return {
            'did_estimate': did_estimate,
            'pct_lift': pct_lift,
            'treatment_pre_mean': treatment_pre_mean,
            'treatment_post_mean': treatment_post_mean,
            'control_pre_mean': control_pre_mean,
            'control_post_mean': control_post_mean,
            'p_value': p_value,
            'is_significant': p_value < 0.05,
            'ci_lower': ci_lower,
            'ci_upper': ci_upper,
            't_statistic': t_stat
        }
    
    def parallel_trends_test(self,
                            treatment_pre: pd.DataFrame,
                            control_pre: pd.DataFrame,
                            date_col: str,
                            metric_col: str) -> Dict:
        """Test parallel trends assumption."""
        
        self.logger.info("Testing parallel trends assumption")
        
        # Calculate trends
        treatment_trend = self._calculate_trend(treatment_pre, date_col, metric_col)
        control_trend = self._calculate_trend(control_pre, date_col, metric_col)
        
        # Compare slopes
        slope_diff = abs(treatment_trend['slope'] - control_trend['slope'])
        
        # Test if trends are parallel (slopes are similar)
        # Using F-test for equality of regression coefficients
        is_parallel = slope_diff < 0.1 * abs(control_trend['slope'])
        
        return {
            'treatment_slope': treatment_trend['slope'],
            'control_slope': control_trend['slope'],
            'slope_difference': slope_diff,
            'is_parallel': is_parallel,
            'treatment_r2': treatment_trend['r2'],
            'control_r2': control_trend['r2']
        }
    
    def _calculate_trend(self, df: pd.DataFrame, date_col: str, metric_col: str) -> Dict:
        """Calculate linear trend."""
        
        df = df.copy()
        df['time_idx'] = (df[date_col] - df[date_col].min()).dt.days
        
        X = df[['time_idx']].values
        y = df[metric_col].values
        
        X = sm.add_constant(X)
        model = sm.OLS(y, X).fit()
        
        return {
            'slope': model.params[1],
            'intercept': model.params[0],
            'r2': model.rsquared,
            'p_value': model.pvalues[1]
        }
    
    def power_analysis(self,
                      baseline_mean: float,
                      baseline_std: float,
                      min_detectable_effect: float,
                      alpha: float = 0.05,
                      power: float = 0.8) -> int:
        """Calculate required sample size for given power.
        
        Args:
            baseline_mean: Expected baseline metric value
            baseline_std: Standard deviation of metric
            min_detectable_effect: Minimum lift to detect (as pct)
            alpha: Significance level
            power: Desired statistical power
        """
        
        effect_size = (baseline_mean * min_detectable_effect / 100) / baseline_std
        
        # Calculate required sample size
        n = tt_ind_solve_power(
            effect_size=effect_size,
            alpha=alpha,
            power=power,
            alternative='two-sided'
        )
        
        return int(np.ceil(n))
    
    def estimate_runtime(self,
                        baseline_mean: float,
                        min_detectable_effect: float,
                        daily_variance: float,
                        alpha: float = 0.05,
                        power: float = 0.8) -> int:
        """Estimate required test duration in days."""
        
        # Effect size
        effect_size = (baseline_mean * min_detectable_effect / 100) / np.sqrt(daily_variance)
        
        # Critical value
        z_alpha = stats.norm.ppf(1 - alpha/2)
        z_beta = stats.norm.ppf(power)
        
        # Required days
        n_days = int(np.ceil(((z_alpha + z_beta) / effect_size) ** 2))
        
        return n_days


# Example
analyzer = GeoLiftAnalyzer()
print("Geo-lift analyzer ready")

## 3. Automated Matched Market Selection

In [None]:
class MatchedMarketSelector:
    """Automated selection of matched control markets."""
    
    def __init__(self):
        self.logger = logging.getLogger(self.__class__.__name__)
    
    def find_matches(self,
                    treatment_markets: List[str],
                    candidate_controls: List[str],
                    market_data: pd.DataFrame,
                    matching_features: List[str],
                    n_matches: int = 1,
                    method: str = 'euclidean') -> pd.DataFrame:
        """Find best matching control markets.
        
        Args:
            treatment_markets: List of treatment market IDs
            candidate_controls: List of candidate control market IDs
            market_data: DataFrame with market characteristics
            matching_features: Features to use for matching
            n_matches: Number of control markets to select
            method: Distance metric ('euclidean', 'cosine', 'dtw')
        """
        
        self.logger.info(f"Finding {n_matches} matches for {len(treatment_markets)} treatment markets")
        
        # Normalize features
        scaler = StandardScaler()
        market_data[matching_features] = scaler.fit_transform(market_data[matching_features])
        
        # Get treatment and control data
        treatment_data = market_data[market_data.index.isin(treatment_markets)]
        control_data = market_data[market_data.index.isin(candidate_controls)]
        
        matches = []
        
        for treatment_market in treatment_markets:
            treatment_features = treatment_data.loc[treatment_market, matching_features].values
            
            # Calculate distances to all controls
            distances = []
            for control_market in candidate_controls:
                control_features = control_data.loc[control_market, matching_features].values
                
                if method == 'euclidean':
                    dist = euclidean(treatment_features, control_features)
                elif method == 'cosine':
                    dist = cosine(treatment_features, control_features)
                elif method == 'dtw':
                    dist = self._dtw_distance(treatment_features, control_features)
                
                distances.append({
                    'treatment': treatment_market,
                    'control': control_market,
                    'distance': dist
                })
            
            # Select top matches
            distances_df = pd.DataFrame(distances).sort_values('distance')
            top_matches = distances_df.head(n_matches)
            matches.append(top_matches)
        
        return pd.concat(matches, ignore_index=True)
    
    def _dtw_distance(self, x: np.ndarray, y: np.ndarray) -> float:
        """Calculate Dynamic Time Warping distance."""
        
        n, m = len(x), len(y)
        dtw = np.zeros((n+1, m+1))
        dtw[0, :] = np.inf
        dtw[:, 0] = np.inf
        dtw[0, 0] = 0
        
        for i in range(1, n+1):
            for j in range(1, m+1):
                cost = abs(x[i-1] - y[j-1])
                dtw[i, j] = cost + min(dtw[i-1, j], dtw[i, j-1], dtw[i-1, j-1])
        
        return dtw[n, m]
    
    def validate_matches(self,
                        matches: pd.DataFrame,
                        market_data: pd.DataFrame,
                        time_series_col: str,
                        metric_col: str) -> pd.DataFrame:
        """Validate quality of matched markets."""
        
        validation_results = []
        
        for _, match in matches.iterrows():
            treatment = match['treatment']
            control = match['control']
            
            # Get time series
            treatment_ts = market_data[market_data['market'] == treatment][metric_col].values
            control_ts = market_data[market_data['market'] == control][metric_col].values
            
            # Calculate correlation
            correlation = np.corrcoef(treatment_ts, control_ts)[0, 1]
            
            # Test for cointegration
            try:
                adf_stat, adf_pvalue, _, _, _, _ = adfuller(treatment_ts - control_ts)
                is_cointegrated = adf_pvalue < 0.05
            except:
                is_cointegrated = False
            
            # MAPE between series
            mape = np.mean(np.abs((treatment_ts - control_ts) / treatment_ts)) * 100
            
            validation_results.append({
                'treatment': treatment,
                'control': control,
                'correlation': correlation,
                'is_cointegrated': is_cointegrated,
                'mape': mape,
                'match_quality': 'Excellent' if correlation > 0.9 else ('Good' if correlation > 0.7 else 'Fair')
            })
        
        return pd.DataFrame(validation_results)
    
    def optimize_match_selection(self,
                                 treatment_markets: List[str],
                                 candidate_controls: List[str],
                                 market_data: pd.DataFrame,
                                 metric_col: str,
                                 max_controls: int = 5) -> List[str]:
        """Optimize selection of control group using integer programming."""
        
        n_candidates = len(candidate_controls)
        
        # Decision variables (binary: include control or not)
        x = cp.Variable(n_candidates, boolean=True)
        
        # Calculate similarity scores
        similarities = []
        for control in candidate_controls:
            treatment_avg = market_data[market_data.index.isin(treatment_markets)][metric_col].mean()
            control_val = market_data.loc[control, metric_col]
            similarity = 1 / (1 + abs(treatment_avg - control_val))
            similarities.append(similarity)
        
        # Objective: maximize total similarity
        objective = cp.Maximize(cp.sum(cp.multiply(similarities, x)))
        
        # Constraints
        constraints = [
            cp.sum(x) <= max_controls,  # Max number of controls
            cp.sum(x) >= 1  # At least one control
        ]
        
        # Solve
        problem = cp.Problem(objective, constraints)
        problem.solve()
        
        # Get selected controls
        selected_idx = np.where(x.value > 0.5)[0]
        selected_controls = [candidate_controls[i] for i in selected_idx]
        
        return selected_controls


selector = MatchedMarketSelector()
print("Matched market selector ready")

## 4. Synthetic Control at Scale

In [None]:
class SyntheticControl:
    """Synthetic control method for causal inference."""
    
    def __init__(self):
        self.logger = logging.getLogger(self.__class__.__name__)
        self.weights = None
        self.synthetic_control = None
    
    def fit(self,
           treatment_pre: np.ndarray,
           control_pre: np.ndarray,
           method: str = 'standard') -> np.ndarray:
        """Fit synthetic control using donor pool.
        
        Args:
            treatment_pre: Pre-intervention treatment unit data (T x 1)
            control_pre: Pre-intervention control units data (T x N)
            method: 'standard', 'ridge', or 'elastic_net'
        """
        
        self.logger.info(f"Fitting synthetic control with {method} method")
        
        if method == 'standard':
            self.weights = self._fit_standard(treatment_pre, control_pre)
        elif method == 'ridge':
            self.weights = self._fit_ridge(treatment_pre, control_pre)
        elif method == 'elastic_net':
            self.weights = self._fit_elastic_net(treatment_pre, control_pre)
        
        # Create synthetic control
        self.synthetic_control = control_pre @ self.weights
        
        return self.weights
    
    def _fit_standard(self, treatment: np.ndarray, controls: np.ndarray) -> np.ndarray:
        """Standard synthetic control (quadratic programming)."""
        
        n_controls = controls.shape[1]
        
        # Decision variables
        w = cp.Variable(n_controls)
        
        # Objective: minimize squared difference
        objective = cp.Minimize(cp.sum_squares(treatment - controls @ w))
        
        # Constraints: weights sum to 1 and are non-negative
        constraints = [
            cp.sum(w) == 1,
            w >= 0
        ]
        
        # Solve
        problem = cp.Problem(objective, constraints)
        problem.solve()
        
        return w.value
    
    def _fit_ridge(self, treatment: np.ndarray, controls: np.ndarray, alpha: float = 1.0) -> np.ndarray:
        """Synthetic control with ridge regularization."""
        
        n_controls = controls.shape[1]
        w = cp.Variable(n_controls)
        
        objective = cp.Minimize(
            cp.sum_squares(treatment - controls @ w) +
            alpha * cp.sum_squares(w)
        )
        
        constraints = [cp.sum(w) == 1, w >= 0]
        
        problem = cp.Problem(objective, constraints)
        problem.solve()
        
        return w.value
    
    def _fit_elastic_net(self, treatment: np.ndarray, controls: np.ndarray,
                        alpha: float = 1.0, l1_ratio: float = 0.5) -> np.ndarray:
        """Synthetic control with elastic net regularization."""
        
        n_controls = controls.shape[1]
        w = cp.Variable(n_controls)
        
        objective = cp.Minimize(
            cp.sum_squares(treatment - controls @ w) +
            alpha * l1_ratio * cp.norm(w, 1) +
            alpha * (1 - l1_ratio) * cp.sum_squares(w)
        )
        
        constraints = [cp.sum(w) == 1, w >= 0]
        
        problem = cp.Problem(objective, constraints)
        problem.solve()
        
        return w.value
    
    def predict(self, control_post: np.ndarray) -> np.ndarray:
        """Predict counterfactual for post-intervention period."""
        
        if self.weights is None:
            raise ValueError("Model not fitted. Call fit() first.")
        
        return control_post @ self.weights
    
    def calculate_effect(self,
                        treatment_post: np.ndarray,
                        control_post: np.ndarray) -> Dict:
        """Calculate treatment effect."""
        
        # Counterfactual
        counterfactual = self.predict(control_post)
        
        # Effect
        effect = treatment_post - counterfactual
        
        # Statistics
        avg_effect = effect.mean()
        cumulative_effect = effect.sum()
        pct_effect = (avg_effect / counterfactual.mean()) * 100
        
        return {
            'treatment': treatment_post,
            'counterfactual': counterfactual,
            'effect': effect,
            'avg_effect': avg_effect,
            'cumulative_effect': cumulative_effect,
            'pct_effect': pct_effect
        }
    
    def placebo_test(self,
                    treatment_pre: np.ndarray,
                    control_pre: np.ndarray,
                    n_placebos: int = 100) -> Dict:
        """Perform placebo tests for inference."""
        
        self.logger.info(f"Running {n_placebos} placebo tests")
        
        n_controls = control_pre.shape[1]
        placebo_effects = []
        
        for i in range(n_controls):
            # Use control i as placebo treatment
            placebo_treatment = control_pre[:, i]
            placebo_controls = np.delete(control_pre, i, axis=1)
            
            # Fit synthetic control
            try:
                weights = self._fit_standard(placebo_treatment, placebo_controls)
                synthetic = placebo_controls @ weights
                effect = placebo_treatment - synthetic
                placebo_effects.append(effect.mean())
            except:
                continue
        
        placebo_effects = np.array(placebo_effects)
        
        # Calculate p-value
        actual_effect = (treatment_pre - self.synthetic_control).mean()
        p_value = np.mean(np.abs(placebo_effects) >= np.abs(actual_effect))
        
        return {
            'placebo_effects': placebo_effects,
            'actual_effect': actual_effect,
            'p_value': p_value,
            'is_significant': p_value < 0.05
        }
    
    def get_donor_weights(self) -> pd.DataFrame:
        """Get donor unit weights."""
        
        if self.weights is None:
            raise ValueError("Model not fitted")
        
        return pd.DataFrame({
            'donor_id': range(len(self.weights)),
            'weight': self.weights
        }).sort_values('weight', ascending=False)


sc = SyntheticControl()
print("Synthetic control ready")

## 5. Causal Impact with Large Time Series

In [None]:
class CausalImpactAnalyzer:
    """Scalable causal impact analysis using Bayesian structural time series."""
    
    def __init__(self):
        self.logger = logging.getLogger(self.__class__.__name__)
        self.model = None
        self.results = None
    
    def analyze(self,
               data: pd.DataFrame,
               pre_period: Tuple[str, str],
               post_period: Tuple[str, str],
               response_col: str,
               covariate_cols: Optional[List[str]] = None) -> Dict:
        """Perform causal impact analysis.
        
        Args:
            data: Time series data with DatetimeIndex
            pre_period: (start, end) dates for pre-intervention
            post_period: (start, end) dates for post-intervention
            response_col: Name of response variable
            covariate_cols: Optional control variables
        """
        
        self.logger.info("Running causal impact analysis")
        
        try:
            # Prepare data
            if covariate_cols:
                data_formatted = data[[response_col] + covariate_cols]
            else:
                data_formatted = data[[response_col]]
            
            # Run CausalImpact
            ci = CausalImpact(data_formatted, pre_period, post_period)
            
            # Extract results
            summary = ci.summary()
            summary_data = ci.summary_data
            
            self.results = {
                'average_effect': summary_data['average']['actual'] - summary_data['average']['predicted'],
                'cumulative_effect': summary_data['cumulative']['actual'] - summary_data['cumulative']['predicted'],
                'relative_effect': summary_data['average']['rel_effect'],
                'p_value': summary_data['average']['p'],
                'lower_95': summary_data['average']['actual_lower'],
                'upper_95': summary_data['average']['actual_upper'],
                'is_significant': summary_data['average']['p'] < 0.05,
                'model': ci
            }
            
            return self.results
            
        except Exception as e:
            self.logger.error(f"CausalImpact analysis failed: {e}")
            return self._fallback_analysis(data, pre_period, post_period, response_col)
    
    def _fallback_analysis(self,
                          data: pd.DataFrame,
                          pre_period: Tuple[str, str],
                          post_period: Tuple[str, str],
                          response_col: str) -> Dict:
        """Fallback to simple analysis if CausalImpact fails."""
        
        pre_data = data.loc[pre_period[0]:pre_period[1], response_col]
        post_data = data.loc[post_period[0]:post_period[1], response_col]
        
        # Simple difference
        avg_effect = post_data.mean() - pre_data.mean()
        cumulative_effect = post_data.sum() - (pre_data.mean() * len(post_data))
        relative_effect = avg_effect / pre_data.mean() * 100
        
        # T-test
        t_stat, p_value = stats.ttest_ind(post_data, pre_data)
        
        return {
            'average_effect': avg_effect,
            'cumulative_effect': cumulative_effect,
            'relative_effect': relative_effect,
            'p_value': p_value,
            'is_significant': p_value < 0.05,
            'method': 'fallback'
        }
    
    def plot_results(self):
        """Plot causal impact results."""
        
        if self.results and 'model' in self.results:
            self.results['model'].plot()
            plt.tight_layout()
            plt.show()


ci_analyzer = CausalImpactAnalyzer()
print("Causal impact analyzer ready")

## 6. Incrementality Monitoring Dashboard

In [None]:
class IncrementalityDashboard:
    """Real-time incrementality monitoring dashboard."""
    
    def __init__(self, db_config: Dict):
        self.db_config = db_config
        self.logger = logging.getLogger(self.__class__.__name__)
        self.engine = create_engine(
            f"postgresql://{db_config['user']}:{db_config['password']}@"
            f"{db_config['host']}:{db_config['port']}/{db_config['database']}"
        )
    
    def get_active_tests(self) -> pd.DataFrame:
        """Get all active incrementality tests."""
        
        query = """
        SELECT 
            test_id,
            test_name,
            channel,
            start_date,
            end_date,
            status,
            treatment_markets,
            control_markets
        FROM incrementality_tests
        WHERE status = 'active'
        ORDER BY start_date DESC
        """
        
        return pd.read_sql(query, self.engine)
    
    def get_test_metrics(self, test_id: str, current_date: str) -> Dict:
        """Get real-time metrics for a test."""
        
        query = f"""
        SELECT 
            date,
            market_id,
            is_treatment,
            revenue,
            conversions,
            spend
        FROM test_daily_metrics
        WHERE test_id = '{test_id}'
          AND date <= '{current_date}'
        ORDER BY date, market_id
        """
        
        data = pd.read_sql(query, self.engine)
        
        # Calculate running metrics
        treatment = data[data['is_treatment']]
        control = data[~data['is_treatment']]
        
        metrics = {
            'treatment_revenue': treatment['revenue'].sum(),
            'control_revenue': control['revenue'].sum(),
            'treatment_conversions': treatment['conversions'].sum(),
            'control_conversions': control['conversions'].sum(),
            'treatment_spend': treatment['spend'].sum(),
            'days_running': data['date'].nunique()
        }
        
        # Calculate lift
        treatment_avg = metrics['treatment_revenue'] / metrics['days_running']
        control_avg = metrics['control_revenue'] / metrics['days_running']
        metrics['lift_pct'] = (treatment_avg - control_avg) / control_avg * 100
        
        return metrics
    
    def create_dashboard(self, test_id: str) -> go.Figure:
        """Create interactive dashboard for a test."""
        
        # Get data
        query = f"""
        SELECT 
            date,
            SUM(CASE WHEN is_treatment THEN revenue ELSE 0 END) as treatment_revenue,
            SUM(CASE WHEN NOT is_treatment THEN revenue ELSE 0 END) as control_revenue
        FROM test_daily_metrics
        WHERE test_id = '{test_id}'
        GROUP BY date
        ORDER BY date
        """
        
        data = pd.read_sql(query, self.engine)
        
        # Create subplots
        fig = make_subplots(
            rows=2, cols=2,
            subplot_titles=[
                'Revenue Over Time',
                'Cumulative Revenue',
                'Daily Lift %',
                'Confidence Intervals'
            ]
        )
        
        # Revenue over time
        fig.add_trace(
            go.Scatter(x=data['date'], y=data['treatment_revenue'], 
                      name='Treatment', line=dict(color='blue')),
            row=1, col=1
        )
        fig.add_trace(
            go.Scatter(x=data['date'], y=data['control_revenue'],
                      name='Control', line=dict(color='orange')),
            row=1, col=1
        )
        
        # Cumulative revenue
        data['treatment_cumsum'] = data['treatment_revenue'].cumsum()
        data['control_cumsum'] = data['control_revenue'].cumsum()
        
        fig.add_trace(
            go.Scatter(x=data['date'], y=data['treatment_cumsum'],
                      name='Treatment Cumulative', line=dict(color='blue')),
            row=1, col=2
        )
        fig.add_trace(
            go.Scatter(x=data['date'], y=data['control_cumsum'],
                      name='Control Cumulative', line=dict(color='orange')),
            row=1, col=2
        )
        
        # Daily lift
        data['lift'] = (data['treatment_revenue'] - data['control_revenue']) / data['control_revenue'] * 100
        
        fig.add_trace(
            go.Scatter(x=data['date'], y=data['lift'],
                      name='Lift %', line=dict(color='green')),
            row=2, col=1
        )
        fig.add_hline(y=0, line_dash="dash", line_color="gray", row=2, col=1)
        
        fig.update_layout(height=800, showlegend=True, title_text=f"Test {test_id} Dashboard")
        
        return fig
    
    def alert_check(self, test_id: str) -> List[Dict]:
        """Check for alerts and anomalies."""
        
        alerts = []
        
        metrics = self.get_test_metrics(test_id, datetime.now().strftime('%Y-%m-%d'))
        
        # Check for negative lift
        if metrics['lift_pct'] < -5:
            alerts.append({
                'severity': 'high',
                'message': f"Negative lift detected: {metrics['lift_pct']:.1f}%",
                'timestamp': datetime.now()
            })
        
        # Check for data quality
        if metrics['treatment_revenue'] == 0 or metrics['control_revenue'] == 0:
            alerts.append({
                'severity': 'critical',
                'message': 'Missing data detected',
                'timestamp': datetime.now()
            })
        
        return alerts


print("Incrementality dashboard ready")

## 7. Real-World Project: Build Incrementality Measurement Platform

In [None]:
class IncrementalityPlatform:
    """End-to-end incrementality measurement platform."""
    
    def __init__(self, config: Dict):
        self.config = config
        self.logger = logging.getLogger(self.__class__.__name__)
        
        # Initialize components
        self.selector = MatchedMarketSelector()
        self.analyzer = GeoLiftAnalyzer()
        self.sc = SyntheticControl()
        self.ci_analyzer = CausalImpactAnalyzer()
    
    def design_test(self,
                   channel: str,
                   markets_data: pd.DataFrame,
                   n_treatment: int,
                   n_control: int,
                   min_detectable_effect: float = 10.0) -> Dict:
        """Design incrementality test."""
        
        self.logger.info(f"Designing test for {channel}")
        
        # 1. Calculate required sample size
        baseline_mean = markets_data['revenue'].mean()
        baseline_std = markets_data['revenue'].std()
        
        required_markets = self.analyzer.power_analysis(
            baseline_mean=baseline_mean,
            baseline_std=baseline_std,
            min_detectable_effect=min_detectable_effect
        )
        
        # 2. Select treatment markets (random or stratified)
        treatment_markets = markets_data.sample(n=n_treatment)['market_id'].tolist()
        
        # 3. Find matched controls
        candidate_controls = markets_data[
            ~markets_data['market_id'].isin(treatment_markets)
        ]['market_id'].tolist()
        
        matches = self.selector.find_matches(
            treatment_markets=treatment_markets,
            candidate_controls=candidate_controls,
            market_data=markets_data,
            matching_features=['population', 'income', 'competition_index'],
            n_matches=n_control
        )
        
        control_markets = matches['control'].unique().tolist()
        
        # 4. Validate matches
        validation = self.selector.validate_matches(
            matches=matches,
            market_data=markets_data,
            time_series_col='date',
            metric_col='revenue'
        )
        
        # 5. Estimate test duration
        duration = self.analyzer.estimate_runtime(
            baseline_mean=baseline_mean,
            min_detectable_effect=min_detectable_effect,
            daily_variance=markets_data.groupby('date')['revenue'].var().mean()
        )
        
        return {
            'test_design': {
                'channel': channel,
                'treatment_markets': treatment_markets,
                'control_markets': control_markets,
                'required_markets': required_markets,
                'estimated_duration_days': duration,
                'min_detectable_effect': min_detectable_effect
            },
            'validation': validation,
            'matches': matches
        }
    
    def execute_test(self, test_design: Dict, start_date: str, end_date: str):
        """Execute incrementality test."""
        
        self.logger.info("Executing incrementality test")
        
        # Store test in database
        test_id = self._create_test_record(test_design, start_date, end_date)
        
        # Monitor test
        self._setup_monitoring(test_id)
        
        return test_id
    
    def analyze_results(self,
                       test_id: str,
                       data: pd.DataFrame,
                       method: str = 'all') -> Dict:
        """Analyze test results using multiple methods."""
        
        self.logger.info(f"Analyzing test {test_id} with method: {method}")
        
        results = {}
        
        # Prepare data
        treatment_data = data[data['is_treatment']]
        control_data = data[~data['is_treatment']]
        
        # Method 1: Difference-in-differences
        if method in ['all', 'did']:
            did_results = self.analyzer.calculate_lift(
                treatment_pre=treatment_data[treatment_data['period'] == 'pre']['revenue'].values,
                treatment_post=treatment_data[treatment_data['period'] == 'post']['revenue'].values,
                control_pre=control_data[control_data['period'] == 'pre']['revenue'].values,
                control_post=control_data[control_data['period'] == 'post']['revenue'].values
            )
            results['difference_in_differences'] = did_results
        
        # Method 2: Synthetic control
        if method in ['all', 'synthetic_control']:
            treatment_pre = treatment_data[treatment_data['period'] == 'pre']['revenue'].values
            control_pre = control_data[control_data['period'] == 'pre'].pivot(
                index='date', columns='market_id', values='revenue'
            ).values
            
            self.sc.fit(treatment_pre, control_pre)
            
            control_post = control_data[control_data['period'] == 'post'].pivot(
                index='date', columns='market_id', values='revenue'
            ).values
            treatment_post = treatment_data[treatment_data['period'] == 'post']['revenue'].values
            
            sc_results = self.sc.calculate_effect(treatment_post, control_post)
            results['synthetic_control'] = sc_results
        
        # Method 3: Causal Impact
        if method in ['all', 'causal_impact']:
            ci_results = self.ci_analyzer.analyze(
                data=data.set_index('date'),
                pre_period=(data[data['period'] == 'pre']['date'].min(),
                           data[data['period'] == 'pre']['date'].max()),
                post_period=(data[data['period'] == 'post']['date'].min(),
                            data[data['period'] == 'post']['date'].max()),
                response_col='revenue'
            )
            results['causal_impact'] = ci_results
        
        # Consensus result
        results['consensus'] = self._calculate_consensus(results)
        
        return results
    
    def _calculate_consensus(self, results: Dict) -> Dict:
        """Calculate consensus estimate across methods."""
        
        effects = []
        
        if 'difference_in_differences' in results:
            effects.append(results['difference_in_differences']['pct_lift'])
        
        if 'synthetic_control' in results:
            effects.append(results['synthetic_control']['pct_effect'])
        
        if 'causal_impact' in results:
            effects.append(results['causal_impact']['relative_effect'])
        
        return {
            'mean_effect': np.mean(effects),
            'median_effect': np.median(effects),
            'std_effect': np.std(effects),
            'min_effect': np.min(effects),
            'max_effect': np.max(effects)
        }
    
    def _create_test_record(self, test_design: Dict, start_date: str, end_date: str) -> str:
        """Create test record in database."""
        # Implementation would insert into database
        test_id = f"test_{datetime.now().strftime('%Y%m%d_%H%M%S')}"
        return test_id
    
    def _setup_monitoring(self, test_id: str):
        """Setup real-time monitoring for test."""
        # Implementation would setup monitoring
        pass
    
    def generate_report(self, test_id: str, results: Dict) -> str:
        """Generate executive report."""
        
        consensus = results['consensus']
        
        report = f"""
        INCREMENTALITY TEST RESULTS
        ==========================
        
        Test ID: {test_id}
        Date: {datetime.now().strftime('%Y-%m-%d')}
        
        SUMMARY:
        --------
        Consensus Lift: {consensus['mean_effect']:.1f}%
        Range: {consensus['min_effect']:.1f}% to {consensus['max_effect']:.1f}%
        
        METHODOLOGY COMPARISON:
        ----------------------
        """  
        
        if 'difference_in_differences' in results:
            did = results['difference_in_differences']
            report += f"""
        Difference-in-Differences:
          - Lift: {did['pct_lift']:.1f}%
          - P-value: {did['p_value']:.4f}
          - Significant: {did['is_significant']}
            """
        
        return report


# Example usage
# platform = IncrementalityPlatform(config={'db': 'connection_string'})
# test_design = platform.design_test(
#     channel='paid_search',
#     markets_data=market_df,
#     n_treatment=10,
#     n_control=10
# )

print("Incrementality platform ready")

## 8. Exercises

### Exercise 1: Geo-Lift Test Design
1. Design a geo-lift test for a marketing channel
2. Calculate required sample size for 10% MDE
3. Estimate test duration
4. Validate parallel trends assumption
5. Analyze results with DID

### Exercise 2: Matched Market Selection
1. Load multi-market dataset
2. Select treatment markets
3. Find optimal matched controls
4. Validate match quality
5. Test for cointegration

### Exercise 3: Synthetic Control Analysis
1. Implement synthetic control method
2. Fit model on pre-intervention period
3. Calculate treatment effect
4. Run placebo tests
5. Visualize results

### Exercise 4: Production Platform
1. Build end-to-end incrementality platform
2. Implement automated test design
3. Create monitoring dashboard
4. Set up alerting system
5. Generate automated reports

## Resources

### Documentation
- [CausalImpact R Package](https://google.github.io/CausalImpact/)
- [Synthetic Control Methods](https://synth-inference.github.io/)
- [GeoLift (Meta)](https://facebookincubator.github.io/GeoLift/)
- [CVXPY - Optimization](https://www.cvxpy.org/)

### Papers
- Abadie et al. (2010): Synthetic Control Methods for Comparative Case Studies
- Brodersen et al. (2015): Inferring causal impact using Bayesian structural time-series models
- Athey & Imbens (2017): The State of Applied Econometrics
- Vaver & Koehler (2011): Measuring Ad Effectiveness Using Geo Experiments

### Tools
- GeoLift: Geo experimentation platform
- CausalImpact: Bayesian causal impact
- DoWhy: Causal inference library
- EconML: Econometric ML methods