# DeepCausalMMM Quick Start Guide

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/adityapt/deepcausalmmm/blob/main/examples/quickstart.ipynb)
[![PyPI](https://badge.fury.io/py/deepcausalmmm.svg)](https://pypi.org/project/deepcausalmmm/)

Welcome! This notebook will walk you through using **DeepCausalMMM** for Marketing Mix Modeling.

## What You'll Learn

1. 📦 Install DeepCausalMMM
2. 📊 Generate synthetic MMM data
3. 🚀 Train a model
4. 📈 Analyze results
5. 📉 Fit response curves

## Step 1: Installation

In [None]:
!pip install deepcausalmmm -q

## Step 2: Import Libraries

In [None]:
import pandas as pd
import numpy as np
import deepcausalmmm

from deepcausalmmm import DeepCausalMMM
from deepcausalmmm.core import get_default_config
from deepcausalmmm.core.trainer import ModelTrainer
from deepcausalmmm.core.data import UnifiedDataPipeline
from deepcausalmmm.utils.data_generator import generate_synthetic_mmm_data
from deepcausalmmm.postprocess import ResponseCurveFit

print(f"✅ DeepCausalMMM v{deepcausalmmm.__version__} loaded!")

## Step 3: Generate Synthetic Data

For this demo, we'll generate synthetic MMM data with:
- 10 regions (DMAs)
- 52 weeks
- 5 media channels
- 3 control variables

In [None]:
# Generate data
df = generate_synthetic_mmm_data(
    n_regions=10,
    n_weeks=52,
    n_media=5,
    n_controls=3,
    seed=42
)

print(f"📊 Data shape: {df.shape}")
print(f"\nColumns: {list(df.columns)}")
df.head()

## Step 4: Configure Model

Get default config and customize for quick demo:

In [None]:
config = get_default_config()

# Reduce epochs for quick demo
config['n_epochs'] = 200  # Use 2500+ for production
config['learning_rate'] = 0.01

print("⚙️ Configuration:")
print(f"  Epochs: {config['n_epochs']}")
print(f"  Learning Rate: {config['learning_rate']}")
print(f"  Hidden Dim: {config['hidden_dim']}")

## Step 5: Prepare Data

In [None]:
# Initialize pipeline
pipeline = UnifiedDataPipeline(config)

# Process data
processed_data = pipeline.fit_transform(df)

print("✅ Data processed!")
print(f"Training: {processed_data['train_tensors']['X_media'].shape}")
print(f"Holdout: {processed_data['holdout_tensors']['X_media'].shape}")

## Step 6: Train Model

In [None]:
# Train
trainer = ModelTrainer(config)
model, results = trainer.train(processed_data)

print("\n✅ Training complete!")

## Step 7: View Results

In [None]:
print("📊 Performance Metrics:\n")
print(f"Training R²: {results['train_r2']:.4f}")
print(f"Holdout R²: {results['holdout_r2']:.4f}")
print(f"\nTraining RMSE: {results['train_rmse']:.2f}")
print(f"Holdout RMSE: {results['holdout_rmse']:.2f}")

gap = abs(results['train_r2'] - results['holdout_r2']) / results['train_r2'] * 100
print(f"\nPerformance Gap: {gap:.2f}%")

## Step 8: Get Contributions

In [None]:
# Get predictions and contributions
postprocess = pipeline.predict_and_postprocess(model, split='holdout')

predictions = postprocess['predictions']
media_contrib = postprocess['media_contributions']

print(f"Predictions: {predictions.shape}")
print(f"Media contributions: {media_contrib.shape}")
print(f"\nTotal predicted: {predictions.sum():.2f}")

## Step 9: Analyze Channels

In [None]:
# Channel contributions
channel_totals = media_contrib.sum(axis=(0, 1))
channels = [f"Channel_{i+1}" for i in range(len(channel_totals))]

contrib_df = pd.DataFrame({
    'Channel': channels,
    'Contribution': channel_totals,
    'Percentage': channel_totals / channel_totals.sum() * 100
}).sort_values('Contribution', ascending=False)

print("📊 Channel Impact:\n")
print(contrib_df.to_string(index=False))

## Step 10: Response Curves

Fit Hill saturation curves to understand diminishing returns:

In [None]:
# Prepare data for top channel
top_channel_idx = channel_totals.argmax()
top_channel = channels[top_channel_idx]

# Get impressions from original data
impressions_col = [c for c in df.columns if 'media' in c.lower()][top_channel_idx]
impressions = df[impressions_col].values

# Get contributions
contributions = media_contrib[:, :, top_channel_idx].flatten()[:len(impressions)]

# Create curve data
curve_data = pd.DataFrame({
    'week_monday': pd.date_range('2023-01-01', periods=len(impressions), freq='W'),
    'impressions': impressions,
    'spend': impressions * 0.5,
    'predicted': contributions
})

# Fit curve
fitter = ResponseCurveFit(data=curve_data, model_level='Overall')
fitter.fit(
    x_label='Impressions',
    y_label='Contributions',
    title=f'Response Curve: {top_channel}',
    generate_figure=True,
    save_figure=True,
    output_path='response_curve.html'
)

print(f"\n�� Response Curve for {top_channel}:")
print(f"  Slope: {fitter.slope:.3f}")
print(f"  Half-Saturation: {fitter.saturation:,.0f}")
print(f"  R²: {fitter.r_2:.3f}")
print(f"\n💾 Saved to: response_curve.html")

## 🎉 Congratulations!

You've successfully:
- ✅ Installed DeepCausalMMM
- ✅ Trained a model
- ✅ Analyzed channel contributions
- ✅ Fitted response curves

## Next Steps

### 📚 Learn More:
- [Documentation](https://deepcausalmmm.readthedocs.io/)
- [API Reference](https://deepcausalmmm.readthedocs.io/en/latest/api/)
- [GitHub](https://github.com/adityapt/deepcausalmmm)

### 🔧 Advanced Features:
- Run comprehensive dashboard
- Explore DAG causal discovery
- Multi-region analysis
- Hyperparameter optimization

### 📖 Citation:
```bibtex
@software{Puttaparthi_Tirumala_DeepCausalMMM_2025,
  author = {Puttaparthi Tirumala, Aditya},
  doi = {10.5281/zenodo.17274024},
  title = {{DeepCausalMMM}},
  url = {https://github.com/adityapt/deepcausalmmm},
  version = {1.0.17},
  year = {2025}
}
```

**Happy Modeling! 🚀**