# Robyn: Marketing Mix Modeling Application

This notebook demonstrates the usage of Robyn, a Marketing Mix Modeling (MMM) application. We'll go through the main steps of initializing the model, running it, and performing budget allocation.



## 1. Import Required Libraries and Create Synthetic Data

First, let's import the necessary libraries and create some synthetic data for our demonstration.

In [1]:
import sys

sys.path.append("/Users/yijuilee/robynpy_release_reviews/Robyn/python/src")
import pandas as pd
import numpy as np
from robyn.robyn import Robyn
from robyn.data.entities.mmmdata import MMMData
from robyn.data.entities.holidays_data import HolidaysData
from robyn.data.entities.hyperparameters import Hyperparameters, ChannelHyperparameters
from robyn.data.entities.calibration_input import CalibrationInput, ChannelCalibrationData
from robyn.data.entities.enums import AdstockType, DependentVarType, CalibrationScope

2024-11-14 18:40:01,943 - robyn - INFO - Logging is set up to console only.
  from .autonotebook import tqdm as notebook_tqdm


## 2.1 Load simulated data.

You need to replace this with real data.

In [2]:
# Read the simulated data and holidays data
dt_simulated_weekly = pd.read_csv("resources/dt_simulated_weekly.csv")

print("Simulated Data...")
dt_simulated_weekly.head()

Simulated Data...


Unnamed: 0,DATE,revenue,tv_S,ooh_S,print_S,facebook_I,search_clicks_P,search_S,competitor_sales_B,facebook_S,events,newsletter
0,2015-11-23,2754372.0,22358.346667,0.0,12728.488889,24301280.0,0.0,0.0,8125009,7607.132915,na,19401.653846
1,2015-11-30,2584277.0,28613.453333,0.0,0.0,5527033.0,9837.238486,4133.333333,7901549,1141.95245,na,14791.0
2,2015-12-07,2547387.0,0.0,132278.4,453.866667,16651590.0,12044.119653,3786.666667,8300197,4256.375378,na,14544.0
3,2015-12-14,2875220.0,83450.306667,0.0,17680.0,10549770.0,12268.070319,4253.333333,8122883,2800.490677,na,2800.0
4,2015-12-21,2215953.0,0.0,277336.0,0.0,2934090.0,9467.248023,3613.333333,7105985,689.582605,na,15478.0


In [3]:
dt_prophet_holidays = pd.read_csv("resources/dt_prophet_holidays.csv")

print("Holidays Data...")
dt_prophet_holidays.head()

Holidays Data...


Unnamed: 0,ds,holiday,country,year
0,1995-01-01,New Year's Day,AD,1995
1,1995-01-06,Epiphany,AD,1995
2,1995-02-28,Carnival,AD,1995
3,1995-03-14,Constitution Day,AD,1995
4,1995-04-14,Good Friday,AD,1995


## 2.2. Initialize Robyn

Now, let's initialize Robyn with our synthetic data and configuration.

In [4]:
# Initialize Robyn
robyn = Robyn(working_dir="~/temp/robyn")

# Create MMMData
mmm_data_spec = MMMData.MMMDataSpec(
    dep_var="revenue",
    dep_var_type="revenue",
    date_var="DATE",
    context_vars=["competitor_sales_B", "events"],
    paid_media_spends=["tv_S", "ooh_S", "print_S", "facebook_S", "search_S"],
    paid_media_vars=["tv_S", "ooh_S", "print_S", "facebook_I", "search_clicks_P"],
    organic_vars=["newsletter"],
    window_start="2016-01-01",
    window_end="2018-12-31",
)

mmm_data = MMMData(data=dt_simulated_weekly, mmmdata_spec=mmm_data_spec)

# Create HolidaysData (using dummy data for demonstration)
holidays_data = HolidaysData(
    dt_holidays=dt_prophet_holidays,
    prophet_vars=["trend", "season", "holiday"],
    prophet_country="DE",
    prophet_signs=["default", "default", "default"],
)

# Create Hyperparameters
hyperparameters = Hyperparameters(
    {
        "facebook_S": ChannelHyperparameters(
            alphas=[0.5, 3],
            gammas=[0.3, 1],
            thetas=[0, 0.3],
        ),
        "print_S": ChannelHyperparameters(
            alphas=[0.5, 3],
            gammas=[0.3, 1],
            thetas=[0.1, 0.4],
        ),
        "tv_S": ChannelHyperparameters(
            alphas=[0.5, 3],
            gammas=[0.3, 1],
            thetas=[0.3, 0.8],
        ),
        "search_S": ChannelHyperparameters(
            alphas=[0.5, 3],
            gammas=[0.3, 1],
            thetas=[0, 0.3],
        ),
        "ooh_S": ChannelHyperparameters(
            alphas=[0.5, 3],
            gammas=[0.3, 1],
            thetas=[0.1, 0.4],
        ),
        "newsletter": ChannelHyperparameters(
            alphas=[0.5, 3],
            gammas=[0.3, 1],
            thetas=[0.1, 0.4],
        ),
    },
    adstock=AdstockType.GEOMETRIC,
    lambda_=[0, 1],
    train_size=[0.5, 0.8],
)


## Calibration is not supported yet
# Create CalibrationInput (using dummy data for demonstration)
# calibration_input = CalibrationInput({
#     "tv_spend": ChannelCalibrationData(
#         lift_start_date=pd.Timestamp("2022-03-01"),
#         lift_end_date=pd.Timestamp("2022-03-15"),
#         lift_abs=10000,
#         spend=50000,
#         confidence=0.9,
#         metric="revenue",
#         calibration_scope=CalibrationScope.IMMEDIATE
#     )
# })

# Initialize Robyn
robyn.initialize(
    mmm_data=mmm_data,
    holidays_data=holidays_data,
    hyperparameters=hyperparameters,
)

print("Robyn initialized successfully!")

2024-11-14 18:40:03,950 - root - INFO - Robyn initialized with working directory: ~/temp/robyn
2024-11-14 18:40:03,955 - robyn.data.validation.mmmdata_validation - INFO - Starting complete MMMData validation
2024-11-14 18:40:03,957 - robyn.data.validation.mmmdata_validation - INFO - Missing and infinite value check passed successfully
2024-11-14 18:40:03,959 - robyn.data.validation.mmmdata_validation - INFO - No-variance check passed successfully
2024-11-14 18:40:03,959 - robyn.data.validation.mmmdata_validation - INFO - Variable names validation passed successfully
2024-11-14 18:40:03,960 - robyn.data.validation.mmmdata_validation - INFO - Date variable validation passed successfully
2024-11-14 18:40:03,960 - robyn.data.validation.mmmdata_validation - INFO - Dependent variable validation passed successfully
2024-11-14 18:40:03,960 - robyn.data.validation.mmmdata_validation - INFO - All validations passed successfully
2024-11-14 18:40:03,961 - robyn.data.validation.holidays_data_valida

Rolling Window Start Index: 6
Rolling Window End Index: 162
Validation complete
Robyn initialized successfully!


## 3. Run Robyn Model

After initialization, we can run the Robyn model.

In [5]:
from robyn.modeling.entities.modelrun_trials_config import TrialsConfig


trials_config = TrialsConfig(iterations=54, trials=5)

# Run the model
robyn.model_e2e_run(plot=False, trials_config=trials_config)

2024-11-14 18:40:03,977 - robyn.modeling.feature_engineering - INFO - Starting feature engineering process
2024-11-14 18:40:03,979 - robyn.modeling.feature_engineering - INFO - Starting Prophet decomposition
2024-11-14 18:40:03,980 - robyn.modeling.feature_engineering - INFO - Starting Prophet decomposition
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self.holidays['ds'] = pd.to_datetime(self.holidays['ds'])
2024-11-14 18:40:04,712 - cmdstanpy - DEBUG - input tempfile: /var/folders/gm/g5cpl7110m96nfd1qr1xwnwc0000gn/T/tmp06ul5sco/rcxqefyb.json
2024-11-14 18:40:04,722 - cmdstanpy - DEBUG - input tempfile: /var/folders/gm/g5cpl7110m96nfd1qr1xwnwc0000gn/T/tmp06ul5sco/a0tt7zfw.json
2024-11-14 18:40:04,724 - cmdstanpy - DEBUG - idx 0
2024-11-14 18:40:04,725 - cmdstanpy - DEBU

Model evaluation complete.
Model training and evaluation complete.


## 4. Budget Allocation

Finally, let's perform budget allocation using the trained model.

This notebook demonstrates the basic workflow of using Robyn for Marketing Mix Modeling. In a real-world scenario, you would need to replace the synthetic data with your actual marketing data and adjust the parameters accordingly.

Remember to explore the full capabilities of Robyn, including model evaluation, visualization, and interpretation of results, which are beyond the scope of this basic demonstration.

In [6]:
from robyn.allocator.entities.allocation_constraints import AllocationConstraints
from robyn.allocator.entities.allocation_config import AllocationConfig
from robyn.allocator.entities.enums import OptimizationScenario, ConstrMode

# Define allocation constraints
channel_constraints = AllocationConstraints(
    channel_constr_low={
        "tv_S": 0.7,  # -30% from base
        "ooh_S": 0.7,
        "print_S": 0.7,
        "facebook_S": 0.7,
        "search_S": 0.7,
    },
    channel_constr_up={
        "tv_S": 1.2,  # +20% from base
        "ooh_S": 1.5,  # +50% from base
        "print_S": 1.5,
        "facebook_S": 1.5,
        "search_S": 1.5,
    },
    channel_constr_multiplier=3.0,
)
# Configure allocation scenario
allocation_config = AllocationConfig(
    scenario=OptimizationScenario.MAX_RESPONSE,
    constraints=channel_constraints,
    date_range="last",  # Use last period as initial
    total_budget=None,  # Use historical budget
    maxeval=100000,
    optim_algo="SLSQP_AUGLAG",
    constr_mode=ConstrMode.EQUALITY,
)

# Call the budget_allocator method
allocation_result = robyn.budget_allocator(
    select_model=None,  # Replace with your actual model ID
    allocation_constraints=channel_constraints,
    allocator_config=allocation_config,
    plot=False,
    export=False,
)
# Display the allocation result
print(allocation_result)

2024-11-14 18:40:55,274 - robyn.allocator.budget_allocator - INFO - Initializing BudgetAllocator
2024-11-14 18:40:55,275 - robyn.allocator.media_response - INFO - Initializing MediaResponseParamsCalculator
2024-11-14 18:40:55,276 - robyn.allocator.allocation_optimizer - INFO - Initializing AllocationOptimizer
2024-11-14 18:40:55,276 - robyn.allocator.media_response - INFO - Starting media response parameters calculation for model 2_1_1
2024-11-14 18:40:55,280 - robyn.allocator.media_response - INFO - Successfully calculated media response parameters: MediaResponseParameters(alphas=5 channels, inflexions=5 channels, coefficients=5 channels)
2024-11-14 18:40:55,280 - robyn.allocator.budget_allocator - INFO - BudgetAllocator initialization completed successfully
2024-11-14 18:40:55,281 - robyn.allocator.budget_allocator - INFO - Starting budget allocation optimization
2024-11-14 18:40:55,283 - robyn.allocator.response_calculator - INFO - Successfully calculated gradient value: -0.0000
202


            Model ID: 2_1_1
            Scenario: max_response
            Use case: 
            Window: 2019-11-11 00:00:00:2019-11-11 00:00:00 (1 week)
            Dep. Variable Type: revenue
            Media Skipped: None
            Relative Spend Increase: 0.0% (+0K)
            Total Response Increase (Optimized): 0.0%
            Allocation Summary:
            

                - tv_S:
                  Optimizable bound: [-30%, 20%],
                  Initial spend share: 7.14% -> Optimized bounded: 7.14%
                  Initial response share: 9.90% -> Optimized bounded: 9.90%
                  Initial abs. mean spend: 1.221K -> Optimized: 1.221K [Delta = 0%]
                

                - ooh_S:
                  Optimizable bound: [-30%, 50%],
                  Initial spend share: 7.14% -> Optimized bounded: 7.14%
                  Initial response share: 0.00% -> Optimized bounded: 0.00%
                  Initial abs. mean spend: 1.221K -> Optimized: 1.221K [Del