## Step 1: Installing Meridian

In [5]:
!pip install --upgrade google-meridian

Collecting google-meridian
  Downloading google_meridian-1.1.6-py3-none-any.whl.metadata (22 kB)
Collecting arviz (from google-meridian)
  Using cached arviz-0.22.0-py3-none-any.whl.metadata (8.9 kB)
Collecting altair>=5 (from google-meridian)
  Using cached altair-5.5.0-py3-none-any.whl.metadata (11 kB)
Collecting immutabledict (from google-meridian)
  Using cached immutabledict-4.2.1-py3-none-any.whl.metadata (3.5 kB)
Collecting joblib (from google-meridian)
  Using cached joblib-1.5.1-py3-none-any.whl.metadata (5.6 kB)
Collecting natsort<8,>=7.1.1 (from google-meridian)
  Using cached natsort-7.1.1-py3-none-any.whl.metadata (22 kB)
Collecting numpy<3,>=2.0.2 (from google-meridian)
  Using cached numpy-2.3.2-cp312-cp312-macosx_14_0_x86_64.whl.metadata (62 kB)
Collecting pandas<3,>=2.2.2 (from google-meridian)
  Using cached pandas-2.3.1-cp312-cp312-macosx_10_13_x86_64.whl.metadata (91 kB)
Collecting scipy<2,>=1.13.1 (from google-meridian)
  Using cached scipy-1.16.1-cp312-cp312-macos

In [47]:
import arviz as az
import IPython
from meridian import constants
from meridian.analysis import analyzer
from meridian.analysis import formatter
from meridian.analysis import optimizer
from meridian.analysis import summarizer
from meridian.analysis import visualizer
from meridian.data import load
from meridian.data import test_utils
from meridian.model import model
from meridian.model import prior_distribution
from meridian.model import spec
import numpy as np
import pandas as pd
# check if GPU is available
from psutil import virtual_memory
import tensorflow as tf
import tensorflow_probability as tfp

ram_gb = virtual_memory().total / 1e9
print('Your runtime has {:.1f} gigabytes of available RAM\n'.format(ram_gb))
print(
    'Num GPUs Available: ',
    len(tf.config.experimental.list_physical_devices('GPU')),
)
print(
    'Num CPUs Available: ',
    len(tf.config.experimental.list_physical_devices('CPU')),
)

Your runtime has 17.2 gigabytes of available RAM

Num GPUs Available:  0
Num CPUs Available:  1


## Step 2: Loading the data

In [50]:
df = pd.read_csv('./data/demo_v1.csv')

Create a `CoordToColumns` object specifying which columns represent controls, KPI, media, media spend, reach, frequency, and RF spend. And then set up dictionaries that map each media-related column to its corresponding channel. These mappings are used to initialize a `CsvDataLoader`, which loads and processes the data from the CSV file for use in the modeling workflow.

In [51]:
channels = ["Channel0", "Channel1", "Channel2"]

coord_to_columns = load.CoordToColumns(
    controls=[
        "sentiment_score_control",
        "competitor_activity_score_control"
    ],
    kpi='conversions',
    media=[f"{channel}_impression" for channel in channels],
    media_spend=[f"{channel}_spend" for channel in channels],
    reach=["Channel3_reach"],
    frequency=["Channel3_frequency"],
    rf_spend=["Channel3_spend"],
)

correct_media_to_channel = {
    f"{channel}_impression": channel for channel in channels 
}

correct_media_spend_to_channel = {
    f"{channel}_spend": channel for channel in channels
}

correct_reach_to_channel = {
    "Channel3_reach": "Channel3"
}

correct_frequency_to_channel = {
    "Channel3_frequency": "Channel3"
}

correct_rf_spend_to_channel = {
    "Channel3_spend": "Channel3"
}

loader = load.CsvDataLoader(
    csv_path="./data/demo.csv",
    kpi_type='non_revenue',
    coord_to_columns=coord_to_columns,
    media_to_channel=correct_media_to_channel,
    media_spend_to_channel=correct_media_spend_to_channel,
    reach_to_channel=correct_reach_to_channel,
    frequency_to_channel=correct_frequency_to_channel,
    rf_spend_to_channel=correct_rf_spend_to_channel,
)

data = loader.load()

  self.df[geo_column_name] = self.df[geo_column_name].replace(
  if (constants.GEO) not in self.dataset.dims.keys():
  if constants.MEDIA_TIME not in self.dataset.dims.keys():


## Step 3: Configuring the model

Inititalize the `Meridian` class by passing the loaded data and the customized model specification.

In [52]:
roi_rf_mu = 0.2  # Mu for ROI prior for each RF channel.
roi_rf_sigma = 0.9  # Sigma for ROI prior for each RF channel.
prior = prior_distribution.PriorDistribution(
    roi_rf=tfp.distributions.LogNormal(
        roi_rf_mu, roi_rf_sigma, name=constants.ROI_RF
    )
)
model_spec = spec.ModelSpec(prior=prior)

mmm = model.Meridian(input_data=data, model_spec=model_spec)

Utilize the `sample_prior()` and `sample_posterior()` methods to generate samples from the prior and posterior distributions of the model parameters.

In [53]:
%%time
mmm.sample_prior(500)
mmm.sample_posterior(
    n_chains=4, n_adapt=500, n_burnin=500, n_keep=1000, seed=1
)

W0000 00:00:1754339077.973278 4804458 assert_op.cc:38] Ignoring Assert operator mcmc_retry_init/assert_equal_1/Assert/AssertGuard/Assert


CPU times: user 11min 10s, sys: 2min 7s, total: 13min 18s
Wall time: 13min 10s


## Step 4: Model diagnosis

Once the model is built, next step is to access the convergence, the following generates r-hat statistics, values of r-hat closer to 1 means convergence

In [58]:
model_diagnostics = visualizer.ModelDiagnostics(mmm)
model_diagnostics.plot_rhat_boxplot()

Plot the ROI posterior distribution against the ROI prior distribution for each media channel as follows

In [57]:
model_diagnostics.plot_prior_and_posterior_distribution()


The following lets you compare expected sales with the actual sales

In [59]:
model_fit = visualizer.ModelFit(mmm)
model_fit.plot_model_fit()

In [60]:
model_diagnostics.predictive_accuracy_table()

Unnamed: 0,metric,geo_granularity,value
0,R_Squared,geo,0.997398
1,R_Squared,national,0.999493
2,MAPE,geo,0.019263
3,MAPE,national,0.003022
4,wMAPE,geo,0.017713
5,wMAPE,national,0.002885


## Step 5: Model results

To generate the two-page HTML summary output, first initialize the `Summarizer` class with the model object. Then, use the `output_model_results_summary` method, providing the filename, file path, start date, and end date to generate and save the summary to your specified location.

In [61]:
mmm_summarizer = summarizer.Summarizer(mmm)

In [62]:
filepath = './reports/'
start_date = '2021-01-25'
end_date = '2024-01-15'
mmm_summarizer.output_model_results_summary(
    'summary_output.html', filepath, start_date, end_date
)



In [46]:
IPython.display.HTML(filename='./reports/summary_output.html')

Dataset,R-squared,MAPE,wMAPE
All Data,1.0,0%,0%


The `MediaSummary` class is used to generate model results summaries. By default, it produces summary statistics using a 90% credible interval over the entire modeling period.

In [69]:
media_summary = visualizer.MediaSummary(mmm)
media_summary.summary_table()

  .aggregate(lambda g: f'{g[0]} ({g[1]}, {g[2]})')


Unnamed: 0,channel,distribution,impressions,% impressions,spend,% spend,cpm,incremental KPI,% contribution,roi,effectiveness,mroi,cpik
0,Channel0,prior,1310841984,38%,14573955,37%,$11,"1,861,719,040 (339,138,621, 5,283,881,139)","11.1% (2.0%, 31.4%)","128 (23, 363)","1.42 (0.26, 4.03)","61 (9, 187)","$0.0 ($0.0, $0.0)"
1,Channel0,posterior,1310841984,38%,14573955,37%,$11,"635,215,424 (328,041,432, 1,009,432,659)","5.3% (2.7%, 8.4%)","44 (23, 69)","0.48 (0.25, 0.77)","23 (11, 37)","$0.0 ($0.0, $0.0)"
2,Channel1,prior,859961728,25%,9832452,25%,$11,"1,296,113,152 (237,547,667, 3,550,899,802)","7.7% (1.4%, 21.1%)","132 (24, 361)","1.51 (0.28, 4.13)","62 (9, 177)","$0.0 ($0.0, $0.0)"
3,Channel1,posterior,859961728,25%,9832452,25%,$11,"724,375,104 (536,646,813, 958,481,699)","6.1% (4.5%, 8.0%)","74 (55, 97)","0.84 (0.62, 1.11)","37 (28, 48)","$0.0 ($0.0, $0.0)"
4,Channel2,prior,696530432,20%,8318124,21%,$12,"999,141,888 (226,182,054, 2,674,204,685)","5.9% (1.3%, 15.9%)","120 (27, 321)","1.43 (0.32, 3.84)","58 (9, 153)","$0.0 ($0.0, $0.0)"
5,Channel2,posterior,696530432,20%,8318124,21%,$12,"762,009,792 (604,876,886, 944,017,475)","6.4% (5.1%, 7.9%)","92 (73, 113)","1.09 (0.87, 1.36)","57 (46, 69)","$0.0 ($0.0, $0.0)"
6,Channel3,prior,581757760,17%,6841827,17%,$12,"771,788,224 (162,399,949, 2,057,279,533)","4.6% (1.0%, 12.2%)","113 (24, 301)","1.33 (0.28, 3.54)","113 (24, 301)","$0.0 ($0.0, $0.0)"
7,Channel3,posterior,581757760,17%,6841827,17%,$12,"897,701,120 (779,338,240, 1,018,507,462)","7.5% (6.5%, 8.5%)","131 (114, 149)","1.54 (1.34, 1.75)","131 (114, 149)","$0.0 ($0.0, $0.0)"
8,All Channels,prior,3449091840,100%,39566360,100%,$11,"4,928,763,904 (2,107,431,040, 10,128,951,654)","29.3% (12.5%, 60.2%)","125 (53, 256)","nan (nan, nan)","nan (nan, nan)","$0.0 ($0.0, $0.0)"
9,All Channels,posterior,3449091840,100%,39566360,100%,$11,"3,019,305,472 (2,596,828,800, 3,478,060,813)","25.2% (21.7%, 29.1%)","76 (66, 88)","nan (nan, nan)","nan (nan, nan)","$0.0 ($0.0, $0.0)"


## Step 6: Budget optimization and generating report

You can select different scenarios for budget allocation. By default, the library finds the optimal allocation across channels for a specified budget to maximize return on investment (ROI).

To run the default Fixed Budget Scenario and maximize ROI, simply create an instance of the `BudgetOptimizer` class and call its `optimize()` method without any additional configuration.

In [63]:
%%time
budget_optimizer = optimizer.BudgetOptimizer(mmm)
optimization_results = budget_optimizer.optimize(use_kpi=True)



CPU times: user 4min 4s, sys: 15.2 s, total: 4min 19s
Wall time: 2min 47s


Export the HTML optimization report

In [64]:
filepath = './reports/'
optimization_results.output_optimization_summary(
    'optimization_output.html', filepath
)



In [67]:
IPython.display.HTML(filename='./reports/optimization_output.html')

Channel,Non-optimized spend,Optimized spend
Channel2,21%,27%
Channel0,37%,26%
Channel1,25%,24%
Channel3,17%,22%


## Step 7: Saving the model

Save the model as follows

In [68]:
file_path = './models/demo_mmm.pkl'
model.save_mmm(mmm, file_path)

To load the model, run the following

In [None]:
mmm = model.load_mmm(file_path)