<table class="tfo-notebook-buttons" align="left">
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/google/meridian/blob/main/demo/Meridian_Getting_Started.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
  <td>
    <a target="_blank" href="https://github.com/google/meridian/blob/main/demo/Meridian_Getting_Started.ipynb"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />View source on GitHub</a>
  </td>
</table>

# **wilson_mmm_beattie**

Welcome to the Meridian end-to-end demo. This simplified demo showcases the fundamental functionalities and basic usage of the library, including working examples of the major modeling steps:


<ol start="0">
  <li><a href="#install">Install</a></li>
  <li><a href="#load-data">Load the data</a></li>
  <li><a href="#configure-model">Configure the model</a></li>
  <li><a href="#model-diagnostics">Run model diagnostics</a></li>
  <li><a href="#generate-summary">Generate model results & two-page output</a></li>
  <li><a href="#generate-optimize">Run budget optimization & two-page output</a></li>
  <li><a href="#save-model">Save the model object</a></li>
</ol>


Note that this notebook skips all of the exploratory data analysis and preprocessing steps. It assumes that you have completed these tasks before reaching this point in the demo.

This notebook utilizes sample data. As a result, the numbers and results obtained might not accurately reflect what you encounter when working with a real dataset.

<a name="install"></a>
## Step 0: Install

1\. Make sure you are using one of the available GPU Colab runtimes which is **required** to run Meridian. You can change your notebook's runtime in `Runtime > Change runtime type` in the menu. All users can use the T4 GPU runtime which is sufficient to run the demo colab, free of charge. Users who have purchased one of Colab's paid plans have access to premium GPUs (such as V100, A100 or L4 Nvidia GPU).

2\. Install the latest version of Meridian, and verify that GPU is available.

In [1]:
# Install meridian: from PyPI @ latest release
!pip install --upgrade google-meridian[colab,and-cuda]

# Install meridian: from PyPI @ specific version
# !pip install google-meridian[colab,and-cuda]==1.0.3

# Install meridian: from GitHub @HEAD
# !pip install --upgrade "google-meridian[colab,and-cuda] @ git+https://github.com/google/meridian.git"

Collecting google-meridian[and-cuda,colab]
  Downloading google_meridian-1.0.8-py3-none-any.whl.metadata (22 kB)
Collecting numpy<3,>=2.0.2 (from google-meridian[and-cuda,colab])
  Using cached numpy-2.0.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (60 kB)
Using cached numpy-2.0.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (19.2 MB)
Downloading google_meridian-1.0.8-py3-none-any.whl (205 kB)
Installing collected packages: numpy, google-meridian
  Attempting uninstall: numpy
    Found existing installation: numpy 2.2.4
    Uninstalling numpy-2.2.4:
      Successfully uninstalled numpy-2.2.4
  Attempting uninstall: google-meridian
    Found existing installation: google-meridian 1.0.7
    Uninstalling google-meridian-1.0.7:
      Successfully uninstalled google-meridian-1.0.7
Successfully installed google-meridian-1.0.8 numpy-2.0.2


In [1]:
import numpy as np
import pandas as pd
import tensorflow as tf
import tensorflow_probability as tfp
import arviz as az

import IPython

from meridian import constants
from meridian.data import load
from meridian.data import test_utils
from meridian.model import model
from meridian.model import spec
from meridian.model import prior_distribution
from meridian.analysis import optimizer
from meridian.analysis import analyzer
from meridian.analysis import visualizer
from meridian.analysis import summarizer
from meridian.analysis import formatter

# check if GPU is available
from psutil import virtual_memory
ram_gb = virtual_memory().total / 1e9
print('Your runtime has {:.1f} gigabytes of available RAM\n'.format(ram_gb))
print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))
print("Num CPUs Available: ", len(tf.config.experimental.list_physical_devices('CPU')))

2025-04-16 05:16:14.337833: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1744780574.376395    3839 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1744780574.385335    3839 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-04-16 05:16:14.418493: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


Your runtime has 16.8 gigabytes of available RAM

Num GPUs Available:  0
Num CPUs Available:  1


2025-04-16 05:16:19.220992: E external/local_xla/xla/stream_executor/cuda/cuda_driver.cc:152] failed call to cuInit: INTERNAL: CUDA error: Failed call to cuInit: UNKNOWN ERROR (303)


<a name="load-data"></a>
## Step 1: Load the data

Load the [simulated dataset in CSV format](https://github.com/google/meridian/blob/main/meridian/data/simulated_data/csv/geo_all_channels.csv) as follows.

1\. Map the column names to their corresponding variable types. For example, the column names 'GQV' and 'Competitor_Sales' are mapped to `controls`. The required variable types are `time`, `controls`, `population`, `kpi`, `revenue_per_kpi`, `media` and `spend`. If your data includes organic media or non-media treatments, you can add them using `organic_media` and `non_media_treatments` arguments. For the definition of each variable, see
[Collect and organize your data](https://developers.google.com/meridian/docs/user-guide/collect-data).

In [2]:
# Load data from BigQuery
from google.cloud import bigquery

# Initialize BigQuery client (automatically picks up credentials)
client = bigquery.Client()

# Define a public dataset query
query = "SELECT * FROM `ou-dsa5900.mmm_spring2025.wilson_mmm_view`"

# Run the query and load into pandas dataframe
query_job = client.query(query)
result = query_job.result()
wilsonpdf = result.to_dataframe()

# Convert the date field to a string for Meridian
wilsonpdf['date'] = wilsonpdf['date'].astype(str)



In [None]:
wilsonpdf

Unnamed: 0,date,revenue,pmax_cost,search_cost,shopping_cost,youtube_cost,demandgen_cost,meta_cost,seppromo
0,2023-01-12,89087,4299.69,3830.39,39.16,0.00,0.0,2651.71,0
1,2023-01-30,92983,3896.93,4447.19,47.17,0.00,0.0,2658.22,0
2,2023-03-12,110176,5868.78,4329.58,7.10,0.00,0.0,5772.84,0
3,2023-04-02,119599,6230.93,3872.12,0.00,0.00,0.0,5422.50,0
4,2023-04-23,136232,7353.90,5082.00,0.00,0.00,0.0,6415.54,0
...,...,...,...,...,...,...,...,...,...
754,2024-08-30,151341,6962.09,2534.78,0.00,4838.29,0.0,8813.21,0
755,2024-09-29,132765,16847.94,1639.90,0.00,2238.34,0.0,11046.58,0
756,2024-10-20,180922,10329.01,3028.77,0.00,0.00,0.0,13492.35,0
757,2024-10-28,151799,12153.71,2141.86,0.00,0.00,0.0,7653.11,0


In [4]:
# Map dataframe columns to Meridian variables.  Note, we can ignore media impressions by setting media equal to spend.
coord_to_columns = load.CoordToColumns(
    time='date',
    kpi='revenue',
    media_spend=[
        'pmax_cost',
        'search_cost',
        'shopping_cost',
        'youtube_cost',
        'demandgen_cost',
        'meta_cost'
    ],
    media=[
        'pmax_cost',
        'search_cost',
        'shopping_cost',
        'youtube_cost',
        'demandgen_cost',
        'meta_cost'
    ],
#    non_media_treatments=['seppromo'],
    controls=['seppromo'],
)

media_spend_to_channel = {
    'pmax_cost' : 'pmax',
    'search_cost' : 'search',
    'shopping_cost' : 'shopping',
    'youtube_cost' : 'youtube',
    'demandgen_cost' : 'demandgen',
    'meta_cost' : 'meta',
}

media_to_channel = {
    'pmax_cost' : 'pmax',
    'search_cost' : 'search',
    'shopping_cost' : 'shopping',
    'youtube_cost' : 'youtube',
    'demandgen_cost' : 'demandgen',
    'meta_cost' : 'meta',
}

3\. Load the CSV data using `CsvDataLoader`. Note that `csv_path` is the path to the data file location.

In [5]:
loader = load.DataFrameDataLoader(
    df=wilsonpdf,
    kpi_type='revenue',
    coord_to_columns=coord_to_columns,
    media_spend_to_channel=media_spend_to_channel,
    media_to_channel=media_to_channel,
)
data = loader.load()

  self.df[geo_column_name] = self.df[geo_column_name].replace(
  if (constants.GEO) not in self.dataset.dims.keys():
  if constants.MEDIA_TIME not in self.dataset.dims.keys():


Note that the simulated data here does not contain reach and frequency. We recommend including reach and frequency data whenever they are available. For information about the advantages of utilizing reach and frequency, see [Bayesian Hierarchical Media Mix Model Incorporating Reach and Frequency Data](https://research.google/pubs/bayesian-hierarchical-media-mix-model-incorporating-reach-and-frequency-data/#:~:text=By%20incorporating%20R%26F%20into%20MMM,based%20on%20optimal%20frequency%20recommendations.). For code snippet for loading reach and frequency data, see [Load geo-level data with reach and frequency](https://developers.google.com/meridian/docs/user-guide/load-geo-data-with-rf)

The documentation provides guidance for instances where reach and frequency data is accessible for specific channels. Additionally, for information about how to load other data types and formats, including data with reach and frequency, see [Supported data types and formats](https://developers.google.com/meridian/docs/user-guide/supported-data-types-formats).

<a name="configure-model"></a>
## Step 2: Configure the model

Meridian uses Bayesian framework and Markov Chain Monte Carlo (MCMC) algorithms to sample from the posterior distribution.

1\. Inititalize the `Meridian` class by passing the loaded data and the customized model specification. One advantage of Meridian lies in its capacity to calibrate the model directly through ROI priors, as described in [Media Mix Model Calibration With Bayesian Priors](https://research.google/pubs/media-mix-model-calibration-with-bayesian-priors/). In this particular example, the ROI priors for all media channels are identical, with each being represented as Lognormal(0.2, 0.9).

In [6]:
#roi_mu = 0.2     # Mu for ROI prior for each media channel.
#roi_sigma = 0.9  # Sigma for ROI prior for each media channel.

# Set a custom prior for the September promotion period
gamma_c_mu = 0.5  # Expected impact (mean) for control variable
gamma_c_sigma = 0.1  # Uncertainty (standard deviation) for control variable
prior = prior_distribution.PriorDistribution(
#    roi_m=tfp.distributions.LogNormal(roi_mu, roi_sigma, name=constants.ROI_M)
    gamma_c=tfp.distributions.Normal(gamma_c_mu, gamma_c_sigma, name="GAMMA_C")
    )

model_spec = spec.ModelSpec(
    prior=prior,
    knots=24
    )

mmm = model.Meridian(input_data=data, model_spec=model_spec)

I0000 00:00:1744780634.487854    3839 service.cc:148] XLA service 0x2d0cdc40 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
I0000 00:00:1744780634.487914    3839 service.cc:156]   StreamExecutor device (0): Host, Default Version
I0000 00:00:1744780634.515833    3839 device_compiler.h:188] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.


2\. Use the `sample_prior()` and `sample_posterior()` methods to obtain samples from the prior and posterior distributions of model parameters. If you are using the T4 GPU runtime this step may take about 10 minutes for the provided data set.

In [7]:
%%time
mmm.sample_prior(500)
mmm.sample_posterior(n_chains=7, n_adapt=500, n_burnin=500, n_keep=1000, seed=1)

2025-04-16 05:19:13.600041: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:268] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
W0000 00:00:1744780756.864700    3839 assert_op.cc:38] Ignoring Assert operator mcmc_retry_init/assert_equal_1/Assert/AssertGuard/Assert


CPU times: user 6min 12s, sys: 11 s, total: 6min 23s
Wall time: 6min 22s


For more information about configuring the parameters and using a customized model specification, such as setting different ROI priors for each media channel, see [Configure the model](https://developers.google.com/meridian/docs/user-guide/configure-model).

<a name="model-diagnostics"></a>
## Step 3: Run model diagnostics

After the model is built, you must assess convergence, debug the model if needed, and then assess the model fit.

1\. Assess convergence. Run the following code to generate r-hat statistics. R-hat close to 1.0 indicate convergence. R-hat < 1.2 indicates approximate convergence and is a reasonable threshold for many problems.

In [8]:
model_diagnostics = visualizer.ModelDiagnostics(mmm)
model_diagnostics.plot_rhat_boxplot()

2\. Assess the model's fit by comparing the expected sales against the actual sales.

In [9]:
model_fit = visualizer.ModelFit(mmm)
model_fit.plot_model_fit()



In [10]:
import arviz as az

# Assuming `mmm` is your Meridian model object
posterior = mmm.inference_data.posterior

# Calculate the posterior means for all parameters
posterior_means = az.summary(posterior, kind="stats", stat_funcs={"mean": np.mean})["mean"]

print(posterior_means.to_string())


                                    mean    mean
alpha_m[pmax]                      0.429   0.429
alpha_m[search]                    0.086   0.086
alpha_m[shopping]                  0.495   0.495
alpha_m[youtube]                   0.475   0.475
alpha_m[demandgen]                 0.504   0.504
alpha_m[meta]                      0.038   0.038
beta_gm[national_geo, pmax]        0.268   0.268
beta_gm[national_geo, search]      2.814   2.814
beta_gm[national_geo, shopping]    0.028   0.028
beta_gm[national_geo, youtube]     0.110   0.110
beta_gm[national_geo, demandgen]   0.007   0.007
beta_gm[national_geo, meta]        6.747   6.747
beta_m[pmax]                       0.268   0.268
beta_m[search]                     2.814   2.814
beta_m[shopping]                   0.028   0.028
beta_m[youtube]                    0.110   0.110
beta_m[demandgen]                  0.007   0.007
beta_m[meta]                       6.747   6.747
ec_m[pmax]                         1.046   1.046
ec_m[search]        

For more information and additional model diagnostics checks, see [Modeling diagnostics](https://developers.google.com/meridian/docs/user-guide/model-diagnostics).

<a name="generate-summary"></a>
## Step 4: Generate model results & two-page output

To export the two-page HTML summary output, initialize the `Summarizer` class with the model object. Then pass in the filename, filepath, start date, and end date to `output_model_results_summary` to run the summary for that time duration and save it to the specified file.

In [11]:
mmm_summarizer = summarizer.Summarizer(mmm)

In [12]:
filepath = '/home/user'
start_date = '2023-01-01'
end_date = '2025-01-28'
mmm_summarizer.output_model_results_summary('wilson_mmm_seppromo_summary.html', filepath, start_date, end_date)

  diff_b_a = subtract(b, a)
  return py_builtins.overload_of(f)(*args)


Here is a preview of the two-page output based on the simulated data:

In [13]:
IPython.display.HTML(filename='/home/user/wilson_mmm_seppromo_summary.html')

Dataset,R-squared,MAPE,wMAPE
All Data,0.38,23%,24%


For a customized two-page report, model results summary table, and individual visualizations, see [Model results report](https://developers.google.com/meridian/docs/user-guide/generate-model-results-report) and [plot media visualizations](https://developers.google.com/meridian/docs/user-guide/plot-media-visualizations).





<a name="generate-optimize"></a>
## Step 5: Run budget optimization & generate an optimization report

You can choose what scenario to run for the budget allocation. In default scenario, you find the optimal allocation across channels for a given budget to maximize the return on investment (ROI).

1\. Instantiate the `BudgetOptimizer` class and run the `optimize()` method without any customization, to run the default library's Fixed Budget Scenario to maximize ROI.

In [34]:
%%time
budget_optimizer = optimizer.BudgetOptimizer(mmm)
optimization_results = budget_optimizer.optimize()



CPU times: user 9min 51s, sys: 1.88 s, total: 9min 53s
Wall time: 5min 25s


2\. Export the 2-page HTML optimization report, which contains optimized spend allocations and ROI.

In [15]:
filepath = '/home/user'
optimization_results.output_optimization_summary('wilson_seppromo_optimization_output.html', filepath)

NameError: name 'optimization_results' is not defined

In [36]:
IPython.display.HTML(filename='/home/user/wilson_seppromo_optimization_output.html')

Channel,Non-optimized spend,Optimized spend
meta,34%,44%
pmax,44%,31%
search,17%,22%
youtube,4%,3%
shopping,1%,1%
demandgen,0%,0%


For information about customized optimization scenarios, such as flexible budget scenarios, see [Budget optimization scenarios](https://developers.google.com/meridian/docs/user-guide/budget-optimization-scenarios). For more information about optimization results summary and individual visualizations, see [optimization results output](https://developers.google.com/meridian/docs/user-guide/generate-optimization-results-output) and [optimization visualizations](https://developers.google.com/meridian/docs/user-guide/plot-optimization-visualizations).

<a name="save-model"></a>
## Step 6: Save the model object

We recommend that you save the model object for future use. This helps you to  avoid repetitive model runs and saves time and computational resources. After the model object is saved, you can load it at a later stage to continue the analysis or visualizations without having to re-run the model.


Run the following codes to save the model object:

In [14]:
file_path='/home/user/wilson_seppromo_24knot_saved_mmm.pkl'
model.save_mmm(mmm, file_path)

Run the following codes to load the saved model:

In [None]:
mmm = model.load_mmm(file_path)