<a href="https://colab.research.google.com/github/jackbowley/MMM/blob/main/MeridianTests/Meridian.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


<table class="tfo-notebook-buttons" align="left">
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/google/meridian/blob/main/demo/Meridian_Getting_Started.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
  <td>
    <a target="_blank" href="https://github.com/google/meridian/blob/main/demo/Meridian_Getting_Started.ipynb"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />View source on GitHub</a>
  </td>
</table>


# **Running test log-linear national models through meridian**


<a name="install"></a>

## Step 0: Install


1\. Make sure you are using one of the available GPU Colab runtimes which is **required** to run Meridian. You can change your notebook's runtime in `Runtime > Change runtime type` in the menu. All users can use the T4 GPU runtime which is sufficient to run the demo colab, free of charge. Users who have purchased one of Colab's paid plans have access to premium GPUs (such as V100, A100 or L4 Nvidia GPU).


2\. Install the latest version of Meridian, and verify that GPU is available.


In [None]:
# Install meridian: from PyPI @ latest release
!pip install --upgrade google-meridian[colab,and-cuda]

# Install meridian: from PyPI @ specific version
# !pip install google-meridian[colab,and-cuda]==1.1.1

# Install meridian: from GitHub @HEAD
# !pip install --upgrade "google-meridian[colab,and-cuda] @ git+https://github.com/google/meridian.git@main"

In [None]:
import arviz as az
import IPython
from meridian import constants
from meridian.analysis import analyzer
from meridian.analysis import formatter
from meridian.analysis import optimizer
from meridian.analysis import summarizer
from meridian.analysis import visualizer
from meridian.data import data_frame_input_data_builder as data_builder
from meridian.data import test_utils
from meridian.model import model
from meridian.model import prior_distribution
from meridian.model import spec
import numpy as np
import pandas as pd
# check if GPU is available
from psutil import virtual_memory
import tensorflow as tf
import tensorflow_probability as tfp

ram_gb = virtual_memory().total / 1e9
print('Your runtime has {:.1f} gigabytes of available RAM\n'.format(ram_gb))
print(
    'Num GPUs Available: ',
    len(tf.config.experimental.list_physical_devices('GPU')),
)
print(
    'Num CPUs Available: ',
    len(tf.config.experimental.list_physical_devices('CPU')),
)


ModuleNotFoundError: No module named 'meridian'

### Functions etc.


In [None]:

from google.colab import drive
import pandas as pd


def import_df_from_drive(filepath):
  """Imports a CSV file from Google Drive into a pandas DataFrame.

  Args:
    filepath: The full path to the CSV file in Google Drive.

  Returns:
    A pandas DataFrame containing the data from the CSV file, or None if an error
    occurs.
  """
  try:
    drive.mount('/content/drive', force_remount=False) # Force remount in case it's already mounted
    df = pd.read_csv(filepath)
    print(f"DataFrame loaded successfully from {filepath}")
    return df
  except FileNotFoundError:
    print(f"Error: The file was not found at {filepath}")
    return None
  except Exception as e:
    print(f"An error occurred while reading the CSV file: {e}")
    return None

<a name="load-data"></a>

## Step 1: Load the data


# 1\.1 Load Data from google drive.


In [None]:

# Mount Google Drive
drive.mount('/content/drive')

# Define the file path to your CSV file in Google Drive.
filepath = '/content/drive/MyDrive/work/MMM/data.csv'
dir = '/content/drive/MyDrive/work/MMM/'
filename= 'data.csv'
df_data = import_df_from_drive(dir+filename)
# df_data.head()


In [None]:
df = df_data.copy()
df = df.dropna(subset=['svol_XF'])
df = df.rename(columns={'Date': 'time'})

display(df['time'].dtype)


In [None]:
list(df.columns)

In [None]:
filename= 'var_spec.csv'
df_var_spec = import_df_from_drive(dir+filename)
df_var_spec.head()

2\. Load Data using DataFrameInputDataBuilder


In [None]:
builder = data_builder.DataFrameInputDataBuilder(
    kpi_type='non_revenue',
    default_kpi_column='svol_XF',
    default_revenue_per_kpi_column='Price_SE_XF',
)

controls = ["D_JAN","D_FEB","D_MAR","D_APR","D_MAY","D_SEP","D_OCT","D_NOV","D_DEC",
            "SCHOOL_EASTER","SCHOOL_HT_FEB","SCHOOL_HT_MAY","SCHOOL_HT_OCT","PAYDAY_25",
            "BH_NY","BH_XMAS","DAY_VALENTINE","WW_NAT_DLTA_MAXTEMP","WW_NAT_DLTA_RAIN","WW_NAT_DLTA_SUN","RSI_NFOOD_VOL_SA","Dist_XF",
            "Price_SE_XF","Prom_TFT","POS_FSDU","comp_Lor_Tot","comp_no7_Tot"]

channels = ["m_Wow_TV","m_Wow_OLV","m_Wow_Social","m_Amaze_Tot","m_Celeb_TV","m_Celeb_Outdoor","m_Celeb_Display"]

builder = (
    builder
        .with_kpi(df)
        .with_revenue_per_kpi(df)
        .with_controls(df, control_cols=controls)
)

builder = builder.with_media(
    df,
    media_cols=channels,
    media_spend_cols=channels,
    media_channels=channels,
)

data = builder.build()


<a name="configure-model"></a>

## Step 2: Configure the model


In [None]:
roi_mu = 0.2  # Mu for ROI prior for each media channel.
roi_sigma = 0.9  # Sigma for ROI prior for each media channel.
prior = prior_distribution.PriorDistribution(
    roi_m=tfp.distributions.LogNormal(roi_mu, roi_sigma, name=constants.ROI_M)
)
model_spec = spec.ModelSpec(prior=prior)

mmm = model.Meridian(input_data=data, model_spec=model_spec)

2\. Use the `sample_prior()` and `sample_posterior()` methods to obtain samples from the prior and posterior distributions of model parameters. If you are using the T4 GPU runtime this step may take about 10 minutes for the provided data set.


In [None]:
%%time
mmm.sample_prior(500)
mmm.sample_posterior(
    n_chains=10, n_adapt=2000, n_burnin=500, n_keep=1000, seed=0
)

For more information about configuring the parameters and using a customized model specification, such as setting different ROI priors for each media channel, see [Configure the model](https://developers.google.com/meridian/docs/user-guide/configure-model).


<a name="model-diagnostics"></a>

## Step 3: Run model diagnostics


After the model is built, you must assess convergence, debug the model if needed, and then assess the model fit.

1\. Assess convergence. Run the following code to generate r-hat statistics. R-hat close to 1.0 indicate convergence. R-hat < 1.2 indicates approximate convergence and is a reasonable threshold for many problems.


In [None]:
model_diagnostics = visualizer.ModelDiagnostics(mmm)
model_diagnostics.plot_rhat_boxplot()

2\. Assess the model's fit by comparing the expected sales against the actual sales.


In [None]:
model_fit = visualizer.ModelFit(mmm)
model_fit.plot_model_fit()

For more information and additional model diagnostics checks, see [Modeling diagnostics](https://developers.google.com/meridian/docs/user-guide/model-diagnostics).


<a name="generate-summary"></a>

## Step 4: Generate model results & two-page output


To export the two-page HTML summary output, initialize the `Summarizer` class with the model object. Then pass in the filename, filepath, start date, and end date to `output_model_results_summary` to run the summary for that time duration and save it to the specified file.


In [None]:
mmm_summarizer = summarizer.Summarizer(mmm)

In [None]:
from google.colab import drive

drive.mount('/content/drive')

In [None]:
filepath = '/content/drive/MyDrive'
start_date = '2021-01-05'
end_date = '2024-08-27'
mmm_summarizer.output_model_results_summary(
    'mock_model_summary_output.html', filepath, start_date, end_date
)

Here is a preview of the two-page output based on the simulated data:


In [None]:
IPython.display.HTML(filename='/content/drive/MyDrive/mock_model_summary_output.html')

For a customized two-page report, model results summary table, and individual visualizations, see [Model results report](https://developers.google.com/meridian/docs/user-guide/generate-model-results-report) and [plot media visualizations](https://developers.google.com/meridian/docs/user-guide/plot-media-visualizations).


Run the following codes to save the model object:


In [None]:
file_path = '/content/drive/MyDrive/saved_mmm.pkl'
model.save_mmm(mmm, file_path)

Run the following codes to load the saved model:


In [None]:
mmm = model.load_mmm(file_path)