## Welcom to Griffin

### Introduction

Griffin MMM is a Media Mix Modeling solution designed to empower marketers with advanced analytics and intelligent insights. As part of an evolving suite of tools, Griffin MMM stands at the forefront of marketing technology, enabling users to optimize their strategies across various channels effectively.

At its core, Griffin MMM is a powerful analytical tool that helps navigate the complex marketing landscape. It provides a robust framework for analyzing the effectiveness of different marketing channels, allowing marketers to make data-driven decisions and maximize their return on investment (ROI).

> 💡 Info: Download docs and demo files in the next section of this notebook.

> 📖 See: `docs/Griffin_documentation.pdf` and `docs/Griffin_Quickstart.md` for detailed information about Griffin.

### What is MMM?

Marketing mix modeling (MMM) is a privacy-friendly, highly resilient, data-driven statistical analysis that quantifies the incremental sales impact and ROI of marketing and non-marketing activities.

MMM is an econometric model that aims to quantify the incremental impact of marketing and non-marketing activities on a pre-defined KPI (like sales or subscriptions).

## Setup

Install necessary packages

In [3]:
!pip install -q "ipywidgets>=7,<8" # Required for the widgets to work in Colab. Other versions doesn't work in Colab.
!pip install -q requests
!pip install -q -U git+https://@github.com/griffin-analytics/griffin-mmm-demo.git

## Download demo files

In [None]:
import requests
import time
import os

def get_raw_github_url(url):
    """
    Convert GitHub URL to raw content URL
    """
    raw_url = url.replace('github.com', 'raw.githubusercontent.com')
    raw_url = raw_url.replace('/blob/', '/')
    return raw_url

def download_github_file(url, save_path):
    """
    Download a file from GitHub repository.
    Parameters:
    -----------
    url : str
        GitHub URL of the file
    save_path : str
        Local path where the file should be saved
    """
    try:
        # Ensure the directory exists
        directory = os.path.dirname(save_path)
        if directory and not os.path.exists(directory):
            os.makedirs(directory)
            print(f"Created directory: {directory}")

        # Convert to raw content URL
        raw_url = get_raw_github_url(url)
        print(f"Downloading from: {raw_url}")

        # Add retry mechanism
        max_retries = 3
        retry_delay = 1  # seconds
        
        for attempt in range(max_retries):
            try:
                response = requests.get(raw_url, timeout=10)
                response.raise_for_status()
                
                # Save the file
                with open(save_path, 'wb') as f:
                    f.write(response.content)
                print(f"Successfully downloaded: {save_path}")
                return True
                
            except requests.RequestException as e:
                if attempt < max_retries - 1:
                    print(f"Attempt {attempt + 1} failed. Retrying in {retry_delay} seconds...")
                    time.sleep(retry_delay)
                    retry_delay *= 2  # Exponential backoff
                else:
                    raise e
                    
    except Exception as e:
        print(f"Error downloading {url}: {str(e)}")
        return False

files_to_download = [
    ("https://github.com/griffin-analytics/griffin-mmm-demo/blob/main/demo/demo_config.yml", "demo_config.yml"),
    ("https://github.com/griffin-analytics/griffin-mmm-demo/blob/main/demo/demo_data.xlsx", "demo_data.xlsx"),
    ("https://github.com/griffin-analytics/griffin-mmm-demo/blob/main/demo/holidays.xlsx", "holidays.xlsx"),
    ("https://github.com/griffin-analytics/griffin-mmm-demo/blob/main/demo/demo_utils.py", "demo_utils.py"),
    ("https://github.com/griffin-analytics/griffin-mmm-demo/blob/main/demo/budget_optimizer.py", "budget_optimizer.py"),
    ("https://github.com/griffin-analytics/griffin-mmm-demo/blob/main/docs/Configuration_Guide.md", "docs/Configuration_Guide.md"),
    ("https://github.com/griffin-analytics/griffin-mmm-demo/blob/main/docs/ELPD_Explainer.md", "docs/ELPD_Explainer.md"),
    ("https://github.com/griffin-analytics/griffin-mmm-demo/blob/main/docs/Griffin_Quickstart.md", "docs/Griffin_Quickstart.md"),
    ("https://github.com/griffin-analytics/griffin-mmm-demo/blob/main/docs/Griffin_documentation.pdf", "docs/Griffin_documentation.pdf"),
    ("https://github.com/griffin-analytics/griffin-mmm-demo/blob/main/docs/MODEL_COMPARISON.md", "docs/MODEL_COMPARISON.md"),
    ("https://github.com/griffin-analytics/griffin-mmm-demo/blob/main/docs/Output_Guide.md", "docs/Output_Guide.md"),
]

print("Starting downloads...")
for url, save_path in files_to_download:
    download_github_file(url, save_path)
print("Download process completed!")

## Demo

### 1. Inputs

#### What are the input files?

#### Marketing data - `demo_data.xlsx`
This is an Excel file containing sample marketing data for demonstrating the MMM analysis. It includes:
* Sales and conversion metrics (dependent variables), the number of "Subscribres" in this demo
* Marketing spend across different channels, masked as media_cost_X
* Impressions (response) for each channel, masked as media_imp_X
* Other relevant marketing KPIs

#### Config - `demo_config.yml`
This is a configuration file in YAML format that contains settings for the Marketing Mix Model (MMM) demo. It includes:
* Model settings
* Data transformation rules
* Analysis configurations for running the MMM analysis

> 📖 See: "docs/Configuration_Guide.md" file for detailed configuration parameters.

#### Holidays data - `holidays.xlsx`
This Excel file contains holiday data that's used to account for seasonal effects.

<hr>

#### Model configuration guide

Griffin MMM uses a configuration file config.yaml to specify key settings such as model structure, data
paths, and hyperparameters. The configuration file provides a highly customisable setup, allowing users to
control how the model processes input data, applies Bayesian inference, and optimises the use of available
marketing data.

📖 See: For detailed explanations of config parameters, please refer to the "docs/Griffin_documentation.pdf" file, page 11.


💡 Info: You can edit the demo_config.yml file manually or using UI widgets. Run the next cell to customize the model configuration for your needs. Don't forget to save your changes by clicking the "Save" button at the bottom of the widget.

🚨  Warning: Make sure you have downloaded `demo_utils.py` and placed it at the same level as the sample_data folder.

In [None]:
# If running in Colab, enable custom widget manager
try:
    from google.colab import output
    output.enable_custom_widget_manager()
except:
    pass

from demo_utils import config_widget

display(config_widget)

<div style="border-top: 3px solid #16a085; margin-top: 30px; margin-bottom: 30px"></div>

### Demo vs Pro version

| Demo Version (Free) | Pro Version (Yearly Subscription) |
|--------------------|------------------------------------|
| ✓ Unlimited access to the demo version | ✓ Full access to all features |
| ✓ Support for up to 4 media channels | ✓ Support for unlimited media channels |
| ✓ All features and visualizations | ✓ Advanced analytics and visualizations |
| ✓ Basic / Community support | ✓ Priority email support |
| | ✓ Quarterly strategy consultation |
| | ✓ Regular software updates |
| **Best for:** Businesses or marketers looking to explore MMM capabilities | **Best for:** Agencies of all sizes and mid-size to large businesses serious about optimizing their marketing mix |

<hr><br>

### 2. Preprocessing

Run next cell to input the data to model driver. This class is used to train and create Griffin model.

📝 Note: If you created your own config.yml file, please specify its name correctly.

In [None]:
from base_driver import MMMBaseDriver

driver = MMMBaseDriver(
    "demo_config.yml",
    "demo_data.xlsx",
    "holidays.xlsx")

Setup a separate logger to avoid seeing debug messages in during the demo.

In [2]:
from base_driver import utils as ut

ut.setup_logger()
ut.set_style()

### 3. Run model

Run next cell to start training. Driver performs all variable transformations, Adstock, value stardardization, etc.

Please note that depending on the number of iteration in model config this may take long time.

In [None]:
driver.main()

Now we have the model created. The driver saves all the outputs into results/ folder. Please take a look at the files inside the results folder

### Understanding Model Evaluation Metrics in Griffin MMM

### ELPD (Expected Log Pointwise Predictive Density)
The ELPD is model selection criteria for Bayesian models. Higher is better.

> 📖 See: `"docs/ELPD_Explainer.md"` for "Warning signs" when evaluating models and Best practices.

In [None]:
driver.compute_elpd(model_name=driver.run_id, results_dir="/results")

Save posterior samples into the .nc file. It creates "trace.nc" file in the results folder, 

In [None]:
driver.save_posterior_samples(results_dir='/results')

Plot the posterior distributions for all media channels. It creates "media_spend_posterior.png" file. You can specify different name in `filename_prefix` parameter of function.

The plot helps to understand both the estimated impact of each channel and the confidence level in those estimates. Wider distributions indicate more uncertainty, while narrower ones suggest more precise estimates. This visualization is particularly useful for comparing the relative effectiveness and reliability of different marketing channels' performance.

In [None]:
# driver.plot_posterior_distributions(results_dir='/content/results', filename_prefix='media_spend_posterior'):

driver.plot_posterior_distributions(results_dir='/content/results')

Visualize the MCMC sampling traces for key model parameters to assess model convergence and parameter distributions.

Key Parameters Shown:
- intercept: Base level of the target variable
- likelihood_sigma: Model's uncertainty parameter
- beta_channel: Channel-specific coefficient strengths
- alpha: Shape parameter for saturation curves
- lam: Decay rate parameter
- gamma_control: Coefficients for control variables

You can use the plots to evaluate:
1. Convergence: The trace should show good mixing and no obvious trends or patterns
2. Parameter Distributions: The shape and spread of posterior distributions
3. Chain Agreement: Multiple chains should explore similar regions

In [None]:
model_trace = driver.plot_model_trace()

Calculate the R^2 score for the model and Visualize posterior predictive samples from the model.

Check how well the model retrodicts the training data by sampling the posterior predictive.        

In [None]:
r2_score = driver.calculate_train_r_squared()
posterior_predictive = driver.plot_posterior_predictive()

Visualize the model’s predictions against the observed data. The observed data is plotted as a black line.

HDI - Highest Density Intervals

In [None]:
components_contribution = driver.plot_components_contributions()

A waterfall plot visualizes how different channels and other factors contribute to the total KPI (target). The plot starts with the baseline performance and shows the incremental impact of each channel, allowing you to understand which chanels drove positive or negative changes in the final outcome.

Waterfall plot helps to identify which channels are most and least effective. It makes it easy to explain ROI to stakeholders and to validate marketing spend decisions.

In [None]:
waterfall_plot = driver.plot_waterfall_components_decomposition()

### 4. Check results. Diagnostics.

📖 See: `"docs/Output_Guide.ms"` contains detaled explanations of files in the results folder. 

Run the next cell to see a summary of the model's fit result.

The Table show the model's fit result, including the intercept, likelihood sigma, beta channel, alpha, and lam variables, along with their median values and highest density interval (HDI) with a probability of 0.90.

In [None]:
from mmm import describe as dsc

quick_stats = dsc.quick_stats(driver.model)
quick_stats

### Saturation curves.

You can see where every week's spend falls on each channel's saturation curve.

In [None]:
from mmm import plot as mplt
weekly_spend_curve = mplt.weekly_spend_by_channel(driver.model)

Print the variance over time for each feature. 

Variances below low variance threshold or above high variance threshold are highlighted in red. 
If this happens, consider possible transformations to your data (such as combining channels).

For the national-level data the dataframe contains just one column, and for geo-level data the list contains one column for each geo.

The "`feature_x`" rows refer to the media channels. Spends must be positive.

Default thresholds used in Griffin:

| Parameter | Value |
|-----------|-------|
| Low Variance Threshold | 0.001 |
| High Variance Threshold | 3.0 |
| Low Spend Threshold | 0.01 |
| High VIF Threshold | 7.0 |

In [None]:
driver.check_quality()

high_var = driver.highlight_variances()
high_var

# default thresholds are 0.001 and 3.0, for low and high variances respectively.

VIF: If the variance inflation factor (VIF) is at or above high VIF threshold, consider merging or dropping features.

In [None]:
high_inf = driver.highlight_high_vif_values()
high_inf

# default threshold is 7.0

The fraction of the total spend in each channel.

In [None]:
high_spf = driver.highlight_low_spend_fractions()
high_spf

Plot **correlation** between media spends

In [None]:
fig, corr_df = driver.plot_correlation()
fig

In [None]:
corr_df

Plot *the media spend* over time.

Take a close look at the changes in the output metric over time. 
Think about whether things outside of marketing may account for big swings, and be mindful of this going into the analysis.

In [None]:
all_mds_fig = driver.plot_all_media_spend()
all_mds_fig

### Model Structure Visualization

Understanding the structure of your Media Mix Model is crucial for interpreting results and ensuring that all components are correctly mapped. The `plot_model_structure()` function generates a visual representation of the model, showing how media channels, external factors, and transformations (e.g., adstock, saturation) contribute to the target metric. This diagram helps validate the setup and provides an intuitive overview of the model's architecture.




In [None]:
mdl_st = driver.plot_model_structure()
mdl_st

<br>
A "Weekly Media and Baseline contribution" plot visualizes how different marketing activities and baseline factors contribute to business performance (like sales or conversions) over time.

This plot helps to understand:

- The relative contribution of each channel week by week
- Seasonal patterns in baseline and media performance
- How marketing activities stack up against organic performance
- Periods of high/low marketing effectiveness
- The overall composition of business drivers

In [None]:
driver.display_image("weekly_media_and_baseline_contribution.png")

A "Weekly media contribution" plot displays how different marketing channels contribute to business performance over time, excluding the baseline effects. 

In [None]:
driver.display_image("weekly_media_contribution.png")

<br>
Examine the processed dataset used for modeling after data validation and preprocessing steps.

In [None]:
driver.per_observation_df

In [None]:
driver.data_to_fit.to_data_frame()

In [None]:
model = driver.model
model.plot_direct_contribution_curves(
    show_fit=True,
    method="sigmoid",
    export_curves=True,  # save curves as numerical outputs
    results_dir='./results'
)

## Budget Optimization.

### What is Budget Optimization? 

MMM budget optimization is the allocation of a marketing budget across various channels using a fitted marketing mix model and historical data.

<br>

**Griffin budget optimization**  performs budget optimization by leveraging a calibrated marketing mix model to maximize the expected contribution to the overall marketing objective (e.g., sales, conversions) given a specified total budget. The process involves aggregating historical spending data at a chosen frequency (e.g., monthly, quarterly), defining dynamic budget bounds based on historical spending patterns, and optimizing budget allocations using a non-linear optimization approach.

> 📝 Notes:
> - The optimization is based on a sigmoid function fitted to each channel's response curve.
> - Budget bounds are dynamically calculated as a percentage increase or decrease from historical averages.
> - The resulting budget allocation is visualized and saved as a PNG file in the results directory.


🚨  Warning: Make sure you have downloaded `budget_optimizer.py` and placed it at the same level as the sample_data folder.


💡 Info: Griffin's budget optimizer code is located in the budget_optimizer.py file. Feel free to examine it to understand its functionality.

<br>

Run next cell to import optimizer method.

In [6]:
from budget_optimizer import optimize_marketing_budget

### Run optimization function

This plot compares the initial and optimized budget allocations (left panel) and their corresponding contributions to the target metric (right panel) across media channels. The optimized scenario reallocates resources to maximize efficiency and ROI, highlighting channels with higher potential returns and reducing spend on less effective ones. Use this visualization to guide budget adjustments and improve overall campaign performance.

In [None]:
optimize_marketing_budget(driver.model, driver.processed_data, driver.config, driver.results_dir)

## Export resutls

Run this cell to download all the resulted files into your local machine.

In [None]:
import os
from google.colab import files

def create_downloadable_zip(target_folder, included_files=None, excluded_files=None, zip_name='model_files.zip'):
    """
    Creates a downloadable zip file of specified content.

    Args:
        target_folder (str): The folder containing the files to zip.
        included_files (list): Specific files or folders to include. If None, includes all.
        excluded_files (list): Specific files or folders to exclude.
        zip_name (str): Name of the output zip file.
    """
    command = f"zip -r {zip_name} {target_folder}"

    # Include specific files if provided
    if included_files:
        for file in included_files:
            command += f" {file}"

    # Exclude specific files if provided
    if excluded_files:
        for file in excluded_files:
            command += f" -x \"*/{file}*\""
    os.system(command)

    files.download(zip_name)

In [None]:
# Define the folder and files to include/exclude
target_folder = "/content/results"
excluded_files = ["trace.nc", "other_folder", "unnecessary_file"]
zip_name = "model_files.zip"

# Create and download the zip file
create_downloadable_zip(target_folder, excluded_files=excluded_files, zip_name=zip_name)

## Next steps

### This Demo shows just the basic features of what Griffin provides. In the Pro version, you can use Griffin to model an unconstrained number of media channels.

| Demo Version (Free) | Pro Version (Yearly Subscription) |
|--------------------|------------------------------------|
| ✓ Unlimited access to the demo version | ✓ Full access to all features |
| ✓ Support for up to 4 media channels | ✓ Support for unlimited media channels |
| ✓ All features and visualizations | ✓ Advanced analytics and visualizations |
| ✓ Basic / Community support | ✓ Priority email support |
| | ✓ Quarterly strategy consultation |
| | ✓ Regular software updates |
| **Best for:** Businesses or marketers looking to explore MMM capabilities | **Best for:** Agencies of all sizes and mid-size to large businesses serious about optimizing their marketing mix |

## Support & Resources

Contact

- info@griffin-analytics.com - for any questions.

<hr>

Copyright © 2024 FIXEDPOINT IO LTD, incorporated and registered in England and Wales with company number 13288661 whose registered office is at 20-22 Wenlock Road, London, England, N1 7GU.