In [None]:
# Copyright 2021 Google LLC.
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# 6. Media experiment design

This notebook demonstrates the design of a media experiment by using the
[Experimental Desing](https://github.com/google/gps_building_blocks/tree/master/py/gps_building_blocks/analysis/exp_design)
module to activate the predictions from a propensity model. It is vital to design and estimate the impact of media campaigns using valid statistical methods to make sure the limited experimentation budget is utilized effectively and to set the right expectations of the campaign outcome.



## Requirements

* An already scored test dataset, or the model and the test dataset to be scored available in GCP BigQuery.
* This test dataset should contain all the ML instances for at least one snapshot date.

## Install and import required modules

In [None]:
# Install gps_building_blocks package if not installed
# !pip install gps_building_blocks

In [None]:
import pandas as pd
from gps_building_blocks.analysis.exp_design import ab_testing_design
from gps_building_blocks.cloud.utils import bigquery as bigquery_utils
import numpy as np

## Set paramaters

In [None]:
# GCP Project ID
PROJECT_ID = 'project-id'
# BigQuery dataset name
DATASET = 'dataset'
# BigQuery table (name) containing the test dataset to be scored. This test
# dataset should contain all the instances at least for one snapshot date
TEST_DATA_TABLE = 'test_table'
# BigQuery model name
MODEL_NAME = 'propensity_model'
# BigQuery table (name) containing the test predictions dataset (if available)
TEST_DATA_PREDICTIONS_TABLE = 'test_prediction_table'
# Selected snapshot date to select the ML instances (reflecting the instances to
# be scored on a given scoring date) to be used for experiment design in
# YYYY-MM-DD format
SELECTED_SNAPSHOT_DATE = '2021-07-01'
# Name of the actual label column
ACTUAL_LABEL_NAME = 'label'
# Name of the prediction column
PREDICTED_LABEL_NAME = 'predicted_label'

# BigQuery client object
bq_client = bigquery_utils.BigQueryUtils(project_id=PROJECT_ID)

## Score the Test Dataset (if not already scored)


In [None]:
# Prediction sql query
prediction_query =
(" SELECT *
f" FROM ML.PREDICT(MODEL `{PROJECT_ID}.{DATASET}.{MODEL_NAME}`, "
f"                 TABLE `{PROJECT_ID}.{DATASET}.{TEST_DATA_TABLE}`")

# Run prediction
test_pred_data = bq_client.query(prediction_query).to_dataframe()

# Size of the prediction data frame
print(test_pred_data.shape)

## Read the Prediction Test Dataset (if already scored)

In [None]:
# Data read in sql query
read_query = f"select * from `{PROJECT_ID}.{DATASET}.{TEST_DATA_PREDICTIONS_TABLE}`"

# Run prediction
test_pred_data = bq_client.query(prediction_query).to_dataframe()

# Size of the prediction data frame
print(test_pred_data.shape)

## Select the Relevant Data for Experiment Design

In [None]:
# Select all the instances for one snapshot date, which resembles the scoring
# dataset for one day. This dataset is used to design the media experiment.
selected_snapshot_data = test_pred_data[test_pred_data['snapshot_date'==SELECTED_SNAPSHOT_DATE]]

## Experiment Design I: Different Propensity Groups

One way to use the output from a Propensity Model to optimize marketing is to first define different audience groups based on the predicted probabilities (such as High, Medium and Low propensity groups) and then test the same or different marketing strategies with those. This strategy is more useful to understand how different propensity groups respond to remarketing campaigns.

Following step estimates the statistical sample sizes required for different groups (bins) of the predicted probabilities based on different combinations of the expected minimum uplift/effect size, statistical power and statistical confidence levels specified as input parameters.

Expected output: a Pandas Dataframe containing statistical sample size for each bin for each combination of minimum uplift_percentage, statistical power and statistical confidence level.

Based on the estimated sample sizes and the available sizes one can decide what setting (expected minimum uplift/effect size at a given statistical power and a confidence level) to be selected for the experiment. Then the selected sample sizes could be used to set Test and Control cohorts from each propensity group to implement the media experiment.

In [None]:
ab_testing_design.calc_chisquared_sample_sizes_for_bins(
    labels=selected_snapshot_data[ACTUAL_LABEL_NAME].values)
    probability_predictions=selected_snapshot_data[PREDICTED_LABEL_NAME].values),
    number_bins=3, # to have High, Medium and Low bins
    uplift_percentages=(5, 10, 15), # minimum expected effect sizes
    power_percentages=(80, 90),
    confidence_level_percentages=(90, 95))

## Experiment Design II: Top Propensity Group

Another way to use the output from a Propensity Model to optimize marketing is to target the top X% of users having the highest predicted probability in a remarketing campaign or an acquisition campaigns with the similar audience strategy.

Following step estimates the statistical sample sizes required for different cumulative groups (bins) of the predicted probabilities (top X%, top 2X% and so on) based on different combinations of the expected minimum uplift/effect size, statistical power and statistical confidence levels specified as input parameters.

Expected output: a Pandas Dataframe containing statistical sample size for each cumulative bin for each combination of minimum uplift_percentage, statistical power and statistical confidence level.

Based on the estimated sample sizes and the available sizes one can decide what setting (what top X% of users with the expected minimum uplift/effect size at a given statistical power and a confidence level) to be selected for the experiment. Then the selected sample size could be used to set Test and Control cohorts from the top X% to implement the media experiment.

In [None]:
ab_testing_design.calc_chisquared_sample_sizes_for_cumulative_bins(
    labels=prediction_df[ACTUAL_LABEL_NAME].values)
    probability_predictions=prediction_df[PREDICTED_LABEL_NAME].values),
    number_bins=10, # top 10%, 20%, ..., 100%
    uplift_percentages=(5, 10, 15), # minimum expected effect sizes
    power_percentages=(80, 90),
    confidence_level_percentages=(90, 95))