<center><img src="https://storage.googleapis.com/arize-assets/arize-logo-white.jpg" width="200"/></center>

# Targeted Advertising in the Media and Entertainment Industry
Walk through how to use Arize to evaulate the performance of a targeted advertising model using an example dataset.

In [None]:
# Install and import dependencies

!pip install -q arize
from arize.pandas.logger import Client, Schema
from arize.utils.types import ModelTypes, Environments, Metrics

import pandas as pd
import numpy as np
from datetime import datetime

### 🌐 Upload Data to Arize: Download Data
Here are sample parquet files that represent the <strong>training</strong> and <strong>production</strong> data of a model designed to evaluate the click through rate of a targeted advertising model based on features such as domain, device, and keywords.

In addition to features, these datasets also include relevant metadata such as education, state, gender, and ags.

In [None]:
df_production = pd.read_parquet(
    "https://storage.googleapis.com/arize-assets/fixtures/Industry_Use_Case/media_and_entertainment_targeted_advertising_production.parquet",
)
df_training = pd.read_parquet(
    "https://storage.googleapis.com/arize-assets/fixtures/Industry_Use_Case/media_and_entertainment_targeted_advertising_training.parquet",
)

### 🤝 Upload Data to Arize: Create Arize Client
Sign up/login to your Arize account <a href="https://app.arize.com/auth/login">here</a>. Find your <a href="https://docs.arize.com/arize/api-reference/arize.pandas/client">Space and API keys</a>. Copy/paste into the cell below.

In [None]:
SPACE_ID = "SPACE_ID"  # update value here with your Space ID
API_KEY = "API_KEY"  # update value here with your API key

arize_client = Client(space_id=SPACE_ID, api_key=API_KEY)

In [None]:
if SPACE_ID == "SPACE_ID" or API_KEY == "API_KEY":
    raise ValueError("❌ CHANGE SPACE_ID AND/OR API_KEY")
else:
    print(
        "✅ Import and Setup Arize Client Done! Now we can start using Arize!"
    )

### 📋 Upload Data to Arize: Define Schema
Create your <a href="https://docs.arize.com/arize/sending-data-to-arize/model-schema-reference">model schema</a>. First, we'll define the features, shap values, and tags.  

In [None]:
features_col_names = [
    "position",
    "domain",
    "category",
    "device",
    "keywords",
]
shap_col_names = [f"{x}_shap" for x in features_col_names]
tag_column_names = ['Dependents', 'Partner', 'EmploymentStatus', 'LocationCode', 'Education', 'State', 'Gender', 'Age']

### 🪵 Upload Data to Arize: Log Training Data
Define the training schema and log the training data to Arize.

In [None]:
# Define a Schema() object for Arize to pick up data from the correct columns for logging
training_schema = Schema(
    prediction_id_column_name="prediction_ids",
    prediction_label_column_name="prediction_labels",
    prediction_score_column_name="prediction_scores",
    actual_label_column_name="actuals",
    actual_score_column_name="actual_scores",
    feature_column_names=features_col_names,
    tag_column_names = tag_column_names,
)
# Logging Training DataFrame
response = arize_client.log(
    dataframe=df_training,
    model_id="targeted-advertisitng-media-and-entertainment",
    model_version="1,0",
    model_type=ModelTypes.BINARY_CLASSIFICATION,
    environment=Environments.TRAINING,
    schema=training_schema,
)

## Log the Production Data
Similarly, we will use the `arize.pandas.logger` to log the production dataset. Here, we will first need to update the timestamps to align with current day and time. This is to ensure that the sample data shows up as recent in Arize.


In [None]:
#update timestamps to most recent dates
last_ts = max(df_production['prediction_ts'])
now_ts = datetime.timestamp(datetime.now())
delta_ts = now_ts - last_ts

df_production['prediction_ts'] = (df_production['prediction_ts'] + delta_ts).astype(float)

# Define a Schema() object for Arize to pick up data from the correct columns for logging
production_schema = Schema(
    prediction_id_column_name="prediction_ids",
    timestamp_column_name="prediction_ts",
    prediction_label_column_name="prediction_labels",
    prediction_score_column_name="prediction_scores",
    actual_label_column_name="actuals",
    actual_score_column_name="actual_scores",
    feature_column_names=features_col_names,
    shap_values_column_names=dict(zip(features_col_names, shap_col_names)),
    tag_column_names = tag_column_names,
)

# arize_client.log returns a Response object from Python's requests module
response = arize_client.log(
    dataframe=df_production,
    model_id="targeted-advertisitng-media-and-entertainment",
    model_version="1.0",
    model_type=ModelTypes.BINARY_CLASSIFICATION,
    environment=Environments.PRODUCTION,
    schema=production_schema,
)

# If successful, the server will return a status_code of 200
if response.status_code != 200:
    print(f"logging failed with response code {response.status_code}, {response.text}")
else:
    print(f"✅ You have successfully logged production set to Arize")

## 🏃 Follow 'Success!' Link To Arize
Once you've successfully logged your model to Arize, follow the link to setup monitors, uncover problem areas, and more!

<strong>Note</strong>: It might take a few minutes for all the data to index in Arize, if you don't see all 5000 rows immedieatly, sit back and relax, data is on it's way!

###🔍 In Arize: Model Setup
Now that we can see our model data in Arize, let's get our model setup with some basic configurations.
* Navigate to the 'Config' tab. Select 'Click' as the positive class and 'Accuracy' as the default metric.
* On the Config page, select, 'Configure Baseline' and set the baseline to 'Pre-Production Data'

<image src="https://storage.googleapis.com/arize-assets/fixtures/Industry_Use_Case/media_advertising_setup.png" width = 850px>

### 🔍 In Arize: Monitor Setup

Let's setup some monitors to get alerted when our model deviates from expected behavior.
* Navigate to the 'Monitors' tab and click 'Enable' on the 'Accuracy' card within the Performance Monitors.
* Scroll through the list of other metrics and monitor types, enable a few that seem interesting! We suggest enabling 1 performance, 1 drift, and 1 data quality monitor to get started.

<image src="https://storage.googleapis.com/arize-assets/fixtures/Industry_Use_Case/media_advertising_monitors.png" width = 850px>

### 📈 In Arize: Performance Tracing
Now, let's take a look at the 'Performance Tracing' tab to identify areas to improve and better understand the impact of each feature on our model performance.

* Navigate to the 'Performance Tracing' tab
* Since we uploaded training data, click 'Add Comparison' and select 'Training' in the first drop down menu
* Zoom in on an area of poor performance in dataset A (production) by clicking and dragging your cursor directly on the graph

<image src="https://storage.googleapis.com/arize-assets/fixtures/Industry_Use_Case/media_advertising_performance.png" width = 850px>

### 📈 In Arize: Performance Tracing

* Scroll down to the performance insights to surface the features that impact your model performance the most. In this case, `domain = new_site.com`.
* Click on the feature to uncover a comparative histogram

It looks like we're missing training data for this new site! Click on 'Exclude Slice' to see what happens to our model performance

<image src="https://storage.googleapis.com/arize-assets/fixtures/Industry_Use_Case/media_advertising_performance_2.png" width = 850px>

### 📈 In Arize: Drift Tracing

Another great way to identify areas to improve within your model is to investigate model drift. Model drift compares your current dataset against a baseline (in this case our training dataset).

* Navigate to the 'Drift Tracing' tab
* Click on the Prediction Drift Over Time graph on an area of high drift
* Scroll down to uncover a distribution comparison and features that impact drift
* Click on Domain and select an area of high drift and scroll down to view the distribution comparison.

Notice anything familiar? The feature slice we uncovered earlier, `new_site.com` is the culprit of our model issues! Use this information to target rebuilding and retraining efforts so you don't miss out on an opportuinity.

<image src="https://storage.googleapis.com/arize-assets/fixtures/Industry_Use_Case/media_advertising_drift.png" width = 850px>

## 🚀 Continue Exploring Arize
This tutorial just scratches the surface of what Arize can do. Continue to explore the world of ML Observability with Arize to monitor, troubleshoot, and fine tune your models!

<strong>Recommended Resources:</strong>
* [Arize Community Slack](https://join.slack.com/t/arize-ai/shared_invite/zt-1is2wp3xv-SQgwwszCEeS06Sm1q4xFFw)
* [Arize Documentation](https://docs.arize.com/arize/)
* [ML Observability Course](https://courses.arize.com/)