<img src="https://storage.googleapis.com/arize-assets/arize-logo-white.jpg" width="200"/>

# Getting Started with the Arize Platform - Churn Prediction in the Telecom Industry


At a large telecom organization, you are responsible for managing the model that predicts which customers will churn so marketing efforts can be targeted to save their accounts!

In this walkthrough, we are going to investigate your churn model in production. We will validate degradation in model performance, troubleshoot the root cause, and furthermore set up proactive monitors to mitigate the impact of future degradations.

We will set up monitors to practively identify when our churn model is not perfoming as expected, troubleshoot why we're seeing this deviation in production, and come up with actionable steps to improve the model.

Our steps to resolving this issue will be:

Get our model onto the Arize platform to investigate

1. Setup a performance dashboard to look at prediction performance
2. Review a recent instance of model drift
3. Understand where the model is underperforming
4. Discover the root cause of why a slice (grouping) of predictions are underperforming

The production data contains 1 month of data where 2 main issues exist. You will work on identifying these issues over the course of this exercise.

1. Customers `SupportElgiblity` expired 
2. Identify what type of customers is the model poor at prediction to churn or not

# Step 0. Setup and Getting the Data

The first step is to load our preexisting dataset which includes training and production environments for our churn predicting model. Using a preexisting dataset illustrates how simple it is to get started with the Arize platform.

## Install Dependencies and Import Libraries 📚


In [None]:
!pip install arize --upgrade -q

import datetime
import uuid
from datetime import timedelta

import numpy as np
import pandas as pd
from arize.pandas.logger import Client, Schema
from arize.utils.types import Environments, ModelTypes

print("✅ Dependencies installed and libraries imported!")

## 🌐 **Download the Data**
In this walkthrough, we’ll be sending real historical data (with privacy conscious changes to feature names and values). Note, that while feature names and values are made explicit in this dataset, you can achieve the same level of ML Observability using obfuscated features. 

For this historical evaluation case, the best approach to send data into Arize is via the [Python SDK pandas logger](https://docs.arize.com/arize/api-reference/python-sdk/arize.pandas). Therefore, you will need to have a Pandas dataframe for each dataset/environment. 

We have **2 Environments: training and production**. Training and production are two different datasets that correspond to their respective parts of the training/production pipeline. We download each of them, storing them in a dictionary `datasets` for later use. 

In [None]:
datasets = {}
environments = ["production", "training"]
for environment in environments:

    filepath = (
        "https://storage.googleapis.com/arize-assets/fixtures/Tags-Demo-Data/churn_prediction_"
        + environment
        + ".csv"
    )

    # Create the dataframe and store in dictionary
    datasets[environment] = pd.read_csv(filepath)

print("✅ Data successfully downloaded!")

## Features Description 
**Tenure** - How long the customer has been with our organization

**PhoneService** - Does this customer have a phone plan?

**MultipleLines**	- Does this customer have multiple lines?

**Internet_Speed** - How fast is the customer Internet speed?

**SupportElgilblity** - How can this customer receive service for troubleshooting?

**Contract** - What sort of contract terms does this customer have?

**MonthlyStreamingTime** - How much does the customer stream per month?

**PremiumHDStreaming** - Does this customer stream HD content?
	
**Prediction_Label:** predicted churn or not churn

**PREDICTION_SCORE:** predicted probability of churn

**ACTUAL:** actual value, churn or not_churn

**ACTUAL_SCORE:** 1.0 if churn, 0.0 if not_churn

## Inspect the Data 

Take a quick look at the dataset. The data represents a model designed and trained to evaluate the probability of a customer churning based on various features such as contract, internet speed and techsupport, etc. The dataset contains about one month of data and the performance will be evaluated by comparing:

*   **PREDICTION**: The probability of a customer will churn predicted by the model 
*   **ACTUAL**: Churn or not churn based on ground truth collected by your company

In [None]:
datasets["production"].head()

In [None]:
features_column_names = [
    "Tenure",
    "PhoneService",
    "MultipleLines",
    "InternetSpeed",
    "SupportEligibility",
    "Contract",
    "MonthlyStreamingTime",
    "PremiumHDStreaming",
]
shap_column_names = [f"{x}_shap" for x in features_column_names]
tag_column_names = ['Gender', 'Age', 'Dependents', 'Partner', 'EmploymentStatus', 'LocationCode','Education']

# Step 1. Sending Data into Arize 💫

Now that we have our dataset imported, we are ready to integrate into Arize. We do this by logging (sending) important data we want to analyze to the platform. There, the data will be easily visualized and troubleshooting workflows will help us find the source of our problem.

For our model, we are going to log:
*   feature data
*   predictions
*   actuals

## Import and Setup Arize Client

The first step is to setup our Arize client. After that we will log the data.

First, use your Arize account credentials to log in. Thereafter, retrieve the Arize `API_KEY` and `SPACE_KEY` from your admin page shown below! Copy those over to the set-up section. We will also be setting up some metadata to use across all logging.




<img src="https://storage.googleapis.com/arize-assets/fixtures/copy-keys.png" width="700">

In [None]:
SPACE_KEY = "SPACE_KEY"
API_KEY = "API_KEY"

arize_client = Client(space_key=SPACE_KEY, api_key=API_KEY)

model_id = (
    "churn-prediction-demo-model"  # This is the model name that will show up in Arize
)

model_version = "v1.0"  # Version of model - can be any string

if SPACE_KEY == "SPACE_KEY" or API_KEY == "API_KEY":
    raise ValueError("❌ NEED TO CHANGE SPACE AND/OR API_KEY")
else:
    print("✅ Arize setup complete!")

## Log Training Data

Now that our Arize client is setup, let's go ahead and log all of our data to the platform. For more details on how **`arize.pandas.logger`** works, visit out documentations page below.

[![Buttons_OpenOrange.png](https://storage.googleapis.com/arize-assets/fixtures/Buttons_OpenOrange.png)](https://docs.arize.com/arize/sdks-and-integrations/python-sdk/arize.pandas)

Key parameters:

*   **prediction_label_column_name**: tells Arize which column contains the predictions
*   **actual_label_column_name**: tells Arize which column contains the actual results from field data
*   **preidction_score_column_name**: tells Arize which column contains the prediction score from field data
*   **actual_label_column_name**: tells Arize which column contains the actual results from field data
*   **actual_score_column_name**: tells Arize which column contains the actual score from field data

Given that our model is predicting between categories, we will use [ModelTypes.SCORE_CATEGORICAL](https://docs.arize.com/arize/product-guides-1/models/model-types) to perform this analysis.



In [None]:
df_training = datasets["training"]


def simulate_timestamps(X, days=30, add_days=0):
    t = datetime.datetime.now() + timedelta(days=add_days)
    current_ts, earlier_ts = t.timestamp(), (t - timedelta(days=days)).timestamp()
    return pd.Series(np.linspace(earlier_ts, current_ts, num=len(X)), index=X.index)


df_training["prediction_ts"] = simulate_timestamps(df_training)
df_training["prediction_id"] = [str(uuid.uuid4()) for _ in range(df_training.shape[0])]


# Define a Schema() object for Arize to pick up data from the correct columns for logging
training_schema = Schema(
    prediction_id_column_name="prediction_id",
    timestamp_column_name="prediction_ts",
    prediction_label_column_name="Prediction_Labels",
    prediction_score_column_name="Prediction_Scores",
    actual_label_column_name="Actual_Labels",
    actual_score_column_name="Actual_Scores",
    feature_column_names=features_column_names,
    tag_column_names = tag_column_names,
)

# Logging Training DataFrame
training_response = arize_client.log(
    dataframe=df_training,
    model_id=model_id,
    model_version="v2.0",
    model_type=ModelTypes.SCORE_CATEGORICAL,
    environment=Environments.TRAINING,
    schema=training_schema,
)

# If successful, the server will return a status_code of 200
if training_response.status_code != 200:
    print(
        f"logging failed with response code {training_response.status_code}, {training_response.text}"
    )
else:
    print(f"✅ You have successfully logged training set to Arize")

## Log Production Data 

In [None]:
df_production = datasets["production"]


def simulate_timestamps(X, days=30):
    t = datetime.datetime.now()
    current_ts, earlier_ts = t.timestamp(), (t - timedelta(days=days)).timestamp()
    return pd.Series(np.linspace(earlier_ts, current_ts, num=len(X)), index=X.index)


df_production["prediction_ts"] = simulate_timestamps(df_production)
df_production["prediction_id"] = [
    str(uuid.uuid4()) for _ in range(df_production.shape[0])
]

production_schema = Schema(
    prediction_id_column_name="prediction_id",
    timestamp_column_name="prediction_ts",
    prediction_label_column_name="Prediction_Labels",
    prediction_score_column_name="Prediction_Scores",
    actual_label_column_name="Actual_Labels",
    actual_score_column_name="Actual_Scores",
    feature_column_names=features_column_names,
    shap_values_column_names=dict(zip(features_column_names, shap_column_names)),
    tag_column_names = tag_column_names,
)

production_response = arize_client.log(
    dataframe=df_production,
    model_id=model_id,
    model_version=model_version,
    model_type=ModelTypes.SCORE_CATEGORICAL,
    environment=Environments.PRODUCTION,
    schema=production_schema,
)

if production_response.status_code != 200:
    print(
        f"logging failed with response code {production_response.status_code}, {production_response.text}"
    )
else:
    print(f"✅ You have successfully logged production set to Arize")

# Step 2. Confirm Data in Arize ✅

Note that the Arize performs takes about 10 minutes to index the data. While the model should appear immediately, the data will not show up until the indexing is complete. Feel free to head over to the **Data Ingestion** tab for your model to watch Arize works its magic!🔮

**⚠️ DON'T SKIP:**
In order to move on to the next step, make sure your actuals and training/production sets are loaded into the platform. To check:
1. Navigate to models from the left bar, locate and click on model **arize-demo-churn-prediction** or what you used as the `model-id`
2. On the **Overview Tab**, make sure you can see Predictions and Actuals under the **Model Health** section. Once production actuals have been fully recorded on Arize, the row title will change from **0 Actuals** to **Actuals** with summary statistics such as cardinality listed in the tables.
3. Verify the list of **Features** below **Actuals**.

![image.png](https://storage.googleapis.com/arize-assets/tutorials/use-cases/churn/visuals/churn-confirmingest.gif)

# Step 3. Set up Model Baseline & Managed Monitors

Now that our data has been logged into the [Arize platform](https://app.arize.com/) we can begin our investigation into our poorly performing fraud detection model. 

Arize will guide you through setting up a **Baseline** (reference environment for comparison) and automatically create **Monitors** for your model in just a few clicks —just follow the blue banner at the top of the page titled "Finish setting up your model". 

![image.png](https://storage.googleapis.com/arize-assets/fixtures/Click-Through%20Rate%20Use-Case/images/initial_setup_banner.png)

Arize can automatically configure monitors that are best suited to your data. From the banner at the top of the screen, select the following configurations after clicking the 'Set up Model' button: 

1. Datasets: `Training Version 1.0`
2. Default Metric: `Accuracy`, Trigger Alert When: `Accuracy is below .8`, Positive Class: `churn`
3. Turn On Monitoring: Drift ✅, Data Quality ✅, Performance ✅ 

You will now see that the baseline has been set and **Drift**, **Data Quality**, and **Performance** monitors have been created!!! 

![image](https://storage.googleapis.com/arize-assets/tutorials/use-cases/churn/visuals/churn-initalsetup.gif)

These monitors will help ensure your team to proactively address performance, drift or data quality spikes before the issue grows into a larger issue. Monitors are able to be filtered by category, edit evaluation windows, thresholds, etc. and create custom monitors by visiting the **Monitors** tab.


# Step 4. Setting up a Custom Monitor

During the initial setup process that we just went through, Arize automatically created global monitors to track performance, drift and data quality but every model needs custom monitors to ensure that all relevant performance is tracked. 

In addition to accuracy, it is important to track other performance metrics like "False Negative Rate" which indicates how many accounts churned without any effort to save them. 

The gif below outlines the process to create a monitor utilizing that metric. There is much more configurations that you can apply so try out applying a few filters, changing the evaluation window or even the metric type itself!

![image](https://storage.googleapis.com/arize-assets/tutorials/use-cases/churn/visuals/churn-setupmonitor.gif)

# Step 5. Exploring Drift 

Navigating to the **Drift** tab where model drift is visualized across the selected time period. You will see that the model experienced a spike in drift a few weeks ago so let's look into it!

![image](https://storage.googleapis.com/arize-assets/tutorials/use-cases/churn/visuals/churn-drifttab.gif)

You can select a period in time on the graph and you will see the prediction distributions adjust accordingly below the graph. Predicted "no_churn" has increased compared to our baseline as well as there's much less predicted churn. 

Scroll down further in the page and you will find features of the model listed in order of their prediction drift impact score; the higher score means it contributed more to the model drift. This score is a weighted system that Arize calculates using SHAP values and the feature drift so that you don't need to serach for the top features that are causing drift.

As you select the top feature, Support Elgiblity, there is a familiar page that visualizes drift of the feature with the distribution of the inputs to the feature below it. If you select a period of time where there was high drift, you will see that there was a new input discovered which is expired. 

This means that many customer's support elgiblity expired and the model was not trained with this input leading to the model to drift. 

![image](https://storage.googleapis.com/arize-assets/tutorials/use-cases/churn/visuals/churn-driftnewinput.png)




# Step 6. Exploring Performance Degradation

Navigate to the **Performance** tab within Arize, you will see that accuracy (our default performance metric) is plotted over the last 30 days and it is overlaid on top of bars which measure the volume of predictions. Our model is doing pretty well but there have been a few dips in accuracy down to ~60% so let's look into what could be driving performance down.

If you scroll down, the **Output Segmentation** section includes a confusion matrix which is useful for our model as it is assigning a class to the prediction.

Let's scroll down even further to the **Performance Breakdown by Feature**, this section is very useful to uncover low performing cohorts within a feature. Since this model is producing SHAP values for every prediction, Arize is able to use those SHAP values to weight performance within each feature to create a **Performance Impact Score**. By sorting by this score instead of just feature importance or min/max performance, Arize is able to surface to the top, the top features that are attributing to decreased performance.

At the top of the **Performance Breakdown by Feature** list is "MonthlyStreamingTime" so let's expand this feature. Now we see a list of the inputs to this feature which are a couple of categorical values. By hovering over bar, Arize displays the volume that this input was utilized in predictions and the performance metric for that cohort. 

You can see that 100hrs+ stands out a bit with accuracy at ~72% which is much lower than our global accuracy metric of ~88%. Click on the bar and then select "add cohort as filter". This now applies a conditional filter across the entire data set.

![image](https://storage.googleapis.com/arize-assets/tutorials/use-cases/churn/visuals/churn-streamingtimeperf.png)

Now you will see that InternetSpeed is the top of the features and another low performing segment exposes itself which is "Low Speed" so let's add that as another filter.

![image](https://storage.googleapis.com/arize-assets/tutorials/use-cases/churn/visuals/churn-internetspeedperf.png)

Once that filter is applied, you will see that the histograms across the board update from yellow/white which indicates high accuracy to darker red hues which indicates low accuracy. 

![image](https://storage.googleapis.com/arize-assets/tutorials/use-cases/churn/visuals/churn-finalperf.png)

In summary, we decomposed our model's global accuracy metric of ~88% and identified a segment of our users that our model is worse than a coin flip of predicting churn which is customers who stream over 100 hours and have slow internet speeds! 


# Step 7. Dashboarding Overview

As we continue to check in and improve our model's performance, we want to be able to quickly and efficiently view all our important model metrics in a single pane. In the same way we set up a **Feature Performance Heatmap** we will create a **Model Performance Dashboard** to view our model's most important metrics in a single configurable layout. 

Navigate to the **Templates** section on the left sidebar and scroll down to click on the **Scored Model**. From there select your model, features you care to investigate, and positive class `Churn`. 

![image](https://storage.googleapis.com/arize-assets/tutorials/use-cases/churn/visuals/churn-creatingdashboard.gif)

In addition to the default widgets Arize sets up for your dashboard, you can configure custom metrics your team cares about. In only a few clicks I added a few widgets to give me a single glance view of my model's **Accuracy**, **False Positive Rate**, and **False Negative Rate** as standalone statistics widgets. To visualize these metrics over time I also created a timeseries widget and overlayed three plots showcase the fluctuation of my metrics over time. 

![image](https://storage.googleapis.com/arize-assets/tutorials/use-cases/churn/visuals/churn-customdashboard.png)


# Step 8. Business Impact
Complimentary to traditional statistical measures to define model performance, business impact is another way to measure performance and obtain insights. By using Arize's Business Impact tool, you are able to experiement with different thresholds aka the decision boundary for a scored model. The Arize platform allows you to enter custom formulas, mapping model performance to your definition of model performance. Navigate to the Business Impact tab to set up a custom formula used to calculate the business impact of our model's performance to the overall profit/loss of your company. 

For example, the diagram below estimates the profit/loss of a decision made by our model.

![image](https://storage.googleapis.com/arize-assets/tutorials/use-cases/churn/visuals/churn-payoffdiagram.png)


Capturing this in our **Business Impact** tab visualizes the profit/loss based on our model's prediction threshold for churn classification. ![image](https://storage.googleapis.com/arize-assets/tutorials/use-cases/churn/visuals/churn-payoffcurve.png)

# Step 9. Explainability

This churn model sent SHAP values for each prediction to Arize and earlier within this notebook, we saw how Arize can utilize feature importance across the platform but you are also able to explore explainability by itself within our Explainability tab. 

Within this page, you are able to view the gloal feature imporance of all your predictions as well as apply cohort prediction based analysis to obtain insights. 

![image](https://storage.googleapis.com/arize-assets/tutorials/use-cases/churn/visuals/churn-explainability.gif)

# 📚 Conclusion 
In this walkthrough we've shown how Arize can be used to log prediction data for a model, pinpoint model performance degradations, set up monitors to proactively catch future issues, create dashboards for at a glance model understanding, and calculate business impact metrics through custom formulas. 

We completed the following tasks: 
1. Uploaded data from a model predicting who will churn 
2. Set up a model baseline (Training V1.0) and managed Performance, Data Quality, and Drift monitors
3. Uncovered low performing segments of our population within the Performance tab of customers who have high `monthlystreamingtime` and slow `InternetSpeed`.
4. Identified correlations between our model's degrading performance with individual feature drift and distribution variance. We found that there were many customers who had their `SupportElgibilty` expired which was a new input that caused the model to drift. It could make sense to train the model on this input. 
5. Created a model Performance Dashboard to visualize important metrics at a glance with custom timeseries metrics for ongoing analysis 
6. Captured our potential profit/loss based on our model's classification threshold using the Business Impact tab
7. Reviewed global and cohort explainability of our model

Though we covered a lot of ground, this is just scratching the surface of what the Arize platform can do. We urge you to explore more of Arize, either on your own or through one of our many other tutorials.

# About Arize
Arize is an end-to-end ML observability and model monitoring platform. The platform is designed to help ML engineers and data science practitioners surface and fix issues with ML models in production faster with:
- Automated ML monitoring and model monitoring
- Workflows to troubleshoot model performance
- Real-time visualizations for model performance monitoring, data quality monitoring, and drift monitoring
- Model prediction cohort analysis
- Pre-deployment model validation
- Integrated model explainability

### Website
Visit Us At: https://arize.com/model-monitoring/

### Additional Resources
- [What is ML observability?](https://arize.com/what-is-ml-observability/)
- [Playbook to model monitoring in production](https://arize.com/the-playbook-to-monitor-your-models-performance-in-production/)
- [Using statistical distance metrics for ML monitoring and observability](https://arize.com/using-statistical-distance-metrics-for-machine-learning-observability/)
- [ML infrastructure tools for data preparation](https://arize.com/ml-infrastructure-tools-for-data-preparation/)
- [ML infrastructure tools for model building](https://arize.com/ml-infrastructure-tools-for-model-building/)
- [ML infrastructure tools for production](https://arize.com/ml-infrastructure-tools-for-production-part-1/)
- [ML infrastructure tools for model deployment and model serving](https://arize.com/ml-infrastructure-tools-for-production-part-2-model-deployment-and-serving/)
- [ML infrastructure tools for ML monitoring and observability](https://arize.com/ml-infrastructure-tools-ml-observability/)

Visit the [Arize Blog](https://arize.com/blog) and [Resource Center](https://arize.com/resource-hub/) for more resources on ML observability and model monitoring.