# Fiddler Simple Monitoring Quick Start Guide

Fiddler is the pioneer in enterprise Model Performance Management (MPM), offering a unified platform that enables Data Science, MLOps, Risk, Compliance, Analytics, and other LOB teams to **monitor, explain, analyze, and improve ML deployments at enterprise scale**.
Obtain contextual insights at any stage of the ML lifecycle, improve predictions, increase transparency and fairness, and optimize business revenue.

---

You can start using Fiddler ***in minutes*** by following these quick steps:

1. Imports
2. Connect to Fiddler
3. Load a data sample
4. Define your model specifications
5. Pick a model task
6. Add your model
7. Set up alerts **(Optional)**
8. Configure a custom metric **(Optional)**
9. Configure segments **(Optional)**
10. Publish a pre-production baseline **(Optional)**
11. Publish a rolling baseline **(Optional)**
12. Publish production events
13. Get insights

## 1. Imports

In [None]:
!pip install -q fiddler-client

import numpy as np
import pandas as pd
import time as time
import fiddler as fdl

print(f"Running client version {fdl.__version__}")

## 2. Connect to Fiddler

Before you can add information about your model with Fiddler, you'll need to connect using our API client.


---


**We need a few pieces of information to get started.**
1. The URL you're using to connect to Fiddler
2. Your authorization token

These can be found by navigating to the **Settings** page of your Fiddler environment.

In [None]:
URL = ''  # Make sure to include the full URL (including https://).
TOKEN = ''

Now just run the following to connect to your Fiddler environment.

In [None]:
fdl.init(
    url=URL,
    token=TOKEN
)

Once you connect, you can create a new project by specifying a unique project ID in the client's `create_project` function.

In [None]:
PROJECT_NAME = 'quickstart_example'

project = fdl.Project(
    name=PROJECT_NAME
)

project.create()

You should now be able to see the newly created project on the UI.

<table>
    <tr>
        <td>
            <img src="https://raw.githubusercontent.com/fiddler-labs/fiddler-examples/main/quickstart/images/simple_monitoring_1.png" />
        </td>
    </tr>
</table>

## 3. Load a data sample

In this example, we'll be considering the case where we're a bank and we have **a model that predicts churn for our customers**.
  
In order to get insights into the model's performance, **Fiddler needs a small sample of data** to learn the schema of incoming data.

In [None]:
PATH_TO_SAMPLE_CSV = 'https://raw.githubusercontent.com/fiddler-labs/fiddler-examples/main/quickstart/data/v3/churn_data_sample.csv'

sample_df = pd.read_csv(PATH_TO_SAMPLE_CSV)
sample_df

## 4. Define your model specifications

In order to add your model to Fiddler, simply create a ModelSpec object with information about what each column of your data sample should used for.

Fiddler supports four column types:
1. **Inputs**
2. **Outputs** (Model predictions)
3. **Targets** (Ground truth values)
4. **Metadata**

In [None]:
model_spec = fdl.ModelSpec(
    inputs=[
        'creditscore',
        'geography',
        'gender',
        'age',
        'tenure',
        'balance',
        'numofproducts',
        'hascrcard',
        'isactivemember',
        'estimatedsalary'
    ],
    outputs=['predicted_churn'],
    targets=['churn'],
    metadata=['customer_id', 'timestamp']
)

If you have columns in your ModelSpec which denote **prediction IDs or timestamps**, then Fiddler can use these to power its analytics accordingly.

Let's call them out here.

In [None]:
id_column = 'customer_id'
timestamp_column = 'timestamp'

## 5. Pick a model task

Fiddler supports a variety of model tasks. In this case, we're adding a binary classification model.

For this, we'll create a ModelTask object and an additional ModelTaskParams object to specify the ordering of our positive and negative labels.

*For a detailed breakdown of all supported model tasks, click here.*

In [None]:
model_task = fdl.ModelTask.BINARY_CLASSIFICATION

task_params = fdl.ModelTaskParams(target_class_order=['no', 'yes'])

## 6. Add your model

Create a Model object and publish it to Fiddler, passing in
1. Your data sample
2. Your ModelSpec object
3. Your ModelTask and ModelTaskParams objects
4. Your ID and timestamp columns

In [None]:
MODEL_NAME = 'my_model'

model = fdl.Model.from_data(
    name=MODEL_NAME,
    project_id=project.id,
    source=sample_df,
    spec=model_spec,
    task=model_task,
    task_params=task_params,
    event_id_col=id_column,
    event_ts_col=timestamp_column
)

model.create()

On the project page, you should now be able to see the newly onboarded model with its model schema.

<table>
    <tr>
        <td>
            <img src="./simple_monitoring_3.png" />
        </td>
    </tr>
</table>

<table>
    <tr>
        <td>
            <img src="./simple_monitoring_4.png" />
        </td>
    </tr>
</table>

## 7. Set up alerts (Optional)

Fiddler allows creating rules when your data or model predictions deviate from its expected behavior.

The rules can compare metrics to **absolute** or **relative** values.

Please refer to [our documentation](https://docs.fiddler.ai/docs/alerts) for more information on Alert Rules.

---
  
Let's set up a few alert rules.

The following API call sets up a Data Integrity type rule which triggers an email notification when published events have 2 or more range violations in any 1 day bin for the `numofproducts` column.

In [None]:
alert_rule_1 = fdl.AlertRule(
    name="Bank Churn Range Violation Alert",
    model_id=model.id,
    metric_id="range_violation_count",
    bin_size=fdl.BinSize.DAY,
    compare_to=fdl.CompareTo.RAW_VALUE,
    priority=fdl.Priority.HIGH,
    warning_threshold=2,
    critical_threshold=3,
    condition=fdl.AlertCondition.GREATER,
    columns=["numofproducts"],
)

alert_rule_1.create()

alert_rule_1.set_notification_config(emails=["name@google.com"])

Let's add a second alert rule.

This one sets up a Performance type rule which triggers an email notification when precision metric is 5% higher than that from 1 hr bin one day ago.

In [None]:
alert_rule_2 = fdl.AlertRule(
    name="Bank Churn Performance Alert",
    model_id=model.id,
    metric_id="precision",
    bin_size=fdl.BinSize.HOUR,
    compare_to=fdl.CompareTo.TIME_PERIOD,
    compare_bin_delta=24, # Multiple of the bin size
    condition=fdl.AlertCondition.GREATER,
    warning_threshold=0.05,
    critical_threshold=0.1,
    priority=fdl.Priority.HIGH,
)

alert_rule_2.create()

alert_rule_2.set_notification_config(emails=["name@google.com"])

## 8. Configure a custom metric (Optional)

Fiddler's Client API function [add_custom_metric](https://docs.fiddler.ai/reference/clientadd_custom_metric) allows for the creation of custom metrics.  Custom metrics will be tracked over time and can be charted and alerted upon just like any other out of the box metric offered by the Fiddler platform.  Custom metrics can obviously be configured through the Fiddler UI too.

Please refer [our documentation](https://docs.fiddler.ai/docs/custom-metrics) for more information on Custom Metrics.

---
  
Let's set up a custom metric.

In [None]:
custom_metric = fdl.CustomMetric(
    name='Lost Revenue',
    model_id=model.id,
    description='A metric to track revenue lost for each false positive prediction.',
    definition="""sum(if(fp(),1,0) * -100)"""  # This is an excel like formula which adds -$100 for each false positive predicted by the model
)

custom_metric.create()

## 9. Configure a Segment (Optional)
Fiddler's Client API function add_segment allows for the creation of cohorts/sub-segments in your production data. These segments can be tracked over time, added to charts, and alerted upon. Segments can also be configured through the Fiddler UI as well.

Please refer to our documentation for more information on the creation and management of segments.

Let's set a segment to track customers from Hawaii for a specific age range.

In [None]:
segment = fdl.Segment(
    name='Hawaii Customers between 30 and 60',
    model_id=model.id,
    description='Hawaii Customers between 30 and 60',
    definition="(age<60 or age>30) and geography=='Hawaii'"
)

segment.create()

## 10. Publish a static baseline (Optional)

Since Fiddler already knows how to process data for your model, we can now add a **baseline dataset**.

You can think of this as a static dataset which represents **"golden data,"** or the kind of data your model expects to receive.

Then, once we start sending production data to Fiddler, you'll be able to see **drift scores** telling you whenever it starts to diverge from this static baseline.

***

Let's publish our **original data sample** as a pre-production dataset. This will automatically add it as a baseline for the model.


*For more information on how to design your baseline dataset, [click here](https://docs.fiddler.ai/docs/creating-a-baseline-dataset).*

In [None]:
STATIC_BASELINE_NAME = 'baseline_dataset'

output = model.publish(
    source=sample_df,
    environment=fdl.EnvType.PRE_PRODUCTION,
    dataset_name=STATIC_BASELINE_NAME
)

## 11. Configure a rolling baseline (Optional)

Fiddler also allows you to configure a baseline based on **past production data.**

This means instead of looking at a static slice of data, it will look into the past and use what it finds for drift calculation.

Please refer to [our documentation](https://docs.fiddler.ai/docs/fiddler-baselines) for more information on Baselines.

---
  
Let's set up a rolling baseline that will allow us to calculate drift relative to production data from 1 week back.

In [None]:
ROLLING_BASELINE_NAME = 'rolling_baseline_1week'

baseline = fdl.Baseline(
    model_id=model.id,
    name=ROLLING_BASELINE_NAME,
    type_=fdl.BaselineType.ROLLING,
    environment=fdl.EnvType.PRODUCTION,
    window_bin_size=fdl.WindowBinSize.DAY, # Size of the sliding window
    offset_delta=7, # How far back to set our window (multiple of window_bin_size)
)

baseline.create()

## 12. Publish production events

Finally, let's send in some production data!


Fiddler will **monitor this data and compare it to your baseline to generate powerful insights into how your model is behaving**.


---


Each record sent to Fiddler is called **an event**.
  
Let's load in some sample events from a CSV file.

In [None]:
PATH_TO_EVENTS_CSV = 'https://raw.githubusercontent.com/fiddler-labs/fiddler-examples/main/quickstart/data/v3/churn_production_data.csv'

production_df = pd.read_csv(PATH_TO_EVENTS_CSV)

# Shift the timestamps of the production events to be as recent as today
production_df['timestamp'] = production_df['timestamp'] + (int(time.time() * 1000) - production_df['timestamp'].max())

You can use a Model's `publish` method to start pumping data into Fiddler!
  
*Just include:*
1. The DataFrame (or path to CSV) containing your events
2. The name of the column containing event timestamps

In [None]:
output = model.publish(production_df)

## 13. Get insights
  
Return to your Fiddler environment to get enhanced observability into your model's performance.

<table>
    <tr>
        <td>
            <img src="https://raw.githubusercontent.com/fiddler-labs/fiddler-examples/main/quickstart/images/simple_monitoring_5.png" />
        </td>
    </tr>
</table>

**What's Next?**

Try the [NLP Monitoring - Quickstart Notebook](https://docs.fiddler.ai/docs/simple-nlp-monitoring-quick-start)

---


**Questions?**  
  
Check out [our docs](https://docs.fiddler.ai/) for a more detailed explanation of what Fiddler has to offer.

Join our [community Slack](http://fiddler-community.slack.com/) to ask any questions!

If you're still looking for answers, fill out a ticket on [our support page](https://fiddlerlabs.zendesk.com/) and we'll get back to you shortly.