# Fiddler Quick Start Class Imbalance Guide

Many ML use cases, like fraud detection and facial recognition, suffer from what is known as the class imbalance problem.  This problem exists where a vast majority of the inferences seen by the model belong to only one class, known as the majority class.  This makes detecting drift in the minority class very difficult as the "signal" is completely outweighed by the large number of inferences seen in the majority class.  The following notebook showcases how Fiddler uses a class weighting paramater to deal with this problem. This notebook will onboard two identical models -- one without class imbalance weighting and one with class imbalance weighting -- to illustrate how drift signals in the minority class are easier to detect once properly amplified by Fiddler's unique class weighting approach.

1. Connect to Fiddler
2. Load a Data Sample
3. Create Both Model Versions
4. Publish Static Baselines
5. Publish Production Events
6. Compare the Two Models

## 0. Imports

In [1]:
%pip install -q fiddler-client;

import time

import sklearn
import numpy as np
import pandas as pd
import fiddler as fdl

print(f"Running Fiddler Python client version {fdl.__version__}")


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.2[0m[39;49m -> [0m[32;49m24.3.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.
Running Fiddler Python client version 3.4.0


# 1. Connect to Fiddler

Before you can add information about your model with Fiddler, you'll need to connect using the Fiddler Python client.


---


**We need a couple pieces of information to get started.**
1. The URL you're using to connect to Fiddler
2. Your authorization token

Your authorization token can be found by navigating to the **Credentials** tab on the **Settings** page of your Fiddler environment.

In [2]:
URL = ''  # Make sure to include the full URL (including https:// e.g. 'https://your_company_name.fiddler.ai').
TOKEN = ''

Constants for this example notebook, change as needed to create your own versions

In [3]:
PROJECT_NAME = 'quickstart_examples'
MODEL_NAME = 'imbalance_cc_fraud'
MODEL_NAME_WEIGHTED = 'imbalance_cc_fraud_weighted'
STATIC_BASELINE_NAME = 'baseline_dataset'

PATH_TO_SAMPLE_CSV = 'https://raw.githubusercontent.com/fiddler-labs/fiddler-examples/main/quickstart/data/v3/imbalance_data_sample.csv'
PATH_TO_EVENTS_CSV = 'https://raw.githubusercontent.com/fiddler-labs/fiddler-examples/main/quickstart/data/v3/imbalance_production_data.csv'

Now just run the following to connect to your Fiddler environment.

In [4]:
fdl.init(url=URL, token=TOKEN)

#### 1.a Create New or Load Existing Project

Once you connect, you can create a new project by specifying a unique project name in the fld.Project constructor and call the `create()` method. If the project already exists, it will load it for use.

In [5]:
try:
    # Create project
    project = fdl.Project(name=PROJECT_NAME).create()
    print(f'New project created with id = {project.id} and name = {project.name}')
except fdl.Conflict:
    # Get project by name
    project = fdl.Project.from_name(name=PROJECT_NAME)
    print(f'Loaded existing project with id = {project.id} and name = {project.name}')

Loaded existing project with id = 70b74177-c712-44b1-b431-2377c1b908ab and name = quickstart_examplesx


# 2. Load a Data Sample

In this example, we'll be looking at a fraud detection use case.
  
In order to get insights into the model's performance, **Fiddler needs a small sample of data** to learn the schema of incoming data.

In [6]:

sample_data_df = pd.read_csv(PATH_TO_SAMPLE_CSV)
sample_data_df

Unnamed: 0,scaled_amount,scaled_time,V1,V2,V3,V4,V5,V6,V7,V8,...,V22,V23,V24,V25,V26,V27,V28,prediction_score,Class,timestamp
0,-0.095717,-0.181205,1.186638,0.174374,0.192811,1.283202,0.085843,0.100202,0.068697,0.012306,...,-0.149293,-0.130386,-0.417261,0.727474,-0.282047,0.038151,0.012069,0.00,0,1712237379664
1,-0.167680,-0.177481,1.338086,-0.262908,-1.224386,-1.229997,1.879685,3.131865,-0.616145,0.769305,...,-0.551317,-0.007326,1.067888,0.410250,0.994295,-0.081192,-0.001858,0.00,0,1712237406210
2,-0.182212,-0.843161,-0.771166,1.397387,1.472145,0.065873,0.057350,-0.736374,0.687517,-0.192999,...,-0.781973,0.021033,0.401564,-0.175494,-0.013405,0.262646,-0.015297,0.00,0,1712237432756
3,-0.143506,0.579824,-0.753790,-0.004463,-0.227110,-2.410126,0.327045,-1.134313,0.105330,0.100710,...,0.187384,0.447314,0.752543,-2.317540,-0.287516,-0.016333,0.175144,0.00,0,1712237459302
4,-0.293440,-0.417909,0.079154,1.237330,0.263247,1.191461,0.357007,-0.837919,0.659904,-0.102563,...,0.628542,0.105825,0.279335,-1.079106,-0.466951,0.113149,0.206482,0.01,0,1712237485848
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
22779,3.814714,0.934985,1.572848,-1.624485,-1.657009,-0.227630,-0.689607,-0.388462,-0.098925,-0.162253,...,0.376052,-0.188393,0.763261,0.065768,-0.102552,-0.056184,-0.007921,0.00,0,1712842073479
22780,-0.181793,1.014474,1.980962,-0.185912,-0.224454,0.454859,-0.598241,-0.726282,-0.345986,-0.176870,...,-0.241314,0.354358,0.042540,-0.379893,-0.629999,0.044624,-0.027026,0.00,0,1712842100025
22781,-0.289387,0.628931,2.058796,-0.116165,-1.111531,0.399652,-0.147601,-1.107611,0.114704,-0.259936,...,-0.752434,0.348093,-0.105295,-0.330408,0.208114,-0.072686,-0.062984,0.00,0,1712842126571
22782,-0.220219,0.445929,0.085238,0.852603,0.004921,-0.746662,0.759221,-0.722260,1.021523,-0.177373,...,-0.679752,-0.010588,-0.655569,-0.387611,0.180256,0.246599,0.091219,0.00,0,1712842153117


In [7]:
sample_data_df['Class'].value_counts()

print(
    'Percentage of minority class: {}%'.format(
        round(
            sample_data_df['Class'].value_counts()[1] * 100 / sample_data_df.shape[0], 4
        )
    )
)

Percentage of minority class: 0.1536%


# 3. Create Both Model Versions

Now, we will create two models:
1. One model with class weight parameters
2. One model without class weight parameters

Below, we first create a `ModelSpec` object which is common between the two. 

In [8]:
model_spec = fdl.ModelSpec(
    inputs=set(sample_data_df.columns) - set(['Class', 'prediction_score', 'timestamp']),
    outputs=['prediction_score'],
    targets=['Class'],
    metadata=['timestamp']
)

If you have columns in your ModelSpec which denote **prediction IDs or timestamps**, then Fiddler can use these to power its analytics accordingly.

Let's call them out here and use them when configuring the Model.

In [9]:
# id_column = '' # Optional: Specify the name of the ID column if you have one
timestamp_column = 'timestamp'

Define the weighted and unweighted versions of the model task parameters

In [10]:
model_task = fdl.ModelTask.BINARY_CLASSIFICATION

# Weighted Model Task Params
task_params_weighted = fdl.ModelTaskParams(
    target_class_order=[0, 1],
    binary_classification_threshold=0.4,
    class_weights=sklearn.utils.class_weight.compute_class_weight(
        class_weight="balanced",
        classes=np.unique(sample_data_df["Class"]),
        y=sample_data_df["Class"],
    ).tolist(),
)

# Unweighted Model Task Params aka default Model Task Params
task_params_unweighted = fdl.ModelTaskParams(
    target_class_order=[0, 1],
    binary_classification_threshold=0.4,
)

Now, we onboard (create) the two models to Fiddler -- the first without any class weights and the second with defined class weights.

In [11]:
model = fdl.Model.from_data(
    name=MODEL_NAME,
    project_id=project.id,
    source=sample_data_df,
    spec=model_spec,
    task=model_task,
    task_params=task_params_unweighted,
    event_ts_col=timestamp_column
)

model.create()
print(f'New unweighted model created with id = {model.id} and name = {model.name}')

weighted_model = fdl.Model.from_data(
    name=MODEL_NAME_WEIGHTED,
    project_id=project.id,
    source=sample_data_df,
    spec=model_spec,
    task=model_task,
    task_params=task_params_weighted,
    event_ts_col=timestamp_column
)

weighted_model.create()
print(f'New weighted model created with id = {weighted_model.id} and name = {weighted_model.name}')


New unweighted model created with id = 53854447-0c51-4f50-93f2-34691227bdfb and name = imbalance_cc_fraud
New weighted model created with id = b74173cf-a0ca-4d32-a71b-ca4567b53f4c and name = imbalance_cc_fraud_weighted


# 4. Publish Static Baselines

Since Fiddler already knows how to process data for your models, we can now add a **baseline dataset**.

You can think of this as a static dataset which represents **"golden data,"** or the kind of data your model expects to receive.

Then, once we start sending production data to Fiddler, you'll be able to see **drift scores** telling you whenever it starts to diverge from this static baseline.

***

Let's publish our **original data sample** as a pre-production dataset. This will automatically add it as a baseline for each model.


*For more information on how to design your baseline dataset, [click here](https://docs.fiddler.ai/client-guide/creating-a-baseline-dataset).*

In [12]:
baseline_publish_job = model.publish(
    source=sample_data_df,
    environment=fdl.EnvType.PRE_PRODUCTION,
    dataset_name=STATIC_BASELINE_NAME,
)
print(
    f'Initiated pre-production environment data upload with Job ID = {baseline_publish_job.id}'
)

baseline_publish_job_weighted = weighted_model.publish(
    source=sample_data_df,
    environment=fdl.EnvType.PRE_PRODUCTION,
    dataset_name=STATIC_BASELINE_NAME,
)
print(
    f'Initiated pre-production environment data upload with Job ID = {baseline_publish_job_weighted.id}'
)

# Uncomment the lines below to wait for the jobs to finish, otherwise they will run in the background.
# You can check the statuses on the Jobs page in the Fiddler UI or use the job IDs to query the job statuses via the API.
# baseline_publish_job.wait()
# baseline_publish_job_weighted.wait()

Initiated pre-production environment data upload with Job ID = ad0f1c91-d6d5-4ce6-a899-e5071b1243fb
Initiated pre-production environment data upload with Job ID = 32a1170a-3ea4-4458-94bd-47152df98eb2


# 5. Publish Production Events 

Publish the same events to both models with synthetic drift in the minority class

In [13]:
production_data_df = pd.read_csv(PATH_TO_EVENTS_CSV)

# Shift the timestamps of the production events to be as recent as today
production_data_df['timestamp'] = production_data_df['timestamp'] + (
    int(time.time() * 1000) - production_data_df['timestamp'].max()
)
production_data_df

Unnamed: 0,scaled_amount,scaled_time,V1,V2,V3,V4,V5,V6,V7,V8,...,V22,V23,V24,V25,V26,V27,V28,prediction_score,Class,timestamp
0,-0.293440,-0.028231,-1.464897,1.975528,-1.077145,2.819191,0.069850,-0.789044,-1.196101,0.673654,...,-0.272505,-0.031549,-0.406166,0.157769,-0.104393,0.073796,-0.041570,0.58,1,1731084423383
1,3.030811,-0.145267,-4.198735,0.194121,-3.917586,3.920748,-1.875486,-2.118933,-3.614445,1.687884,...,-0.183001,-0.440387,0.292539,-0.144967,-0.251744,1.249414,-0.131525,0.93,1,1731084433964
2,0.996996,0.514891,-4.599447,2.762540,-4.656530,5.201403,-2.470388,-0.357618,-3.767189,0.061466,...,0.261333,0.621415,0.994110,-0.687853,-0.337531,-1.612791,1.231425,0.62,1,1731084444545
3,-0.275554,0.195268,-25.825982,19.167239,-25.390229,11.125435,-16.682644,3.933699,-37.060311,-28.759799,...,5.703684,3.510019,0.054330,-0.671983,-0.209431,-4.950022,-0.448413,0.87,1,1731084455127
4,5.899113,-0.325016,-2.335655,2.225380,-3.379450,2.178538,-3.568264,0.316814,-1.734948,1.449139,...,0.297412,0.308536,-0.598416,-0.121850,-0.491018,0.701606,0.206966,0.97,1,1731084465708
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
57153,0.750227,-0.334461,-1.994348,1.503076,-0.365560,0.780223,-0.957956,0.038648,-0.453702,1.553565,...,0.319275,-0.081356,-0.366704,-0.269380,-0.278170,0.082042,-0.015071,0.00,0,1731689181057
57154,-0.167819,0.834526,-0.234567,0.733694,0.486250,-0.718186,0.782227,-0.788837,1.056307,-0.175016,...,-0.574857,-0.024845,-0.428558,-0.563551,0.159926,0.094924,0.163736,0.00,0,1731689191638
57155,-0.200796,0.679038,0.040441,-0.109737,-1.266430,1.004783,2.223390,-0.670372,0.490662,-0.033739,...,0.930041,0.162391,-1.180279,-1.484172,-0.619133,0.357845,0.354379,0.00,0,1731689202220
57156,-0.257249,-0.299992,-0.495048,0.991481,1.671584,-0.342474,0.470012,-0.348503,0.996077,-0.351891,...,-0.474178,-0.145562,-0.011279,-0.162997,0.020511,0.040529,-0.269775,0.00,0,1731689212801


In [14]:
print(
    "Percentage of minority class: {}%".format(
        round(
            production_data_df["Class"].value_counts()[1] * 100 / production_data_df.shape[0], 4
        )
    )
)

Percentage of minority class: 0.5144%


We see that the percentage of minority class in production data is > 3 times than that of baseline data. This should create a big drift in the predictions.

We will now publish the same production/event data for both of the models -- the one with class weights and the one without class weights.

In [15]:
production_publish_job = model.publish(production_data_df)

print(f'For Model: {model.name} - initiated production environment data upload with Job ID = {production_publish_job.id}')

production_publish_job_weighted = weighted_model.publish(production_data_df)

print(f'For Model: {weighted_model.name} - initiated production environment data upload with Job ID = {production_publish_job_weighted.id}')

# Uncomment the lines below to wait for the jobs to finish, otherwise they will run in the background.
# You can check the statuses on the Jobs page in the Fiddler UI or use the job IDs to query the job statuses via the API.
# production_publish_job.wait()
# production_publish_job_weighted.wait()

For Model: imbalance_cc_fraud - initiated production environment data upload with Job ID = dbc327c4-fefa-48f5-bf37-5c3e907c0869
For Model: imbalance_cc_fraud_weighted - initiated production environment data upload with Job ID = d914d40c-75f8-461a-94d8-2e8f11fcd2d6


# 5. Compare the Two Models

**You're all done!**


In the Fiddler UI, we can see the model without the class weights defined the output/prediction drift in the minority class is very hard to detect (`<=0.05`) because it is obsured by the overwhelming volume of events in the majority class.  If we declare class weights, then we see a higher drift which is a more accurate respresentation of the production data where the ratio of minority is class is 3x.

<table>
    <tr>
        <td>
            <img src="https://raw.githubusercontent.com/fiddler-labs/fiddler-examples/main/quickstart/images/imabalance_data_1.png" />
        </td>
    </tr>
</table>

**What's Next?**

Try the [LLM Monitoring - Quick Start Notebook](https://docs.fiddler.ai/quickstart-notebooks/simple-llm-monitoring)

---


**Questions?**  
  
Check out [our docs](https://docs.fiddler.ai/) for a more detailed explanation of what Fiddler has to offer.

Join our [community Slack](http://fiddler-community.slack.com/) to ask any questions!

If you're still looking for answers, fill out a ticket on [our support page](https://fiddlerlabs.zendesk.com/) and we'll get back to you shortly.