# About

- Connects to the local Evidently service, seeds the Iris and German credit projects, and prepares reference/current datasets.
- Generates rolling report snapshots and updates dashboards stored in the mounted `workspace/` directory.


# 1 Pre-requisite

_Prepare a local Evidently service and connect the notebook to it._

Set up an Evidently instance where this notebook can store reports and dashboards.

1. Start the Evidently container locally:

   ```bash
   docker run -p 8000:8000 \
     -v $(pwd)/workspace:/app/workspace \
     --name evidently-service \
     --detach \
     evidently/evidently-service:latest
   ```

2. Point your Evidently project at the local instance:

   ```python
   from evidently.ui.remote import RemoteWorkspace

   ws = RemoteWorkspace("http://localhost:8000")
   ```


# 2 Connect to the remote workspace

_Establish the authenticated workspace handle used in later steps._

Instantiate `RemoteWorkspace` with the Evidently endpoint so subsequent cells can create projects, datasets, and dashboards through the `ws` client.


In [1]:
# Connect to remote workspace
from evidently.ui.workspace import RemoteWorkspace

ws = RemoteWorkspace("http://localhost:8000")

# 3 Project 1: Iris Monitoring

_Build and monitor the sample Iris project end to end._

Follow the Iris workflow from dataset preparation through report generation and dashboard configuration.


In [2]:
# Check projects
def get_project(project_name: str, *args, **kwargs):
    """Create or use existing Evidently project"""

    existing_projects = ws.search_project(project_name)
    if any(existing_projects):
        return existing_projects[0]

    else:
        project = ws.create_project(project_name, *args, **kwargs)
        return project


In [3]:
# If project does not exist, then create project
project_name = "Iris Monitoring"
description = "The purpose of this project is to demonstrate the capabilities of the Evidently monitoring platform."

project = get_project(project_name, description)
project.save()

In [4]:
from datetime import datetime, timedelta

project.date_from = datetime.now() - timedelta(days=-30)
project.date_to = datetime.now()
project.save()

In [5]:
# purge workspace snapshot artifacts
!rm -rf workspace/{project.id}/snapshots/*

zsh:1: no matches found: workspace/01983113-e714-70b7-9f6c-453adbbd6046/snapshots/*


## 3.1 Prepare Iris Monitoring Dataset

_Set up the Iris sample data that feeds the initial monitoring walkthrough._

Load the sklearn Iris dataset, convert it into pandas structures, and inspect the distribution we will monitor with Evidently.


In [6]:
import sklearn.datasets

data = sklearn.datasets.load_iris()

In [7]:
data.keys()

dict_keys(['data', 'target', 'frame', 'target_names', 'DESCR', 'feature_names', 'filename', 'data_module'])

In [8]:
data.feature_names

['sepal length (cm)',
 'sepal width (cm)',
 'petal length (cm)',
 'petal width (cm)']

In [9]:
import pandas as pd

features = pd.DataFrame(data.data, columns=data.feature_names)
target = pd.Series(data.target, name="target")

print("No. of samples:", len(features))
print(target.value_counts(normalize=True))

No. of samples: 150
target
0    0.333333
1    0.333333
2    0.333333
Name: proportion, dtype: float64


### 3.1.1 Split Iris data into reference and current samples

_Partition the dataset into baseline and comparison slices for drift checks._


In [10]:
from sklearn.model_selection import train_test_split

features_ref, features_curr, target_ref, target_curr = train_test_split(
    features, target, test_size=0.2, random_state=1
)

## 3.2 Convert to Evidently Datasets

_Build the structures Evidently expects when evaluating the Iris data._

Show how feature and target frames are combined and how dataset metadata is prepared before any reports run.


### 3.2.1 Create dataset definitions for Evidently

_Instantiate dataset objects with explicit schema information._

Combine feature and target columns, then wrap them in `Dataset` and `DataDefinition` objects so evaluations remain deterministic.


In [11]:
from evidently import DataDefinition, Dataset

# create Evidenlt Dataset object for explicit clarity and control
ref_data = Dataset.from_pandas(
    data=pd.concat([features_ref, target_ref], axis=1),
    data_definition=DataDefinition(),
)
curr_data = Dataset.from_pandas(
    data=pd.concat([features_curr, target_curr], axis=1),
    data_definition=DataDefinition(),
)


## 3.3 Generate Report snapshot

_Run Evidently presets to produce and store baseline reports._

Configure drift and summary presets, execute them against the reference and current data, and upload each snapshot to the Iris project timeline.


In [12]:
# Make a sample report snapshot
from evidently import Report
from evidently.presets import DataSummaryPreset, DataDriftPreset


# 1. Prepare report template
report = Report([DataDriftPreset(), DataSummaryPreset()])

# 2. Run report template against by comparing the reference and current dataset
snapshot = report.run(curr_data, ref_data)
# display(snapshot)

In [13]:
date_range = pd.date_range("2025-01-01", "2025-01-10", freq="1D")

The above only generates a single snapshot. For the metric to show up as a timeseries, we need to simulate multiple snapshots being taken as shown below.


In [14]:
# Simulate running multiple daily snapshots and adding to the workspace
from tqdm import tqdm

for date in tqdm(date_range):
    snapshot._timestamp = (
        date.to_pydatetime()
    )  # overwrite snapshot timestamp for demo purpose
    ws.add_run(project.id, snapshot, include_data=False)

100%|██████████| 10/10 [00:01<00:00,  6.46it/s]


## 3.4 Configure Dashboard Layout

_Define dashboard scaffolding for the Iris project in the OSS workspace._

Use the project dashboard API to script the layout because the open-source edition lacks a UI editor.


### 3.4.1 Populate Iris dashboard panels

_Register the panels that surface key metrics for the Iris project._

Clear any existing layout, add the headline text panel, and wire up metric panels before saving the dashboard configuration.


In [15]:
from evidently.sdk.models import PanelMetric
from evidently.sdk.panels import DashboardPanelPlot


# Dashboard Title
def init_dashboard_panel():
    # we need to clear the dashboard otherwise, duplicated panels will be added

    project.dashboard.clear_dashboard()
    project.dashboard.add_panel(
        DashboardPanelPlot(
            title="Iris Monitoring Dashboard",
            size="full",
            values=[],
            plot_params={"plot_type": "text"},
        )
    )

    # line chart
    project.dashboard.add_panel(
        DashboardPanelPlot(
            title="Number of drifted columns",
            description="The number of columns that have drifted in the past X period",
            size="half",
            values=[
                PanelMetric(
                    metric="DriftedColumnsCount",
                    metric_labels={"value_type": "count"},
                )
            ],
            plot_params={"plot_type": "line"},
        )
    )

    # line chart
    project.dashboard.add_panel(
        DashboardPanelPlot(
            title="Number of rows",
            description="The number of samples in each snapshot",
            size="half",
            values=[PanelMetric(legend="Number of rows", metric="RowCount")],
            plot_params={"plot_type": "line"},
        )
    )

In [16]:
# Make dashboard
init_dashboard_panel()

In [17]:
# Save project to retain changes
project.save()

# 4 Project 2: German Credit Card Default Monitoring

_Apply the Evidently workflow to the German credit risk use case._

Spin up a dedicated project, prepare model outputs, simulate production batches, and configure dashboards for the credit default scenario.


## 4.1 Create Credit Risk Monitoring Project

_Set up a dedicated Evidently project for the German credit risk use case._

Load the public German credit dataset and register a workspace project so the monitoring assets stay isolated from the Iris example.


In [18]:
project = get_project("German Credit Risk")
project

Project ID: 019835f4-9809-7ed3-9714-5306d66a82c3
Project Name: German Credit Risk
Project Description: None
        

In [19]:
# purge workspace snapshot artifacts
!rm -rf workspace/{project.id}/snapshots/*

zsh:1: no matches found: workspace/019835f4-9809-7ed3-9714-5306d66a82c3/snapshots/*


### 4.1.1 Load data and initialize the project

_Fetch the German credit dataset._

Normalize column names and map target labels so downstream steps operate on clean inputs while persisting assets to the correct project.


In [20]:
# Load data
credit_risk_data = sklearn.datasets.fetch_openml("German-Credit-Risk-with-Target")

print(f"{credit_risk_data.keys()=}")
print(f"{credit_risk_data.feature_names=}")
print(f"{credit_risk_data.target_names=}")

display("credit_risk_data.frame", credit_risk_data.frame.head())

credit_risk_data.keys()=dict_keys(['data', 'target', 'frame', 'categories', 'feature_names', 'target_names', 'DESCR', 'details', 'url'])
credit_risk_data.feature_names=['Age', 'Sex', 'Job', 'Housing', 'Saving accounts', 'Checking account', 'Credit amount', 'Duration', 'Purpose']
credit_risk_data.target_names=['Risk']


'credit_risk_data.frame'

Unnamed: 0,Age,Sex,Job,Housing,Saving accounts,Checking account,Credit amount,Duration,Purpose,Risk
0,67,male,2,own,,little,1169,6,radio/TV,good
1,22,female,2,own,little,moderate,5951,48,radio/TV,bad
2,49,male,1,own,little,,2096,12,education,good
3,45,male,2,free,little,little,7882,42,furniture/equipment,good
4,53,male,2,free,little,little,4870,24,car,bad


In [21]:
# variable setting
features = credit_risk_data.data
features.columns = [col.lower().replace(" ", "_") for col in features.columns]
_target = credit_risk_data.target
target = _target.map({"good": 1, "bad": 0}).astype(int)

## 4.2 Train a LogReg Classifier

_Fit a logistic regression pipeline to emulate a production scoring workflow._

Split the dataset, build preprocessing for categorical and numerical columns, and train a model that mirrors the batch inference we plan to monitor.


In [22]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(
    features, target, stratify=target, random_state=42
)

In [23]:
# make pre-processing pipeline
# This pipeline preprocesses and has a prediction module in the terminal step
import numpy as np
from sklearn.pipeline import make_pipeline
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder, StandardScaler
from sklearn.linear_model import LogisticRegression


cat_features = [
    "sex",
    "job",
    "housing",
    "saving_accounts",
    "checking_account",
    "purpose",
]

num_features = ["age", "credit_amount", "duration"]


def concat_dunder(input_feature, category):
    return input_feature + "__" + str(category)


col_transformer = ColumnTransformer(
    [
        (
            "ohe",
            OneHotEncoder(
                drop="first",
                handle_unknown="infrequent_if_exist",
                sparse_output=False,
                feature_name_combiner=concat_dunder,
            ),
            cat_features,
        ),
        (
            "scaler",
            StandardScaler(),
            num_features,
        ),
    ]
)


pipeline = make_pipeline(col_transformer, LogisticRegression()).set_output(
    transform="pandas"
)


# init the pipeline with the train data's statistical properties
pipeline.fit(X_train, y_train)

In [24]:
pipeline[:1].transform(X_test).head()

Unnamed: 0,ohe__sex__male,ohe__job__1,ohe__job__2,ohe__job__3,ohe__housing__own,ohe__housing__rent,ohe__saving_accounts__moderate,ohe__saving_accounts__quite rich,ohe__saving_accounts__rich,ohe__saving_accounts__nan,...,ohe__purpose__business,ohe__purpose__car,ohe__purpose__education,ohe__purpose__furniture/equipment,ohe__purpose__radio/TV,ohe__purpose__repairs,ohe__purpose__vacation/others,scaler__age,scaler__credit_amount,scaler__duration
561,1.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,1.0,0.0,0.0,-0.997768,-0.599196,0.24655
262,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.062184,0.745608,-0.243501
477,1.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,1.0,0.0,0.0,-0.909439,0.691901,0.24655
721,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,...,0.0,0.0,1.0,0.0,0.0,0.0,0.0,-0.997768,-0.997696,-1.223603
370,1.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,...,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.062184,-0.050318,1.226652


## 4.3 Convert to Evidently Datasets

_Package model inputs and outputs into Evidently-friendly reference and current datasets._

Define helper utilities that merge features, targets, and prediction probabilities, then build dataset objects.


In [25]:
def make_evidently_dataset(X: pd.DataFrame, y: pd.Series, proba: np.array):
    """
    Construct dataset to be compatible with Evidently API
    X - feature vector
    y - target
    proba - proba scores
    """

    return pd.concat(
        [
            X.reset_index(drop=True),
            y.reset_index(drop=True),
            pd.Series(proba, name="proba"),
        ],
        axis=1,
    )


proba = pipeline.predict_proba(X_test)[:, 1]

# Example usage of dataset constructor
make_evidently_dataset(X_test, y_test, proba).head()

Unnamed: 0,age,sex,job,housing,saving_accounts,checking_account,credit_amount,duration,purpose,Risk,proba
0,24,male,1,rent,little,little,1546,24,radio/TV,0,0.512232
1,36,male,3,free,little,little,5302,18,car,1,0.522807
2,25,male,2,own,little,rich,5152,24,radio/TV,1,0.73872
3,24,female,2,rent,rich,moderate,433,6,education,0,0.541705
4,36,male,2,own,,,3079,36,car,1,0.891785


In [26]:
from evidently import DataDefinition, BinaryClassification
from evidently import Dataset

# Define schema
datadef = DataDefinition(
    categorical_columns=cat_features,
    numerical_columns=num_features,
    classification=[BinaryClassification(target="Risk", prediction_probas="proba")],
)

# Make reference (training) and current (snapshot) dataset
ref_data = Dataset.from_pandas(
    make_evidently_dataset(
        X_train, y_train, proba=pipeline.predict_proba(X_train)[:, 1]
    ),
    data_definition=datadef,
)

curr_data = Dataset.from_pandas(
    make_evidently_dataset(X_test, y_test, proba=pipeline.predict_proba(X_test)[:, 1]),
    data_definition=datadef,
)

## 4.4 Generate single report snapshot

_Run classification and drift presets on the reference and current splits to form the initial report._

Execute Evidently presets and push the resulting snapshot to the workspace to establish a baseline for the credit risk project.


In [27]:
from evidently import Report
from evidently.presets import DataDriftPreset, DataSummaryPreset, ClassificationPreset
from evidently.metrics import RocAuc

report = Report(
    [DataDriftPreset(), DataSummaryPreset(), ClassificationPreset(), RocAuc()]
)
snapshot = report.run(current_data=curr_data, reference_data=ref_data)

In [28]:
# Display project metadata
project

Project ID: 019835f4-9809-7ed3-9714-5306d66a82c3
Project Name: German Credit Risk
Project Description: None
        

In [29]:
# Add snapshot to credit risk workspace
ws.add_run(project.id, snapshot)

## 4.5 Simulate Production Batches

_Resample the test set to mimic successive current batches for monitoring._

Define utilities that randomly over- or under-sample the holdout data so each synthetic batch reflects fresh production activity.


In [30]:
import random
from imblearn.under_sampling import RandomUnderSampler
from imblearn.over_sampling import RandomOverSampler


def simulate_test_data(X, y):
    resampler: RandomOverSampler | RandomUnderSampler = None
    # randomly over or under sample
    if random.random() < 0.5:
        resampler = RandomOverSampler
    else:
        resampler = RandomUnderSampler

    X_resampled, y_resampled = resampler().fit_resample(X, y)

    return X_resampled, y_resampled


def simulate_curr_data():
    X_resampled, y_resampled = simulate_test_data(X_test, y_test)
    proba = pipeline.predict_proba(X_resampled)[:, 1]

    evidently_df = make_evidently_dataset(X_resampled, y_resampled, proba)

    curr_data = Dataset.from_pandas(evidently_df, data_definition=datadef)

    return curr_data

## 4.6 Simulate Multiple/Running Snapshots

_Create multiple snapshots from the simulated batches to emulate ongoing reporting._

Iterate across synthetic dates, run the Evidently report with each resampled batch, and upload the snapshots to the credit risk project.


In [31]:
# simulate daily snapshots
# for each day, simulate current data and generate report against the ref data

from datetime import datetime, timedelta

for date in pd.date_range(
    datetime.now().date() + timedelta(days=-10), datetime.now().date(), freq="1D"
):
    ts = date.to_pydatetime()
    snapshot = report.run(simulate_curr_data(), ref_data, timestamp=ts)
    ws.add_run(project.id, snapshot)

## 4.7 Configure Dashboard Layout

_Set up the dashboard panels that surface the credit risk monitoring metrics._

Provide a helper that clears the existing layout and registers the key panels used to visualize performance and drift over time.


In [32]:
# Dashboard Title
def init_dashboard_panel_german_credit():
    # we need to clear the dashboard otherwise, duplicated panels will be added

    project.dashboard.clear_dashboard()
    project.dashboard.add_panel(
        DashboardPanelPlot(
            title="German Credit Risk Monitoring Dashboard",
            size="full",
            values=[],
            plot_params={"plot_type": "text"},
        )
    )

    # line chart
    project.dashboard.add_panel(
        DashboardPanelPlot(
            title="Accuracy",
            description="The number of columns that have drifted in the past X period",
            size="half",
            values=[
                PanelMetric(
                    metric="Accuracy",
                )
            ],
            plot_params={"plot_type": "line"},
        )
    )

    project.dashboard.add_panel(
        DashboardPanelPlot(
            title="AUROC",
            description="The number of columns that have drifted in the past X period",
            size="half",
            values=[
                PanelMetric(
                    metric="RocAuc",
                    # metric_labels={"column": "1", "value_type": "value"},
                    legend="AUROC",
                )
            ],
            plot_params={"plot_type": "line"},
        )
    )
    project.dashboard.add_panel(
        DashboardPanelPlot(
            title="TPR",
            description="The number of columns that have drifted in the past X period",
            size="half",
            values=[
                PanelMetric(
                    metric="TPR",
                    legend="TPR",
                )
            ],
            plot_params={"plot_type": "line"},
        )
    )

    project.dashboard.add_panel(
        DashboardPanelPlot(
            title="FPR",
            description="The number of columns that have drifted in the past X period",
            size="half",
            values=[
                PanelMetric(
                    metric="FPR",
                    legend="FPR",
                )
            ],
            plot_params={"plot_type": "line"},
        )
    )

    project.dashboard.add_panel(
        DashboardPanelPlot(
            title="Row Count",
            description="The number of columns that have drifted in the past X period",
            size="half",
            values=[
                PanelMetric(
                    metric="RowCount",
                    # metric_labels={"value_type": "value"},
                    # legend="AUROC",
                )
            ],
            plot_params={"plot_type": "line"},
        )
    )

    project.dashboard.add_panel(
        DashboardPanelPlot(
            title="% Drifted Columns",
            description="The number of columns that have drifted in the past X period",
            size="half",
            values=[
                PanelMetric(
                    metric="DriftedColumnsCount",
                    metric_labels={"value_type": "share"},
                    legend="% of Columns Drifted",
                )
            ],
            plot_params={"plot_type": "line"},
        )
    )

In [33]:
# Make dashboard
init_dashboard_panel_german_credit()

# Save
project.save()