First, we must import the TabularDrift detector from the alibi-detect package, as well
as the relevant packages for loading and splitting the data:

In [2]:
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
import alibi
from alibi_detect.cd import TabularDrift

  "class": algorithms.Blowfish,
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.


Next, we must get and split the data:

In [3]:
wine_data = load_wine()
feature_names = wine_data.feature_names
X, y = wine_data.data, wine_data.target
X_ref, X_test, y_ref, y_test = train_test_split(X, y, test_size=0.50,
random_state=42)

Next, we must initialize our drift detector using the reference data and by providing the
p-value we want to be used by the statistical significance tests. If you want to make your
drift detector trigger when smaller differences occur in the data distribution, you must
select a larger p_val:

In [4]:
cd = TabularDrift(x_ref=X_ref, p_val=0.05)



We can now check for drift in the test dataset against the reference dataset:

In [5]:
preds = cd.predict(X_test)
labels = ['No', 'Yes']
print('Drift: {}'.format(labels[preds['data']['is_drift']]))

Drift: No


Although there was no drift in this case, we can easily simulate a scenario where the
chemical apparatus being used for measuring the chemical properties experienced a
calibration error, and all the values are recorded as 10% higher than their true values. In
this case, if we run drift detection again on the same reference dataset, we will get the
following output:

In [6]:
X_test_error = X_test * 1.07
preds = cd.predict(X_test_error)
labels = ['No', 'Yes']
print('Drift: {}'.format(labels[preds['data']['is_drift']]))

Drift: Yes


This returns 'Drift: Yes', showing that the drift has been successfully detected.

The first drift detection example was very simple and showed us how to detect a basic case of
one-off data drift, specifically feature drift. We will now show an example of detecting label drift,
which is basically the same but now we simply use the labels as the reference and comparison
dataset