[source](../api/alibi_detect.cd.margindensity.rst)

# MarginDensity

## Overview

The margin density drift detector (MD3) ([Sethi and  Kantardzic, 2015](https://www.sciencedirect.com/science/article/pii/S1877050915017871)) quantifies the percentage of predictions made by a probabilistic binary classifier at the decision boundary (as defined by a margin).  This quantity is known as the margin density.  The margin density is compared to an allowable margin density range.  Drift is detected if the calculated margin density for a given batch falls outside of the specified margin density range.  Care should be taken when setting the margin density range.  This can be guided, for example, by calculating the margin density mean and variance on out-of-fold instances when performing k-fold cross-validation, or on data batches from an additional holdout set that are characteristically similar to the data on which the binary classifier was trained.  Low and high margin densities relative to the allowable density range can be indicative of virtual drift, concept drift and/or general changes in model performance.

Many alternative drift detection methods focus on tracking changes in the distribution of the data inputs.  These approaches can be prone to generating false positives as they implicitly give equal importance to all features, even those that are of very little importance to the classifier.  The utility of the MD3 approach is that it uses the change in the percentage of samples contained within a classifier's decision boundary (i.e., margin) as a proxy for measuring changes in the probability distribution of the labels given the data inputs, _without actually requiring any labeled data_.  This approach tends to be more robust against false positives as the classifier accounts for differences in feature importances, giving little emphasis to features that do not affect classification performance.


## Usage

### Initialize


* `margin`: Width of margin at decision boundary.

* `model`: Trained binary classification model.

* `density_range`: Tuple of length 2 that defines margin density lower and upper bounds.

* `data_type`: Optionally specify the data type (tabular or image). Added to metadata.

Initialized drift detector example:

```python
from alibi_detect.cd import MarginDensityDrift

cd = MarginDensityDrift(
    margin=0.1,
    model=model,
    density_range=(0.08,0.12)
)
```

### Detect Drift

We detect data drift by simply calling `predict` on a batch of instances `X`. `return_metric` equal to *True* will also return the drift metric (e.g. accuracy) and the threshold used by the detector.

The prediction takes the form of a dictionary with `meta` and `data` keys. `meta` contains the detector's metadata while `data` is also a dictionary which contains the actual predictions stored in the following keys:

* `is_drift`: 1 if the sample tested has drifted from the reference data and 0 otherwise.

* `margin`: user-defined margin width that is used to define whether or not a prediction .

* `margin_density`: calculated value defined by the number of in-margin predictions divided by the total number of samples for a given batch.

* `density_range`: or the optional `metric_name` kwarg value: drift metric value if `return_metric` equals *True*.

* `direction`: or the optional `metric_name` kwarg value: drift metric value if `return_metric` equals *True*.


```python
preds_drift = cd.predict(X)
```

### Saving and loading

The drift detectors can be saved and loaded in the same way as other detectors:

```python
from alibi_detect.utils.saving import save_detector, load_detector

filepath = 'my_path'
save_detector(cd, filepath)
cd = load_detector(filepath)
```

## Examples

[Drift detection on CIFAR10](../examples/cd_clf_cifar10.nblink)