# Multivariate Time Series Anomaly Detection Example

Multivariate time series anomaly detection works in largely the same way as univariate time series anomaly detection (covered here and here).

To begin, we will load the multivariate MSL dataset for time series anomaly detection.

In [4]:
from merlion.utils import TimeSeries
from ts_datasets.anomaly import MSL

time_series, metadata = MSL()[0]
train_data = TimeSeries.from_pd(time_series[metadata.trainval])
test_data = TimeSeries.from_pd(time_series[~metadata.trainval])
test_labels = TimeSeries.from_pd(metadata.anomaly[~metadata.trainval])

print(f"Time series is {train_data.dim}-dimensional")

display(train_data)

Time series is 55-dimensional


                            0    1    2    3    4    5    6    7    8    9  \
time                                                                         
1970-01-01 00:00:00  2.146646  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0   
1970-01-01 00:01:00  2.146646  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0   
1970-01-01 00:02:00  2.146646  0.0  0.0  0.0  0.0  1.0  0.0  0.0  0.0  0.0   
1970-01-01 00:03:00  2.151326  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0   
1970-01-01 00:04:00  2.163807  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0   
...                       ...  ...  ...  ...  ...  ...  ...  ...  ...  ...   
1970-02-10 11:52:00  0.333338  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0   
1970-02-10 11:53:00  0.333338  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0   
1970-02-10 11:54:00  0.333338  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0   
1970-02-10 11:55:00  0.333338  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0   
1970-02-10 11:56:00  0.333338  0.0  0.0  0.0  0.0  0.0  0.0  0.0

In [2]:
# We initialize models using the model factory in this tutorial
# We manually set the detection threshold to 2 (in standard deviation units) for all models
from merlion.models.factory import ModelFactory
from merlion.post_process.threshold import AggregateAlarms

model = ModelFactory.create("IsolationForest",
                             threshold=AggregateAlarms(alm_threshold=2))

train_scores = model.train(train_data)

In [5]:
from merlion.evaluate.anomaly import TSADMetric

labels = model.get_anomaly_label(test_data)
precision = TSADMetric.PointAdjustedPrecision.value(ground_truth=test_labels, predict=labels)
recall = TSADMetric.PointAdjustedRecall.value(ground_truth=test_labels, predict=labels)
f1 = TSADMetric.PointAdjustedF1.value(ground_truth=test_labels, predict=labels)
mttd = TSADMetric.MeanTimeToDetect.value(ground_truth=test_labels, predict=labels)
print(f"{type(model).__name__}")
print(f"Precision: {precision:.4f}")
print(f"Recall:    {recall:.4f}")
print(f"F1:        {f1:.4f}")
print(f"MTTD:      {mttd}")
print()

IsolationForest
Precision: 0.9638
Recall:    0.8192
F1:        0.8856
MTTD:      0 days 01:40:57



AssertionError: Plotting only supported for univariate time series, but got atime series of dimension 55

For the purposes of this tutorial, we will be using 3 models:

    DefaultDetector (which automatically detects whether the input time series is univariate or multivariate);

    IsolationForest (a classic algorithm); and

    A DetectorEnsemble which takes the maximum anomaly score returned by either model.

Note that while all multivariate anomaly detection models can be used on univariate time series, some Merlion models (e.g. WindStats, ZMS, StatThreshold) are specific to univariate time series. However, the API is identical to that of univariate anomaly detection models.

In [2]:
# We initialize models using the model factory in this tutorial
# We manually set the detection threshold to 2 (in standard deviation units) for all models
from merlion.models.factory import ModelFactory
from merlion.post_process.threshold import AggregateAlarms

model1 = ModelFactory.create("DefaultDetector",
                             threshold=AggregateAlarms(alm_threshold=2))

model2 = ModelFactory.create("IsolationForest",
                             threshold=AggregateAlarms(alm_threshold=2))

# Here, we create a _max ensemble_ that takes the maximal anomaly score
# returned by any individual model (rather than the mean).
model3 = ModelFactory.create("DetectorEnsemble", models=[model1, model2],
                             threshold=AggregateAlarms(alm_threshold=2),
                             combiner={"name": "Max"})

for model in [model1, model2, model3]:
    print(f"Training {type(model).__name__}...")
    train_scores = model.train(train_data)

Training DefaultDetector...


Inferred granularity <Minute>


Training IsolationForest...
Training DetectorEnsemble...


Inferred granularity <Minute>




Like univariate models, we may call get_anomaly_label() to get a sequence of post-processed (calibrated and thresholded) training scores.

We can then use these to evaluate the model’s performance.

In [3]:
from merlion.evaluate.anomaly import TSADMetric

for model in [model1, model2, model3]:
    labels = model.get_anomaly_label(test_data)
    precision = TSADMetric.PointAdjustedPrecision.value(ground_truth=test_labels, predict=labels)
    recall = TSADMetric.PointAdjustedRecall.value(ground_truth=test_labels, predict=labels)
    f1 = TSADMetric.PointAdjustedF1.value(ground_truth=test_labels, predict=labels)
    mttd = TSADMetric.MeanTimeToDetect.value(ground_truth=test_labels, predict=labels)
    print(f"{type(model).__name__}")
    print(f"Precision: {precision:.4f}")
    print(f"Recall:    {recall:.4f}")
    print(f"F1:        {f1:.4f}")
    print(f"MTTD:      {mttd}")
    print()

DefaultDetector
Precision: 0.9611
Recall:    0.8325
F1:        0.8922
MTTD:      0 days 01:25:12

IsolationForest
Precision: 0.9638
Recall:    0.8192
F1:        0.8856
MTTD:      0 days 01:40:57

DetectorEnsemble
Precision: 0.9638
Recall:    0.8322
F1:        0.8932
MTTD:      0 days 01:34:28



We can also use a TSADEvaluator to evaluate a model in a manner that simulates live deployment.

Here, we train an initial model on the training data, and we obtain its predictions on the training data using a sliding window of 1 week (cadence="1w").

However, we only retrain the model every 4 weeks (retrain_freq="4w").

In [4]:
from merlion.evaluate.anomaly import TSADEvaluator, TSADEvaluatorConfig
for model in [model1, model2, model3]:
    print(f"{type(model).__name__} Sliding Window Evaluation")
    evaluator = TSADEvaluator(model=model, config=TSADEvaluatorConfig(
        cadence="1w", retrain_freq="4w"))
    train_result, test_pred = evaluator.get_predict(train_vals=train_data, test_vals=test_data)
    precision = evaluator.evaluate(ground_truth=test_labels, predict=test_pred,
                                   metric=TSADMetric.PointAdjustedPrecision)
    recall = evaluator.evaluate(ground_truth=test_labels, predict=test_pred,
                                metric=TSADMetric.PointAdjustedRecall)
    f1 = evaluator.evaluate(ground_truth=test_labels, predict=test_pred,
                            metric=TSADMetric.PointAdjustedF1)
    mttd = evaluator.evaluate(ground_truth=test_labels, predict=test_pred,
                              metric=TSADMetric.MeanTimeToDetect)
    print(f"Precision: {precision:.4f}")
    print(f"Recall:    {recall:.4f}")
    print(f"F1:        {f1:.4f}")
    print(f"MTTD:      {mttd}")
    print()

DefaultDetector Sliding Window Evaluation


Inferred granularity <Minute>




TSADEvaluator:  55%|█████▍    | 2419200/4423680 [00:36<00:30, 65251.04it/s]Inferred granularity <Minute>
TSADEvaluator:  55%|█████▍    | 2419200/4423680 [00:50<00:30, 65251.04it/s]



TSADEvaluator: 100%|██████████| 4423680/4423680 [06:16<00:00, 11739.40it/s]


Precision: 0.9537
Recall:    0.7741
F1:        0.8546
MTTD:      0 days 01:39:26

IsolationForest Sliding Window Evaluation


TSADEvaluator: 100%|██████████| 4423680/4423680 [00:10<00:00, 429062.53it/s]


Precision: 0.9666
Recall:    0.8321
F1:        0.8943
MTTD:      0 days 01:40:42

DetectorEnsemble Sliding Window Evaluation


Inferred granularity <Minute>




TSADEvaluator:  55%|█████▍    | 2419200/4423680 [00:43<00:36, 54897.78it/s]Caught an exception while training model 1/2 (DefaultDetector). Model will not be used. Traceback (most recent call last):
  File "/root/miniconda3/envs/sys843_env/lib/python3.12/site-packages/merlion/models/ensemble/anomaly.py", line 162, in _train
    train_scores, valid_scores = TSADEvaluator(model=model, config=eval_cfg).get_predict(
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/sys843_env/lib/python3.12/site-packages/merlion/evaluate/anomaly.py", line 443, in get_predict
    train_result, result = super().get_predict(
                           ^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/sys843_env/lib/python3.12/site-packages/merlion/evaluate/base.py", line 202, in get_predict
    train_result = self._train_model(train_vals, **full_train_kwargs)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/roo

Precision: 0.9619
Recall:    0.8128
F1:        0.8811
MTTD:      0 days 01:22:36

