Skip to content
Branch: master
Find file History
bamurtaugh and CESARDELATORRE Migration/v1.3.1 (#597)
* Add anomaly detection example to solution

* Updated label/score printing for anomaly detect

With ML.NET v1.3.0, fixed issue where Predicted Label was always true. No longer need "hack" of comparing score to 0.2

* Update build props nuget versions

ML.NET v1.3.1, ML Preview 0.15.0

* Renamed solution to match v1.3.1

* Update C# readmes to v1.3.1

* Update F# E2E readme

* Update F# getting started readmes

Change to v1.3.1

* Rename F# solution to v1.3.1

* Update to preview v0.15.1

* Changed to ML from MLPreview

Update TimeSeries to v1.3.1 instead of preview

* Update timeseries from preview to regular v1.3.1

* Change TimeSeries from preview to regular v1.3.1

* Update TensorFlow from Preview to regular v1.3.1

* Update TensorFlow from preview to regular v1.3.1
Latest commit 2feb479 Aug 7, 2019
Permalink
Type Name Latest commit message Commit time
..
Failed to load latest commit information.
PowerAnomalyDetection Migration/v1.3.1 (#597) Aug 6, 2019
PowerAnomalyDetection.sln Migrate Samples to 1.0.0 (#404) May 3, 2019
README.md Migration/v1.3.1 (#597) Aug 6, 2019

README.md

Power Consumption Anomaly Detection

ML.NET version API type Status App Type Data type Scenario ML Task Algorithms
v1.3.1 Dynamic API Up-to-date Console app .csv files Power Meter Anomaly Detection Time Series- Anomaly Detection SsaSpikeDetection

In this sample, you'll see how to use ML.NET to detect anomalies in time series data.

Problem

This problem is focused on finding spikes in power consumption based on daily readings from a smart electric meter.

To solve this problem, we will build an ML model that takes as inputs:

  • date and time
  • meter reading difference, normalized by the time span between readings (ConsumptionDiffNormalized)

and generate an alert if an anomaly is detected.

ML task - Time Series

The goal is the identification of rare items, events or observations which raise suspicions by differing significantly from the majority of the time series data.

Solution

To solve this problem, you build and train an ML model on existing training data, evaluate how good it is (analyzing the obtained metrics), and lastly you can consume/test the model to predict the demand given input data variables.

Build -> Train -> Evaluate -> Consume

However, in this example we will build and train the model to demonstrate the Time Series anomaly detection library since it detects on actual data and does not have an evaluate method. We will then review the detected anomalies in the Prediction output column.

1. Build model

Building a model includes:

  • Prepare and Load the data with LoadFromTextFile

  • Choosing a time series Estimator and setting parameters

The initial code is similar to the following:

// Create a common ML.NET context.
var ml = new MLContext();

[...]

// Create a class for the dataset
class MeterData
{
    [LoadColumn(0)]
    public string name { get; set; }
    [LoadColumn(1)]
    public DateTime time { get; set; }
    [LoadColumn(2)]
    public float ConsumptionDiffNormalized { get; set; }
}

[...]

// Load the data
[...]

var dataView = ml.Data.LoadFromTextFile<MeterData>(
                TrainingData,
                separatorChar: ',',
                hasHeader: true);

[...]

// Prepare the Prediction output column for the model
class SpikePrediction
{
    [VectorType(3)]
    public double[] Prediction { get; set; }
}

[...]

// Configure the Estimator
const int PValueSize = 30;
const int SeasonalitySize = 30;
const int TrainingSize = 90;
const int ConfidenceInterval = 98;

string outputColumnName = nameof(SpikePrediction.Prediction);
string inputColumnName = nameof(MeterData.ConsumptionDiffNormalized);  

var trainigPipeLine = mlContext.Transforms.DetectSpikeBySsa(
                outputColumnName,
                inputColumnName,
                confidence: ConfidenceInterval,
                pvalueHistoryLength: PValueSize,
                trainingWindowSize: TrainingSize,
                seasonalityWindowSize: SeasonalitySize);

2. Train model

Training the model is a process of running the chosen algorithm on a training data (with known anomaly values) to tune the parameters of the model. It is implemented in the Fit() method from the Estimator object.

To perform training you need to call the Fit() method while providing the training dataset (power-export_min.csv) in a DataView object.

ITransformer trainedModel = trainigPipeLine.Fit(dataView);

3. View the anomalies

You can view the detected anomalies from the Time Series model by accessing the output column.

var transformedData = model.Transform(dataView);
You can’t perform that action at this time.