# Advanced Univariate Modelling

`ML.NET` offers an extra package which contains functionality tailored to time series analysis. These models behave a bit differently from the "usual" ML models but in general, the usage is rather similar.
We will use the Singular Spectrum Analysis trainer to test out a different forecasting approach.

## Loading and Preparation

The loading and preparation steps don't differ a lot from the approach in `2_a`.

In [None]:
#r "nuget: Deedle, 2.3.0"
#r "nuget: Plotly.NET, 2.0.0-beta9"
#r "nuget: Plotly.NET.Interactive, 2.0.0-beta9"
#r "nuget: Microsoft.ML, 1.5.5"
#r "nuget: Microsoft.ML.TimeSeries, 1.5.5"
#r "nuget: FSharp.Stats, 0.4.1"

#i "nuget:https://www.myget.org/F/gregs-experimental-packages/api/v3/index.json"
#r "nuget:Deedle.DotNet.Interactive.Extension, 0.1.0-alpha6"

In [None]:
open Deedle
open Plotly.NET
open Microsoft.ML
open Microsoft.ML.Data
open Microsoft.ML.Trainers
open Microsoft.ML.Transforms
open Microsoft.ML.Transforms.TimeSeries
open FSharp.Stats.Correlation
open Plotly.NET

let data =
    Frame.ReadCsv("../data/at_load_hourly_mw.csv", hasHeaders = true, culture = "en-US", inferTypes = true, inferRows = 5_000)
    |> Frame.dropCol "Ticks"
    |> Frame.indexRowsDate "TimeStamp"

let dataTrain =
    data
    |> Frame.filterRows (fun key _ -> key.Year < 2019)

let dataTest =
    data
    |> Frame.filterRows (fun key _ -> key.Year >= 2019)

dataTrain
|> Frame.skip 3

## A New Modelling Approach

In comparision with the linear model the data preparation has less steps. We don't need anything else than the values we want to base our forecast on ordered in time. The result of a forecast always has a fixed horizon and allows us to get additional information about the lower and upper bounds a certain point estimate has in the specified confidence interval.

In [None]:
[<CLIMutable>]
type AlternativeForecastInput =
    { Load: float32
      TimeStamp: DateTime }

[<CLIMutable>]
type AlternativeLoadForecast =
  { Forecast: float32 array
    LowerBound: float32 array
    UpperBound: float32 array }

let altForecastInputs =
  dataTrain?Value
  |> Series.observations
  |> Seq.map (fun (k, v) -> { Load = float32 v; TimeStamp = k})

let testKeys, testRows =
    dataTest?Value
    |> Series.observations
    |> Seq.unzip

A SSA Forecasting Trainer has a couple of hyper parameters to tune which are dependent on what you want to pick up (like your seasonalities and trends). It also needs a bit more time to perform a model fit, as the process is far more involved than fitting an OLS.

In [None]:
let mlContext = MLContext(seed = 42)

let pipeline =
  mlContext.Forecasting.ForecastBySsa(
    "Forecast",
    nameof Unchecked.defaultof<AlternativeForecastInput>.Load,
    windowSize =  24 * 30,
    seriesLength = 24 * 30 * 2,
    trainSize = dataTrain.RowCount,
    horizon = 24 * 30 ,
    confidenceLevel = 0.90f,
    confidenceLowerBoundColumn = "LowerBound",
    confidenceUpperBoundColumn = "UpperBound"
)

let altForecastData = mlContext.Data.LoadFromEnumerable(altForecastInputs)

let model = pipeline.Fit(altForecastData)

In contrast to our linear model, time series models don't work with the normal prediction engine. They rather offer their own `TimeSeriesPredictionEngine`. This special engine allows us to specify horizons of arbitrary lengths and even to adapt the model with new data (which I wouldn't do without evaluation).

Evaluating the forecast we see a pretty close fit for values that do not reach too far into the future (from the perspective of our known data). Also, the degree of uncertainty increases quickly - which is exactly what we would expect!

In [None]:
let forecastingEngine = model.CreateTimeSeriesEngine<AlternativeForecastInput, AlternativeLoadForecast>(mlContext)

let horizon = 24 * 5
let forecast = forecastingEngine.Predict(horizon = horizon)

let predChart =
    Seq.zip (testKeys |> Seq.take horizon) forecast.Forecast
    |> fun xy -> Chart.Range(xy,
                             forecast.LowerBound,
                             forecast.UpperBound,
                             mode = StyleParam.Mode.Lines,
                             Color = Colors.toWebColor Colors.Table.Office.blue,
                             RangeColor = Colors.toWebColor Colors.Table.Office.lightBlue)
    |> Chart.withTraceName "Forecast_CI"

let actualChart =
    Seq.zip (testKeys |> Seq.take horizon) (testRows |> Seq.take horizon)
    |> fun xy -> Chart.Line(xy,
                            Color = Colors.toWebColor Colors.Table.Office.orange,
                            UseWebGL = true,
                            Name = "Actual")

[ actualChart; predChart ]
|> Chart.Combine

On a small, granular scale this looks similar to our linear model but if we look at the predicted values versus the actual values on a scatter plot we can see the immense difference our SSA trainer made in short term forecasts.

In [None]:
let actualVals = testRows |> Seq.take (24 * 30) |> Seq.map float32
let predVals = forecast.Forecast

let minVal =
    min (Seq.min predVals) (Seq.min actualVals)
    |> float
    |> fun v -> v - 100.

let largestVal =
    max (Seq.max predVals) (Seq.max actualVals)
    |> float
    |> fun v -> v + 100.

let diagonalLine =
    [ (minVal, minVal); (largestVal, largestVal) ]
    |> fun xy -> Chart.Line(xy, Name = "Diagonal")

let predActualScatter =
    Seq.zip predVals actualVals
    |> fun xy -> Chart.Point(xy, UseWebGL = true, Name = "Pred/Actual")
    |> Chart.withX_AxisStyle ("predictions", MinMax = (minVal, largestVal))
    |> Chart.withY_AxisStyle ("actual", MinMax = (minVal, largestVal))

[ predActualScatter; diagonalLine ]
|> Chart.Combine
|> display

Seq.pearson actualVals predVals
|> display

At this point - if we are happy with the model - we can save it for later use.

In [None]:
let modelDirectory = "../models"
let forecastModel = modelDirectory + "/forecast_model.zip"

mlContext.Model.Save(model, altForecastData.Schema, forecastModel)