### Forecasting the next-hour load using regressor on Sonar dataset

This example shows how to make time-series forecasting using sonar dataset. Firstly we show how to transform forecasting into a regression problem and then we show how to run hyper-parameter optimization and search for the best regression model using built-in `AutoMLExperiment` class.

## Install the necessary NuGet packages for training ML.NET model and plotting:

In [None]:
// using nightly-build

#i "nuget:https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet5/nuget/v3/index.json" 
#i "nuget:https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-tools/nuget/v3/index.json"
#i "nuget:https://mlnetcli.blob.core.windows.net/mlnetcli/index.json"
#i "nuget:https://pkgs.dev.azure.com/dnceng/public/_packaging/MachineLearning/nuget/v3/index.json"
#r "nuget:MLNetAutoML.InteractiveExtension,0.1.1"
#r "nuget:XPlot.Plotly.Interactive,4.0.6"
#r "nuget:Microsoft.ML.AutoML,0.20.0-preview.22259.2"
#r "nuget:Microsoft.Data.Analysis,0.20.0-preview.22259.2"

In [None]:

// Import common usings.
using static Microsoft.DotNet.Interactive.Formatting.PocketViewTags;
using Microsoft.Data.Analysis;
using System;
using System.IO;
using Microsoft.ML;
using Microsoft.ML.AutoML;
using Microsoft.ML.Data;
using MLNetAutoML.InteractiveExtension;

### Import Dataset

Sonar is a time-series dataset which records the metric of hourly-active usage of an internal-used service on Azure. It has two columns: `DateTime` and `load` where `load` column records the metric of hourly-active usage. Sonar shows a very strong weekly seasonal pattern given it's nature of an azure service, and a little bit of trend (growth with time) as well. Those features allows us to build a forecasting model to predict the next hour load metric with which we can adjust the size of computing power accordingly.

In the code block below, we show how to
- load dataset
- transform a time-series forecasting problem into a regression problem via using the last _N_ hour as feature and the current `load` as label.

In [None]:
var dataPath = @"C:\Users\xiaoyuz\Desktop\forecasting\us_data_6_month.csv";

// use previous 2 weeks data
var df = DataFrame.LoadCsv(dataPath);
var prevWindows = Enumerable.Range(1, 24 * 7 * 2);
var loads = df["load"].Cast<float?>();
foreach(var i in prevWindows)
{
    var columnName = $"prev_{i}h";
    df[columnName] = DataFrameColumn.Create(columnName, Enumerable.Repeat<float?>(null, i).Concat(loads).SkipLast(i));
}
df

index,load,prev_2h,prev_3h,prev_4h,prev_5h,prev_6h,prev_7h,prev_166h,prev_167h,prev_168h,prev_169h,prev_170h,prev_171h,prev_172h
⏮⏪◀️Page1▶️⏩⏭️,⏮⏪◀️Page1▶️⏩⏭️,⏮⏪◀️Page1▶️⏩⏭️,⏮⏪◀️Page1▶️⏩⏭️,⏮⏪◀️Page1▶️⏩⏭️,⏮⏪◀️Page1▶️⏩⏭️,⏮⏪◀️Page1▶️⏩⏭️,⏮⏪◀️Page1▶️⏩⏭️,⏮⏪◀️Page1▶️⏩⏭️,⏮⏪◀️Page1▶️⏩⏭️,⏮⏪◀️Page1▶️⏩⏭️,⏮⏪◀️Page1▶️⏩⏭️,⏮⏪◀️Page1▶️⏩⏭️,⏮⏪◀️Page1▶️⏩⏭️,⏮⏪◀️Page1▶️⏩⏭️


In [None]:
var rowCount = df.Rows.Count();
var evaluateCount = 24*7;
var trainDf = df.Head(rowCount -evaluateCount);
var evaluateDf = df.Tail(evaluateCount);
var mlContext = new MLContext();

var featureColumns = df.Columns.Select(c => c.Name)
                        .Where(c => c!="load");
// Append the trainer to the data processing pipeline
var pipeline = mlContext.Transforms.Concatenate(@"Features", featureColumns.ToArray())
                    .Append(mlContext.Auto().Regression(labelColumnName: "load", useLbfgs: false, useSdca: false, useFastForest: false, useFastTree: false, useLgbm: true));


In [None]:

// Configure AutoML
var trainTestSplit = mlContext.Data.TrainTestSplit(trainDf, 0.1);
var monitor = new NotebookMonitor();

 var experiment = mlContext.Auto().CreateExperiment()
                    .SetPipeline(pipeline)
                    .SetTrainingTimeInSeconds(120)
                    .SetDataset(trainTestSplit.TrainSet, trainTestSplit.TestSet)
                    .SetEvaluateMetric(RegressionMetric.RootMeanSquaredError, "load", "Score")
					.SetMonitor(monitor);

					// Configure Visualizer			
monitor.SetUpdate(monitor.Display());

// Start Experiment
var res = experiment.Run().Result;


index,Trial,Metric,Pipeline
0,0,357709.0,Unknown=>LightGbmRegression
1,1,435646.25,Unknown=>LightGbmRegression
2,2,349325.84,Unknown=>LightGbmRegression
3,3,493939.28,Unknown=>LightGbmRegression


## Evaluate model using test dataset

In [None]:
var model = res.Model;
var eval = model.Transform(evaluateDf); // we should use an unseen dataset though.
var metric = mlContext.Regression.Evaluate(eval, "load");
var predictedPath = @"C:\Users\xiaoyuz\Desktop\forecasting\predicted_notebook.csv";
evaluateDf["predicted"] = DataFrameColumn.Create("predicted", eval.GetColumn<float>("Score"));
DataFrame.WriteCsv(evaluateDf, predictedPath);

// print metric
metric

MeanAbsoluteError,MeanSquaredError,RootMeanSquaredError,LossFunction,RSquared
80090.70740327382,12901849094.1507,113586.30680742596,12901849124.083542,0.9897081887521064
