# Weather forecast model

Data is from NOAA. 

The stations were found using https://www.ncdc.noaa.gov/cdo-web/datatools/findstation

The dataset used was the Daily Summaries Dataset -> Air Temperature

## Getting the data

1. Find a station https://www.ncdc.noaa.gov/cdo-web/datatools/findstation

![image](https://user-images.githubusercontent.com/46974588/116326383-50f17f80-a792-11eb-9c66-3dabef398889.png)

1. View cart and export **Custom GHCN-Daily CSV** format

![image](https://user-images.githubusercontent.com/46974588/116326449-78484c80-a792-11eb-8061-9c87bb6fc856.png)

1. Submit request for data with the following options:
    - [x] Station Name
    - Units: Standard
    - [x] Air Temperature
        - [ ] Average Temperature
        - [x] Maximum Temperature (TMAX)
        - [x] Minimum Temperature (TMIN)

![image](https://user-images.githubusercontent.com/46974588/116326560-bd6c7e80-a792-11eb-8050-18f7f85193a5.png)

## Install NuGet packages

In [1]:
#r "nuget:Microsoft.ML,1.5.5"
#r "nuget:Microsoft.ML.TimeSeries,1.5.5"

Installed package Microsoft.ML.TimeSeries version 1.5.5

Installed package Microsoft.ML version 1.5.5

## Import NuGet packages

In [2]:
using Microsoft.ML;
using Microsoft.ML.Data;
using Microsoft.ML.Transforms.TimeSeries;
using System.Linq;

## Define model input and output schemas

In [3]:
public class ModelInput
{
    [LoadColumn(6)]
    public DateTime Date { get; set; }
    
    [LoadColumn(7)]
    public float MaxTemp { get; set; }

    [LoadColumn(8)]
    public float MinTemp {get;set;}
    
}

Added original columns Max / Min Temp columns to compare with actual

In [4]:
public class ModelOutput
{
    public DateTime Date { get; set; }

    public float MaxTemp {get;set;}

    public float MinTemp {get; set;}

    public float[] ForecastTemp { get; set; }

    public float[] LowerBoundTemp { get; set; }

    public float[] UpperBoundTemp { get; set; }
}

## Initialize MLContext

In [5]:
var mlContext = new MLContext();

## Load data into IDataView

5 year data starts 4/1/2015  
10 year data starts 4/2/2010

In [6]:
var seattle5yr = "Data/seattle-5yr.csv";
var seattle10yr = "Data/seattle-10yr.csv";

var trainingDataView5yr = mlContext.Data.LoadFromTextFile<ModelInput>(seattle5yr, hasHeader: true, separatorChar:',');
var trainingDataView10yr = mlContext.Data.LoadFromTextFile<ModelInput>(seattle10yr, hasHeader: true, separatorChar:',');

## Models (5 year data)

- Minimum Temperature
- Maximum Temperature

### Minimum Temperature

In [29]:
IEstimator<ITransformer> minTempEstimator5yr = mlContext.Forecasting.ForecastBySsa(
    outputColumnName: "ForecastTemp",
    inputColumnName: "MinTemp",
    windowSize: 7,
    seriesLength: 2202,
    trainSize: 2202,
    horizon: 7,
    confidenceLevel: 0.85f,
    confidenceLowerBoundColumn: "LowerBoundTemp",
    confidenceUpperBoundColumn: "UpperBoundTemp");

In [30]:
var minTempModel5yr = minTempEstimator5yr.Fit(trainingDataView5yr);

### Maximum Temperature

In [31]:
IEstimator<ITransformer> maxTempEstimator5yr = mlContext.Forecasting.ForecastBySsa(
    outputColumnName: "ForecastTemp",
    inputColumnName: "MaxTemp",
    windowSize: 7,
    seriesLength: 2202,
    trainSize: 2202,
    horizon: 7,
    confidenceLevel: 0.85f,
    confidenceLowerBoundColumn: "LowerBoundTemp",
    confidenceUpperBoundColumn: "UpperBoundTemp");

In [32]:
var maxTempModel5yr = maxTempEstimator5yr.Fit(trainingDataView5yr);

## Test Models (5 year)

In [33]:
var testInput = new ModelInput { Date= new DateTime(2020,04,01), MaxTemp = 51, MinTemp = 38};

In [34]:
TimeSeriesPredictionEngine<ModelInput, ModelOutput> minForecastEngine5yr = minTempModel5yr.CreateTimeSeriesEngine<ModelInput, ModelOutput>(mlContext);
var minPrediction5yr = minForecastEngine5yr.Predict(testInput,horizon:7);

In [35]:
TimeSeriesPredictionEngine<ModelInput, ModelOutput> maxForecastEngine5yr = maxTempModel5yr.CreateTimeSeriesEngine<ModelInput, ModelOutput>(mlContext);
var maxPrediction5yr = maxForecastEngine5yr.Predict(testInput, horizon:7);

In [36]:
new [] {minPrediction5yr, maxPrediction5yr}

index,Date,MaxTemp,MinTemp,ForecastTemp,LowerBoundTemp,UpperBoundTemp
0,2020-04-01 00:00:00Z,51,38,"[ 45.64229, 45.505856, 45.42607, 45.48796, 45.574562, 45.579506, 45.473053 ]","[ 39.421925, 38.72968, 37.953438, 37.29155, 36.686584, 36.047966, 35.26731 ]","[ 51.86265, 52.282032, 52.898705, 53.684372, 54.46254, 55.111046, 55.678795 ]"
1,2020-04-01 00:00:00Z,51,38,"[ 73.780716, 73.05658, 72.64575, 72.69919, 72.88619, 72.93948, 72.638275 ]","[ 65.4993, 64.08209, 62.78885, 61.9087, 61.187195, 60.394077, 59.195263 ]","[ 82.06213, 82.03107, 82.502655, 83.48968, 84.58519, 85.484886, 86.08128 ]"


## Models (10 year)

- Minimum temperature
- Maximum temperature

### Minimum Temperature

In [37]:
IEstimator<ITransformer> minTempEstimator10yr = mlContext.Forecasting.ForecastBySsa(
    outputColumnName: "ForecastTemp",
    inputColumnName: "MinTemp",
    windowSize: 7,
    seriesLength: 4027,
    trainSize: 4027,
    horizon: 7,
    confidenceLevel: 0.85f,
    confidenceLowerBoundColumn: "LowerBoundTemp",
    confidenceUpperBoundColumn: "UpperBoundTemp");

In [38]:
var minTempModel10yr = minTempEstimator10yr.Fit(trainingDataView10yr);

### Maximum Temperature

In [39]:
IEstimator<ITransformer> maxTempEstimator10yr = mlContext.Forecasting.ForecastBySsa(
    outputColumnName: "ForecastTemp",
    inputColumnName: "MaxTemp",
    windowSize: 7,
    seriesLength: 4027,
    trainSize: 4027,
    horizon: 7,
    confidenceLevel: 0.85f,
    confidenceLowerBoundColumn: "LowerBoundTemp",
    confidenceUpperBoundColumn: "UpperBoundTemp");

In [40]:
var maxTempModel10yr = maxTempEstimator10yr.Fit(trainingDataView10yr);

## Test Models (10 year)

In [41]:
var testInput = new ModelInput { Date= new DateTime(2020,04,01), MaxTemp = 51, MinTemp = 38};

In [42]:
TimeSeriesPredictionEngine<ModelInput, ModelOutput> minForecastEngine10yr = minTempModel10yr.CreateTimeSeriesEngine<ModelInput, ModelOutput>(mlContext);
var minPrediction10yr = minForecastEngine10yr.Predict(testInput,horizon:7);

In [43]:
TimeSeriesPredictionEngine<ModelInput, ModelOutput> maxForecastEngine10yr = maxTempModel10yr.CreateTimeSeriesEngine<ModelInput, ModelOutput>(mlContext);
var maxPrediction10yr = maxForecastEngine10yr.Predict(testInput, horizon:7);

In [44]:
new [] {minPrediction10yr, maxPrediction10yr}

index,Date,MaxTemp,MinTemp,ForecastTemp,LowerBoundTemp,UpperBoundTemp
0,2020-04-01 00:00:00Z,51,38,"[ 45.442406, 45.264603, 45.15645, 45.19384, 45.254196, 45.226856, 45.13159 ]","[ 39.717957, 39.014584, 38.26309, 37.649364, 37.10386, 36.508354, 35.83022 ]","[ 51.166855, 51.51462, 52.04981, 52.738316, 53.404533, 53.94536, 54.432964 ]"
1,2020-04-01 00:00:00Z,51,38,"[ 73.52904, 72.769394, 72.34601, 72.38786, 72.55228, 72.53551, 72.297585 ]","[ 66.13588, 64.76329, 63.576366, 62.831036, 62.2506, 61.534508, 60.570606 ]","[ 80.922195, 80.7755, 81.11565, 81.944695, 82.85395, 83.53651, 84.02457 ]"


## Conclusion

5 year models, because it has more recent data looks to make better predictions. I recommend playing around with hyperparameters