# Problems:

- Underfitting refers to not capturing enough patterns in the data. The model performs poorly both in the training and the test set.

- Overfitting refers: a)capturing noise and b) capturing patterns which do not generalize well to unseen data. The model performs extremely well to the training set but poorly on the test set.

![image.png](attachment:image.png)

## Train_Test Split approach

- When Perform the model training on the training set and use the test set for validation purpose, ideally split the data into 70:30 or 80:20. If our data is huge and our test sample and train sample has the same distribution then this approach is acceptable.
![image.png](attachment:image.png)

=> There is a possibility of high bias if we have limited data, because we would miss some information about the data which we have not used for training. 


As there is never enough data to train a model, removing a part of it for validation poses a problem of underfitting. By reducing the training data, we risk losing important patterns/ trends in data set, which in turn increases error induced by bias. So, what we require is a method that provides ample data for training the model and also leaves ample data for validation.

## Cross-validation with K-fold: the number of groups = k

It can be viewed as repeated holdout and we simply average scores after K different holdouts. Every data point gets to be in a validation set exactly once, and gets to be in a training set k-1times. 

This significantly reduces underfitting as we are using most of the data for fitting, and also significantly reduces overfitting as most of the data is also being used in validation set.
![image.png](attachment:image.png)
As a general rule, we choose k=5 or k=10, as these values have been shown empirically to yield test error estimates that suffer neither from excessively high bias nor high variance.

### This method follows the below steps:
1. Split the entire data randomly into k folds (value of k shouldn’t be too small or too high, ideally we choose 5 to 10 depending on the data size). The higher value of K leads to less biased model (but large variance might lead to overfit), where as the lower value of K is similar to the train-test split approach we saw before.
2. Then fit the model using the K — 1 (K minus 1) folds and validate the model using the remaining Kth fold. Note down the scores/errors.
3. Repeat this process until every K-fold serve as the test set. Then take the average of your recorded scores. That will be the performance metric for the model.

# Evaluate time series model AWS Forecast

To evaluate the accuracy of an algorithm for various forecasting scenarios and to tune the predictor, use
predictor metrics. Amazon Forecast uses backtesting to produce metrics.

Forecast automatically splits your input data into two datasets, training and test, as shown in the
following figure. Forecast decides how to split the input data by using the BackTestWindowOffset
parameter.
![image.png](attachment:image.png)

To evaluate the metrics in multiple backtest scenarios with different virtual forecast start
dates, as shown in the following figure, use the NumberOfBacktestWindows parameter in the
CreatePredictor operation. The default for the NumberOfBacktestWindows parameter is 1. If you
use the default, Forecast uses the simple splitting method shown in the preceding figure.
![image.png](attachment:image.png)

### EvaluationParameters

##### Contents
1. BackTestWindowOffset

The point from the end of the dataset where you want to split the data for model training and testing (evaluation). Specify the value as the number of data points. The default is the value of the forecast horizon. BackTestWindowOffset can be used to mimic a past virtual forecast start date. This value must be greater than or equal to the forecast horizon and less than half of the TARGET_TIME_SERIES dataset length.

ForecastHorizon <= BackTestWindowOffset < 1/2 * TARGET_TIME_SERIES dataset length

Type: Integer

Required: No

2. NumberOfBacktestWindows
The number of times to split the input data. 

The default is 1. 

Valid values are 1 through 5.

After training, Amazon Forecast calculates the root mean square error (RMSE) and weighted quantile
losses to determine how well the model predicted the test data in each backtest window and the
average value over all the backtest windows. These metrics measure the difference between the values
predicted by the model and the actual values in the test dataset

### Root Mean Square Error