-----
# Baseline Forecasting
-----

### Notebook Overview

In this notebook, I’ll explore four different types of baseline forecasting methods:

**1. Mean Method**

Predicts that future values will be the mean of all past observed values.

**2. Naive Method**

Predicts that the next value will be the same as the last observed value.

**3. Seasonal Naive Method**

Predicts that the next value will be the same as the last observed value from the same seasonal period. This method is effective when there are clear seasonal patterns - like in the data we have there is a clear 7 day trend.

**4. Drift Method**

Predicts future values by extending the trend line between the first and last observed data points.

These methods will serve as a benchmark for time series forecasting models. These baseline models can be used assess the performance of more advanced forecasting model such as ARIMA to see if advanced methods are actually better at forecasting or if they are just as good as simplest approach. 

## Set Up
---

In [137]:
import numpy as np
import pandas as pd
import plotly.express as px
from plotly.subplots import make_subplots
import plotly.graph_objs as go
import matplotlib.pyplot as plt
import seaborn as sns
from statsmodels.api import tsa # time series analysis
import statsmodels.api as sm
from sklearn.metrics import mean_squared_error, mean_absolute_error
from sklearn.model_selection import TimeSeriesSplit

## Utility Functions
-----

In [138]:
def plt_forecast(predictions, fc_method, train, test):
    """
    Description:
    Plots the training data, validation data (actual), and baseline predictions on a single graph.

    Parameters:
    - predictions : A Series containing the predicted values with date indices.
    - fc_method: A string describing the forecasting method used.

    Output:
    The function creates a plot using Plotly to visualise:
        - Training data
        - Validation data 
        - Baseline forecast predictions

    """
    
    # Plot to visualise the training data, test data and baseline prediction
    fig = go.Figure()
    fig.add_trace(go.Scatter(x=train.index, y=train['Close'], mode='lines', name="Train"))
    fig.add_trace(go.Scatter(x=test.index, y=test['Close'], mode='lines', name="Validation"))
    fig.add_trace(go.Scatter(x=predictions.index, y=predictions, mode='lines', name="Baseline Forecast"))

    fig.update_layout(
        yaxis_title='Close', 
        xaxis_title='Date',
        title= f'Baseline Forecasting using {fc_method}'
    )
    fig.show()

In [139]:
def fcast_evaluation(predicted, actual):
    """
    Description:
    To evaluate forecasting performance using multiple metrics.

    Parameters:
    predicted: Forecasted values.
    actual: Actual observed values.

    Output:
    A dictionary containing the evaluation metrics:
        - 'MSE': Mean Squared Error
        - 'MAE': Mean Absolute Error
        - 'RMSE': Root Mean Squared Error
        - 'MAPE': Mean Absolute Percentage Error
    """

    err= actual - predicted

    # Calculating MSE
    mse = mean_squared_error(actual, predicted)

    # Calculating MAE
    mae = mean_absolute_error(actual, predicted)

    # Calculating RMSE
    rmse = np.sqrt(mse)

    # Calculating MAPE
    abs_percent_err = np.abs(err/actual)
    mape = abs_percent_err.mean() * 100

    return {'MSE': mse,
            'MAE': mae,
            'RMSE': rmse,
            'MAPE': mape
            }

## Data Loading
---

**Note:** Raw data is used in baseline forecasting to evaluate how well simple methods perform with real-world data.


In [140]:
raw_data = pd.read_csv('../../data/msft_cleaned.csv', index_col=0)

In [141]:
raw_data.index = pd.to_datetime(raw_data.index)

## Cross Validation Train/Test Split 
----

In [142]:
tscv = TimeSeriesSplit(n_splits=5)

----
**Comment:**

Deciding to use Cross Validation Train Test split to ensure no data leakage using the TimeSeriesSplit function.

When dealing with timesserie, you cannot use the standard train/test randome split as the data must remain in chronological order.

## Mean Forecasting 
----

In [94]:
# Creating a Dataframe to store the results
mean_results_df = pd.DataFrame(columns = ['MSE', 'MAE', 'RMSE', 'MAPE'], index = [1,2,3,4,5])

In [144]:
for i, (train_index, test_index) in enumerate(tscv.split(raw_data)):
    
    # Create train/test datraframes using the indexes from tcsv
    train_df = raw_data.iloc[train_index, :]
    test_df = raw_data.iloc[test_index, :]

    # Calculating predicted Close price values using the mean of training data
    baseline_pred = np.full(test_df.shape[0], np.mean(train_df['Close']))
    mean_predictions = pd.Series(data=baseline_pred, index=test_df.index)

    # Plotting the forecasted values alongside actual data and previous data (training data) 
    print(f"Fold {i+1}:") 
    plt_forecast(mean_predictions, fc_method='Mean', train=train_df, test= test_df)
    
    # Calculating results for each fold using function and adding results to dataframe
    results_dict = fcast_evaluation(mean_predictions.values, test_df['Close'].values)
    mean_results_df.iloc[i] = results_dict
    

Fold 1:


Fold 2:


Fold 3:


Fold 4:


Fold 5:


----
**Plot Description:**

Plots shows baseline forecasting using the mean, future values are predicted based on the mean of the training set.

In the final fold, it is clear to see that this method fails to capture the upward trend seen in the data since the earlier values are pulling the mean down.

### Evaluation

In [145]:
mean_results_df

Unnamed: 0,MSE,MAE,RMSE,MAPE
1,3726.701436,58.843224,61.046715,26.722194
2,11248.583131,102.930125,106.059338,34.932037
3,1576.842832,33.760136,39.70948,12.655583
4,6744.80812,74.672001,82.126781,23.461699
5,27586.404698,164.27581,166.091555,39.930852


-----
**Comment:**

The results for the mean baseline forecasting varies across the folds.

- **Fold 1:** The method performs relatively well with a low MSE (3726.70), MAE (58.84), RMSE (61.05) and MAPE (26.72). This indicates that the mean forecast is reasonably accurate for the first fold.

- **Fold 2:** The metrics are much worse, with high values of MSE (11248.58), MAE (102.93), RMSE (106.06) and MAPE (34.93). This suggests that the mean method fails to capture the underlying patterns or trends in the data for this fold.

- **Fold 3:** Improvement in metrics with low MSE (1576.84), MAE (33.76), RMSE (39.71) and MAPE (12.66). This indicates that the mean method aligns better with the data in this fold, providing more accurate forecasts.

- **Fold 4:** Dip in performance with higher values of MSE (6744.81), MAE (74.67), RMSE (82.13) and MAPE (23.46). While the mean method performs better than in Fold 2, it still does not capture the data's characteristics effectively.

- **Fold 5:** The method shows poor performance again, with very high values of MSE (27586.40), MAE (164.28), RMSE (166.09) and MAPE (39.93). This suggests that the mean method is unable to adapt to significant changes or trends in the overall data.

Results of the evaluation metrics show how mean forecasting struggles to deal with the trend and patterns in the data, this could explain the fluctations we see in the metrics across the different folds.

## Naive Forecasting
-----

In [102]:
naive_results_df = pd.DataFrame(columns = ['MSE', 'MAE', 'RMSE', 'MAPE'], index = [1,2,3,4,5])

In [146]:
for i, (train_index, test_index) in enumerate(tscv.split(raw_data)):
    
    train_df = raw_data.iloc[train_index, :]
    test_df = raw_data.iloc[test_index, :]
    
    # Creating baseline predictions, filling array with last observed value
    # Assuming future predictions are equal to the last observed value in the training set  
    baseline_pred = np.full(test_df.shape[0], train_df['Close'].iloc[-1])
    naive_predictions = pd.Series(data=baseline_pred, index=test_df.index)

    print(f"Fold {i+1}:")
    plt_forecast(naive_predictions, fc_method='Naive', train = train_df, test = test_df)
    
    results_dict = fcast_evaluation(naive_predictions.values, test_df['Close'].values)
    naive_results_df.iloc[i] = results_dict

Fold 1:


Fold 2:


Fold 3:


Fold 4:


Fold 5:


----
**Plot Description:**

Plots show the baseline forecasting where the prediction for future values is the last value of the train dataset. 

Here the assumption is that future values are equal to the last historical observation. 

It is clear from the plots that this method is starting to capture part of the overall trend seen in the data.

### Evaluation

In [147]:
naive_results_df

Unnamed: 0,MSE,MAE,RMSE,MAPE
1,527.060095,17.288096,22.957789,7.519303
2,3100.608776,49.603119,55.68311,16.429679
3,683.332776,22.344498,26.140635,9.117295
4,6006.401491,69.602684,77.500977,21.779957
5,2557.903452,44.262815,50.57572,10.495777


----
**Comment:**

Overall, the Naive forecasting method also shows inconsistent results across all folds. This again shows limiations of this method in capturing the trend and patterns in the data. 

## Seasonal Forecasting
----

In [148]:
snaive_results_df = pd.DataFrame(columns = ['MSE', 'MAE', 'RMSE', 'MAPE'], index = [1,2,3,4,5])

In [150]:
for i, (train_index, test_index) in enumerate(tscv.split(raw_data)):

    train_df = raw_data.iloc[train_index, :]
    test_df = raw_data.iloc[test_index, :]

    snaive_fcasts = []

    last_7_days_values = train_df['Close'].iloc[-7:].values
    for idx in range(len(test_index)):
        # forecasts using values from the last 7 days of the training data
        forecast_value = last_7_days_values[idx % 7]
        snaive_fcasts.append(forecast_value)

    print(f"Fold {i + 1}:")
    snaive_predictions = pd.Series(snaive_fcasts, index=test_df.index, name='Predicted')

    plt_forecast(snaive_predictions, 'Seasonal Naive', train=train_df, test=test_df)

    results_dict = fcast_evaluation(snaive_predictions.values, test_df['Close'].values)

    snaive_results_df.iloc[i] = results_dict
    

Fold 1:


Fold 2:


Fold 3:


Fold 4:


Fold 5:


----
**Plot Description:**

Plot shows baseline forecasting using a seasonal naive method, future values are predicted based on the repeating seasonlity seen in the past 7 days of the training data.


### Evaluation

In [107]:
snaive_results_df

Unnamed: 0,MSE,MAE,RMSE,MAPE
1,456.460888,15.772506,21.364945,6.86071
2,3306.40842,51.468427,57.501378,17.073634
3,836.509908,24.533766,28.922481,10.058682
4,7332.282201,78.402998,85.628746,24.70866
5,2929.199646,48.08961,54.122081,11.435274


-----
**Comment:**

Overall, the seasonal naive method shows improved performance compared to both the mean and naive methods. 

By predicting future values based on the same day from the previous week, it effectively captures seasonal patterns. This results in lower error metrics, indicating that it better captures recurring patterns and trends in the data. 

## Drift Model
----

In [151]:
drift_results_df = pd.DataFrame(columns = ['MSE', 'MAE', 'RMSE', 'MAPE'], index = [1,2,3,4,5])

In [153]:
for i, (train_index, test_index) in enumerate(tscv.split(raw_data)):
   
    train_df = raw_data.iloc[train_index, :]
    test_df = raw_data.iloc[test_index, :]

    const = (train_df['Close'].iloc[-1] - train_df['Close'].iloc[0])/(train_df.shape[0] -1)
    fcast_range = range(len(test_index))

    drift_pred = train_df['Close'].iloc[-1] + (fcast_range*const)
    drift_predictions =  pd.Series(drift_pred, index=test_df.index)

    print(f'Fold {i + 1}:')
    plt_forecast(drift_predictions, 'Drift', train=train_df, test=test_df)

    results_dict = fcast_evaluation(drift_predictions.values, test_df['Close'].values)

    drift_results_df.iloc[i] = results_dict
    

Fold 1:


Fold 2:


Fold 3:


Fold 4:


Fold 5:


----
**Plot Description:**

Plot shows forecasting using the drift method, future values are predicted based on the trend seen in the training data. The trend in the training data is extended from the last training data point.

### Evaluation


In [154]:
drift_results_df

Unnamed: 0,MSE,MAE,RMSE,MAPE
1,429.749281,18.65794,20.730395,8.603766
2,883.413406,24.53632,29.722271,8.094027
3,2536.851253,42.933352,50.367164,17.670646
4,3992.444331,56.700692,63.185792,17.748309
5,740.83888,21.837498,27.218356,5.153977


----
**Comment:**

The Drift forecasting method generally performs well when the data trends are stable and align with the linear trend observed in the training data, as seen in Folds 1 and 5. However, it struggles when there are irregular patterns in the training data which could be why we see higher errors in Folds 2, 3 and 4.

So far, this method seems to be the best baseline forecasting approach. To be sure, I will calculate the average metrics across all folds for each method and compare them.

## Evaluation of Baseline Forecasting Methods
----

To evaluate each of the baseline models, I will be using the following metrics:

**1. MSE (Mean Squared Error)**

MSE measures the average of the squares of the errors. Errors are defined as the differences between predicted and actual values

This metric gives more weight to larger errors and outliers due to the squaring.

**2. MAE (Mean Absolute Error)**

MAE measures the average magnitude of errors in predictions regardless of direction (over/under estimation). It is the average of the absolute differences between the predicted and actual values.

**3. RMSE (Root Mean Squared Error)**

RMSE measures how much the predictions differ from the actual values.  It calculates the square root of the average squared errors and so is on the same scales as the original data making it more interpretable than MSE.

**4. MAPE (Mean Absolute Percentage Error)**

MAPE measures the average percentage error between predicted and actual values.

In [120]:
overall_df = pd.DataFrame(index=['Mean', 'Naive', 'Seasonal Naive', 'Drift'], columns = ['MSE', 'MAE', 'RMSE', 'MAPE'])

In [131]:
mean_metrics = mean_results_df[['MSE', 'MAE', 'RMSE', 'MAPE']].mean()
naive_metrics = naive_results_df[['MSE', 'MAE', 'RMSE', 'MAPE']].mean()
snaive_metrics = snaive_results_df[['MSE', 'MAE', 'RMSE', 'MAPE']].mean()
drift_metrics = drift_results_df[['MSE', 'MAE', 'RMSE', 'MAPE']].mean()

In [132]:
overall_df.loc['Mean'] = mean_metrics
overall_df.loc['Naive'] = naive_metrics
overall_df.loc['Seasonal Naive'] = snaive_metrics
overall_df.loc['Drift'] = drift_metrics

In [136]:
overall_df.sort_values(by = ['MSE', 'MAE', 'RMSE', 'MAPE'], ascending= True)

Unnamed: 0,MSE,MAE,RMSE,MAPE
Drift,1716.65943,32.93316,38.244796,11.454145
Naive,2575.061318,40.620242,46.571646,13.068402
Seasonal Naive,2972.172213,43.653461,49.507926,14.027392
Mean,10176.668043,86.896259,91.006774,27.540473


#### **Best Baseline: Drift Forecasting Model**

The Drift method seems to be the best performing baseline forecasting method with the lowest average errors. 

The Naive method is better than the Seasonal Naive and Mean methods but still less accurate than Drift.

## Conclusion
----

These baseline models serve as a benchmark for assessing the performance of more advanced forecasting methods which I am moving on to. 

By comparing advanced models against these simple methods, we can evaluate whether they offer significant improvements in predictions made or not. 

After comparing evaluation metrics of all 4 methods, the drift model showed the best performance.Therefore, I will use the drift model as the baseline for comparing more advanced forecasting methods moving forward.