### [Baseline Models in Time Series](https://medium.com/data-science/baseline-models-in-time-series-c76d44a826b3)

> Creating a baseline model before training the final model

#### What is a baseline model?

A baseline model is a simple model used to create a benchmark, or a point of reference, upon which you will be building your final, more complex machine learning model.

Data scientists create baseline models because:

- Baseline models can give you a good idea of how a more complex model will perform.
- If a baseline model does badly, it could be a sign of an issue with the data quality that needs addressing.
- If a baseline model performs better than the final model, it could indicate issues with that algorithm, features, hyperparameters or other data preprocessing.
- If the baseline and complex model perform more or less the same, this could indicate that the complex model needs more fine tuning (in features, architecture, or hyperparameters). It could also show that a more complex model isn’t necessary, and a simpler model will suffice.

Typically, a baseline model is a statistical model, such as a moving average model. Alternatively, it is a simpler version of the target model — for example, if you will be training a Random Forest model, you can first train a Decision Tree model as a baseline.

In [None]:
import warnings
warnings.filterwarnings('ignore')

In [None]:
!pip install -q numpy pandas

##### Naive forecast

In [None]:
import numpy as np 

# define the split index
split_time = 1000

# separate the target array and 
# the time/date array
series = np.array(df['TemperatureF'])
time = np.array(df['Date'])

# train test split
time_train = time[:split_time]
time_test = time[split_time:]

series_train = series[:split_time]
series_test = series[split_time:]

# the naive forecast simply shifts the series by 1 
naive_fcst = series[split_time - 1: -1]

In [None]:
import plotly.graph_objects as go

fig = go.Figure([
        go.Scatter(x=time_test, y=series_test, text='true', name='true'),
        go.Scatter(x=time_test, y=naive_fcst, text='pred', name='pred'),
    ])

fig.show()

In [None]:
from sklearn.metrics import mean_squared_error

mse = mean_squared_error(series_test,naive_fcst)
rmse = mean_squared_error(series_test,naive_fcst,squared=False)

print("MSE:", mse)
print("RMSE:", rmse)

##### Moving average forecast
A moving average (MA) baseline model predicts that the next data point is the average of the last n data points. 

In [None]:
# Initialize a list
forecast = []
window_size = 30

# Compute the moving average based on the window size
for time in range(len(series) - window_size):
    forecast.append(series[time:time + window_size].mean())

# Convert to a numpy array
forecast = np.array(forecast)

In [None]:
moving_avg = forecast[split_time - window_size:]

In [None]:
fig = go.Figure([
        go.Scatter(x=time_test, y=series_test, text='true', name='true'),
        go.Scatter(x=time_test, y=moving_avg, text='pred', name='pred'),
    ])

fig.show()

In [None]:
mse = mean_squared_error(series_test, moving_avg)
rmse = mean_squared_error(series_test,moving_avg,squared=False)

print("MSE:", mse)
print("RMSE:", rmse)