# Moving Averages and Smoothing Methods (1)

*(Hanke & Wichern "Business Forecasting" Ch.3)*


We learn three simple methods to forecasting a time series:
1. *Naive method*: a simple model that assumes that very recent data provide the best predictors of the future
1. *Averaging method*: forecasts based on an average of past observations
1. *Smoothing method*: forecasts by averaging past values of a series with a decreasing (exponential) series of weights

## A procedure for forecasting

Step 1. A method is selected based on the forecaster's <u>analysis of and intuition about the nature of the data.</u>

Step 2. The data set is divided into two sections --- <u>a fitting section and a test section.</u>

Step 3. The selected technique is used to develop fitted values.

Step 4. The technique is used to forecast the test part of the data.

Step 5. The forecasting error is determined and evaluated.

Step 6. A decision is made.

Past data | You are here ($t$) | Periods to be forecast
:-----:|:-----:|:-----:
$\ldots$, $Y_{t-3}$, $Y_{t-2}$, $Y_{t-1}$ | $Y_t$ | $\hat{Y}_{t+1}$, $\hat{Y}_{t+2}$, $\hat{Y}_{t+3}$, $\ldots$

## Naive Models

Naive forecasts assume that recent periods are the best predictors of the future:

\begin{equation}
\hat{Y}_{t+1} = Y_t \tag{1}
\end{equation}

**Properties:**
- One hundred percent of the weight is given to the current value of he series.
- The naive forecast is sometimes called the "no change" forecast.
- Since the naive forecast discards all other observations, this scheme tracks changes very rapidly.

**<font color='red'>Under what context, would this method be a good technique for forecasting?</font>**

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv('https://raw.githubusercontent.com/dongyakoh/business_forecasting/main/data/sales_of_saws.csv')
plt.plot(df['sales'])
plt.show()

Let's use the data up to the 4th quarter of 2005 as a fitting section and forecast the first quarter of 2006.
That is, $\hat{Y}_{24} = Y_{23}$.

In [None]:
# Naive method
df['naive'] = df.sales.shift(1)
df['e_naive'] = df['sales'] - df['naive']
plt.plot(df.loc[df.index[:24],'sales'],'k-',label='data')
plt.plot(df.loc[df.index[23:25],'sales'],'k--',label='data')
plt.plot(df.loc[df.index[:25],'naive'],color='red',label='naive')
plt.legend()
plt.show()
df[24:]

The forecasting error of the first quarter of 2006 is calculated as

$e_{24}$ = $Y_{24}$ - $\hat{Y}_{24}$ = 850 - 650 = 200

In the same manner, you can forecast the second to fourth quarters of 2006 and compute the forecast errors.

Visual inspection of the data shows that the data has a trend and seasonal patterns. If the naive method is used, the projections will be consistently low. 

We can modify the naive method by taking trend into consideration and add the difference between this period and the last period.

\begin{equation}
\hat{Y}_{t+1} = Y_t + (Y_t - Y_{t-1}) \tag{2}
\end{equation}

where the difference on the right-hand side takes into account the amount of change that occurred between quarters.

In [None]:
# Modified naive model
df['naive2'] = df.sales.shift(1) + (df.sales.shift(1) - df.sales.shift(2))
df['e_naive2'] = df['sales'] - df['naive2']
plt.plot(df.loc[df.index[:24],'sales'],'k-',label='data')
plt.plot(df.loc[df.index[23:25],'sales'],'k--',label='data')
plt.plot(df.loc[df.index[:25],'naive2'],color='red',label='naive')
plt.legend()
plt.show()
df[24:]

The forecast error with the modified naive method is calculated to be

$e_{24}$ = $Y_{24}$ - $\hat{Y}_{24}$ = 850 - 900 = -50

which yields a smaller forecasting error than before.

For some purposes, the rate of change might be more appropriate than the amount of change.

\begin{equation}
\hat{Y}_{t+1} = Y_t \frac{Y_t}{Y_{t-1}} \tag{3}
\end{equation}

In [None]:
# Another modified naive model
df['naive3'] = df.sales.shift(1) * (df.sales.shift(1) / df.sales.shift(2))
df['e_naive3'] = df['sales'] - df['naive3']
plt.plot(df.loc[df.index[:24],'sales'],'k-',label='data')
plt.plot(df.loc[df.index[23:25],'sales'],'k--',label='data')
plt.plot(df.loc[df.index[:25],'naive3'],color='red',label='naive')
plt.legend()
plt.show()
df[24:]

Visual inspection of the data indicates that seasonal variation seems to exist. Sales in the first and fourth quarters are typically larger than those in the second and third quarters. Then, an appropriate modification of naive model should be

\begin{equation}
\hat{Y}_{t+1} = Y_{t-3} \tag{4}
\end{equation}

This equation says that the next quarter's forecast will take the same value as the corresponding quarter a year ago.

In [None]:
# Naive method with seasonal pattern
df['naive4'] = df.sales.shift(4)
df['e_naive4'] = df['sales'] - df['naive4']
plt.plot(df.loc[df.index[:24],'sales'],'k-',label='data')
plt.plot(df.loc[df.index[23:25],'sales'],'k--',label='data')
plt.plot(df.loc[df.index[:25],'naive4'],color='red',label='naive')
plt.legend()
plt.show()
df[24:]

The major weakness of the previous approach is that it ignores any trend. We can add trend estimates to the previous approach:

\begin{equation}
\hat{Y}_{t+1} = Y_{t-3} + \frac{Y_t - Y_{t-4}}{4} \tag{5}
\end{equation}


In [None]:
# Naive method with seasonal pattern and trend
df['naive5'] = df.sales.shift(4) + (df.sales.shift(1)-df.sales.shift(5))/4
df['e_naive5'] = df.sales - df.naive5
plt.plot(df.loc[df.index[:24],'sales'],'k-',label='data')
plt.plot(df.loc[df.index[23:25],'sales'],'k--',label='data')
plt.plot(df.loc[df.index[:25],'naive5'],color='red',label='naive')
plt.legend()
plt.show()
df[24:]

## Averaging Methods

The naive methods may not be appropriate if the past values represent random departures from some underlying structure. Therefore, managers are likely to use an averaging or smoothing technique to identify the structure. These types of technique use a form of weighted average of past observations.

### Simple Averages

We can smooth historical data in many ways. One simple smoothing technique is to average the past values (the initialization part of the data) and forecast the future values:

\begin{equation}
\hat{Y}_{t+1} = \frac{1}{t} \sum^{t}_{i=1} Y_{i} \tag{6}
\end{equation}

When a new observation becomes available, $\hat{Y}_{t+2}$ can be calculated from the forecast value, $\hat{Y}_{t+1}$, and the new observation, $Y_{t+1}$:

\begin{equation}
\hat{Y}_{t+2} = \frac{1}{t+1} \sum^{t+1}_{i=1} Y_{i} = \frac{1}{t+1} (Y_{t+1} + \sum^{t}_{i=1} Y_{i}) = \frac{t}{t+1} \left( \frac{1}{t}Y_{t+1} + \frac{1}{t} \sum^{t}_{i=1} Y_{i}\right) = \frac{Y_{t+1} + t\hat{Y}_{t+1}}{t+1}
\end{equation}

<font color='red'>In what data patterns, is this technique appropriate to be used?</font>

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
gas = pd.read_csv('https://raw.githubusercontent.com/dongyakoh/business_forecasting/main/data/gasoline_sale.csv')
plt.plot(gas['sales'])
plt.show()

In [None]:
# Simple Averages
t,i = 28,27  # current week, python index
gas['sa'] = gas.sales[:t].mean() # Forecast week 29, index 28
gas.loc[i+2,'sa']  = (gas.sales[i+1] + t*gas.loc[i+1,'sa'])/(t+1)  # Forecast week 30, index 29
gas['e_sa']  = gas.sales - gas.sa
plt.plot(gas.loc[gas.index[:28],'sales'],'k-',label='init section')
plt.plot(gas.loc[gas.index[27:30],'sales'],'k--',label='test section')
plt.plot(gas.loc[gas.index[:29],'sa'],color='red',label='simple averages')
plt.legend()
plt.show()
gas[28:]

### Moving Averages

The simple averages use all the past values to forecast. But, what if the analyst is more concerned with recent observations?

*Moving average* is a smoothing technique that a new mean is computed for the most recent observations, by adding the newest value and dropping the oldest.

A moving average of order $k$, MA(k), is computed by

\begin{equation}
\hat{Y}_{t+1} = \frac{1}{k} \sum^{t}_{i=t-k+1} Y_{i} \tag{7}
\end{equation}

Please note that if you believe that the most recent (lag 1) data is relevant for the current value, this implies that the number of terms in the moving average is $k=1$. Therefore, the equation becomes equivalent to the naive method.

\begin{equation}
\hat{Y}_{t+1} = Y_{t}
\end{equation}

**Features:**
- The MA for time period $t$ is the arithmetic mean of the $k$ most recent observations.
- In a moving average, equal weights are assigned to each observation.
- The moving average technique deals only with the latest $k$ periods of known data.
- The moving average model does not handle trend or seasonality very well, although it does better than the simple average method.

**Caveat**
- The analyst must choose the number of terms in the moving average, $k$. The smaller the number, the larger the weight given to recent periods.
- A small number is most desirable when there are sudden shifts in the level of the series.
- A large number is desirable when there are wide, infrequent fluctuations in the series.
- For quarterly data,  a four-quarter moving average, MA(4), yields an average of the four quarters.
- For monthly data, a 12-month moving average, MA(12), eliminates the seasonal effects.

In [None]:
gas['ma'] = gas.sales.rolling(5).mean().shift(1)
gas['e_ma']  = gas.sales - gas.ma
plt.plot(gas.loc[gas.index[:28],'sales'],'k-',label='init section')
plt.plot(gas.loc[gas.index[27:30],'sales'],'k--',label='test section')
plt.plot(gas.loc[gas.index[:29],'ma'],color='red',label='moving averages')
plt.legend()
plt.show()
gas[28:]

### Double Moving Averages

Although I mentioned that the moving average method doesn't handle trend well, one way of adding a linear trend is to use double moving averages. As the name implies, one set of moving averages is computed, and then a second set is computed as a moving average of the first set.

**First MA:**
\begin{equation}
M_t = \hat{Y}_{t+1} = \frac{1}{k} \sum^{t}_{i=t-k+1} Y_{i}
\end{equation}

**Second MA:**
\begin{equation}
M'_t = \frac{1}{k} \sum^{t}_{i=t-k+1} M_{i}
\end{equation}

Now, we add the difference between the single and the second moving averages to the single moving average, just like adding a trend to the naive method.

\begin{equation}
a_t = M_{t} + (M_t - M'_t)
\end{equation}

In addition, we can put an additional adjustment factor, which is similar to a slope measure that can change over the series.
\begin{equation}
b_t = \frac{2}{k-1}(M_t - M'_t)
\end{equation}

Then, the equation that we use to make the forecast $p$ periods into the future
\begin{equation}
\hat{Y}_{t+p} = a_t + b_t p
\end{equation}



In [None]:
import pandas as pd
import matplotlib.pyplot as plt
vsale = pd.read_csv('https://raw.githubusercontent.com/dongyakoh/business_forecasting/main/data/video_sales.csv')
vsale.columns = 'time', 'sales'
plt.plot(vsale['sales'])
plt.show()

In [None]:
k,p = 3,1
vsale['ma1'] = vsale.sales.rolling(k).mean()
vsale['ma2'] = vsale.ma1.rolling(k).mean()
vsale['a'] = vsale.ma1 + (vsale.ma1 - vsale.ma2)
vsale['b'] = 2/(k-1) * (vsale.ma1 - vsale.ma2)
vsale['dbma'] = vsale.a.shift(1) + vsale.b.shift(1)*p
vsale['e_dbma'] = vsale.sales - vsale.dbma
plt.plot(vsale.loc[vsale.index,'sales'],'k-',label='data')
plt.plot(vsale.loc[vsale.index,'dbma'],color='red',label='db moving averages')
plt.legend()
plt.show()

vsale