In [1]:
# https://www.kaggle.com/code/ryanholbrook/time-series-as-features

## What is Serial Dependence?

In earlier lessons, we investigated properties of time series that were mostly easily modeled as *time dependent* properities, that is, with features we could derive directly from the time index. Some time series properties, however, can only be modeled as *serially dependent* properties, that is, using as features past values of target series. The structure of these time series may not be apparent from a plot over time; plotted against past values, however, the structure becomes clear -- as we see in the figure below.

![image.png](attachment:5b6c088a-01cf-4d30-8f78-ad0b5535823d.png)

*These two series have serial dependence, but not time dependence. Points on the right have coordinates (value at time t-1, value at time t).*

With trend and seasonality, we trained models to fit curves to plots like those on the left in the figure above -- the models were learning time dependence. The goal in this lesson is to train models to fit curves to plots like those on the right -- we want them to learn serial dependence.

## Cycles

One especially common way for serial independence to manifest is in **cycles**. Cycles are patterns of growth and decay in a time series associated with how the value in a series at one time depends on values at previous times, but not necessarily on the time step itself. Cyclic behavior is characteristic of systems that can affect themselves or whose reactions persist over time. Economies, epidemics, animal populations, volcano eruptions, and similar natural phenomena often display cyclic behavior.

![image.png](attachment:16770c43-0223-43cf-ae28-f8b0e6c25d6f.png)

What distinguishes cyclic behavior from seasonality is that cycles are not necessarily time dependent, as seasons are. What happens in a cycle is less about the particular date of occurence, and more about what has happened in the recent past. The (at least relative) independence from time means that cyclic behavior can be much more irregular than seasonality.

## Lagged Series and Lag Plots

To investigate possible serial dependence (like cycles) in a time series, we need to create "lagged" copies of the series. **Lagging** a time series means to shift its values forward one or more time steps, or equivalently, to shift the time in its index backward one or more steps. In either case, the effect is that the observations in the lagged series will appear to have happened later in tie.

This shows the monthly unemployment rate in the US [y] together with its first and second lagged series ([y_lag_1] and [y_lag_2], respectively). Notice how the values of the lagged series are shifted forward in time.

In [2]:
import pandas as pd

# Federal Reserve dataset: https://www.kaggle.com/federalreserve/interest-rates
reserve = pd.read_csv(
    "DataSources/reserve.csv",
    parse_dates={'Date': ['Year', 'Month', 'Day']},
    index_col='Date',
)

y = reserve.loc[:, 'Unemployment Rate'].dropna().to_period('M')
df = pd.DataFrame({
    'y': y,
    'y_lag_1': y.shift(1),
    'y_lag_2': y.shift(2),    
})

df.head()

Unnamed: 0_level_0,y,y_lag_1,y_lag_2
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1954-07,5.8,,
1954-08,6.0,5.8,
1954-09,6.1,6.0,5.8
1954-10,5.7,6.1,6.0
1954-11,5.3,5.7,6.1
