## Advanced Techniques for Complex Time Series

Timeseries data can contain complex seasonality - for example, recorded hourly data can exhibit daily, weekly and yearly seasonal patterns. With the rise of connected devices - IoT and sensors - data is being recorded more frequently. For example, if we examine classical time series datasets used in many research papers, many were smaller sets and recorded less frequently, such as annually or monthly. Such data contains one seasonal pattern. More recent datasets and research now use higher frequency data, recorded in hours or minutes. 

Here we will explore new algorithms that can model a time series with multiple seasonality for forecasting and decomposing a time series into different components. 

Here we will explore the following topics:

* Decomposing time series with multiple seasonal patterns **MSTL**
* Forecasting with multiple seasonal patterns using the **Unobserved Components Model (UCM)**
* Forecasting time series with multiple seasonal patterns using **Prophet**
* Forecasting time series with multiple seasonal patterns using **NeuralProphet**


In [1]:
import pandas as pd 
import numpy as np 
import matplotlib.pyplot as plt 
from pathlib import Path
from statsmodels.tools.eval_measures import rmse, rmspe 
import warnings 
warnings.filterwarnings('ignore')


In [2]:
folder = Path('../TimeSeriesAnalysisWithPythonCookbook/Data/AEP_hourly.csv')

df = pd.read_csv(folder, index_col='Datetime', parse_dates=True)

In [3]:
# Clean up the data 

df.sort_index(inplace=True)
df = df.resample('H').max()
df.ffill(inplace=True)

## Understanding state-space models

**State-Space Models (SSM)**

Have their roots in the field of engineering and offer a generic approach to modeling dynamic systems and how they evolve over time. In addition, SSMs are widely used in other fields, such as economics, neuroscience, electrical engineering etc. 

In time series data, the central idea behind SSMs is that of **latent variables**, also called **states**, which are continous and sequential through time-space domain. For example, in a univariate time series, we have a response variable at time **t**; this is the observed value termed **$Y_{t}$**, which depends on the true variable termed **$X_{t}$**. The **$X_{t}$** variable is the latent variable that we are interested in estimating - which is either **unobserved** or cannot be measured. In state space, we have an underlying state that we cannot measure directly (unobserved). An SSM provides a system of equations to estimate these unobserved states from the observed values and is represented mathematically in a vector-matrix form. In general, there are two equations - a _state equation_ and an _observed equation_. One key aspect of their popularity is their flexibility and ability to work with complex time series data that can be multivariate, non-stationary, non-linear or contain multiple seasonality, gaps or irregularities.

In addition, SSMs can be generalized and come in various forms, several of which make use of **Kalman filters**. The benefit of using SSMs in time series data is that they are used in _filtering, smoothing or forecasting_, as we will explore

**KALMAN FILTERS**
The Kalman filter is an algorithm for extracting signals from data that is either noisy or contains incomplete measurements. The premise behind Kalman filters is that not every state within a system is directly observable; instead, we can estimate the state indirectly, using observations that may be contaminated, incomplete or noisy

For example, sensor devices produce time series data known to be incomplete due to interruptions or unreliable due to noise. Kalman filters are excellent when working with time series data containing a considerable signal-to-noise ratio, as they work on smoothing and denoising the data to make it more reliable

## Decomposing time series with multiple seasonal patterns using MSTL

