Reference: Wikipedia

# Time Series Decomposition

* Additive model: $y(t) = \text{Level} + \text{Trend} + \text{Seasonality} + \text{Noise} $


* Multiplicative model: $y(t) = \text{Level} \times \text{Trend}  \times \text{Seasonality}  \times \text{Noise} $


```python
from statsmodels.tsa.seasonal import seasonal_decompose

result = seasonal_decompose(x, model='additive')
# x is a pandas object with a timeseries index with a freq not set to None.
# If x is not a pandas object, freq= should be given in seasonal_decompose().

result.plot()

# We can get the following data:
result.observed
result.trend
result.reasonal
result.resid
result.nobs
```


## Detrending

* By differencing: $\nabla x_t = x_t - x_{t-1}$, more generally, $\nabla^d x_t = (1-B)^d x_t$, where $B$ is the backshift operator. 

Example. For a random walk model given as $x_t = x_{t-1} + w_t$ with $w_t\sim N(0,\sigma^2)$, we have $\nabla x_t = x_t - x_{t-1} = w_t$.

* By fitting a model: If a model's prediction of $x_t$ is $\hat{x}_t$, $x_t - \hat{x}_t$ may detrend the time series.


## Deseasonalizing

* By differencing: $x_t - x_{t-p}$, where $p$ is a constant depending on the seasonal length

    We may resample the time series for a stable result. For example, if $x$ is a temperature dataset measured daily, we can apply the above differencing method with $p=12$ after making a monthly dataset: 
    ```
    x = x.resample('M').mean()
    ```
    
* By fitting a model: If a model's prediction of $x_t$ is $\hat{x}_t$, $x_t - \hat{x}_t$ may deseasonalize the time series.

# White Noise

A time series is white noise if the variables are independent and identically distributed with a mean of zero.

If the variance changes over time or if values correlate with lag values (we can use `pandas.plotting.autocorrelation_plot()`), the time series is not white noise.

# AutoRegressive (AR) model

AR(p) = an autoregressive model of order p

$X_t = c + \varphi_1 X_{t-1} + \cdots + \varphi_p X_{t-p} + \epsilon_t = c + (\varphi_1 B + \cdots + \varphi_p B^p)X_t + \epsilon_t$, where $B$ is the backshift operator. 

Equivalently, $\phi[B]X_t = c + \epsilon_t$, where $\phi[B] = 1 - \varphi_1 B - \cdots -\varphi_p B^p$.

# Moving-Average (MA) model

MA(q) = an moving average model of order q

$X_t = \mu + \epsilon_t + \theta_1 \epsilon_{t-1} + \cdots +\theta_q \epsilon_{t-q} = \mu + (1 + \theta_1 B + \cdots +\theta_q B^q) \epsilon_t$, where $B$ is the backshift operator. 

Contrary to the AR model, the finite MA model is always stationary.

# AutoRegressive–Moving-Average (ARMA) model

ARMA(p,q) = an model with p autoregressive terms and q moving-average terms

$X_t = c+ \epsilon_t  + (\varphi_1 B + \cdots + \varphi_p B^p)X_t + (\theta_1 B + \cdots +\theta_q B^q) \epsilon_t$

The error terms $\epsilon_t$ are assumed to be i.i.d. sampled from $N(0,\sigma^2)$.

# Dickey–Fuller test

The Dickey–Fuller test tests the null hypothesis that a unit root is present in an autoregressive model.

AR(1) model is $x_t = \rho x_{t-1} + \epsilon_t$. A unit root is present if $\rho=1$. 

The model can be written as $\Delta x_t = \gamma x_{t-1} + \epsilon_t$, where $\gamma = \rho -1$. 

Since the test is done over the residual term rather than raw data, it is not possible to use standard t-distribution to provide critical values. This statistic $t$ has a specific distribution known as the Dickey–Fuller table.

There are three main versions of the test:

1. Test for a unit root: $\Delta x_t = \gamma x_{t-1} + \epsilon_t$
1. Test for a unit root with drift: $\Delta x_t = \alpha + \gamma x_{t-1} + \epsilon_t$
1. Test for a unit root with drift and deterministic time trend: $\Delta x_t = \alpha + \beta t + \gamma x_{t-1} + \epsilon_t$

# Augmented Dickey–Fuller test

The testing procedure for the ADF test is the same as for the Dickey–Fuller test but it is applied to the model

$\Delta x_t = \alpha + \beta t + \gamma x_{t-1} + \delta_1\Delta x_{t-1} + \cdots + \delta_{p-1} \Delta x_{t-p+1} + \epsilon_t$

The ADF statistic is a negative number. The more negative it is, the stronger the rejection of the hypothesis that there is a unit root at some level of confidence.

```python
from statsmodels.tsa.stattools import adfuller

# random_walk is a list of numbers.
result = adfuller(random_walk)

result[0]      # ADF statistic
result[1]      # p-value
result[4]      # Critical values for the test statistic at the 1 %, 5 %, and 10 % levels. 
```