# Time-Series Analysis: The Basics

<hr>

**Basic Understanding**<br>
A collection of observations (realization of random variables) $x_1, \dots, x_n \in \mathbb {R}$ indexed by time (fixed time intervals)

- Components: $X_t = T_t + S_t + Z_t$, where $T_t$, $S_t$, $Z_t$ are trend, seasonality and noise respectively while subscript $t$ denotes a deterministic sequence of time stamps that are regularly spaced with equal intervals between any two adjacent stamps

- Smoothness: Correlation in time (previous values predicts current value in time). Less smoothness refers to lower correlation in time

The most important feature of time series data is that we do not assume $iid$ random variables. Most time series data are dependent, as past realizations of the RV influence future observations.

A general probabilistic model to describe a real world phenomena that evolves continously in time is called a stochastic process, which is simply a collection of random variables $X_t$ indexed by either a continuous or discrete time realization of a stochastic process.

A time series data set can be thought of as a single realization of a stochastic process. Each random variable $X_t$ has a marginal distribution, $P_t$ and the stochastic process of the collection of all {$X_t$} can be thought of as a joint distribution of all $X_t$'s

<hr>

**Deterministic dependencies: trend, seasonality**

Decomposition of the deterministic dependence of a time series, let $\mu_X (t)$ denote the mean function of the time series.

$\mu_X (t) = m_X (t) + s_X (t) + W_t$

where $m_X (t)$, $s_X (t)$, $W_t$ denotes the trend (*non-constant, non-cyclical*), seasonality (*cyclical, periodic*) and white noise time series (iid, $N(0, \sigma^2)$) components respectively

<hr>

**Stationarity / Stochastic Dependence**

- Marginal mean per time stamp: $\mu_X (t) = \mathbb {E}[X_t]$

- Margin variance: $var(X_t) = \mathbb {E} [(X_t - \mu_X (t))^2]$

- Autocovariance: $\gamma_X (s, t)$, where $s$, $t$ are different time points

    $\gamma_X (s, t) = cov(X_s, X_t) = \mathbb {E} [(X_s - \mu_X (s))(X_t - \mu_X (t)]$
    
    To estimate the covariance between a given time gap (*say, 50*) then take all stationarized observations with such gaps. For example, the covariance between $T_1$ to $T_5$ should be the same as $T_{51}$ to $T_{55}$.
    
    Typically, the autocovariance decays as the time distance between observation increases while the variance of $X_t$ stays constant over time.
    

The idea of stationarity is to remove the trend and seasonality such that observations of the process are representative of all possible realizations of the random variable.

$\therefore$ We need conditions that would allow us to estimate population parameters of the whole process (e.g. expectations, variances, correlations) with time averages over a single realization of the process. Weak stationarity implies that the first and second moments (mean, variance) have to be the same for all realizations of the random variables. Strong stationarity implies all moments.

A time series {$X_t$} is strongly stationary if the joint distribution of $X_t, \dots, X_{t+n}$ is the same as the joint distribution of $X_{t+h}, \dots, X_{t+n+h}$ for all integers $n$, time stamps $t$ and time shfiters $h$.

Weak stationarity only requires that the first two moments (mean, variance/covariance) be constant in time, 

$\mathbb {E} [X_t] = \mu_X$

$Var(X_t) = \sigma_X^2$

$Cov(X_s, X_t) = \gamma_X (\lvert s - t \rvert)$ for all time stamps, $s$, $t$

Detect non-stationarity using the following:

- Visualizations
- Autocovariance: trends/seasonal effects, changes in distribution
    - If stochastic dependencies (i.e. correlations) in the time series decays sufficiently fast as time distance gets larger then it satisfies mild conditions for asymtoptic behaviour as in CLT and LLN for iid data and we have good estimators for the mean, variance and autocovariance 


How to stationarize time series:

- Removing trend: 
    - Regression, estimate $X_t' = X_t - \hat{T_t}$ where $\hat{T_t} = \hat{\beta_1}t + \beta_0$
    
    
- Remove seasonality:
    - Regression, estimate $X_t' = X_t - \hat{S_t}$
    - Subtract weekly/monthly/yearly averages, estimate a vector of $\hat\mu_{weekday}$ and subtract this from the time series
    - Fourier analysis
    - Smoothing, e.g. by exponential moving averages (give more weight to recent data), $Y_t = \sum_{h = -k}^{k} \alpha_h X_{t+h}$
    

- Non-linear transformations (for e.g. log transformations for increasing variance over time), $Y_t = \log (X_t - \mu_t)$, $Y_t = \sqrt{X_t - \mu_t}$


- Differencing between previous values, $Y_t = \nabla X_t = X_t - X_{t-1}$ (removes linear trend)
    - Removing quadratic trend by differencing twice, $\nabla^2 X_t = \nabla X_t - \nabla X_{t-1}$
    - {$X_t$} is integerated of order p: $\nabla^p X_t$ is stationary, where $p$ is limited by the length of the time series. The higher $p$ is then the lesser number of data points are remaining in the staionarized data
    - Differencing the data is also a standard method to transform a time series with persistent stochastic dependencies (seasonality) into a stationary time series
    
    
****

**Properties of autocovariance, $\gamma_X (s, t)$**

Autocovariance: $\gamma_X (s,t) = cov(X_s, X_t) = \mathbb {E} [(X_s - \mu (s)) (X_t - \mu (t))]$

Autocorrelation (ACF): $\rho_X (s,t) = \frac{\gamma_X (s, t)}{\sqrt{\gamma_X (s,s) \cdot \gamma_X (t,t)}}$

- Symmetric, $\rho_X (s, t) = \rho_X (t, s)$
- Measures linear dependence of $X_t$, $X_s$
- Relates to smoothness (if decay of autocovariance is slow then time series is smooth)
- For weakly stationary series: $\gamma_X (t, t+h) = \gamma_X (0, h) =: \gamma_X(h)$

<img alt="ACF Decay (Smooth)" src="assets/acf_smoothness.png" width="300">


****

**Sample estimates for stationary series**

- Mean: $\hat\mu = \bar{x} = \frac{1}{n} \sum_{t=1}^{n} x_t$


- Variance: $\frac{1}{n} \sum_{t=1}^{n} (x_t - \bar{x})^2$


- Autocovariance: $\hat\gamma_X (h) = \frac{1}{n} \sum_{t=1}^{n-\lvert h \rvert} (x_t - \bar{x}) (x_{t+\lvert h \rvert} - \bar{x})$ for $-n < h < n$ where $h$ is the time gap


- Autocorrelation: $\hat\rho_X (h) = \hat\gamma_X (h) / \hat\gamma_X (0)$ for $-n < h < n$ where $\hat\gamma_X (0)$ is the variance

# Basic code
A `minimal, reproducible example`

In [18]:
# Compute autocovariances
import numpy as np
from statsmodels.tsa.stattools import acovf

data = np.array([-4, -3, -2, -1, 0, 1, 2, 3, 4])
auto_covariances = acovf(data)

for h, gamma in enumerate(auto_covariances, start = 0):
    print(h, gamma)

0 6.666666666666667
1 4.444444444444445
2 2.3333333333333335
3 0.4444444444444444
4 -1.1111111111111112
5 -2.2222222222222223
6 -2.7777777777777777
7 -2.6666666666666665
8 -1.7777777777777777


In [19]:
# Compute autocovariances
import numpy as np
from statsmodels.tsa.stattools import acovf

data = np.array([-1, 0, 1, 0, -1, 0, 1, 0])
auto_covariances = acovf(data)

for h, gamma in enumerate(auto_covariances, start = 0):
    print(h, gamma)

0 0.5
1 0.0
2 -0.375
3 0.0
4 0.25
5 0.0
6 -0.125
7 0.0
