# Time-Series Analysis: The Basics

<hr>

**Basic Understanding**<br>
A collection of observations (realization of random variables) $x_1, \dots, x_n \in \mathbb {R}$ indexed by time (fixed time intervals)

- Components: $X_t = T_t + S_t + Z_t$, where $T_t$, $S_t$, $Z_t$ are trend, seasonality and noise respectively while subscript $t$ denotes a deterministic sequence of time stamps that are regularly spaced with equal intervals between any two adjacent stamps

- Smoothness: Correlation in time (previous values predicts current value in time). Less smoothness refers to lower correlation in time

The most important feature of time series data is that we do not assume $iid$ random variables. Most time series data are dependent, as past realizations of the RV influence future observations.

A general probabilistic model to describe a real world phenomena that evolves continously in time is called a stochastic process, which is simply a collection of random variables $X_t$ indexed by either a continuous or discrete time realization of a stochastic process.

A time series data set can be thought of as a single realization of a stochastic process. Each random variable $X_t$ has a marginal distribution, $P_t$ and the stochastic process of the collection of all {$X_t$} can be thought of as a joint distribution of all $X_t$'s

<hr>

**Deterministic dependencies: trend, seasonality**

Decomposition of the deterministic dependence of a time series, let $\mu_X (t)$ denote the mean function of the time series.

$\mu_X (t) = m_X (t) + s_X (t) + W_t$

where $m_X (t)$, $s_X (t)$, $W_t$ denotes the trend (*non-constant, non-cyclical*), seasonality (*cyclical, periodic*) and white noise time series (iid, $N(0, \sigma^2)$) components respectively

<hr>

**Stationarity / Stochastic Dependence**

- Marginal mean per time stamp: $\mu_X (t) = \mathbb {E}[X_t]$

- Margin variance: $var(X_t) = \mathbb {E} [(X_t - \mu_X (t))^2]$

- Autocovariance: $\gamma_X (s, t)$, where $s$, $t$ are different time points

    $\gamma_X (s, t) = cov(X_s, X_t) = \mathbb {E} [(X_s - \mu_X (s))(X_t - \mu_X (t)]$
    
    To estimate the covariance between a given time gap (*say, 50*) then take all stationarized observations with such gaps. For example, the covariance between $T_1$ to $T_5$ should be the same as $T_{51}$ to $T_{55}$.
    
    Typically, the autocovariance decays as the time distance between observation increases while the variance of $X_t$ stays constant over time.
    

The idea of stationarity is to remove the trend and seasonality such that observations of the process are representative of all possible realizations of the random variable.

$\therefore$ We need conditions that would allow us to estimate population parameters of the whole process (e.g. expectations, variances, correlations) with time averages over a single realization of the process. Weak stationarity implies that the first and second moments (mean, variance) have to be the same for all realizations of the random variables. Strong stationarity implies all moments.

A time series {$X_t$} is strongly stationary if the joint distribution of $X_t, \dots, X_{t+n}$ is the same as the joint distribution of $X_{t+h}, \dots, X_{t+n+h}$ for all integers $n$, time stamps $t$ and time shfiters $h$.

Weak stationarity only requires that the first two moments (mean, variance/covariance) be constant in time, 

$\mathbb {E} [X_t] = \mu_X$

$Var(X_t) = \sigma_X^2$

$Cov(X_s, X_t) = \gamma_X (\lvert s - t \rvert)$ for all time stamps, $s$, $t$

Detect non-stationarity using the following:

- Visualizations
- Autocovariance: trends/seasonal effects, changes in distribution



# Basic code
A `minimal, reproducible example`