# Time Series Analysis
## Introduction

#### Time Series
- a sequence of random variables $x_1, x_2, x_3, ...$, where the subscript denotes time point at which the data is taken
- a collection of random variables {$x_t$}, where $t = 0, \pm1, \pm2, ...$ and denotes time point at which the data is taken
    - usually t is discrete meaning data is sampled at constant intrevals 


#### Time Series Model  
- A mathematical/ statistical model that 
    - provides a plausible description for a sample time series data
    - describes how the data is correlated throughout time

## Measure of Dependence  

#### Mean
$$\mu_t = E[x_t] = \int_{-\infty}^{\infty}{xf_x(x)dx}$$
- Discribes what the average value is at time

#### Autocovariance Function (ACVF)
$$\gamma_x(s,t) = cov(x_s, x_t) = E[(x_s - \mu_s)(x_t - \mu_t)]$$
- Measures the *linear* dependence between 2 points on the same series
    - $\gamma_x(s, t) = 0 -> x_s, x_t$ are not linearly related, not that they are independent
    - If $x_s, x_t$, then $\gamma_x(s, t) = 0$
        
#### Autocorrelation Function (ACF)
$$\rho(s,t) = \frac{\gamma(x,t)}{\sqrt{\gamma(s,s)\gamma(t,t)}}$$
- A normalized version of ACVF. Normalized to be between [-1, 1]

#### Cross-covariance function
$$\gamma_{xy}(s,t) = cov(x_s, y_t) = E[(x_s - \mu_xs)(y_t - \mu_yt)]$$
- Measures the *linear* dependence between 2 points on 2 different series

#### Cross-correlation Function (CCF)
$$\rho_{xy}(s,t) = \frac{\gamma_{xy}(x,t)}{\sqrt{\gamma_x(s,s)\gamma_y(t,t)}}$$
- A normalized version of cross-covariance. Normalized to be between [-1, 1]

## Stationary
#### Strictly Stationary
A time series with the property for all $h \in \mathbb{Z}$, the dataset {$x_{t1}, x_{t2}, x_{t3}, ..., x_{tn}$} and the time shifted set {$x_{t1+h}, x_{t2+h}, x_{t3+h}, ..., x_{tn+h}$} has the probabilistic behavior/ statistical properties  

#### Weakly Stationary
The strict definition is (nearly) impossible to prove, thus too strict of a definition for most application. For the weak version of stationary, we only have conditions for the first 2 moments. The weak definition is:
- $E[x_t] = \mu, \forall t \in \mathbb{Z}$
- $Var[x_t] < \infty, \forall t in \mathbb{Z}$
- $\gamma_x(s, t) = \gamma_x(s+h, t+h), \forall s,t,h \in \mathbb{Z}$
    - ACVF only depend on the difference between s and t, |s - t|, not the values of s, t
    - the difference between s and t is refered to lag
    - ACVF and be rewritten so it is only a function of lag

The term stationary typically will refer to weakly stationary. A stationary process in the strict sense will refered as strictly stationary

##### ACVF and ACF for Stationary Time Series
Both ACVF and ACF and difined using only the lag h instead of s and t. 
$$h = |s-t|$$
$$\text{ACVF } =\gamma(h) = \gamma(t, t+h) = E[(x_t-\mu)(x_{t+h}-\mu)]$$
$$\text{ACF } =\rho(h) = \frac{\gamma(h)}{\gamma(0)}$$

Sample ACVF and ACF
$$\text{ACVF } = \hat \gamma (h) = \frac{1}{n}\sum_{t=1}^{n-h}(x_{t+h-\bar x})(x_t - \bar x)$$
- Biased estimate
- Positive-definite, so prefered over the something else e.g. n-h or n-h-1
$$\text{ACF } = \hat \rho = \frac{\gamma(h)}{\gamma(0)}$$    

Sample Mean
$$\bar x = \frac{1}{n} \sum_{t=1}^{n}x_t$$

- $E[\bar x] = \mu$  
- $Var[\bar x] = \frac{1}{n}\sum_{h=-n}^n(1-\frac{|h|}{n})\gamma_x(h)$
    - Derivation [here](./derivation/var_x_bar.ipynb)
    - When estimating variance, set the bounds of the summation to be $\pm \sqrt n$. $\hat \gamma(h)$ for $|h| \approx n$ are inaccurate due to the little amount of data.
    
##### Jointly Stationary
2 time series x,y are jointly stationary if x and y are both stationary and the cross-covariance is a function of lag h  
$$\text{cross-covariance} = \gamma_{xy}(h) = \gamma_{xy}(t, t+h)$$
$$\text{CCF} = \rho_{xy}(h) = \frac{\gamma_{xy}(h)}{\sqrt{\gamma_x(0)\gamma_y(0)}}$$

## White Noise
Denoted as $w_t$  
White noise is a special type of time series in which the random variables are uncorrelated with mean $0$ and variance $\sigma_w^2$.  
There are 3 types of white noise 
- white noise
- white indpendent noise (iid noise)
- Gaussian white noise

#### White Noise
- Collection of
    - uncorrelated random variables   
   - mean $0$
    - finite variance $\sigma_w^2$
- $w_t$ ~ $wn(0, \sigma_w^2)$

#### IID Noise
- Collection of
    - independent and identically distributed (iid) random variables
    - mean $0$
    - finite variance $\sigma_w^2$
- Subset of white noise
- $w_t$ ~ $iid(0, \sigma_w^2)$

#### Gaussian white noise
- Collection of
    - independent normal/ Gaussian random variables
    - mean $0$
    - finite variance $\sigma_w^2$
- A specific type of iid noise
- $w_t$ ~ $iid N(0, \sigma_w^2)$
- We will be refering to this special case of white noise when simulating

## Test for White Noise
For white noise, $\hat \rho(h) \dot \sim N(0, \frac{1}{n})$  

#### Confidence Interval
CI = $\pm \frac{1.96}{\sqrt n} \approx \pm \frac{2}{\sqrt n}$  
- Plot the ACFs (normally bar chart with lags on the x axis and ACF of the y axis). If 95% of the ACFs are within the CI, then is likely to be white noise
- Visual test

#### HypothesisTest
$H_0$: Data is white noise  
$H_1$: Data is not white noise

Test statistic:
$Q \sim \chi^2_h$    
- Box-Pierce test
    - $Q = n\sum_{j=1}^{h}{\hat\rho}^2(h)$
    - Original test statistic
    - Derivation [here](./derivation/Box-Pierce.ipynb)
- Ljung-Box test
    - $Q = n(n+1)\sum_{j=1}^{h}\frac{\rho^2(h)}{n-h}$
    - Improved test statistic. This statistic is closwer to $\chi^2_h$  

## Linear Process
A linear process, $x_t$ , is defined to be a linear combination of white noise $w_t$
$$x_t = \mu + \sum_{i = -\infty}^\infty\psi_i w_{t-i}$$
This has the ACVF of $\gamma_x(h)=\sigma_w^2 \sum_{i = -\infty}^{\infty}{\psi_{i+h} \psi_i}$