# __Chapter 2: Linear Time Series Analysis and Its Applications__

<br>

Finance 5330: Financial Econometrics <br>
Tyler J. Brough <br>
First Date: January 19, 2019 <br>
Last Date: January 21, 2019 <br>
<br>

## Introduction 

These notes are based on Chapter 2 of the book _Analysis of Financial Time Series 3rd Ed_ by Ruey Tsay. 

Understanding the simple time series models introduced here will go a long way to better appreciate the more sophisticated financial econometric models of later chapters.

<br>

Treating an asset return (e.g. log return $r_{t}$ of a stock) as a collection of random variables over time., we have a time series $\{r_{t}\}$. The Linear time series models of this chapter are a natural first attemp at modeling such dynamic behavior. 

<br>

The theories of linear time series discussed include:

- stationarity

- dynamic dependence

- autocorrelation function

- modeling

- forecasting


<br>

The econometric models introduced include: 

- (a) simple autoregressive (AR) models

- (b) simple moving-average (MA) models

- (c) mixed autoregressive moving-average (ARMA) models

- (d) unit-root nonstationarity

- (e) regression models with times series errors

- (f) fractionally differenced models for long-range dependence



## Section 2.1 Stationarity

The foundation of time series analysis is stationarity. A time series $\{r_{t}\}$ is said to be _strictly stationary_ if the joint distribution of
$(r_{t_{1} + t}, \ldots, r_{t_{k} + t})$ for all $t$, where $k$ is an arbitrary positive integer and $(t_{1}, \ldots, t_{k})$ is a collection of $k$ positive integers. 

<br>

Strict stationarity requires that the joint distribution of $(r_{t_{1} + t}, \ldots, r_{t_{k} + t})$ is invariant under time shift. This is a very strong requirement that is challenging to verify empirically. For this reason, we often employ a simpler form of stationarity. 

<br>

A time series is $\{r_{t}\}$ _weakly stationary_ if both the mean of $r_{t}$ and the covariance between $r_{t}$ and $r_{t-l}$ are time invariant, where $l$ is an arbitrary integer.

<br>

More specifically, $\{r_{t}\}$ is weakly stationary if:

- (a) $E(r_{t}) = \mu$, which is constant

- (b) $Cov(r_{t}, r_{t-l}) = \gamma_{l}$, which only depends on $l$

<br>

In practice, suppose that we have observed $T$ data points $\{r_{t} | 1, \ldots, T\}$. Weak stationarity implies that a time plot of the data would show that the $T$ values fluctuate with constant variation around a fixed level. In application, weak stationarity enables one to make inference concerning future observations (e.g. prediction).

<br>

Implicitly, in the condition of weak stationarity, we assume that the first two moments of $r_{t}$ are finite. From the definitions, if $r_{t}$ is strictly stationary and its first two moments are finite, then $r_{t}$ is also weakly stationary. The converse is not true in general. 

<br>

If the time series $r_{t}$ is normally distributed, then weak stationarity is equivalent to strict stationarity. 

<br>

We will be mainly concerned with weakly stationary time series.

<br>

The covariance $\gamma_{l} = Cov(r_{t}, r_{t-1})$ is called the lag-$l$ autocovariance of $r_{t}$. It has two important properties: 

- (a) $\gamma_{0} = Var(r_{t})$

- (b) $\gamma_{-l} = \gamma_{l}$

The second property holds because $Cov(r_{t}, r_{t-(-l)}) = Cov(r_{t-(-l)}, r_{t}) = Cov(r_{t+l}, r_{t}) = Cov(r_{t_{1}}, r_{t_{1} - l})$, where $t_{1} = t + l$. 

<br>

In the finance literature, is common to assume that an asset return series is weakly stationary. We can check this empirically given a sufficient number of historical returns observations. In particular, we can divide the historical returns into subsamples and check the consistency of the results obtained across subsamples. 

## Section 2.2 Correlation and Autocorrelation Function

Recall that the correlation between two random variables $X$ and $Y$ can be defined as:

$$
\rho_{x,y} = \frac{Cov(X,Y)}{\sqrt{Var(X) Var(Y)}} = \frac{E[(X - \mu_{x}) (Y - \mu_{y})]}{\sqrt{E[(X - \mu_{x})^{2}] E[(Y - \mu_{y})^{2}]}}
$$

<br>

This coefficient measures the strength between $X$ and $Y$, and can be shown that $-1 \le \rho_{x,y} \le +1$, and that $\rho_{x,y} = \rho{y,x}$. The two random variables are uncorrelated if $\rho_{x,y} = 0$. In addition, if both $X$ and $Y$ are normally distributed random variables then the condition that $\rho_{x,y} = 0$ also indicates that they are independent. 

<br>

When the sample $\{(x_{t}, y_{t})\}_{t=1}^{T}$ then the population parameter can be estimated by its sample counterpart: 

$$
\hat{\rho}_{x,y} = \frac{\sum_{t=1}^{T} (x_{t} - \tilde{x}) (y_{t} - \tilde{y})}{\sqrt{\sum_{t=1}^{T} (x_{t} - \tilde{x})^{2}) \sum_{t=1}^{T} (y_{t} - \tilde{y})^{2}}}
$$

where $\tilde{x} = \frac{1}{T}\sum_{t=1}^{T} x_{t}$ and $\tilde{y} = \frac{1}{T}\sum_{t=1}^{T} y_{t}$ are the sample mean of $X$ and $Y$, respectively. 


#### Simulating Correlated Data

We can simulate correlated data with the following algorithm:

1. Draw $z_{1} \sim N(0,1)$
2. Draw $z_{2} \sim N(0,1)$
3. Set $\epsilon_{1} = z_{1}$
4. Set $\epsilon_{2} = \rho z_{1} + \sqrt{1 - \rho^{2}} z_{2}$, where $\rho$ is value of the correlation coefficient desired. 

<br>

We can do this in Python as follows:

In [4]:
import numpy as np

M = 10000
z1 = np.random.normal(size=M)
z2 = np.random.normal(size=M)
rho = 0.5
e1 = z1
e2 = rho * z1 + np.sqrt(1 - rho**2) * z2

np.corrcoef(e1,e2)

array([[1.        , 0.50372534],
       [0.50372534, 1.        ]])

##### __Autocorrelation Function (ACF)__

Consider a weakly stationary return series $r_{t}$. When the linear dependence between $r_{t}$ and its past values $r_{t-i}$ is of interest, the concept of correlation is generalized to autocorrelation. 

<br>

The correlation coefficient between $r_{t}$ and $r_{t-l}$ is called the lag-$l$ _autocorrelation_ of $r_{t}$ and is commonly denoted by $\rho_{l}$.

<br>

Specifically, we define

$$
\rho_{l} = \frac{Cov(r_{t}, r_{t-l})}{\sqrt{Var(r_{t}) Var(r_{t-l})}} = \frac{Cov(r_{t}, r_{t-l})}{Var(r_{t})} = \frac{\gamma_{l}}{\gamma_{0}}
$$

<br>

For a given sample of returns $\{r_{t}\}_{t=1}^{T}$, let $\tilde{r}$ be the sample mean (i.e. $\tilde{r} = \frac{1}{T} \sum_{t=1}^{T} r_{t}$). Then the lag-$1$ sample autocorrelation of $r_{t}$ is

$$
\hat{\rho}_{1} = \frac{\sum_{t=2}^{T} (r_{t} - \tilde{r}) (r_{t-1} - \tilde{r})}{\sum_{t=1}^{T} (r_{t} - \tilde{r})^{2}} 
$$

<br>

Under general conditions, $\hat{\rho}_{1}$ is a consistent estimate of $\rho_{1}$. For example:

- If $\{r_{t}\}_{t=1}^{T}$ is an independent and identically distributed (iid) sequence 
- And $E(r_{t}^{2}) < \infty$
- Then $\hat{\rho}_{1}$ is asymptotically normal with mean $0$ and variance $1/T$

<br>

We can use this to test the hypothesis $H_{0}: \rho_{1} = 0$ against the alternative hypothesis $H_{a}: \rho_{1} \ne 0$. The test statistic is the usual $t$ ratio, which is $\sqrt{T} \hat{\rho}_{1}$ and follows asymptotically the standard normal distribution. The null hypothesis $H_{0}$ is rejected if the $t$ ratio is large in magnitude, or if the $p$-value of the $t$ ratio is small, say less than $0.05$. 

<br>

In general, the lag-$l$ sample autocorrelation of $r_{t}$ is defined as

$$
\hat{\rho}_{l} = \frac{\sum_{t=l+1}^{T} (r_{t} - \tilde{r}) (r_{t-l} - \tilde{r})}{\sum_{t=1}^{T} (r_{t} - \tilde{r})^{2}}, \quad\quad 0 \le l < T-1.
$$

<br>

If $\{r_{t}\}$ is an iid sequence satisfying $E(r_{t}^{2}) < \infty$, then $\hat{\rho}_{l}$ is asymptotically normal with mean zero and variance $1/T$ for any fixed positive integer $l$.

<br>

More generally, if $r_{t}$ is a weakly stationary time series satisfying $r_{t}= \mu + \sum_{i=0}^{q} \psi_{i} a_{t-i}$, where $\psi_{0} = 1$ and $\{a_{j}\}$ is a sequence of iid random variables with mean zero, then $\hat{\rho}_{l}$ is asymptotically normal with mean zero and variance $(1 + 2 \sum_{i=1}^{q} \hat{\rho}_{i}^{2})/T$ for $l > q$. This is known as Bartlett's formula.

##### __Testing Individual ACF__

For a given positive interger $l$, the previous result can be used to test $H_{0}: \rho_{l} = 0$ vs $H_{a}: \rho_{l} \ne 0$. The test statistic is

$$
t \mbox{ratio} = \frac{\hat{\rho}_{l}}{\sqrt{(1 + 2 \sum_{i=1}^{l-1} \hat{\rho}_{i}^{2})/T}}
$$

<br>

If $\{r_{t}\}$ is a stationary Gaussian series satisfying $\rho_{j} = 0$ for $j > l$, the $t$ ratio is asymptotically distributed as a standard normal random variable. 

<br>

The decision rule is to reject $H_{0}$ if $|t\mbox{-ratio}| > z_{\alpha/2}$, where $z_{\alpha/2}$ is the $100(1 - \alpha/2)th$ percentile of the standard normal distribution. 

## Section 2.3 White Noise and Linear Time Series

## Section 2.4 Simple AR Models

#### Properties of AR Models

#### Identifying AR Models in Practice

#### Goodness of Fit

#### Forecasting

## Section 2.5 Simple MA Models

#### Properties of MA Models

#### Identifying MA Order

#### Estimation

#### Forecasting Using MA Models

## Section 2.6 Simple ARMA Models

#### Properties of ARMA(1,1) Models

#### General ARMA Models

#### Identifying ARMA Models

#### Forecasting Using an ARMA Model

#### Three Model Representations for an ARMA Model

## Section 2.7 Unit-Root Nonstationarity

#### Random Walk

#### Random Walk with Drift

#### Trend-Stationary Time Series

#### General Unit-Root Nonstationary Models

#### Unit-Root Test

## Section 2.8 Seasonal Models

#### Seasonal Differencing

#### Multiplicative Seasonal Models

## Section 2.9 Regression Models with Time Series Errors

## Section 2.10 Consistent Covariance Matrix Estimation

## Section 2.11 Long-Memory Models