Analysis of Financial Time Series, Third Edition Chapter 8

# Chapter 8 Multivariate Time Series Analysis and Its Applications


Price movements in one market affect another market. One must consider them **jointly** to better understand the **dynamic** structure of the global finance. One market may **lead** the other market. The next few chapters introduce econometric models and methods belonging to **vector or multivariate time series analysis** and useful for **studying jointly** multiple return series.

A [vector or] **multivariate time series** consists of **multiple single series** [each] referred to as **components**. 

**Boldface** indicates vectors and matrices. 

Appendix A discusses vector and matrices operations and properties. 

Appendix B discusses the **multivariate normal distribution** widely used in multivariate statistical analysis, (Johnson and Wichern, 1998).

Let:   
- $\large \boldsymbol{r}_t =(r_{1t},r_{2t}, \ldots ,r_{kt})' \text{ = log returns of k assets at time t}$ 

where:   
- $\large \boldsymbol{a}'$ denotes the transpose of $\boldsymbol{a}$

For example:  
$r_{1t}$ might denote the daily log return of IBM stock and $r_{2t}$ might denote the daily log return of Microsoft [at day or time t].

[So the above $r_t$ is the equal to the transposed of the horizontal; i.e. $r_t$ is vertical and is only at time t not the entire series.]

This chapter's goals are 
- (a) to explore the basic properties of $\boldsymbol{r}_t$ 
- (b) to study **econometric models** for analyzing the **multivariate data** $\{r_t | t = 1, \ldots, T \}$.
- (c) to discuss the direct generalization of previous chapters' models and methods to the multivariate case. 
- (d) discuss new models and methods required for complicated relationships between multiple series in order to form generalizations. 
- (c) discuss these issues with emphasis on intuition and applications. 

For statistical theory of multivariate time series analysis, readers are referred to Lutkepohl (2005) and Reinsel (1993).

## 8.1 WEAK STATIONARITY AND CROSS-CORRELATION MATRICES

A k-dimensional time series is denoted [as a vector that is transposed when represented in text where horizontal alignment is better formatting].

$\large \boldsymbol{r}_t = (r_{1t} \ldots, r_{kt})'$  

The series $\boldsymbol{r}_t$ is **weakly stationary** if its first and second moments [mean vector of all components and covariance matrix of all components] are time invariant: the **mean vector** and **covariance matrix** of a weakly stationary series are **constant over** time, and this book **assumes** that return series of financial assets are weakly stationary.

https://en.wikipedia.org/wiki/Moment_(mathematics)
[In mathematics, the moments of a function are quantitative measures related to the shape of the function's graph. If the function represents mass, then the first moment is the center of the mass, and the second moment is the rotational inertia. **If** the function is a **probability distribution**, then the **first moment** is the expected value, the **second central moment** is the variance, the **third standardized moment** is the skewness, and the **fourth standardized moment** is the kurtosis. The mathematical concept is closely related to the concept of moment in physics. [note: those are exact terms.  it isnt the "first central moment" or the "second standardized moment".]

For a weakly stationary time series $\boldsymbol{r}_t$, Tsay defines a **mean vector** and **covariance matrix** [as the expected value of the time series vector $r_t$ and the outerproduct of (a) the difference between $r_t \text{ and } \mu$ and (b) the same vector difference transposed:]

(8.1)
$$\large \boldsymbol{\mu} = E( \boldsymbol{r}_t ), \;\; \boldsymbol{\Gamma}_0 = E[( \boldsymbol{r}_t −  \boldsymbol{\mu} )( \boldsymbol{r}_t −  \boldsymbol{\mu} )']$$

![image.png](attachment:image.png)

where 
- Tsay writes $\large \boldsymbol{\mu} = (\mu_1,\ldots,\mu_k)' \text{ and } \boldsymbol{\Gamma}_0 = [\Gamma_{ij}(0)]$ when the **elements** are needed.
- the expectation is taken **element** by element over the **joint distribution** of $\boldsymbol{r}_t$.  

[element by element means component by component which imparts that each elements PDF computes its own expected value but then 'over the joint distribution' is not entirely clear if that means expectations are for joint probabilities of all combinations of componet values, but think I cleared this up later.]

[see expected value here: https://en.wikipedia.org/wiki/Multivariate_random_variable where says "components are random variables on the same probability space ($\large \Omega, F, P$) where $\Omega$ is **sample space** that I guess defines the ranges for each component, F is the **sigma-algebra** or **collection of all events** and P is the **probability measure** that returns even event's probability.  Again would imply joint probability, but then goes on to say $E[X] = (E[X_1], \ldots, E[X_n])^T so each component's expected value is evaluated separately, but possible that the probabilities that are used to evaluate the separates are from the joint PDF which would make sense.]



The mean $\boldsymbol{\mu}$ is a k-dimensional vector of the **unconditional expectations** of the components of $\boldsymbol{r}_t$. [Unconditional means not conditioned on what? It could mean as it does elsewhere not conditioned on prior values which would be confusing since we might expect to use all values to compute the mean but when e.g. lags are introduced the mean is computed from $\ell$ to T which means it could have been conditioned on the values in the series from t=zero to t=$\ell$, but it isn't.]


The covariance matrix $\boldsymbol{\Gamma}_0$ is a k × k matrix. [k = number of assets.]
- The ith diagonal element [= (i,i)th element] of $\boldsymbol{\Gamma}_0$ is the variance of $\boldsymbol{r}_{it}$, 
- The (i,j)th element of $\boldsymbol{\Gamma}_0$ is the covariance between  $\boldsymbol{r}_{it}$ and  $\boldsymbol{r}_{jt}$. [Recall that  $\boldsymbol{r}_{it}$ is the whole time series of i.]. 

## 8.1.1 Cross-Correlation Matrices

Cross-correlation matrices are used to measure the strength of linear dependence between time series.

Let D be a k × k diagonal matrix consisting of the standard deviations of $\large r_{it}$ for i = 1, ..., k: 

$$\large \boldsymbol{D} = \text{diag}\{\sqrt{\Gamma_{11}(0)}, \ldots,\sqrt{\Gamma_{kk}(0)}\}$$

![image-12.png](attachment:image-12.png)

Then, the **concurrent**, or lag-zero, **cross-correlation matrix** of $\large r_{it}$ is defined:

$$\large \boldsymbol{\rho}_0 \equiv [\rho_{ij}(0)] = \boldsymbol{D}^{-1} \boldsymbol{\Gamma_0}\boldsymbol{D}^{-1}$$ 

![image-10.png](attachment:image-10.png)

[because D on either side multiplies every correlation by its 2 relevant standard deviations from row and column respectively on L and R forms of D:]

$$\large \boldsymbol{D} \boldsymbol{\rho}_0 \boldsymbol{D} =  \boldsymbol{\Gamma_0}$$ 



More specifically, the (i,j)th element of $\boldsymbol{\rho}_0$ is the correlation coefficient between $r_{it} \text{ and } r_{jt}$:

$$\large \rho_{ij}(0) = \frac{\Gamma_{ij}(0)}{\sqrt{\Gamma_{ii}(0)\Gamma_{jj}(0)}} = \frac{\text{Cov}(r_{it}, r_{jt})}{std(r_{it})std(r_{jt})}$$

![image-9.png](attachment:image-9.png)

[where $std(r_{it})$ is the standard deviation of the i-th component's time series]

In **time series analysis**, such a correlation coefficient $\large \rho_{ij}(0)$ is referred to as a **concurrent**, or **contemporaneous**, correlation coefficient because it is the correlation of the two series **at time t**. 

$$\large \rho_{ij}(0)=\rho_{ji}(0),\;\; −1 ≤ \rho_{ij}(0) ≤ 1, \;\; \rho_{ii}(0) = \rho_{jj}(0) = 1 \text{ for } 1 ≤ i,j ≤ k$$ 

![image-11.png](attachment:image-11.png)

Thus, $\rho(0)$ is a **symmetric** matrix with **unit diagonal** elements.

**Lead–lag relationships between component series** are important in multivariate time series analysis [and are represented in **lag-$\ell$ cross correlation matrices**].

[Start with ] the **lag-$\ell$ cross correlation matrix** of $r_t$ is defined [by an outer product]:

(8.2)
$$\large \boldsymbol{\Gamma}_{\ell} \equiv [ \Gamma_{ij}(\ell)] = E[(\boldsymbol{r}_t − \boldsymbol{\mu})(\boldsymbol{r}_{t-\ell} - \boldsymbol{\mu})']$$

![image-8.png](attachment:image-8.png)

where: 
- $\boldsymbol{\mu}$ is the mean vector of $\boldsymbol{r}_t$. 

Therefore, the (i,j)th element of $\boldsymbol{\Gamma}_{\ell}$ is the covariance between $r_{it} \text{ and } r_{j,t-\ell}$. 

The cross-covariance matrix is [and its entries are] a function of the lag = $\ell$ and not a function of the time index t **for a weakly stationary series**. [Obviously, the entries are a function of the components, i and j, as well.]

The lag-$\ell$ cross-correlation matrix (**CCM**) of $r_t$ is defined as:

(8.3)
$$\large \boldsymbol{\rho} \equiv [\rho_{ij}(\ell)]= \boldsymbol{D}^{−1} \boldsymbol{\Gamma}_{\ell} \boldsymbol{D}^{−1}$$

![image-7.png](attachment:image-7.png)

where, as before with the no-lag cross correlation matrix:
- $\boldsymbol{D}$ is the diagonal matrix of standard deviations of the individual series $r_{it}$

From the definition, this [is almost the same except $\ell$ was zero], is the correlation coefficient between $r_{it} \text{ and } r_{j,t−\ell}$:

(8.4)

$$\large \rho_{ij}(\ell) = \frac{\Gamma_{ij}(\ell)}{\sqrt{\Gamma_{ii}(0)\Gamma_{jj}(0)}} = \frac{\text{Cov}(r_{it}, r_{j,t-\ell})}{std(r_{it})std(r_{jt})}$$

![image-6.png](attachment:image-6.png)

[Implications:]
- [In the notation, the second component subscripted j gets comes first and thus leads the component subscripted i which lags (follows) and thus is dependent.]    
- When $\ell > 0$, this correlation coefficient $\rho_{ij}(\ell)$ measures the linear dependence of $r_{it} \text{ on } r_{j,t−\ell}$ which occurred prior to time t. [Seems this correlation coefficient $\rho_{ij}(\ell)$ measures the linear dependence of $r_{it} \text{ on } r_{j,t−\ell}$ even when $\ell = 0$, but that Tsay's pointing to i preceding j in subscript: the $\ell$ lag goes with the second subscript, j, here.]   

- Consequently, if $\rho_{ij}(\ell) ≠ 0 \text{ and } \ell > 0$, the series $r_{jt}$ **"leads"** the series $r_{it}$ at lag $\ell$. 

- Similarly, correlation coefficient $\rho_{ji}(\ell)$ measures the linear dependence of $r_{jt} \text{ and } r_{i,t−\ell}$. [The $\ell$ lag goes with the second subscript, i, here.]

- If $\rho_{ji}(\ell) ≠ 0 \text{ and } \ell > 0$, the series $r_{it}$ **"leads"** the series $r_{jt}$ at lag $\ell$. 

[To be clear, the **second** and column subscript **leads** the first **row** subscript and it is the linear dependence of the **first** row subscript **on** the second **column** subscript of the lag-$\ell$ cross correlation matrix.  But there are other conditions for these to be exclusively true.]

- (8.4) also shows that the **diagonal** element $\rho_{ii}(\ell)$ is the lag-$\ell$ **autocorrelation** coefficient of $r_{it}$.

Important [lag-$\ell$] **cross correlation properties** [descend from these implications] when $\ell > 0$: 

$$\rho_{ij} (\ell) ≠ \rho_{ji} (\ell) \text{ for i ≠ j}$$ 

![image-5.png](attachment:image-5.png)

- First, **the matrices** $\boldsymbol{\Gamma}_{\ell} \text{ and } \boldsymbol{\rho}_{\ell}$ are **in general** not symmetric because the two correlation coefficients measure [two] different linear relationships [that exist] between [four different subsets of the two time series] {$r_{it}$} and {$r_{jt}$}:
    - $r_{i,t}$
    - $r_{j,t}$
    - $r_{i,t-\ell}$
    - $r_{j,t-\ell}$

$$\large \text{Cov}(r_{it},r_{j,t−\ell}) = \text{Cov}(r_{j,t−\ell},r_{it}) = \text{Cov}(r_{jt},r_{i,t+\ell}) = \text{Cov}(r_{jt},r_{i,t−(−\ell})$$

![image.png](attachment:image.png)

- Second, using [the property that] **Cov(x,y) = Cov(y,x)** and the [assumption of] weak stationarity, [notice here that Tsay is showing that switching the order of subscripts i and j in the 2nd, 3rd and 4th covariance keeps the same covariance quantity if i is always ahead of j]:
    - [if $\ell$ remains with the same subscript: $\ell$ keeps the same sign and remains with j in the second covariance equality where i and j change places relative to their position in the first equality.]
    - [if $\ell$ changes sign: $\ell$ changes sign as $\ell$ associates with i instead of j in the third and fourth covariance equalities where i and j change places relative to their position in the first equality.] 
    
[In all 4 cases covariance is measuring the dependence between the **same** subsets of components i and j time series; only the notated order is changing.]

Thus, ... 

$$\large \Gamma_{ij}(\ell) = \Gamma_{ji}(−\ell)$$

![image-2.png](attachment:image-2.png)

Because $\Gamma_{ji}(−\ell)$ is the (j,i)th element of the matrix $\boldsymbol{\Gamma}_{−\ell}$ and the equality holds for 1 ≤ i, j ≤ k, also true are:

$$\large \boldsymbol{\Gamma}_{\ell} = \boldsymbol{\Gamma}_{-\ell}'$$

$$\large \boldsymbol{\rho}_{\ell} = \boldsymbol{\rho}_{-\ell}'$$  

![image-3.png](attachment:image-3.png)

Consequently, **unlike the univariate case**, for a **general** **vector** time series when
$\ell > 0$:

$$\boldsymbol{\rho}_{\ell} = \boldsymbol{\rho}_{-\ell}'$$

![image-4.png](attachment:image-4.png)

Because $\boldsymbol{\rho}_{\ell} = \boldsymbol{\rho}_{-\ell}'$, it suffices in practice to consider the cross-correlation matrices $\boldsymbol{\rho}_{\ell} \text{ for } \ell ≥ 0$. [That is **don't** fuss with **negative** values for **lags** and **transposed** lagged **correlation** matrices (or **covariance matrices** for that matter).]

## 8.1.2 Linear Dependence

Considered **jointly**, 
the cross-correlation matrices ...

$$\large \{\boldsymbol{ρ} | \ell = 0, 1, \ldots\}$$ 

... of a **weakly stationary** vector time series contain the following information:

1. The diagonal elements $\large \{ρ_{ii}(\ell)| \ell= 0, 1, \ldots\}$ are the **autocorrelation function** of $r_{it}$. [Remember that the autocorrelation function describes a single value of i across the full set of values for lags $\ell$ that label each matrix].
2. The off-diagonal element $\large \rho_{ij}(0)$ measures the **concurrent linear relationship** between $r_{it} \text{ and } r_{jt}$ [as lag $\ell$ is held constant = zero].
3. For $\ell > 0$, the off-diagonal element $\large \rho_{ij}(\ell)$ measures the **linear dependence** of $r_{it} \text{ on the past value } r_{j,t−\ell}$.  [here "dependence" for $\ell > 0$ and in #2 "relationship" for $\ell = 0$.]

Therefore, ...

$\large \text{if } \rho_{ij}(\ell) = 0 \text{ for all } \ell > 0$, then $r_{it}$ does not **depend** linearly on any past value $r_{j,t−\ell}$ of the $r_{jt}$ series.

[May be worth noting that dependence can run the other way, expectation for a future event can cause an earlier event.]

In **general**, the linear **relationship** between two time series $\{r_{it}\}$ and $\{r_{jt}\}$ can be summarized as follows:
1. $\{r_{it}\} \text{ and } \{r_{jt}\}$ have no linear **relationship** if 

$$\large \rho_{ij}(\ell)=\rho_{ji}(\ell) = 0 \text{ for all } \ell ≥ 0$$

2. $\{r_{it}\} \text{ and } \{r_{jt}\}$ are **concurrently correlated** if 

$$\large \rho_{ij}(0) ≠ 0$$

3. $\{r_{it}\} \text{ and } \{r_{jt}\}$ are **"uncoupled"** defined by no **lead–lag relationship** if 

$$\large \rho_{ij}(\ell) = 0 \text{ and } \rho_{ji}(\ell) = 0 \text{ for all } \ell > 0$$

[Notice $\ell ≠ 0$ here.]

4. There is a **unidirectional relationship** from $\{r_{it}\} \text{ to } \{r_{jt}\}$ 
    - $r_{it}$ does not depend on any past value of $r_{jt}$
    - $r_{jt}$ does depend on some past values of $r_{it}$.

    if ...

$$\large \rho_{ij}(\ell) = 0 \text{ for all } \ell > 0 \text{, but } \rho_{ji}(\vee) ≠ 0 \text{ for some } \vee > 0.$$ 

5. There is a feedback relationship between $\{r_{it}\} \text{ and } \{r_{jt}\}$ if 

$$\large \rho_{ij}(\ell) ≠ 0 \text{ for some } \ell > 0 \text{, and } \rho_{ji}(\vee) ≠ 0 \text{ for some } \vee > 0.$$

The **conditions stated** earlier are **sufficient** conditions. A **more informative approach** to study the relationship between time series is to build a **multivariate model** for the series because a properly specified model considers **simultaneously the serial and cross correlations** among the series [though we are asked to untangle these combined metrics]. [So i may have autocorrelation and lead j in cross correlation. Which is the strongest factor?]

## 8.1.3 Sample Cross-Correlation Matrices

["cross" conotes a lag]

Given the data ...

$$\large \{\boldsymbol{r}_t | t = 1,\ldots, T \}$$, 

![image.png](attachment:image.png)

... the $\large \boldsymbol{\Gamma}_{\ell}$ **cross-covariance** matrix is estimated by:

(8.5)
$$\large \boldsymbol{\widehat{\Gamma}}_{\ell} = \frac{1}{T} \sum_{t=\ell+1}^T (\boldsymbol{r}_t − \bar{\boldsymbol{r}})(\boldsymbol{r}_{t-\ell} − \bar{\boldsymbol{r}})'$$

![image-2.png](attachment:image-2.png)

where 
- the vector of sample means is:

$$\large \bar{\boldsymbol{r}} = \frac{\left(\sum_{t=1}^T \boldsymbol{r}_t \right)}{T}$$

![image-3.png](attachment:image-3.png)

- [each $\boldsymbol{r}_t \text{ and } \boldsymbol{r}_{t-\ell}$ is a vector of component log return values, one component log return value for each asset at the subscripted time t or t-$\ell$.]
- [each of the summed items is a matrix created from the outer product of the unlagged vector's components' differences from vector of sample means and lagged vector's components' differences from vector of sample means.]


The $\large \boldsymbol{\rho}_{\ell}$ **cross-correlation** matrix is estimated by:

(8.6)
$$\large \boldsymbol{\widehat{\rho}}_{\ell} = \boldsymbol{\widehat{D}}^{-1} \boldsymbol{\widehat{\Gamma}} \boldsymbol{\widehat{D}}^{-1}, \;\; \ell ≥ 0$$

![image-4.png](attachment:image-4.png)

where: 
- $\boldsymbol{\widehat{D}}$ is the k × k **diagonal** matrix of the **sample** standard deviations of the component series.

Like the univariate case, 
- **asymptotic properties** of the sample cross-correlation matrix 
$\boldsymbol{\widehat{\rho}}_{\ell}$ are computed under assumptions, studied in detail by *Fuller (1976, Chapter 6)*. 
- The estimate $\boldsymbol{\widehat{\rho}}_{\ell}$ is **consistent** but **biased** in a finite sample. For asset return series, the **finite sample distribution** of $\boldsymbol{\widehat{\rho}}_{\ell}$ is complicated by the presence of conditional **heteroscedasticity and high kurtosis**. Proper **bootstrap resampling** methods to estimate the distribution are recommended for finite-sample distribution of cross correlations. A crude approximation of the variance of $\widehat{\rho}_{ij}(\ell)$ is sufficient.

[frpm Wiki "Consistent Estimator" paragraph on bias vs conssistent: Biased but consistent: Consistent imparts assymptotically the estimate converges to the true (population) value.  Bias imparts that estimates are reliably in the same way away from the true (population value).  Biased but consistent would be the case where the bias wanes assymptotically as for mean calc = 1/n * (sum x_i) + 1/n]

### Example 8.1. 

![image-3.png](attachment:image-3.png)

**Figure 8.1** Time plots of monthly log returns, in percentages, for (a) IBM stock and (b) the S&P 500 index from January 1926 to December 2008.

![image-2.png](attachment:image-2.png)

**Figure 8.2** Some scatterplots for monthly log returns of IBM stock and S&P 500 index: (a) concurrent plot of IBM vs. S&P 500, (b) S&P 500 vs. lag-1 IBM, (c) IBM vs. lag-1 S&P 500, and (d) S&P 500 vs. lag-1 S&P 500.

Consider 996 observations:
- percent log returns including dividend payments
    - $r_{1t}$ = IBM stock 
    - $r_{2t}$ = S&P 500 index 
- monthly Jan 1926 to Dec 2008 
- $r_t = (r_{1t},r_{2t})'$ is a **bivariate time series**
- $r_t$ is shown in a a timeplot in fig 8.1
- $r_{1t},r_{2t}$ are shown in scatterplots in fig 8.2
    - shows that $r_{1t},r_{2t}$ the two return series are **concurrently correlated**. 
    - the **sample concurrent correlation coefficient** [$\bar{\rho}_{12} =. 0.65$] between the two returns is statistically significant at the 5% level. 
    - the **cross correlations at lag 1** are weak if any.
- Table 8.1 provides 
    - summary statistics 
    - cross-correlation matrices (**CCM**) of $r_{1t},r_{2t}$ the two series [at lagged times]. 
        - For a **bivariate** series, each CCM is a 2 × 2 matrix with 4 correlations. 
        - **Tiao and Box**'s (1981) simplified CCM notation defines a cross-correlation matrix  with “+”, “−”, and “.”:
            1. (+) denotes a correlation coefficient $\bar{\rho}_{ij} ≥ \frac{2}{\sqrt{T}}$
            2. (-) denotes a correlation coefficient $\bar{\rho}_{ij} ≤ -\frac{2}{\sqrt{T}}$
            3. (.) denotes a correlation coefficient $-\frac{2}{\sqrt{T}} > \bar{\rho}_{ij} > \frac{2}{\sqrt{T}}$
        - $\frac{1}{T}$ = the **asymptotic** 5% critical value of the **sample** correlation under the **assumption** that $r_t$ is a white noise series. [Why do Tiao and Box use $\frac{2}{T}$?]
    - Table 8.1(c) shows simplified CCM for IBM stock and S&P 500 index monthly log returns
        - significant cross correlations at the approximate 5% level appear mainly at lags 1 and 3.  [The only CCM elements ≥ 0.10 are at lag 1 and 3.]
        - sample CCMs at these two lags indicates that:   
            (a) S&P 500 index returns have marginal [marginal means weak here?] autocorrelations at lags 1, 2, 3, and 5. [I see + or - in the (2,2) at lags 1, 3, 5, but not at lag 2?]     
            (b) IBM stock returns depend weakly on the previous returns of the S&P 500 index. [Columns lead rows; rows depend on columns: 2nd column is SP500 leading 1st row IBM; 1st row IBM depending on 2nd column SP500] See significant cross correlations in the (1,2)th element of lag-1, lag-2 and lag-5 CCMs.    
            (c) [SP 500 index returns do not depend on IBM returns at any lag; thus it is a **unidirectional relationship** with SP500 leading.]    


![image.png](attachment:image.png)

Figure 8.3 shows the sample autocorrelations [UL and LR] and cross correlations [UR and LL] of the two series $r_{1t},r_{2t}$. 
- UL plot shows IBM stock returns sample ACF
- UR plot shows IBM stock returns dependence on S&P 500 index lagged returns. 
- Dashed lines represent asymptotic two standard error limits for sample auto- and cross-correlation coefficients. 
- Dynamic relationship is weak between the two return series, but their contemporaneous correlation is statistically significant.  [Where is contemporaneous correlation shown?]


![image-4.png](attachment:image-4.png)

Figure 8.3 Sample auto- and cross-correlation functions (CCF) of two monthly log return series: (a) sample ACF of IBM stock returns, (b) cross-correlations between S&P 500 index and lagged IBM stock returns (lower left), (c) cross correlations between IBM stock and lagged S&P 500 index returns, and (d) sample ACF of S&P 500 index returns. Dashed lines denote 95% limits.