# 8 Multivariate Time Series

In this section, we consider $m$ time series $\{ X_{t i}, t = 0, \pm 1, \pm 2, \ldots \}$, $i = 1, 2, \ldots, m$ jointly.

## 8.1 Second order properties of multivariate time series

Denote

$$
X_t = (X_{t1}, \ldots, X_{tm})^T, \quad t = 0, \pm 1, \pm 2, \ldots
$$

The second-order properties of the multivariate time series $\{ X_t \}$ are then specified by the mean vectors

$$
\mu_t = E X_t = (\mu_{t1}, \ldots, \mu_{tm})^T,
$$

and covariance matrices

$$
\Gamma(t+h, t) = \text{Cov}(X_{t+h}, X_t) = E \{ (X_{t+h} - \mu_{t+h})(X_t - \mu_t)^T \} = [\gamma_{ij}(t+h, t)]_{i,j=1}^m.
$$

**(Stationary Multivariate Time Series).** The series $\{ X_t \}$ is said to be stationary if $\mu_t$ and $\Gamma(t+h, t)$, $h = 0, \pm 1, \pm 2, \ldots$, are independent of $t$.

For a stationary series, we shall use the notation

$$
\mu = E X_t
$$

and

$$
\Gamma(h) = E \{ (X_{t+h} - \mu)(X_t - \mu)^T \} = [\gamma_{ij}(h)]_{i,j=1}^m
$$

to represent the mean of the series and the covariance matrix at lag $h$, respectively.

Note that, for each $i$, $\{ X_{t i} \}$ is stationary with covariance function $\gamma_{ii}(\cdot)$ and mean function $\mu_i$.

The function $\gamma_{ij}(\cdot)$ where $i \neq j$ is called the **cross-covariance function** of the two series $\{ X_{t i} \}$ and $\{ X_{t j} \}$. It should be noted that $\gamma_{ij}(\cdot)$ is not in general the same as $\gamma_{ji}(\cdot)$.

Further, the correlation matrix function $R(\cdot)$ is defined by

$$
R(h) = \left[ \frac{\gamma_{ij}(h)}{\sqrt{\gamma_{ii}(0) \gamma_{jj}(0)}} \right]_{i,j=1}^m = [\rho_{ij}(h)]_{i,j=1}^m.
$$

The function $R(\cdot)$ is the covariance matrix function of the normalized series obtained by subtracting $\mu$ from $X_t$ and then dividing each component by its standard deviation.

---

**Lemma 8.1.** The covariance matrix function $\Gamma(\cdot) = [\gamma_{ij}(\cdot)]_{i,j=1}^m$ of a stationary time series $\{ X_t \}$ has the properties

1. $\Gamma(h) = \Gamma^T(-h)$;
2. $|\gamma_{ij}(h)| \leq \sqrt{\gamma_{ii}(0) \gamma_{jj}(0)}$, $i, j = 1, \ldots, m$;
3. $\gamma_{ii}(\cdot)$ is an autocovariance function, $i = 1, \ldots, m$.


4.

$$
\sum_{j,k=1}^n a_j^0 \Gamma(j-k) a_k \geq 0 \quad \text{for all } n \in \{1, 2, \ldots\} \text{ and } a_1, \ldots, a_n \in \mathbb{R}^m.
$$

And the correlation matrix $R$ satisfies the above four properties and further

$$
\rho_{ii}(0) = 1.
$$

---

**Example 8.1.** Consider the bivariate stationary process $\{ X_t \}$ defined by

$$
\begin{cases}
X_{t1} = W_t \\
X_{t2} = W_t + 0.75 W_{t-10}
\end{cases}
$$

where $\{ W_t \} \sim WN(0,1)$ (white noise with mean 0 and variance 1). Then

$$
\mu = 0,
$$

and

$$
\Gamma(-10) = \text{Cov}\left(
\begin{pmatrix} X_{t-10,1} \\ X_{t-10,2} \end{pmatrix},
\begin{pmatrix} X_{t,1} \\ X_{t,2} \end{pmatrix}
\right)
= \text{Cov}\left(
\begin{pmatrix} W_{t-10} \\ W_{t-10} + 0.75 W_{t-20} \end{pmatrix},
\begin{pmatrix} W_t \\ W_t + 0.75 W_{t-10} \end{pmatrix}
\right)
=
\begin{pmatrix}
0 & 0.75 \\
0 & 0.75
\end{pmatrix}
$$

$$
\Gamma(0) = \text{Cov}\left(
\begin{pmatrix} X_{t,1} \\ X_{t,2} \end{pmatrix},
\begin{pmatrix} X_{t,1} \\ X_{t,2} \end{pmatrix}
\right)
= \text{Cov}\left(
\begin{pmatrix} W_t \\ W_t + 0.75 W_{t-10} \end{pmatrix},
\begin{pmatrix} W_t \\ W_t + 0.75 W_{t-10} \end{pmatrix}
\right)
=
\begin{pmatrix}
1 & 1 \\
1 & 1.5625
\end{pmatrix}
$$

$$
\Gamma(10) = \text{Cov}\left(
\begin{pmatrix} X_{t+10,1} \\ X_{t+10,2} \end{pmatrix},
\begin{pmatrix} X_{t,1} \\ X_{t,2} \end{pmatrix}
\right)
= \text{Cov}\left(
\begin{pmatrix} W_{t+10} \\ W_{t+10} + 0.75 W_t \end{pmatrix},
\begin{pmatrix} W_t \\ W_t + 0.75 W_{t-10} \end{pmatrix}
\right)
=
\begin{pmatrix}
0 & 0 \\
0.75 & 0.75
\end{pmatrix}
$$


Otherwise, $\Gamma(j) = 0$. The correlation matrix function is given by

$$
R(-10) = \begin{pmatrix}
0 & 0.6 \\
0 & 0.48
\end{pmatrix}, \quad
R(0) = \begin{pmatrix}
1 & 0.8 \\
0.8 & 1
\end{pmatrix}, \quad
R(10) = \begin{pmatrix}
0 & 0 \\
0.6 & 0.48
\end{pmatrix},
$$

and $R(j) = 0$ otherwise.

---

**(Multivariate White Noise).** The $m$-variate series $\{ W_t, t = 0, \pm 1, \pm 2, \ldots \}$ is said to be white noise with mean 0 and covariance matrix $\Sigma$, written as

$$
\{ W_t \} \sim WN(0, \Sigma),
$$

if and only if $\{ W_t \}$ is stationary with mean vector 0 and covariance matrix function

$$
\Gamma(h) =
\begin{cases}
\Sigma & \text{if } h = 0, \\
0 & \text{otherwise}.
\end{cases}
$$

If further, the $\{ W_t \}$ are independent, then we write

$$
\{ W_t \} \sim IID(0, \Sigma).
$$

Multivariate white noise is used as a building block from which an enormous variety of multivariate time series can be constructed. The **linear processes** are those of the form

$$
X_t = \sum_{j=-\infty}^{\infty} C_j W_{t-j}, \quad \{ W_t \} \sim WN(0, \Sigma),
$$

where $\{ C_j \}$ is a sequence of matrices whose components are absolutely summable. It is easy to see that this linear process has mean 0 and covariance matrix function

$$
\Gamma(h) = \sum_{j=-\infty}^{\infty} C_{j+h} \, \Sigma \, C_j^T, \quad h = 0, \pm 1, \pm 2, \ldots
$$

---

**Estimation of $\mu$.** Based on the observations $X_1, \ldots, X_n$, an unbiased estimate of $\mu$ is given by the vector of sample means

$$
\bar{X}_n = \frac{1}{n} \sum_{t=1}^n X_t.
$$

This estimator is consistent and asymptotically normal with rate $\sqrt{n}$.


### Estimation of $\Gamma(h)$

Based on the observations $X_1, \ldots, X_n$, as in the univariate case, a natural estimate of the covariance matrix $\Gamma(h)$ is

$$
\hat{\Gamma}(h) = 
\begin{cases}
\frac{1}{n-1} \sum_{t=1}^{n-h} (X_{t+h} - \bar{X}_n)(X_t - \bar{X}_n)^T, & 0 \leq h \leq n-1, \\
\\
\frac{1}{n-1} \sum_{t=-h+1}^n (X_{t+h} - \bar{X}_n)(X_t - \bar{X}_n)^T, & -n+1 \leq h < 0,
\end{cases}
$$

where $\bar{X}_n = \frac{1}{n} \sum_{t=1}^n X_t$.

Denoting $\hat{\Gamma}(h) = [\hat{\gamma}_{ij}(h)]_{i,j=1}^m$, we estimate the cross-correlation function by

$$
\hat{\rho}_{ij}(h) = \frac{\hat{\gamma}_{ij}(h)}{\sqrt{\hat{\gamma}_{ii}(0) \hat{\gamma}_{jj}(0)}}.
$$

If $i = j$, this reduces to the sample autocorrelation function of the $i$-th series.

---

### Theorem 8.1

Let $\{X_t\}$ be the bivariate time series

$$
X_t = \sum_{k=-\infty}^{\infty} C_k W_{t-k}, \quad \{ W_t = (W_{t1}, W_{t2})^T \} \sim IID(0, \Sigma),
$$

where $\{ C_k = [C_k(i,j)]_{i,j=1}^2 \}$ is a sequence of matrices with

$$
\sum_{k=-\infty}^{\infty} |C_k(i,j)| < \infty, \quad i,j=1,2.
$$

Then as $n \to \infty$,

$$
\hat{\gamma}_{ij}(h) \xrightarrow{p} \gamma_{ij}(h),
$$

and

$$
\hat{\rho}_{ij}(h) \xrightarrow{p} \rho_{ij}(h),
$$

for each fixed $h \geq 0$ and for $i,j = 1,2$.

---

### Theorem 8.2

Suppose that

$$
X_{t1} = \sum_{j=-\infty}^\infty \alpha_j W_{t-j,1}, \quad \{ W_{t1} \} \sim IID(0, \sigma_1^2),
$$

and

$$
X_{t2} = \sum_{j=-\infty}^\infty \beta_j W_{t-j,2}, \quad \{ W_{t2} \} \sim IID(0, \sigma_2^2),
$$

where the two sequences $\{ W_{t1} \}$ and $\{ W_{t2} \}$ are independent, and

$$
\sum_j |\alpha_j| < \infty, \quad \sum_j |\beta_j| < \infty.
$$

Then if $h \geq 0$,

$$
\hat{\rho}_{12}(h) \sim AN\left(0, n^{-1} \sum_{j=-\infty}^\infty \rho_{11}(j) \rho_{22}(j) \right),
$$

where $AN$ stands for asymptotically normal distribution.

If $h, k \geq 0$ and $h \neq k$, then

$$
\begin{pmatrix}
\hat{\rho}_{12}(h) \\
\hat{\rho}_{12}(k)
\end{pmatrix}
\sim AN \left(
0,
\begin{pmatrix}
n^{-1} \sum_{j=-\infty}^\infty \rho_{11}(j) \rho_{22}(j) & n^{-1} \sum_{j=-\infty}^\infty \rho_{11}(j) \rho_{22}(j + k - h) \\
n^{-1} \sum_{j=-\infty}^\infty \rho_{11}(j) \rho_{22}(j + k - h) & n^{-1} \sum_{j=-\infty}^\infty \rho_{11}(j) \rho_{22}(j)
\end{pmatrix}
\right).
$$


This theorem plays an important role in testing for correlation between two processes. If one of the two processes is white noise, then

$$
\hat{\rho}_{12}(h) \sim AN\left(0, n^{-1}\right),
$$

in which case it is straightforward to test the hypothesis that $\rho_{12}(h) = 0$. The rejection region is

$$
|\hat{\rho}_{12}(h)| > \frac{z_{\alpha/2}}{\sqrt{n}}.
$$

However, if neither process is white noise, then a value of $|\hat{\rho}_{12}(h)|$ which is large relative to $n^{-1/2}$ does not necessarily indicate that $\rho_{12}(h)$ is different from zero. For example, suppose that $\{ X_{t1} \}$ and $\{ X_{t2} \}$ are two independent and identical AR(1) processes with

$$
\rho_{11}(h) = \rho_{22}(h) = 0.8^{|h|}.
$$

Then the asymptotic variance of $\hat{\rho}_{12}(h)$ is

$$
n^{-1} \left(1 + 2 \sum_{k=1}^\infty (0.8)^{2k} \right) = 4.556 \, n^{-1}.
$$

Thus, the rejection region is

$$
|\hat{\rho}_{12}(h)| > z_{\alpha/2} \frac{\sqrt{4.556}}{\sqrt{n}}.
$$

It would not be surprising to observe a value of $\hat{\rho}_{12}(h)$ as large as $3 n^{-1/2}$ even though $\{ X_{t1} \}$ and $\{ X_{t2} \}$ are independent. On the other hand, if

$$
\rho_{11}(h) = 0.8^{|h|} \quad \text{and} \quad \rho_{22}(h) = (-0.8)^{|h|},
$$

then the asymptotic variance of $\hat{\rho}_{12}(h)$ is $0.2195 n^{-1}$, and an observed value of $3 n^{-1/2}$ for $\hat{\rho}_{12}(h)$ would be very unlikely.

---

## 8.2 Multivariate ARMA processes

**(Multivariate ARMA(p, q) process).** The process $\{ X_t, t = 0, \pm 1, \pm 2, \ldots \}$ is an $m$-variate ARMA(p, q) process if $\{ X_t \}$ is a stationary solution of the difference equations

$$
X_t - \Phi_1 X_{t-1} - \cdots - \Phi_p X_{t-p} = W_t + \Theta_1 W_{t-1} + \cdots + \Theta_q W_{t-q},
$$

where $\Phi_1, \ldots, \Phi_p, \Theta_1, \ldots, \Theta_q$ are real $m \times m$ matrices, and $\{ W_t \} \sim WN(0, \Sigma)$.

We can write this more compactly as

$$
\Phi(B) X_t = \Theta(B) W_t, \quad \{ W_t \} \sim WN(0, \Sigma),
$$

where

$$
\Phi(z) = I - \Phi_1 z - \cdots - \Phi_p z^p,
$$

and

$$
\Theta(z) = I + \Theta_1 z + \cdots + \Theta_q z^q,
$$

are matrix-valued polynomials, and $I$ is the $m \times m$ identity matrix.

---

**Example 8.2. (Multivariate AR(1) process).**

This process satisfies

$$
X_t = \Phi X_{t-1} + W_t, \quad \{ W_t \} \sim WN(0, \Sigma).
$$


We have

$$
X_t = \sum_{j=0}^{\infty} \Phi^j W_{t-j},
$$

provided all the eigenvalues of $\Phi$ are less than 1 in absolute value; that is,

$$
\det(I - z\Phi) \ne 0 \quad \text{for all } z \in \mathbb{C} \text{ such that } |z| \leq 1.
$$

---

### Theorem 8.3 (Causality Criterion)

If

$$
\det \Phi(z) \ne 0 \quad \text{for all } z \in \mathbb{C} \text{ such that } |z| \leq 1,
$$

then there exists exactly one **stationary solution** of the ARMA process:

$$
X_t = \sum_{j=0}^{\infty} \Psi_j W_{t-j},
$$

where the matrices $\Psi_j$ are determined uniquely by the generating function

$$
\Psi(z) = \sum_{j=0}^{\infty} \Psi_j z^j = \Phi^{-1}(z) \Theta(z), \quad |z| \leq 1.
$$

---

### Theorem 8.4 (Invertibility Criterion)

If

$$
\det \Theta(z) \ne 0 \quad \text{for all } z \in \mathbb{C} \text{ such that } |z| \leq 1,
$$

and $\{X_t\}$ is a stationary solution of the ARMA equation, then the white noise sequence $\{W_t\}$ can be recovered as

$$
W_t = \sum_{j=0}^{\infty} \Pi_j X_{t-j},
$$

where the matrices $\Pi_j$ are uniquely determined by

$$
\Pi(z) = \sum_{j=0}^{\infty} \Pi_j z^j = \Theta^{-1}(z) \Phi(z), \quad |z| \leq 1.
$$
