# 3.4 Ergodic Markov Processes

Chapter 1 studied special statistical models that, because they are ergodic, are affiliated with a Law of Large Numbers in which limit points are constant across sample points $\omega \in \Omega$. Section 1.8 described other statistical models that are not ergodic and that are components of more general probability specifications that we used to express the idea that a statistical model is unknown. We now explore ergodicity in the context of Markov processes.

From Proposition 3.3.2 we know that time-series averages of an eigenfunction $\mathbb{T} \tilde{f}=\tilde{f}$ are invariant over time, so
$$
\frac{1}{N} \sum_{t=1}^N \tilde{f}\left(X_t\right)=\tilde{f}(X)
$$

However, when $\tilde{f}(x)$ varies across sets of states $x$ that occur with positive probability under $Q$, a time series average $\frac{1}{N} \sum_{t=1}^N \tilde{f}\left(X_t\right)$ can differ from $\int \tilde{f}(x) Q(d x)$. This happens when observations of $\tilde{f}\left(X_t\right)$ along a sample path for $\left\{X_t\right\}$ convey an inaccurate impression of how $f(X)$ varies across the stationary distribution $Q(d x)$. See Example 3.6.4 below. We can exclude the possibility of such inaccurate impressions by imposing a restriction on the eigenfunction equation $\mathbb{T} f=f$.

Proposition 3.4.1. When a unique solution to the equation
$$
\mathbb{T} f=f
$$
is a constant function (with $Q$ measure one), then it is possible to construct $\left\{X_t: t=0,1,2, \ldots\right\}$ as a stationary and ergodic Markov process with $\mathbb{T}$ as the one-period conditional expectation operator and $Q$ as the initial distribution for $X_0$. .

Evidently, ergodicity is a property that obtains relative to a stationary distribution $Q$ of the Markov process. If there are multiple stationary distributions, it is possible that there is a unique constant function $f$ that solves $\mathbb{T} f=f$ problem for one stationary distribution and that non-constant solutions exist for other stationary distributions.

Invariant events for a Markov process
Consider an eigenfunction $\tilde{f}$ of $\mathbb{T}$ associated with a unit eigenvalue. Let $\varphi: \mathbb{R} \rightarrow \mathbb{R}$ be a bounded Borel measurable function. Since $\left\{\tilde{f}\left(X_t\right): t=\right.$ $0,1,2, \ldots\}$ is invariant over time, so is $\left\{\varphi\left[\tilde{f}\left(X_t\right)\right]: t=0,1,2, \ldots\right\}$ and it is necessarily true that
$$
\mathbb{T}(\varphi \circ \tilde{f})=\varphi \circ \tilde{f}
$$
Therefore, from an eigenfunction $\tilde{f}$ associated with a unit eigenvalue, we can construct other eigenfunctions, $]^9$ for example
$$
\varphi[\tilde{f}(x)]= \begin{cases}1 & \text { if } \tilde{f}(x) \in \tilde{\mathfrak{b}} \\ 0 & \text { if } \tilde{f}(x) \notin \tilde{\mathfrak{b}}\end{cases}
$$
for some Borel set $\tilde{\mathfrak{b}}$ in $\mathbb{R}$. It follows that
$$
\Lambda=\left\{\omega \in \Omega: \tilde{f}\left[X_0(\omega)\right] \in \tilde{\mathfrak{b}}\right\}
$$
is an invariant event in $\Omega$. Note that by constructing the Borel set, $\mathfrak{b}$ in $\mathcal{X}$
$$
\mathfrak{b}=\{x: \tilde{f}(x) \in \tilde{\mathfrak{b}}\}
$$
we can represent $\Lambda$ as
$$
\Lambda=\left\{\omega \in \Omega: X_0(\omega) \in \mathfrak{b}\right\} .
$$

Thus we have shown how to construct many non-degenerate eigenfunctions, starting from an initial such function.

For Markov processes, all invariant events can be represented as in (3.4), which is expressed in terms of the initial state $X_0$. See Doob (1953, p. 460, Theorem 1.1). Thus, associated with an invariant event is a Borel set in $\mathcal{X}$. Let $\mathfrak{J}$ denote the collection of Borel subsets of $\mathcal{X}$ for which $\Lambda$ constructed as in (3.4) is an invariant event. From these invariant events, we can also construct many non-degenerate eigenfunctions as indicator functions of sets in $\mathfrak{J}$. Formally, if $\tilde{\mathfrak{b}} \in \mathfrak{J}$, then the indicator function
$$
f(x)= \begin{cases}1 & \text { if } x \in \mathfrak{b} \\ 0 & \text { if } x \notin \mathfrak{b}\end{cases}
$$
satisfies
$$
\mathbb{T} f=f
$$
with $Q$ probability one. Provided that the probability of $\Lambda$ is neither zero nor one, then we have constructed a nonnegative function $f$ that is strictly positive on a set of positive $Q$ measure and zero on a set with strictly positive $Q$ measure.

More generally, when a Markov process $\left\{X_t: t \geq 0\right\}$ is not ergodic, there exist bounded eigenfunctions with unit eigenvalues that are not constant with $Q$ measure one. For a non-degenerate eigenfunction $\tilde{f}$ with unit eigenvalue to be constant with $Q$ measure one, it shouldn't be possible for the Markov process permanently to get stuck in a subset of the state space which has probability different from one or zero. Suppose now we consider any Borel set $\mathfrak{b}$ of $\mathcal{X}$ that has $Q$ measure that is neither zero nor one. Let $f$ be constructed as in (3.5) without restricting $\mathfrak{b}$ to be in $\mathfrak{J}$. Then $\mathbb{T}^j$ applied to $f$ is the conditional probability of $\left\{X_j \in \mathfrak{b}\right\}$ as of date zero. If we want time series averages to converge to unconditional expectations, we must require that the set $\mathfrak{b}$ be visited eventually with positive probability. To account properly for all possible future dates we use a mathematically convenient resolvent operator defined by
$$
\mathbb{M} f(x)=(1-\lambda) \sum_{j=0}^{\infty} \lambda^j \mathbb{T}^j f
$$
for some constant discount factor $0<\lambda<1$. Notice that If $\tilde{f}$ is an eigenfunction of $\mathbb{T}$ associated with a unit eigenvalue, then the same is true for $\mathbb{T}^j$ and hence for $\mathbb{M}$. We translate the requirement that $X_j$ be eventually visited to a restriction that applying $\mathbb{M}$ the indicator function $f$ yields a strictly positive function. The following statement extends this restriction to all nonnegative functions that are distinct from zero.
Proposition 3.4.2. Suppose that for any $f \geq 0$ such that $\int f(x) Q(d x)>0$, $\mathbb{M} f(x)>0$ for all $x \in \mathcal{X}$ with $Q$ measure one. Then any solution $\tilde{f}$ to $\mathbb{T} f=f$ is necessarily constant with $Q$ measure one.
Proof. Consider an eigenfunction $\tilde{f}$ associated with a unit eigenvalue. The function $f=\varphi \circ \tilde{f}$ necessarily satisfies:
$$
\mathbb{M} f=f
$$
for any $\varphi$ of the form (3.3). If such an $f$ also satisfies $\int f(x) Q(d x)>0$, then $f(x)=1$ with $Q$ probability one. Since this holds for any Borel set $\mathfrak{b}$ in $\mathbb{R}, \tilde{f}$ must be constant with $Q$ probability one.

Proposition 3.4 .2 supplies a sufficient condition for ergodicity. A more restrictive sufficient condition is that there exists an integer $m \geqslant 1$ such that
$$
\mathbb{T}^m f(x)>0
$$

for any $f \geq 0$ such that $\int f(x) Q(d x)>0$ on a set with $Q$ measure one.
Remark 3.4.3. The sufficient conditions imposed in Proposition 3.4.9 imply a property called irreducibility relative to the probability measure $Q$. While this proposition presumes that $Q$ is a stationary distribution, irreducibility allows for a more general specification of $Q$.

Proposition 3.4 .2 provides a way to verify ergodicity. As discussed in Chapter 1, ergodicity is a property of a statistical model. As statisticians or econometricians we often entertain a set of Markov models, each of which is ergodic. For each model we can build a probability $\operatorname{Pr}$ using the canonical construction given at the outset of Chapter 1. Convex combinations of these probabilities are measure-preserving but not necessarily ergodic when used in conjunction with the shift transformation $\mathbb{S}$. We can take the ergodic Markov models to be the building blocks for a specification to to be used in a statistical investigation. There can be a finite number of these building blocks or even a continuum of them represented in terms of an unknown parameter vector.
