We call a random vector $X_t$ the state because it completely describes the position of a dynamic system at time $t$ from the perspective of a model builder or an econometrician. We construct a consistent sequence of probability distributions $\operatorname{Pr}_{\ell}$ for a sequence of random vectors
$$
X^{[\ell]} \doteq\left[\begin{array}{c}
X_0 \\
X_1 \\
\vdots \\
X_{\ell}
\end{array}\right]
$$
for all nonnegative integers $\ell$ by specifying the following two elementary components of a Markov process: (i) a probability distribution for $X_0$, and (ii) a time-invariant distribution for $X_{t+1}$ conditional on $X_t$ for $t \geqslant 0$. All other probabilities are functions of these two distributions. By creatively defining the state vector $X_t$, a Markov specification includes many models used in applied research.

# 3.1 Constituents

Assume a state space $\mathcal{X}$ and a transition distribution $P\left(d x^* \mid x\right)$. For example, $\mathcal{X}$ could be $\mathbb{R}^n$ or a subset of $\mathbb{R}^n$. The transition distribution $P$ is a conditional probability measure for each $X_t=x$ in the state space, so it satisfies $\int_{\left\{x^* \in \mathcal{X}\right\}} P\left(d x^* \mid x\right)=1$ for every $x$ in the state space. If in addition we specify a marginal distribution $Q_0$ for the initial state $x_0$ over $\mathcal{X}$, then we have completely specified all joint distributions for the stochastic process $\left\{X_t, t=0,1, \ldots\right\}$.

The notation $P\left(d x^* \mid x\right)$ denotes a conditional probability measure; integration is over $x^*$ and conditioning is captured by $x$. Thus, $x^*$ is a possible realization of next period's state and $x$ is a realization of this period's state. The conditional probability measure $P\left(d x^* \mid x\right)$ assigns conditional probabilities to next period's state given that this period's state is $x$. Often, but not always, the conditional distributions have densities against a common distribution $\lambda\left(d x^*\right)$ to be used to integrate over states. That lets us use a transition density to represent the conditional probability measure.

Example 3.1.1. A first-order vector autoregression is a Markov process. Here $Q_0(x)$ is a normal distribution with mean $\mu_0$ and covariance matrix $\Sigma_0$ and $P\left(d x^* \mid x\right)$ is a normal distribution with mean $A x$ and covariance matrix $B B^{\prime}$ for a square matrix $A$ and a matrix $B$ with full column rank. $\square$ These assumptions imply the vector autoregressive (VAR) representation
$$
X_{t+1}=A X_t+B W_{t+1},
$$
for $t \geqslant 0$, where $W_{t+1}$ is a multivariate standard normally distributed random vector that is independent of $X_t$.

Example 3.1.2. A discrete-state Markov chain consists of a $Q_0$ represented as a row vector and a transition probability $P\left(d x^* \mid x\right)$ represented as a matrix with one row and one column for each possible value of the state $x$. Rows contain vectors of probabilities of next period's state conditioned on a realized value of this period's state.

It is useful to construct an operator by applying a one-step conditional expectation operator to functions of a Markov state. Let $f: \mathcal{X} \rightarrow \mathbb{R}$. For bounded $f$, define:
$$
\mathbb{T} f(x)=E\left[f\left(X_{t+1}\right) \mid X_t=x\right]=\int_{\left\{x^* \in \mathcal{X}\right\}} f\left(x^*\right) P\left(d x^* \mid x\right) .
$$

The Law of Iterated Expectations justifies iterating on $\mathbb{T}$ to form conditional expectations of the function $f$ of the Markov state over longer horizons:
$$
\mathbb{T}^j f(x)=E\left[f\left(X_{t+j}\right) \mid X_t=x\right] .
$$

We can use the operator $\mathbb{T}$ to characterize a Markov process. Indeed, by applying $\mathbb{T}$ to a suitable range of test functions $f$, we can construct a conditional probability measure.

Fact 3.1.3. Start with a conditional expectation operator $\mathbb{T}$ that maps a space of bounded functions into itself. We can use $\mathbb{T}$ to construct a conditional probability measure $P\left(d x^* \mid x\right)$ provided that $\mathbb{T}$ is (a) well defined on the space of bounded functions, (b) preserves the bound, (c) maps nonnegative functions into nonnegative functions, and d) maps the unit function into the unit function.
