# Gibbs Sampling

Gibbs Sampling is a simple and widely applicable Markov Chain Monte Carlo algorithm and can be seen as a special case of the Metropolis-Hastings algorithm.

Consider the distribution $p(\mathbf{z})=p(z_1,\ldots,z_M)$ from which we wish to sample, and suppose that we have chosen some initial state for the Markov Chain. Each step of the Gibbs sampling procedure involves replacing the value of one of the variables by a value drawn from the distribution of that variable conditioned on the values of the remaining variables. Thus we replace $z_i$ by a value drawn from the distribution $p(z_i|\mathbf{z}_{\\i})$, where $z_i$ denotes the $i$th component of $\mathbf{z}$, and $\mathbf{z}_{\\i}$ denotes $z_i,\ldots,z_M$ but with $z_i$ omitted. This procedure is repeated either by cycling through the variables in some particular order or by choosing the variable to be updated at each step at random for some distribution.

For example, suppose we have a distribution $p(z_1, z_2, z_3)$ over three variables, and at step $\tau$ of the algorithm we have selected values $z_1^{(\tau)}$, $z_2^{(\tau)}$ and $z_3^{(\tau)}$. We first replace $z_1^{(\tau)}$ by a new value $z_1^{(\tau+1)}$ obtained by sampling from the conditional distribution

$$
p(z_1|z_2^{(\tau)}, z_3^{(\tau)}).
$$

Next we replace $z_2^{(\tau)}$ by a value $z_2^{(\tau+1)}$ obtained by sampling from the conditional distribution
$$
p(z_2|z_1^{(\tau+1)}, z_3^{(\tau)})
$$
so that the new value for $z_1$ is used straight away in subsequent sampling steps. Then we update $z_3$ with a sample $z_3^{(\tau+1)}$ drawn from
$$
p(z_3| z_1^{(\tau+1)}, z_2^{(\tau+1)})
$$
and so on, cycling through the three variables in turn.

> ### Gibbs Sampling
1. Initialize $\{z_i: i=1,\ldots,M\}$
2. For $\tau = 1,\ldots,T$:
    - Sample $z_1^{(\tau+1)} \sim p(z_1|z_2^{(\tau)}, z_3^{(\tau)}, \ldots,z_M^{(\tau)})$.
    - Sample $z_2^{(\tau+1)} \sim p(z_2|z_1^{(\tau+1)}, z_3^{(\tau)}, \ldots,z_M^{(\tau)})$.
    - $\vdots$
    - Sample $z_{j}^{(\tau+1)} \sim p(z_j|z_1^{(\tau+1)},\ldots, z_{j-1}^{(\tau+1)},z_{j+1}^{(\tau)},\ldots,z_M^{(\tau)})$.
    - $\vdots$
    - Sample $z_M^{(\tau+1)} \sim p(z_M|z_1^{(\tau+1)}, z_2^{(\tau+1)}, \ldots,z_{M-1}^{(\tau+1)})$.

To show this procedure samples from the required distribution, we first of all note that the distribution $p(\mathbf{z})$ is an invariant of each of the Gibbs sampling steps individually and hence of the whole Markov chain. This follows from the fact that when we sample from $p(z_i|\{\mathbf{z}_{\backslash i})$, the marginal distribution $p(\mathbf{z}_{\backslash i})$ is clearly invariant because the value of $\mathbf{z}_{\backslash i}$ is unchanged. Also, each step by definition samples from the correct conditional distribution $p(z_i|\mathbf{z}_{\backslash i})$. Because the conditional and marginal distributions together specify the joint distribution, we see that the joint distribution is itself invariant.

The second requirement to be satisfied in order that the Gibbs sampling procedure samples from the correct distribution is that it be ergodic. A sufficient condition for ergodicity is that none of the conditional distributions be anywhere zero. If this is the case, then any point in $z$ space can be reached from any other point in a finite number of steps involving one update of each of the component variables.

Because the basic Gibbs sampling technique considers one variable at a time, there are strong dependencies between successive samples. We can hope to improve on simple Gibbs sampler by adopting an intermediate strategy in which we sample successively from groups of variables rather than individual variables. This is achieved in the *blocking Gibbs* sampling algorithm by choosing blocks of variables, not necessarily disjoint, then sampling jointly from the variables in each block in turn, conditioned on the remaining variables.