# Statistical Ensembles and Liouville's Theorem

## What is an Ensemble?

In **classical mechanics**, we describe the state of a system (like a single particle) by its position and momentum. For a system with many particles, say $N$ particles, the state would be given by $3N$ position coordinates and $3N$ momentum coordinates. This $6N$-dimensional space is called **phase space**. Each point in phase space represents a unique microscopic state of the system.

However, when we deal with macroscopic systems (like a gas in a room), we have an enormous number of particles (on the order of Avogadro's number, $10^{23}$). It's practically impossible to know the exact microscopic state of such a system at any given time. Even if we could, solving the equations of motion for all those particles would be computationally intractable.

This is where the concept of an **ensemble** comes in. An ensemble is a **collection of a large number of identical systems**, all prepared in the same macroscopic way, but each in a different possible microscopic state consistent with the macroscopic conditions.

Imagine you have a box of gas. You know its volume, temperature, and the number of particles. An ensemble would be a vast collection of identical boxes, each with the same volume, temperature, and number of particles, but the individual gas molecules within each box would be in different positions and have different momenta.

The idea is that if we observe a single macroscopic system over a very long time, it will, on average, visit all the microscopic states consistent with its macroscopic properties. This is called the **ergodic hypothesis**. For practical purposes, instead of waiting for one system to explore all its possible states, we consider a large number of identical systems at a single instant in time. The average over this ensemble is then equivalent to the time average of a single system.

> **[Ergodic hypothesis](https://en.wikipedia.org/wiki/Ergodic_hypothesis)**. The ergodic hypothesis, in physics and thermodynamics, essentially states that for a system in thermal equilibrium, the time a system spends in a particular region of its phase space is proportional to the volume of that region. In simpler terms, over a long period, the system will visit all accessible states with equal probability. This allows us to calculate the average behavior of a system by averaging over its states, rather than over time. 

### Types of Ensembles

The type of ensemble used depends on which macroscopic quantities are kept constant:

* **Microcanonical Ensemble (NVE):** This ensemble represents an isolated system where the **number of particles ($N$), volume ($V$), and total energy ($E$) are fixed**. All systems in this ensemble have exactly the same energy.
* **Canonical Ensemble (NVT):** This ensemble represents a closed system in thermal contact with a heat reservoir. Here, the **number of particles ($N$), volume ($V$), and temperature ($T$) are fixed**. Energy can fluctuate between the system and the reservoir.
* **Grand Canonical Ensemble ($\mu$VT):** This ensemble represents an open system that can exchange both energy and particles with a reservoir. Here, the **chemical potential $\mu$, volume $V$, and temperature $T$ are fixed**. Both energy and particle number can fluctuate.

We will review all these different ensembles in detail a bit later.

---

## Liouville's theorem

Liouville's theorem in statistical mechanics states that the density of system points in phase space remains constant along the trajectories of a Hamiltonian system. In simpler terms, it means that the "volume" occupied by a group of particles in phase space remains unchanged as they move through time. This theorem is a cornerstone of statistical mechanics, particularly when dealing with conservative systems.

Liouville's theorem is a cornerstone of statistical mechanics, providing a fundamental insight into how systems evolve in **phase space**. It essentially states that the **density of points in phase space remains constant along the trajectories of the system**. This is analogous to an **incompressible fluid** flow in phase space.

**Phase space** is a $6N$-dimensional mathematical space where each point represents a unique **microstate** of the system, defined by all its position and momentum coordinates. As the system evolves over time, this point traces a path, or **trajectory**, in phase space.

Liouville's theorem has two equivalent interpretations:

1.  **Conservation of Phase Space Volume:** If you take a collection of initial microstates that occupy a certain volume in phase space, as these microstates evolve according to Hamilton's equations, the *volume* of the region they occupy in phase space remains constant. The shape of this region might distort significantly, but its total volume does not change.
2.  **Conservation of Phase Space Density:** Consider a "fluid" of representative points in phase space, where the density of this fluid, $\rho(q, p, t)$, represents the probability of finding the system in a particular microstate at a given time $t$. Liouville's theorem states that if you follow any specific point in this fluid (i.e., a specific system's trajectory), the density of points *around that specific point* remains constant. In other words, the fluid is incompressible.

This theorem applies to **conservative Hamiltonian systems**, meaning systems where the total energy is conserved and there are no external dissipative forces like friction.

The mathematical statement of Liouville's theorem is often expressed through the **Liouville equation**, which describes the time evolution of the phase space distribution function $\rho(q, p, t)$.

We start by considering the conservation of "probability" (or the number of representative points) in phase space. Imagine a small, fixed volume $dV$ in phase space. The change in the number of points within this volume over time must be due to the net flow of points across its boundaries. This is analogous to the continuity equation in fluid dynamics.

The **continuity equation** in phase space is:

$$\frac{\partial \rho}{\partial t} + \nabla \cdot (\rho \mathbf{v}) = 0$$

Here:
* $\rho(q, p, t)$ is the phase space probability density.
* $\mathbf{v}$ is the "velocity" vector in phase space, whose components are the time derivatives of the generalized coordinates and momenta: $\mathbf{v} = (\dot{q}_1, \ldots, \dot{q}_{3N}, \dot{p}_1, \ldots, \dot{p}_{3N})$.
* $\nabla \cdot$ is the divergence operator in phase space, defined as:

    $$\nabla \cdot = \sum_{i=1}^{3N} \left( \frac{\partial}{\partial q_i} + \frac{\partial}{\partial p_i} \right)$$

Expanding the divergence term:

$$\nabla \cdot (\rho \mathbf{v}) = \sum_{i=1}^{3N} \left( \frac{\partial (\rho \dot{q}_i)}{\partial q_i} + \frac{\partial (\rho \dot{p}_i)}{\partial p_i} \right)$$

Using the product rule:

$$\nabla \cdot (\rho \mathbf{v}) = \sum_{i=1}^{3N} \left( \frac{\partial \rho}{\partial q_i} \dot{q}_i + \rho \frac{\partial \dot{q}_i}{\partial q_i} + \frac{\partial \rho}{\partial p_i} \dot{p}_i + \rho \frac{\partial \dot{p}_i}{\partial p_i} \right)$$

Rearranging terms:

$$\nabla \cdot (\rho \mathbf{v}) = \sum_{i=1}^{3N} \left( \dot{q}_i \frac{\partial \rho}{\partial q_i} + \dot{p}_i \frac{\partial \rho}{\partial p_i} \right) + \rho \sum_{i=1}^{3N} \left( \frac{\partial \dot{q}_i}{\partial q_i} + \frac{\partial \dot{p}_i}{\partial p_i} \right)$$

Now, we introduce **Hamilton's equations of motion** for a conservative system, which describe how the generalized coordinates and momenta evolve over time:

$$\dot{q}_i = \frac{\partial H}{\partial p_i}$$
$$\dot{p}_i = -\frac{\partial H}{\partial q_i}$$

where $H(q, p)$ is the Hamiltonian (the total energy) of the system.

Let's look at the second sum in the expanded divergence term:

$$\sum_{i=1}^{3N} \left( \frac{\partial \dot{q}_i}{\partial q_i} + \frac{\partial \dot{p}_i}{\partial p_i} \right) = \sum_{i=1}^{3N} \left( \frac{\partial}{\partial q_i} \left( \frac{\partial H}{\partial p_i} \right) + \frac{\partial}{\partial p_i} \left( -\frac{\partial H}{\partial q_i} \right) \right)$$

Since the order of differentiation for mixed partial derivatives of a well-behaved function like the Hamiltonian doesn't matter (assuming $H$ is sufficiently smooth), we have:

$$\frac{\partial^2 H}{\partial q_i \partial p_i} = \frac{\partial^2 H}{\partial p_i \partial q_i}$$

Therefore, the terms in the sum cancel out:

$$\sum_{i=1}^{3N} \left( \frac{\partial^2 H}{\partial q_i \partial p_i} - \frac{\partial^2 H}{\partial p_i \partial q_i} \right) = 0$$

This means the second term in the expanded divergence, $\rho \sum (\ldots)$, is zero. This is a crucial result, implying that the phase space "fluid" is **incompressible**. Let's review this statement in more details.

We can conclude that the phase space "fluid" is incompressible because the term $\sum_{i=1}^{3N} \left( \frac{\partial \dot{q}_i}{\partial q_i} + \frac{\partial \dot{p}_i}{\partial p_i} \right) = \nabla \cdot \mathbf{v}$ represents the **divergence of the phase space velocity field**, and for Hamiltonian systems, this divergence is identically zero.

Here's why that implies incompressibility:

In fluid dynamics, the **divergence of a velocity field** ($\nabla \cdot \mathbf{v}$) at a point tells you about the net outflow of fluid from an infinitesimally small volume around that point.
* If $\nabla \cdot \mathbf{v} > 0$, there's a net outflow, meaning the fluid is expanding.
* If $\nabla \cdot \mathbf{v} < 0$, there's a net inflow, meaning the fluid is compressing at that point.
* If $\nabla \cdot \mathbf{v} = 0$, there is no net outflow or inflow; the fluid density at that point is conserved as it moves. This is the definition of **incompressibility**.

In the context of phase space, our "velocity" vector $\mathbf{v}$ has components $(\dot{q}_1, \ldots, \dot{q}_{3N}, \dot{p}_1, \ldots, \dot{p}_{3N})$. The sum $\sum_{i=1}^{3N} \left( \frac{\partial \dot{q}_i}{\partial q_i} + \frac{\partial \dot{p}_i}{\partial p_i} \right)$ is precisely the **divergence of this phase space velocity field**.

As derived using Hamilton's equations:
$$\sum_{i=1}^{3N} \left( \frac{\partial \dot{q}_i}{\partial q_i} + \frac{\partial \dot{p}_i}{\partial p_i} \right) = \sum_{i=1}^{3N} \left( \frac{\partial}{\partial q_i} \left( \frac{\partial H}{\partial p_i} \right) + \frac{\partial}{\partial p_i} \left( -\frac{\partial H}{\partial q_i} \right) \right) = \sum_{i=1}^{3N} \left( \frac{\partial^2 H}{\partial q_i \partial p_i} - \frac{\partial^2 H}{\partial p_i \partial q_i} \right) = 0$$

Since this divergence is zero for Hamiltonian systems, it means that the phase space velocity field has no net sources or sinks. Therefore, if you consider a small "parcel" of phase space points, as it moves along the trajectories defined by Hamilton's equations, its volume in phase space will not change. This is the hallmark of an **incompressible flow**. The phase space "fluid" can stretch and deform, but its volume remains constant, just like an incompressible physical fluid might change shape while its total volume stays the same.

Substituting this back into the continuity equation:

$$\frac{\partial \rho}{\partial t} + \sum_{i=1}^{3N} \left( \dot{q}_i \frac{\partial \rho}{\partial q_i} + \dot{p}_i \frac{\partial \rho}{\partial p_i} \right) = 0$$

The sum on the left is precisely the **convective derivative** (or total time derivative) of $\rho$ with respect to time, assuming $\rho$ depends explicitly on $t$ as well as implicitly through $q(t)$ and $p(t)$. This is denoted as $d\rho/dt$.

$$\frac{d\rho}{dt} = \frac{\partial \rho}{\partial t} + \sum_{i=1}^{3N} \left( \frac{\partial \rho}{\partial q_i} \frac{dq_i}{dt} + \frac{\partial \rho}{\partial p_i} \frac{dp_i}{dt} \right)$$

So, the Liouville equation can be compactly written as:

$$\frac{d\rho}{dt} = 0$$

This equation implies that the **phase space probability density $\rho$ is constant along the trajectories of the system**. An observer "riding along" with a phase space point would always measure the same density of other points around them. $\frac{d\rho}{dt} = 0$ equation means that if you **follow a specific trajectory in phase space** (i.e., you are "riding along" with a specific phase space point), the density $\rho$ that you observe *at that specific point* will **not change with time**.

Imagine a fluid with dye markers. If you are standing still in the fluid, you might see the dye concentration change as different parts of the fluid flow past you ($\frac{\partial \rho}{\partial t} \neq 0$). However, if you attach yourself to one of the dye markers and move with it, you will always be within the same "parcel" of fluid, and the concentration of dye *within that parcel* (and thus around you) would remain constant, assuming the fluid is incompressible ($\frac{d\rho}{dt} = 0$).

In the context of Liouville's theorem, the phase space "fluid" is incompressible. Therefore, an observer moving with a specific system's trajectory in phase space will always perceive the local density of other nearby system trajectories (or representative points) to be constant.

Liouville's theorem is crucial for understanding statistical ensembles, particularly the microcanonical ensemble where all accessible microstates on a given energy hypersurface are equally probable.


## Liouville's theorem and equilibrium statistical mechanics

Liouville's theorem states that the **total time derivative** of the probability density function in phase space, $\frac{d\rho}{dt}$, is zero. This is a general result that holds for any Hamiltonian system, whether it's in equilibrium or not. This also means, that $\rho$ is a constant of motion because the total time derivative is equal to zero.

In statistical mechanics, a system is said to be in **equilibrium** when its macroscopic properties (like temperature, pressure, energy, volume, etc.) are **time-independent**. This doesn't mean that the individual particles have stopped moving; quite the opposite, they are in constant, dynamic motion. Instead, it means that the *average* behavior of the system, when observed over time, remains constant.

For this macroscopic constancy to hold, the **probability distribution** of the system's microstates must also be time-independent. 

$$\frac{d\rho}{dt} = \frac{\partial \rho}{\partial t} + \sum_{i=1}^{3N} \left( \dot{q}_i \frac{\partial \rho}{\partial q_i} + \dot{p}_i \frac{\partial \rho}{\partial p_i} \right) = 0$$

This equation, as we've established, tells us that the phase space density $\rho$ is constant along the trajectories of the system.

Now, for a system to be in **statistical equilibrium**, we require that the probability density function $\rho$ is **stationary**. Stationary means that the density does not explicitly depend on time. In mathematical terms, this means:

$$\frac{\partial \rho}{\partial t} = 0$$

If $\frac{\partial \rho}{\partial t} = 0$, then the Liouville equation simplifies to:

$$\sum_{i=1}^{3N} \left( \dot{q}_i \frac{\partial \rho}{\partial q_i} + \dot{p}_i \frac{\partial \rho}{\partial p_i} \right) = 0$$

This sum can also be written using the **Poisson bracket**:

$$\{\rho, H\} = \sum_{i=1}^{3N} \left( \frac{\partial \rho}{\partial q_i} \frac{\partial H}{\partial p_i} - \frac{\partial \rho}{\partial p_i} \frac{\partial H}{\partial q_i} \right)$$

Since $\dot{q}_i = \frac{\partial H}{\partial p_i}$ and $\dot{p}_i = -\frac{\partial H}{\partial q_i}$, the condition for equilibrium ($\frac{\partial \rho}{\partial t} = 0$) translates to:

$$\{\rho, H\} = 0$$

This condition, $\{\rho, H\} = 0$, means that the probability density function $\rho$ must be a **constant of motion**. In simpler terms, $\rho$ must be a function of only those quantities that are conserved during the system's evolution, such as the total energy $H$.





Liouville's theorem describes the evolution of the phase space density of an ensemble over time.

Consider a large number of identical systems, each represented by a point in phase space. As time evolves, these points move through phase space according to Hamilton's equations of motion. Liouville's theorem states that the **density of points in phase space remains constant along any trajectory**. In simpler terms, if you take a small "blob" of phase space points, as these points evolve in time, the volume of that blob remains constant, even though its shape might distort.

Let $\rho(\mathbf{q}, \mathbf{p}, t)$ be the **phase space density function**, where $\mathbf{q} = (q_1, ..., q_{3N})$ are the generalized coordinates and $\mathbf{p} = (p_1, ..., p_{3N})$ are the generalized momenta. The density $\rho$ is defined such that $\rho(\mathbf{q}, \mathbf{p}, t) d\mathbf{q}d\mathbf{p}$ gives the number of systems in the ensemble whose state lies within the infinitesimal phase space volume $d\mathbf{q}d\mathbf{p}$ at time $t$.

The total number of systems in the ensemble, $\mathcal{N}$, is constant:
$$\mathcal{N} = \int \rho(\mathbf{q}, \mathbf{p}, t) d\mathbf{q}d\mathbf{p} = \text{constant}$$

For the density to be conserved along the flow, the substantial derivative (or material derivative) of $\rho$ must be zero:

$$\frac{D\rho}{Dt} = \frac{\partial\rho}{\partial t} + \sum_{i=1}^{3N} \left( \frac{\partial\rho}{\partial q_i}\dot{q}_i + \frac{\partial\rho}{\partial p_i}\dot{p}_i \right) = 0$$

This equation is a statement of conservation of "probability fluid" in phase space.

Now, let's use Hamilton's equations of motion:
$$\dot{q}_i = \frac{\partial H}{\partial p_i}$$
$$\dot{p}_i = -\frac{\partial H}{\partial q_i}$$
where $H(\mathbf{q}, \mathbf{p})$ is the Hamiltonian of the system.

Substitute these into the substantial derivative equation:

$$\frac{\partial\rho}{\partial t} + \sum_{i=1}^{3N} \left( \frac{\partial\rho}{\partial q_i}\frac{\partial H}{\partial p_i} - \frac{\partial\rho}{\partial p_i}\frac{\partial H}{\partial q_i} \right) = 0$$

The summation term is the **Poisson bracket** of $\rho$ and $H$, denoted as $\{\rho, H\}$:

$$\frac{\partial\rho}{\partial t} + \{\rho, H\} = 0$$

This is **Liouville's equation**.

**Derivation of Liouville's Theorem (Alternative Perspective):**

Consider a small volume element $d\Gamma = d\mathbf{q}d\mathbf{p}$ in phase space. The number of points in this volume at time $t$ is $\rho(\mathbf{q}, \mathbf{p}, t) d\Gamma$. As the system evolves, the points that were in $d\Gamma$ at time $t$ will move to a new volume element $d\Gamma'$ at time $t + \Delta t$.

For the number of points to be conserved, we must have:
$$\rho(\mathbf{q}, \mathbf{p}, t) d\Gamma = \rho(\mathbf{q} + \dot{\mathbf{q}}\Delta t, \mathbf{p} + \dot{\mathbf{p}}\Delta t, t + \Delta t) d\Gamma'$$

The change in volume element $d\Gamma'$ can be related to $d\Gamma$ using the Jacobian of the transformation from $(\mathbf{q}, \mathbf{p})$ to $(\mathbf{q}', \mathbf{p}')$.
For a Hamiltonian system, the Jacobian of the transformation from $(\mathbf{q}, \mathbf{p})$ to $(\mathbf{q}(t+\Delta t), \mathbf{p}(t+\Delta t))$ is 1. This means that the phase space volume is conserved. This property is also known as **Poincaré's recurrence theorem**.

Let's illustrate with a simpler analogue: imagine an incompressible fluid flowing. The density of the fluid at any point might change if you're looking at a fixed point in space, but if you follow a "particle" of fluid, its density remains constant.

**Consequence of Liouville's Theorem:**

For a system in thermal equilibrium, the phase space density $\rho$ must be time-independent, i.e., $\partial\rho/\partial t = 0$.
In this case, Liouville's equation simplifies to:

$$\{\rho, H\} = 0$$

This implies that $\rho$ must be a function of conserved quantities (integrals of motion), and in particular, it must be a function of the Hamiltonian $H$. This is a crucial result for defining the equilibrium distributions for different ensembles. For instance, in the canonical ensemble, $\rho \propto e^{-\beta H}$.

---

## Ensemble Average

Since we cannot know the exact microscopic state of a macroscopic system, we are interested in the average values of macroscopic observables. The **ensemble average** of a physical quantity is the average of that quantity over all the systems in the ensemble, weighted by their phase space density.

Let $A(\mathbf{q}, \mathbf{p})$ be a physical observable (a function of the phase space coordinates). The ensemble average of $A$, denoted by $\langle A \rangle$, is given by:

$$\langle A \rangle = \frac{\int A(\mathbf{q}, \mathbf{p}) \rho(\mathbf{q}, \mathbf{p}, t) d\mathbf{q}d\mathbf{p}}{\int \rho(\mathbf{q}, \mathbf{p}, t) d\mathbf{q}d\mathbf{p}}$$

If $\rho$ is normalized such that $\int \rho(\mathbf{q}, \mathbf{p}, t) d\mathbf{q}d\mathbf{p} = 1$, then:

$$\langle A \rangle = \int A(\mathbf{q}, \mathbf{p}) \rho(\mathbf{q}, \mathbf{p}, t) d\mathbf{q}d\mathbf{p}$$

In statistical mechanics, we typically deal with equilibrium ensembles, where $\rho$ is time-independent.

**Example: Ensemble Average of Energy in the Canonical Ensemble**

In the canonical ensemble, the probability density function is given by:
$$\rho(\mathbf{q}, \mathbf{p}) = \frac{1}{Z} e^{-\beta H(\mathbf{q}, \mathbf{p})}$$
where $Z$ is the **canonical partition function** (a normalization constant) and $\beta = 1/(k_B T)$ ($k_B$ is Boltzmann's constant, $T$ is temperature).

The average energy $\langle E \rangle$ is then:
$$\langle E \rangle = \frac{\int H(\mathbf{q}, \mathbf{p}) e^{-\beta H(\mathbf{q}, \mathbf{p})} d\mathbf{q}d\mathbf{p}}{\int e^{-\beta H(\mathbf{q}, \mathbf{p})} d\mathbf{q}d\mathbf{p}}$$

We can also express this in terms of the partition function:
$$Z = \int e^{-\beta H(\mathbf{q}, \mathbf{p})} d\mathbf{q}d\mathbf{p}$$
Notice that:
$$\frac{\partial Z}{\partial \beta} = \int (-H) e^{-\beta H} d\mathbf{q}d\mathbf{p} = - \int H e^{-\beta H} d\mathbf{q}d\mathbf{p}$$
So,
$$\langle E \rangle = - \frac{1}{Z} \frac{\partial Z}{\partial \beta} = - \frac{\partial \ln Z}{\partial \beta}$$
This is a fundamental relationship in statistical mechanics, connecting the macroscopic observable (average energy) to the partition function, which encapsulates all the microscopic information.

---

## Simple Examples to Illustrate Theoretical Material

### Example 1: Liouville's Theorem - Simple Harmonic Oscillator

Consider a 1D simple harmonic oscillator with Hamiltonian:
$$H(q, p) = \frac{p^2}{2m} + \frac{1}{2}kq^2$$
The phase space is 2-dimensional ($q, p$).

Hamilton's equations are:
$$\dot{q} = \frac{\partial H}{\partial p} = \frac{p}{m}$$
$$\dot{p} = -\frac{\partial H}{\partial q} = -kq$$
The trajectories in phase space are ellipses.

Let's assume an initial distribution $\rho(q, p, 0)$. According to Liouville's theorem, the density $\rho$ will remain constant along these elliptical trajectories.
Suppose you have a cluster of points in phase space representing an ensemble of oscillators. As they evolve, this cluster will deform (stretch and rotate along the ellipses), but its area (volume in 2D phase space) will remain constant.

**Visualizing Liouville's Theorem:**
Imagine drawing a small circle in the $(q,p)$ phase space at $t=0$. As time progresses, each point on that circle follows its own elliptical trajectory. The circle will deform into an ellipse, then a more elongated ellipse, and so on. However, the **area** enclosed by this evolving shape will remain precisely the same. This is the essence of Liouville's theorem for Hamiltonian systems.

### Example 2: Ensemble Average - A System of Two-State Particles

Consider a system of $N$ identical, non-interacting particles. Each particle can be in one of two energy states:
* State 0: Energy $E_0 = 0$
* State 1: Energy $E_1 = \epsilon$ (where $\epsilon > 0$)

This is a simplified system, so we won't use continuous phase space integrals. Instead, we'll use a discrete sum over possible microstates.

**Canonical Ensemble:**
Let the system be in thermal contact with a reservoir at temperature $T$. We are in the canonical ensemble.
The probability of a single particle being in a state with energy $E_i$ is proportional to $e^{-\beta E_i}$.

For a single particle:
* Probability of being in State 0 ($P_0$): $P_0 = \frac{e^{-\beta E_0}}{e^{-\beta E_0} + e^{-\beta E_1}} = \frac{e^0}{e^0 + e^{-\beta\epsilon}} = \frac{1}{1 + e^{-\beta\epsilon}}$
* Probability of being in State 1 ($P_1$): $P_1 = \frac{e^{-\beta E_1}}{e^{-\beta E_0} + e^{-\beta E_1}} = \frac{e^{-\beta\epsilon}}{1 + e^{-\beta\epsilon}}$

The **partition function for a single particle** is $Z_1 = e^{-\beta E_0} + e^{-\beta E_1} = 1 + e^{-\beta\epsilon}$.

The **ensemble average energy for a single particle** $\langle E \rangle_1$:
$$\langle E \rangle_1 = E_0 P_0 + E_1 P_1$$
$$\langle E \rangle_1 = (0) \frac{1}{1 + e^{-\beta\epsilon}} + (\epsilon) \frac{e^{-\beta\epsilon}}{1 + e^{-\beta\epsilon}} = \frac{\epsilon e^{-\beta\epsilon}}{1 + e^{-\beta\epsilon}}$$

Alternatively, using the partition function method:
$$\langle E \rangle_1 = - \frac{\partial \ln Z_1}{\partial \beta}$$
$$\ln Z_1 = \ln(1 + e^{-\beta\epsilon})$$
$$\frac{\partial \ln Z_1}{\partial \beta} = \frac{1}{1 + e^{-\beta\epsilon}} \cdot (-\epsilon e^{-\beta\epsilon}) = -\frac{\epsilon e^{-\beta\epsilon}}{1 + e^{-\beta\epsilon}}$$
So,
$$\langle E \rangle_1 = - \left( -\frac{\epsilon e^{-\beta\epsilon}}{1 + e^{-\beta\epsilon}} \right) = \frac{\epsilon e^{-\beta\epsilon}}{1 + e^{-\beta\epsilon}}$$
The results match, illustrating the power of the partition function approach.

For a system of $N$ non-interacting particles, the total partition function $Z_N = (Z_1)^N$. The total average energy is then $\langle E \rangle_N = N \langle E \rangle_1$.

Ensembles provide a powerful framework for connecting the microscopic properties of a system to its macroscopic thermodynamic behavior. Liouville's theorem ensures the consistency of this approach by describing the conservation of phase space density, and ensemble averages allow us to calculate measurable quantities from the microscopic probability distributions.