# 09 - Sampling and Ergodicity

**Overview** 

This notebook guides you through the problem of sampling phase space using stochastic processes (Monte Carlo) or a deterministic dynamics (Molecular Dynamics).  

In [None]:
# @title Modules Setup { display-mode: "form" }
import numpy as np
import matplotlib.pyplot as plt
! pip install -q plotly > /dev/null
import plotly.graph_objects as go


In [None]:
#@title Utilities { display-mode: "form" }
def rational_theta(Lx, Ly, p=1, q=2):
    # Direction so that tan(theta) = (p*Ly)/(q*Lx)
    th = np.arctan2(p*Ly, q*Lx)
    return float(th)

def _rotate(vx, vy, delta):
    c, s = np.cos(delta), np.sin(delta)
    return c*vx - s*vy, s*vx + c*vy

def step_finite(x, y, vx, vy, Lx, Ly, noise, dt):
    x_new = x + vx*dt
    y_new = y + vy*dt
    hit = False

    # vertical walls
    if x_new < 0:
        x_new = -x_new
        vx = -vx
        hit = True
    elif x_new > Lx:
        x_new = 2*Lx - x_new
        vx = -vx
        hit = True

    # horizontal walls
    if y_new < 0:
        y_new = -y_new
        vy = -vy
        hit = True
    elif y_new > Ly:
        y_new = 2*Ly - y_new
        vy = -vy
        hit = True

    # apply one random tilt if any wall was hit
    if hit and noise > 0.0:
        delta = np.random.normal(0.0, noise)     # radians
        vx, vy = _rotate(vx, vy, delta)

        # Optional: if tilt accidentally points outward at a corner, you can
        # softly nudge position back inside a hair:
        x_new = min(max(x_new, 0.0 + 1e-12), Lx - 1e-12)
        y_new = min(max(y_new, 0.0 + 1e-12), Ly - 1e-12)
        
    return x_new, y_new, vx, vy

# property: indicator of a disk centered in the cell
def indicator_disk(xs, ys, Lx, Ly, radius):
    cx, cy = Lx/2.0, Ly/2.0
    r2 = (xs - cx)**2 + (ys - cy)**2
    return (r2 <= radius**2).astype(float)

def running_mean(x):
    c = np.cumsum(x, dtype=float)
    n = np.arange(1, len(x)+1)
    return c / n

def simulate_md(Lx=1.0, Ly=1.0, x0=0.13, y0=0.29, speed=1.0,
                      dt=0.01, nsteps=10000, p=1, q=2, noise=0.0):
    theta = rational_theta(Lx, Ly, p, q)
    vx, vy = speed*np.cos(theta), speed*np.sin(theta)
    x, y = x0, y0
    total_time = 0.0

    # Store collision points (including start as a "collision 0" for convenience)
    xs = [x]; ys = [y]; ts = [total_time]

    # For closure detection, we track (x mod Lx, y mod Ly, direction up to sign on walls)
    # Exact closure happens; numerically we allow a tiny tolerance.
    for _ in range(nsteps):
        x, y, vx, vy = step_finite(x, y, vx, vy, Lx, Ly, noise, dt)
        total_time += dt
        xs.append(x); ys.append(y); ts.append(total_time)

    return np.array(xs), np.array(ys), np.array(ts)


def simulate_mc(Lx=2.0, Ly=1.0, x0=0.13, y0=0.29, speed=1.0, dt=0.01, nsteps=10000):
    """
    Monte Carlo-like random-walk simulation in a 2D cell with reflecting walls.
    Each step uses two random numbers (Δx, Δy) drawn uniformly from [-step_size, step_size].
    
    Returns:
        xs, ys: positions over time
    """
    # initial position (can randomize if desired)
    x, y = x0, y0
    step_size = speed * dt
    total_step = 0.0
    xs, ys, ts = [x], [y], [total_step]

    for _ in range(nsteps):
        # two random numbers → random displacement
        dx = np.random.uniform(-step_size, step_size)
        dy = np.random.uniform(-step_size, step_size)

        x_new = x + dx
        y_new = y + dy

        # reflect on walls
        if x_new < 0: x_new = -x_new
        elif x_new > Lx: x_new = 2*Lx - x_new

        if y_new < 0: y_new = -y_new
        elif y_new > Ly: y_new = 2*Ly - y_new

        x, y = x_new, y_new
        total_step += step_size
        xs.append(x); ys.append(y); ts.append(total_step)

    return np.array(xs), np.array(ys), np.array(ts)

**Problem** 

Let's consider a molecule constrained to move on a two-dimensional surface. This could be a molecule that is adsorbed on the surface of a material, or in a slit pore, or inside the lipid bilayer membrane. There is a 'reactive' patch on the substrate that affects a property $A$ of the molecule, for example by making the molecule emit light or through a catalytic process that generates some chemical products. We want to characterize the average value of the property, $\left<A\right>$, assuming that the molecule-substrate interaction is uniform in space. 

**Model**
We are going to be modeling the system as a closed system in the micro-canonical ensemble. We will model interactions with the boundaries of the cell as hard wall reflections, which preserve the total kinetic energy of the system. Since the potential energy of the system is uniform, there are no external forces and the molecule will move as a free particle. At equilibrium, the properties of the system should not depend on the starting point of the simulation. Also, if the system is ergodic, the ensemble average of its properties should be equivalent to a time average over the system's dynamics. 

>Do you expect the simple system of one molecule behaving like a billiard ball to be ergodic?

Since the energy of the molecule is the same at any point in space, the macroscopic (ensemble-averaged) property of the molecule will just be the integral over the x and y coordinates of the property. To simplify the analysis we will assume that $A(x,y)\equiv A_R(x,y)$ is piece-wise constant over space, with a value of 1 inside a circle of radius $R$ centered in the cell, and a value of 0 everywhere else. 

**Questions**

Before you run any simulation, answer the following question(s):

1. Given the above definition of $A_R(x,y)$, what do you expect to be the value of $\left<A\right>$ as a function of the parameter $R$? 

In the following cell you will have the possibility to run two different simulations to compute numerically the ensemble average of $A$. The first approach relies on moving the molecule using random steps (stochastic method, a.k.a. Monte Carlo or MC approach). The second approach will evolve the dynamics of the molecule using Newton's laws (Euler integrator) and approximate the ensemble average with a time average over the trajectory. 

$$\int \int A(x,y,) p(x,y) dx dy \approx \lim_{t^{total}\rightarrow\infty}\frac{1}{t^{total}}\int_0^{t^{total}} A(x(t),y(t)) dt$$

Run the simulation, change the parameters, and run the simulation again as many times as needed to answer the following question(s):

2. Consider the MC approach, which are the numerical parameters of the simulation? Check the convergence of the results with respect to the parameter(s). Instead of moving the molecule in steps, can you think of a smarter approach that works well for this simple system? 
3. Consider the MD approach, without any noise. You can change the random seed to start with different initial conditions. Are all the results independent from the initial conditions? Is the system ergodic? 
4. Now consider the MD approach, but add a noise to the interaction of the molecule with the walls. This noise can be interpreted as some interaction with the atoms of the walls that only affect the direction of the particle, not its energy (so the system is still in the NVE ensemble). Do the results depend on the initial conditions? Is the system ergodic in this case? 

In [None]:
# @title Simulation Parameters { display-mode: "form" }
Lx, Ly = 1.0, 1.0 # square unit box
radius = 0.15 # @param {type:"number"}
simulation = "mc" # @param ["md", "mc"]
step_size = 0.05 # @param {type:"number"}
nsteps = 5000 # @param {type:"integer"}
skip_step = 10 # @param {type:"integer"}
noise = 0.0 # @param {type:"number"}
random_seed = 1 # @param {type:"integer"}
rng = np.random.default_rng(random_seed)

if simulation == "md":
    p = rng.integers(1,10)
    q = rng.integers(1,10)
    x0 = rng.uniform(0, Lx)
    y0 = rng.uniform(0, Ly)
    xs, ys, ts = simulate_md(Lx, Ly, dt=step_size, nsteps=nsteps, p=p, q=q, noise=noise )
else:
    x0 = rng.uniform(0, Lx)
    y0 = rng.uniform(0, Ly)
    xs, ys, ts = simulate_mc(Lx, Ly, x0, y0, dt=step_size, nsteps=nsteps)

a = radius * min(Lx, Ly)           # choose disk radius (15% of the shortest side)
s_t = indicator_disk(xs, ys, Lx, Ly, a)       # 0/1 values along the trajectory
s_bar = running_mean(s_t)                      # running average

# Reference: if the trajectory samples positions uniformly,
# the expected long-time average equals the area fraction:
area_fraction = np.pi * a**2 / (Lx * Ly)

# --- make frames ---
frames = []
step_skip = 10  # skip frames for smoother playback
for k in range(2, len(xs), step_skip):
    frames.append(go.Frame(
        data=[
            # path + ball
            go.Scatter(x=xs[:k], y=ys[:k], xaxis='x1', yaxis='y1',
                       mode='lines', line=dict(width=2, color='royalblue')),
            go.Scatter(x=[xs[k-1]], y=[ys[k-1]], xaxis='x1', yaxis='y1',
                       mode='markers', marker=dict(size=8, color='crimson')),
            # running average curve up to time k
            go.Scatter(x=ts[:k], y=s_bar[:k], xaxis='x2', yaxis='y2',
                       mode='lines', line=dict(color='royalblue')),
            go.Scatter(x=[ts[k-1]], y=[s_bar[k-1]], xaxis='x2', yaxis='y2',
                       mode='markers', marker=dict(size=6, color='crimson'))
        ],
        name=f'frame{k}'
    ))

# --- layout with two panels ---
fig = go.Figure(
    data=[
        go.Scatter(x=[xs[0]], y=[ys[0]], xaxis='x1', yaxis='y1',
                   mode='lines', line=dict(width=2, color='royalblue')),
        go.Scatter(x=[xs[0]], y=[ys[0]], xaxis='x1', yaxis='y1',
                   mode='markers', marker=dict(size=8, color='crimson')),
        go.Scatter(x=[0], y=[s_bar[0]], xaxis='x2', yaxis='y2',
                   mode='lines', line=dict(color='royalblue')),
        go.Scatter(x=[0], y=[s_bar[0]], xaxis='x2', yaxis='y2',
                   mode='markers', marker=dict(size=6, color='crimson'))
    ],
    frames=frames
)

# persistent cell + disk shapes
fig.update_layout(
    shapes=[
        dict(type='rect', xref='x1', yref='y1',
             x0=0, y0=0, x1=Lx, y1=Ly,
             line=dict(color='black', width=4), fillcolor='rgba(0,0,0,0)'),
        dict(type='circle', xref='x1', yref='y1',
             x0=Lx/2 - a, x1=Lx/2 + a, y0=Ly/2 - a, y1=Ly/2 + a,
             line=dict(color='black', width=2, dash='dot'),
             fillcolor='rgba(0,0,0,0.5)'),
    ],
    # --- left panel (cell) ---
    xaxis1=dict(domain=[0.0, 0.45], range=[-0.05, Lx+0.05],
                visible=False, scaleanchor='x1', scaleratio=1),
    yaxis1=dict(domain=[0.0, 1.0], range=[-0.05, Ly+0.05],
                visible=False, scaleanchor='x1', scaleratio=1),
    # --- right panel (average) ---
    xaxis2=dict(domain=[0.55, 1.0], range=[0, ts[-1]],
                title='time', mirror=True, showline=True, linewidth=1,
                anchor='y2'),
    yaxis2=dict(domain=[0.0, 1.0], range=[-0.05, 1.05],
                title='⟨s⟩(t)', mirror=True, showline=True, linewidth=1,
                side='right', position=1.0,  # << move axis to the right edge
                anchor='x2'),
    plot_bgcolor='white',
    width=900, height=400,
    title='Ensemble Average vs. Time Average in 2D System',
    showlegend=False,          # ⬅ removes legend
    updatemenus=[dict(type='buttons', x=0.4, y=-0.05,
                      buttons=[
                          dict(label='Play', method='animate',
                               args=[None, {"frame": {"duration": 30, "redraw": True},
                                            "fromcurrent": True, "mode": "immediate"}]),
                          dict(label='Pause', method='animate',
                               args=[[None], {"frame": {"duration": 0, "redraw": False},
                                              "mode": "immediate"}])
                      ])]
)

# add reference line for uniform area fraction
fig.add_shape(
    type='line', xref='x2', yref='y2',
    x0=0, x1=ts[-1], y0=area_fraction, y1=area_fraction,
    line=dict(dash='dash', color='gray')
)

fig.show()