# Non-equilibrium Systems, Transport Coefficients, and Coupling Methods

In [None]:
using LinearAlgebra
using Random
using Plots
using LaTeXStrings
using Statistics
using StatsBase

# I) Non-equilibrium Dynamics and Transport Coefficients

As a model for non-equilibrium, we consider overdamped Langevin dynamics with potential $U:\mathbb{R}^d \to \mathbb{R}$ and inverse temperature $\beta > 0$ perturbed by a non-gradient vector field $F:\mathbb{R}^d \to \mathbb{R}^d$ with the pertubation modulated by a parameter $\eta \in \mathbb{R}$
$$dX_t^\eta = \left(-\nabla U\left(X_t^\eta\right) + \eta F\left(X_t^\eta\right)\right)dt + \sqrt{\frac{2}{\beta}}dW_t. \tag{Non-eq Dynamics}$$
Under appropriate assumptions on $U$ and $F$ (sufficient contractivity of $-\nabla U$, boundness of $F$, ...), the above dynamics admits unique invariant measure $\nu_\eta$. When $\eta = 0$, $\nu_0$ is the Gibbs measure $$\nu_0(dx) = Z_\beta^{-1} \exp\left(-\beta U(x)\right)dx,$$ and the dynamics of $X^0$ is reversible with respect $\nu_0$. If $\eta \neq 0$, then $\nu_\eta$ is no longer of Gibbs form and the dynamics of $X^\eta$ is non-reversible. 

For an observable $R$, we define the transport coefficient $\alpha_R$ to be linear response at $\eta = 0$ of the expectation of the observable
$$ \alpha_R = \lim_{\eta \to 0} \frac{\nu_\eta(R) - \nu_0(R)}{\eta},$$
i.e. $\nu_\eta(R) \approx \nu_0(R) + \alpha_R \eta$ for $|\eta| \ll 1$.

Suppose that $\nu_0(R) = 0$. We can approximate $\alpha_R$ with a finite difference approximation $\alpha_{R, \eta} = \eta^{-1}\nu_{\eta}(R) = \alpha_R + \mathrm{O}(\eta)$. In practice, we do not have access to $\nu_\eta$ and instead estimate $\nu_\eta(R)$ via a time average. Denote by $\widehat{\Phi}_{\eta, t}$ the estimator for $\alpha_R$, given by 
$$\widehat{\Phi}_{\eta, t} = \frac{1}{\eta t}\int_0^t R\left(X_s^\eta\right) ds \xrightarrow[\text{a.s.}]{t \to \infty} \alpha_{\eta, R} = \alpha_R + \mathrm{O}(\eta). \tag{NEMD Estimator}$$
The discrete-time version of this estimator is time average of the Euler-Maruyama discretization in lieu of the continuous-time solution. The statistical error of this estimator is dictated by the central limit theorem
$$\sqrt{t}\left(\widehat{\Phi}_{\eta, t} - \alpha_{\eta, R}\right) \xrightarrow[\text{in law}]{t\to \infty} \mathcal{N}\left(0, \frac{\sigma_{R,\eta}^2}{\eta}\right),$$
and therefore $\widehat{\Phi}_{\eta, t} = \alpha_R + \mathrm{O}(\eta) + \mathrm{O}_P\left(\frac{1}{\eta \sqrt{t}}\right)$.

Our two running examples through this notebook will be a harmonic potential in two dimensions and the entropic switch potential perturbed by a rotational force $F$
$$ F(x) = Jx = \left[ \begin{matrix} 0 & 1\\ -1 & 0\end{matrix} \right]x$$
Observe that this force is non-gradient since $\mathrm{curl}(F) \equiv -2$.

In [None]:
J = [0. 1
    -1 0]

#Defining the rotational force
rotation(x) = J*x
;

#### 1) Harmonic potential (Ornstein-Uhlenbeck process)
As a potential, we take $ U(x) = \frac{x^T A x}{2}$ with
$$ A = \left[ \begin{matrix} 2 & -1\\ -1 & 2\end{matrix} \right].$$
Then the perturbed overdamped Langevin dynamics is given by a linear SDE
$$ dX_t = \left(-A + \eta J\right)X_t dt + \sqrt{\frac{2}{\beta}}dW_t.$$
In this case we can solve everything explicitly. The invariant measure $\nu_\eta$ is a centered Gaussian with covariance matrix, $\Sigma_{\eta, \beta}$, satisfying
$$ \left(A -\eta J\right)\Sigma_{\eta, \beta} + \Sigma_{\eta, \beta}\left(A -\eta J\right)^T = \frac{2}{\beta}\mathrm{Id}, $$
specifically
$$\Sigma_{\eta, \beta} = \frac{1}{\left(3+\eta^2\right)\beta} \left[\begin{matrix} 2 + (\eta^2 +\eta)/2 & 1\\ 1 & 2 +(\eta^2 - \eta)/2\end{matrix}\right].$$
For $|\eta| < \sqrt{3}$, we can write $\Sigma_{\eta, \beta}$ as a power expansion in $\eta$,
$$\begin{aligned}
\Sigma_{\eta, \beta} &= \frac{1}{3\beta}\left[\begin{matrix} 2 & 1\\ 1 & 2\end{matrix}\right] + \frac{\eta}{6\beta}\left[\begin{matrix} 1 & 0\\ 0 & -1 \end{matrix}\right] + \frac{1}{6\beta}\sum_{k\geq 1} \left(-3\right)^{-k}\left(\eta^{2k}\left[\begin{matrix} 1 & 2\\ 2 & 1\end{matrix}\right] + \eta^{2k+1}\left[\begin{matrix}1 & 0\\ 0 & -1 \end{matrix}\right]\right).
\end{aligned}$$
This expansion indicates that the difference of variances of the first and second components exhibts a linear response. If we choose as our observable $R(x) = x_1^2 - x_2^2$, then we get that 
$$\nu_\eta\left(R\right) = \frac{\eta}{3\beta} + \frac{1}{3\beta}\sum_{k\geq 1}\left(-3\right)^{-k}\eta^{2k+1}, \qquad |\eta | < \sqrt{3}.$$
Consequently, the transport coefficient is given by
$$ \alpha_R = \lim_{\eta \to 0} \frac{\nu_\eta(R)}{\eta} = \frac{1}{3\beta}$$ 

In [None]:
"""
Defining the harmonic potential and its gradient
"""
A = [2 -1. 
    -1 2]

Harmonic(x) = x'*A*x/2
Harmonic(x,y) = [x, y]' *A *[x,y]/2
∇Harmonic(x) = A*x


"""
Difference of variances response function
"""
function R(x,y)
    return x^2 - y^2
end

R(x::Vector{<:Real}) = R(x...)


"""
Plotting harmonic potential
"""
xlimits = -2., 2.
ylimits = -2., 2.
xrange = range(xlimits..., 200)
yrange = range(ylimits..., 200)
contourf(xrange, yrange, Harmonic, levels=50, cmap=:hsv)

Below we plot a sample trajectory of the dynamics.

In [None]:
"""
    em_integration_step!(x::Vector{<:Real}, G::Vector{<:Real}, ∇U, F, η, β, dt; RNG =Random.default_rng())

One step Euler-Maruyama discretization of the SDE (Non-eq Dynamics). Configuration x is update in place and then
returned by the function. Noise vector G is also updated in place. The random number generator can be specified
via optional keyword argument RNG.

"""
function em_integration_step!(x::Vector{<:Real}, G::Vector{<:Real}, ∇U, F, η, β, dt; RNG =Random.default_rng())
    x .+= (.-∇U(x) .+ η*F(x)).* dt + sqrt(2*dt/β)*randn!(RNG, G)
    return x
end


#Parameters of the sample trajectory feel free to modify and observe what changes
x_0 = [1., 0]
G = [0., 0]
η = 0.1
β = 2.
dt = 0.01
T = 10
N = floor(Int64, T/dt)
RNG = Xoshiro(24092023) #fixing RNG



x_traj = zeros(2, N + 1)
x_traj[:, 1] .= x_0
x_n = copy(x_0)
for i in 1:N
    x_traj[:, i+1] .= em_integration_step!(x_n, G, ∇Harmonic, rotation, η, β, dt; RNG = RNG)
end

contourf(xrange, yrange, Harmonic, levels=50, cmap=:hsv)
plot!(x_traj[1,:], x_traj[2, :], color=:black, label = "")

In the next cell, we plot a trajectory of the discretized estimator $\widehat{\Phi}_{\eta, t}$.
Vary the total runtime $T$ and strength of the perturbation $\eta$ and comment on the convergence of $\widehat{\Phi}_{\eta, t}$

In [None]:
x_0 = [1., 0]
G = [0., 0]
η = 0.1
β = 2.
dt = 0.01
T = 5000
N = floor(Int64, T/dt)
RNG = Xoshiro(24092023) #fixing RNG

α_R = 1/(3β) #analytic value of the transport coefficients

x_traj = zeros(2, N + 1)
x_traj[:, 1] .= x_0
x_n = copy(x_0)
for i in 1:N
    x_traj[:, i+1] .= em_integration_step!(x_n, G, ∇Harmonic, rotation, η, β, dt; RNG = RNG)
end

phi_hat = cumsum(R.(x_traj[1,:], x_traj[2,:])) ./ (1:(N+1)) ./η
plot(0:10dt:T, phi_hat[1:10:end], label = L"\hat{\Phi}_{\eta, t}")
plot!([0, T], [α_R, α_R], label = L"\alpha_R", legendfontsize = 15)
plot!(ylims = (0, 0.5))

#### 2) Entropic switch potential 

As our second example we consider the entropic switch potential introduced yesterday
$$
\tag{Entropic switch}
V(x,y)=3{\rm e}^{-x^2}({\rm e}^{-(y-1/3)^2}-{\rm e}^{-(y-5/3)^2})-5{\rm e}^{-y^2}({\rm e}^{-(x-1)^2}+{\rm e}^{-(x+1)^2})+0.2x^4+0.2(y-1/3)^4.
$$

As an observable, we consider the first component, $R(x) = x_1$. By the symmetry of the two deep wells, $\nu_0(R) = 0$. 

In [None]:
"""
    entropic_switch(x, y)

Potential energy function.
"""
function entropic_switch(x, y)
    tmp1 = x^2
    tmp2 = (y - 1 / 3)^2
    return 3 * exp(-tmp1) * (exp(-tmp2) - exp(-(y - 5 / 3)^2)) - 5 * exp(-y^2) * (exp(-(x - 1)^2) + exp(-(x + 1)^2)) + 0.2 * tmp1^2 + 0.2 * tmp2^2
end

function entropic_switch(q::Vector{<:Real})
    return entropic_switch(q...)
end

"""
    ∇entropic_switch(x, y)

Gradient of the potential energy function.
"""
function ∇entropic_switch(x, y)

    tmp1 = exp(4*x)
    tmp2 = exp(-x^2 - 2*x - y^2 - 1)
    tmp3 = exp(-x^2)
    tmp4 = exp(-(y-1/3)^2)
    tmp5 = exp(-(y-5/3)^2)

    dx = 0.8*x^3 + 10*(tmp1*(x - 1) + x + 1)*tmp2 - 6*tmp3*x*(tmp4 - tmp5)

    dy = 10*(tmp1 + 1)*y*tmp2 + 3*tmp3*(2*tmp5*(y - 5/3) - 2*tmp4*(y - 1/3)) + 0.8*(y - 1/3)^3

    return [dx, dy]
end

function ∇entropic_switch(q::Vector{<:Real})
    return ∇entropic_switch(q...)
end

xlims = -2.5, 2.5
ylims = -1.75, 2.5
xrange = range(xlims..., 200)
yrange = range(ylims..., 200)
qₗ = (-1.048, -0.04210, Plots.text(L"$q_{\rm L}$", pointsize=25))
qᵣ = (1.048, -0.04210, Plots.text(L"$q_{\rm R}$", pointsize=25))
qₛ = (0., 1.5371, Plots.text(L"$q_{\rm S}$", pointsize=25))
pl_V = contourf(xrange, yrange, entropic_switch, levels=50, cmap=:hsv)
annotate!(qₗ)
annotate!(qᵣ)
annotate!(qₛ)

In the next cell, we plot a trajectory of the discretized estimator $\widehat{\Phi}_{\eta, t}$.
Vary the total runtime $T$ and strength of the perturbation $\eta$ and comment on the convergence of $\widehat{\Phi}_{\eta, t}$. For this potential, it is also interesting to vary the inverse temperature $\beta$.

In [None]:
x_0 = [1., 0]
G = [0., 0]
η = 0.01
β = 1.
dt = 0.01
T = 50000
burn = 1000
N = floor(Int64, T/dt)
dt = T/N
N_burn = floor(Int64, burn/dt)
RNG = Xoshiro(24092023) #fixing RNG


x_traj = zeros(2, N)
x_n = copy(x_0)

for i in 1:N_burn
    em_integration_step!(x_n, G, ∇entropic_switch, rotation, η, β, dt; RNG = RNG)
end
for i in 1:N
    x_traj[:, i] .= em_integration_step!(x_n, G, ∇entropic_switch, rotation, η, β, dt; RNG = RNG)
end

Φ = cumsum(x_traj[1,:]) ./ (1:N) /η

plot(dt:50dt:T, Φ[1:50:end], label = L"\hat{\Phi}_{\eta, t}")
plot!(legendfontsize = 15)

In [None]:
plot!(ylims = (-3., 3.))

## II) Couplings

In this section, we consider couplings of diffusions and their discretization via their driving noise. These couplings are Markovian in the sense that the coupled process is a Markov process.


### a) Starting from different inital conditions
Assuming deterministic initial conditions, a pair of coupled solutions to an overdamped Langevin equation would satisfy 
$$\tag{Same Drift}
\begin{aligned}
    dX_t &= -\nabla U\left(X_t\right)dt + \sqrt{\frac{2}{\beta}}dW_t, \qquad X_0 = x,\\
    dY_t &= -\nabla U\left(Y_t\right)dt + \sqrt{\frac{2}{\beta}}d\widetilde{W}_t, \qquad Y_0 = y,
\end{aligned}
$$
with $x, y \in \mathbb{R}^d$ $x\neq y$ and $(W, \widetilde{W})$ cleverly coupled pair of Brownian motions.

We can use this coupling to estimate the contraction rate of the diffusion semi-group, $\left(P_t\right)_{t\geq 0}$, corresponding to the SDE in Wasserstein or total variation distance. Each of these distance can be defined as an infimum over couplings, so we have either
$$\mathcal{W}_p\left(\delta_xP_t, \delta_yP_t\right) \leq \mathbb{E}_{(x,y)}\left[\left|X_t - Y_t\right|^p\right]^{1/p} $$
or 
$$ d_{\mathrm{TV}}\left(\delta_xP_t, \delta_yP_t\right) \leq \mathbb{P}_{(x,y)}\left(X_t\neq Y_t\right)$$


### b) Different drifts
We can alternatively couple solutions to two SDEs with different drifts. To improve upon the estimator from the previous section we can couple the perturbed and equilibrium processes 
$$\tag{Different Drifts}
\begin{aligned}
    dX_t^\eta &= \left(-\nabla U\left(X_t^\eta\right) + \eta F\left(X_t^\eta\right)\right)dt + \sqrt{\frac{2}{\beta}}dW_t, \qquad &X_0 = x\\
    dY_t^0 &= -\nabla U\left(Y_t\right)dt + \sqrt{\frac{2}{\beta}}d\widetilde{W}_t, &Y_0 = y,
\end{aligned}
$$
with $x, y \in \mathbb{R}^d$ and consider the following coupling based estimator
$$\widehat{\Psi}_{\eta, t} = \frac{1}{\eta t}\int_0^t\left[R\left(X_s^\eta\right) - R\left(Y_s^0\right)\right]ds.$$
If $(W, \widetilde{W})$ are coupled in such a way that $X^\eta$ and $Y$ stay close for a long time, then we can hope that the estimator $\widehat{\Psi}_{\eta, t}$ has lower variance than that of $\widehat{\Phi}_{\eta, t}$.

In [None]:
"""
   coupled_traj(drift_x, drift_y, coupled_integrator, β, x_0, y_0, T, burn, dt; RNG = Random.default_rng())

A wrapper function for simulating trajectories of two discretized diffusions coupled via their noises. The 
argument `coupled_integrator` should be a function that implements the one step iteration of the coupled dynamics.

"""
function coupled_traj(drift_x, drift_y, coupled_integrator, β, x_0, y_0, T, burn, dt; RNG = Random.default_rng()) 
    N = floor(Int64, T/dt)
    dt = T/N
    K = floor(Int64, burn/dt)
    
    x_traj = zeros(length(x_0), N+1)
    y_traj = zeros(length(x_0), N+1)

    x = copy(x_0)
    y = copy(y_0)
    
    for _ in 1:K
        #"burning-in" the dynamics. Set `burn` to zero if you do not 
        #want a burn-in
        coupled_integrator(x, y, drift_x, drift_y, β, dt, RNG)
    end
    
    x_traj[:, 1] .= x
    y_traj[:, 1] .= y
    for i in 2:(N+1)
        coupled_integrator(x, y, drift_x, drift_y, β, dt, RNG)
        x_traj[:, i] .= x
        y_traj[:, i] .= y
    end
    
    return x_traj, y_traj
end

### 1) Synchronous coupling

A simple way to couple the dynamics is to <i>synchronously</i> couple them, i.e. to drive them with the same Brownian motion, $W = \widetilde{W}$. 

<b>Instructive optional exercise: </b> Consider the system (Same Drift) given above with an $m$-strongly convex potential $U$ $m > 0$, i.e. $\nabla^2 U(x) \geq m \mathrm{Id}$ in the sense of positive definite matricies for all $x \in \mathbb{R}^d$. Compute the differential of $\left|X_t - Y_t\right|^2$ when $X$ and $Y$ are synchronously coupled. What can we say about the behavior of $\left|X_t - Y_t\right|^2$ with time?

<i>Hint 1:</i> If you do not know Itô Calculus, don't worry! The in this case the Itô corrections cancel. If you do know Itô Calculus, why does this happen?
    
<i>Hint 2:</i> Where do you use that fact two equations are synchronously coupled? Where do you use that the fact $U$ is $m$-strongly convex?

<u>Extension</u>: What happens when we consider the system (Different Drifts) and compute the differential of $\left|X_t^\eta - Y_t^0\right|^2$?

<br>

<b>Task:</b> Implement one integration step of synchronously coupled dynamics in the next cell.

In [None]:
"""
    synchronous_integration_step!(x,y, drift_x, drift_y, β, dt, RNG)
One step of the synchronously coupled dynamics
"""
function synchronous_integration_step!(x,y, drift_x, drift_y, β, dt, RNG)
    #implement here
    return nothing
end

#### a) Strongly Convex Harmonic Potential

Let us test your implementation of synchronous coupling on the overdamped Langevin dynamics with a harmonic potential introduced in the first section.

We start by looking at the convergence towards one another of two coupled solutions to the equilibrium dynamics with different initial conditions. 

After that, we will look at the convergence of the transport coefficient estimator based on sticky coupling.

In [None]:
drift(x) = -∇Harmonic(x)

B = [sqrt(2) 0
    sqrt(2)/2 sqrt(6)/2]

RNG = Xoshiro(24092023) #fixing RNG

###################################
#two samples from the invariant measure of the continuous-time process
x_0 = sqrt(1/(3β)) * B * randn(RNG, 2)
y_0 = sqrt(1/(3β)) * B * randn(RNG, 2)
####################################


# ###################################
# #Difference in the first eigendirection
# x_0 = [-1., 1]
# y_0 = zeros(2)
# ####################################

# ###################################
# #Difference in the second eigendirection
# x_0 = [1., 1]
# y_0 = zeros(2)
# ####################################

T = 10.
dt = 0.01
β = 1.
burn = 0.
x_traj, y_traj = coupled_traj(drift, drift, synchronous_integration_step!, β, x_0, y_0, T, burn, dt; RNG = RNG)
;

Below we plot the decay of the difference between the two trajectories in the two eigen-directions of $A$. The decay of the difference in each eigen-direction should behave like $\mathrm{e}^{-\lambda t}$ with $\lambda$ the corresponding eigenvalue.

In [None]:
#Change of coordinates 
u(x, y) = (x + y)/sqrt(2)
v(x, y) = (x - y)/sqrt(2)
u(x::Vector{<:Real}) = u(x...)
v(x::Vector{<:Real}) = v(x...)

plot(yscale = :log10)
scatter!(0.:10dt:T, [abs(u(x_traj[:, i] - y_traj[:, i])) for i in axes(x_traj, 2)][1:10:end],
    ms = 2, msw = 0.1, label = "Difference in 1st Eigendirection")
plot!(0.:dt:T, abs(u(x_0 - y_0))*exp.(.-(0.:dt:T)), label = L"\exp{(-t)}", lw = 1.5)
scatter!(0.:10dt:T, [abs(v(x_traj[:, i] - y_traj[:, i])) for i in axes(x_traj, 2)][1:10:end], 
    ms = 2, msw = 0.1, label = "Difference in 2nd Eigendirection")
plot!(0.:dt:T, abs(v(x_0 - y_0))*exp.(-3 .*(0.:dt:T)), label = L"\exp{(-3t)}", lw = 1.5)
plot!(legend = :bottomleft, legendfontsize = 12)

We next look at the behavior of the synchronous coupling based estimator of the transport coefficient.

<b> Task:</b> In the next cell vary the intensity of the perturbation $\eta$, the inverse temperature $\beta$, and runtime $T$ and comment on the convergence of $\widehat{\Psi}_{\eta,t}^{\mathrm{sync}}$. You may also look at the plots of the trajectories in the cell after and the distribution of distances between the trajectories in the cell after that. 

In [None]:
x_0 = [1., 0]
η = 0.1
β = 1.
dt = 0.01
T = 5000
burn = 1000
N = floor(Int64, T/dt)
dt = T/N
N_burn = floor(Int64, burn/dt)
RNG = Xoshiro(24092023) #fixing RNG

drift_x(x) = -∇Harmonic(x) + η*rotation(x) #perturbed drift
drift_y(x) = -∇Harmonic(x) #equilibrium drift

x_traj, y_traj = coupled_traj(drift_x, drift_y, synchronous_integration_step!, β, x_0, y_0, T, burn, dt; RNG = RNG)
Ψ_sync = cumsum(R.(x_traj[1, :], x_traj[2, :]) - R.(y_traj[1, :], y_traj[2, :])) ./ (1:N+1) ./η
α_R = 1/(3β)

plot(0:10dt:T, Ψ_sync[1:10:end], label = L"\widehat{\Psi}_{\eta, t}")
plot!([0, T], [α_R, α_R], label = L"\alpha_R", legendfontsize = 15)

In [None]:
#Plotting the trajectories the equilibrium and nonequilibrium processes 
#if you can't differentiate the two trajectories increase η and rerun the previous cell
contourf(xrange, yrange, Harmonic, levels=50, cmap=:hsv)
plot!(x_traj[1, 1:500], x_traj[2, 1:500], color = :black, lw = 1.5, label = L"X^\eta")
plot!(y_traj[1, 1:500], y_traj[2, 1:500], color = :lightgray, lw = 1.5, label = L"Y^0")
plot!(legendfontsize = 15)

In [None]:
#Plotting the empirical distribution of distance between the equilibrium and nonequilibrium processes
plot()
plot!(title = latexstring("Distribution of \$\\left|X_t - Y_t\\right|\$"))
plot!([norm([x_traj[:, i] - y_traj[:, i]]) for i in axes(x_traj, 2)], normed = true, label = "", 
    seriestype = :stephist)
plot!(xlabel = "Distance between trajectories")

#### b) Non-convex Entropic Switch Potential

We repeat what we did above for the entropic switch potential.

<i> Question for those who did the exercise:</i> Does the result of the exercise still hold in this case?

In [None]:
drift(x) = -∇entropic_switch(x)

q_l = [-1.048, -0.04210]
q_r = [1.048, -0.04210]
x_0 = [1., 1]
y_0 = [0., 0.]
β = 1.
T = 7.
burn = 0.
dt = 0.01
RNG = Xoshiro(24092023) #fixing RNG
x_traj, y_traj = coupled_traj(drift, drift, synchronous_integration_step!, β, q_l, q_r, T, burn, dt; RNG = RNG)

plot()
plot!(0:dt:T, [norm([x_traj[:, i] - y_traj[:, i]]) for i in axes(x_traj, 2)], label = L"\left|X_t - Y_t\right|")
plot!(yscale = :log, legendfontsize = 15)

As before, we look at the behavior of the synchronous coupling based estimator of the transport coefficient.

<b> Task:</b> In the next cell vary the intensity of the perturbation $\eta$, the inverse temperature $\beta$, and runtime $T$ and comment on the convergence of $\widehat{\Psi}_{\eta,t}^{\mathrm{sync}}$. You may also look at the plots of the trajectories in the cell after and the distribution of distances between the trajectories in the cell after that. 

In [None]:
x_0 = [0., 0]
η = 0.1
β = 1.
dt = 0.01
T = 50000
burn = 10000
N = floor(Int64, T/dt)
dt = T/N
N_burn = floor(Int64, burn/dt)
RNG = Xoshiro(24092023) #fixing RNG

drift_x(x) = -∇entropic_switch(x) + η*rotation(x) #perturbed drift
drift_y(x) = -∇entropic_switch(x) #equilibrium drift

x_traj, y_traj = coupled_traj(drift_x, drift_y, synchronous_integration_step!, β, x_0, y_0, T, burn, dt; RNG = RNG)
Ψ_sync = cumsum(x_traj[1, :] - y_traj[1, :]) ./ (1:N+1) ./η

plot(0:10dt:T, Ψ_sync[1:10:end], label = L"\widehat{\Psi}_{\eta, t}")
plot!(legendfontsize = 15, ylims = (0, 1))

In [None]:
contourf(xrange, yrange, entropic_switch, levels=50, cmap=:hsv)
plot!(x_traj[1, 1:500], x_traj[2, 1:500], color = :black, lw = 1.5, label = L"X")
plot!(y_traj[1, 1:500], y_traj[2, 1:500], color = :lightgray, lw = 1.5, label = L"Y")
plot!(legendfontsize = 15)

In [None]:
#Plotting the empirical distribution of distance between the equilibrium and nonequilibrium processes
plot()
plot!(title = latexstring("Distribution of \$\\left|X_t - Y_t\\right|\$"))
plot!([norm([x_traj[:, i] - y_traj[:, i]]) for i in axes(x_traj, 2)], normed = true, label = "", 
    seriestype = :stephist)
plot!(xlabel = "Distance between trajectories")

### 2) Sticky coupling

It can be shown that synchronous coupling works exceedingly well if $U$ is strongly convex—you may have already proven the key inequality for this result in the optional exercise from the previous subsection. However if $U$ is not strongly convex in regions of the phase space synchronous coupling can fail spectacularly (particularly in large dimensions) since this coupling relies on solely on the drift part of the dynamics to bring the two trajectories together.

This flaw in synchronous coupling brings us to consider an alternative sticky coupling. In this notebook, we are only going to consider sticky couplings of the discretized dynamics (also called maximal-reflection coupling). 

Consider two Euler-Maruyama discretizations of SDEs with drifts $b_1$ and $b_2$ and additive noise
$$\begin{aligned}
    X_{k+1} &= X_k + b_1\left(X_k\right)\Delta t + \sqrt{\frac{2\Delta t}{\beta}}G_{k+1},\\
    Y_{k+1} &= Y_k + b_2\left(Y_k\right)\Delta t + \sqrt{\frac{2\Delta t}{\beta}}\widetilde{G}_{k+1},
\end{aligned}
$$
where $G_{k+1}$ and $\widetilde{G}_{k+1}$ are to standard normal random variables on $\mathbb{R}^d$. We make no assumption for the moment on their joint distributions since that is what we want to play with. 

Denote by $\mathbf{E}(x,y) = x - y + \Delta t \left(b_1(x) - b_2(y)\right)$ the difference vector between the two trajectories after the drift update but before the stochastic update. We write $\mathbf{e}(x,y)$ its normalization which is set to zero if $\mathbf{E}(x,y) = 0$, 
$$\mathbf{e}(x,y) = \left\{ 
\begin{aligned}
    &\frac{\mathbf{E}(x,y)}{\left|\mathbf{E}(x,y)\right|} && \text{ if } \left|\mathbb{E}(x,y)\right| \neq 0,\\
    &0 && \text{ otherwise}.
\end{aligned}
\right.$$
At each step, we draw $G_{k+1}$ from a standard Gaussian and update the first marginal as usual,
$$ X_{k+1} = X_k + b_1\left(X_k\right)\Delta t + \sqrt{\frac{2\Delta t}{\beta}}G_{k+1}.$$
For the second marginal, we want to maximize the probability that $Y_{k+1} = X_{k+1}$. Assuming $X_k = x$ and $Y_k = y$, this probability of collision is upper bounded by the coupling inequality
$$\mathbb{P}\left(X_{k+1} = Y_{k+1} \left| X_k = x, \, Y_k = y \right. \right) \leq 1 -  d_{\mathrm{TV}}\left(\mathcal{N}\left(x + \Delta t b_1(x), \sqrt{\frac{2\Delta t}{\beta}}\mathrm{Id}\right), \mathcal{N}\left(y + \Delta t b_2(y), \sqrt{\frac{2\Delta t}{\beta}}\mathrm{Id}\right)\right).$$
We can saturate this equality via the following construction: denote by $\varphi_d$ the pdf of standard $d$-dimensional Gaussian density and define
$$p_{\Delta t, \beta} (x,y, g) = \frac{\varphi_d\left(g + \sqrt{\frac{\beta}{2\Delta t}}\mathbf{E}(x,y)\right)}{\varphi_d(g)}.$$
One can check that 
$$\int_{\mathbb{R}^d} \min\left\{p_{\Delta t, \beta}(x, y, g), 1\right\} \varphi_d(g) dg = 1 - d_{\mathrm{TV}}\left(\mathcal{N}\left(x + \Delta t b_1(x), \sqrt{\frac{2\Delta t}{\beta}}\mathrm{Id}\right), \mathcal{N}\left(y + \Delta t b_2(y), \sqrt{\frac{2\Delta t}{\beta}}\mathrm{Id}\right)\right).$$
So at each step we draw $U_{k+1}$ from uniform distribution on $[0,1]$ independently from $G_{k+1}$ and if $U_{k+1} \leq p_{\Delta t, \beta} \left(X_k,Y_k, G_{k+1}\right)$ then we set $$Y_{k+1} = X_{k+1},$$ otherwise we update the second marginal using $G_{k+1}$ reflected over the hyperplane seperating $X_k$ and $Y_k$ as the driving noise
$$Y_{k+1} = Y_k + b_2\left(Y_k\right)\Delta t + \sqrt{\frac{2\Delta t}{\beta}}\left[\mathrm{Id} - 2\mathbf{e}\left(X_k, Y_k\right)\mathbf{e}\left(X_k, Y_k\right)^T\right]G_{k+1}.$$

Putting this all together, one step of the sticky coupled dynamics is given by
$$
\begin{aligned}
    X_{k+1} &= X_k + b_1\left(X_k\right)\Delta t + \sqrt{\frac{2\Delta t}{\beta}}G_{k+1}\\
    Y_{k+1} &= \left\{
    \begin{aligned}
        &X_{k+1} && \text{if } U_{k+1} \leq p_{\Delta t, \beta} \left(X_k,Y_k, G_{k+1}\right),\\
        &Y_{k+1} = Y_k + b_2\left(Y_k\right)\Delta t + \sqrt{\frac{2\Delta t}{\beta}}\left[\mathrm{Id} - 2\mathbf{e}\left(X_k, Y_k\right)\mathbf{e}\left(X_k, Y_k\right)^T\right]G_{k+1} && \text{otherwise,}
    \end{aligned}\right.
\end{aligned}
$$
with $\left\{U_k\right\}_{k\geq 1}$ and $\left\{G_k\right\}_{k\geq 1}$ mutually independent sequences of $[0,1]$-uniform random variables and $d$-dimensional standard Gaussian random variables respectively.

<br>

<b>Task:</b> Implement one integration step of sticky coupled dynamics in the next cell.

In [None]:
function sticky_integration_step!(x, y, G, drift_x, drift_y, β, dt, RNG)
   #implement here
end

As we did with synchronous coupling, we will test your implementation of sticky coupling on the harmonic and entropic switch examples.

#### a) Strongly Convex Harmonic Potential

In [None]:
#Simulating coupled trajectories with the same drift and different initial conditions

drift(x) = -∇Harmonic(x)

B = [sqrt(2) 0
    sqrt(2)/2 sqrt(6)/2]
RNG = Xoshiro(24092023) #fixing RNG

###################################
#two samples from the invariant measure of the continuous-time process
x_0 = sqrt(1/(3β)) * B * randn(RNG, 2)
y_0 = sqrt(1/(3β)) * B * randn(RNG, 2)
####################################


# ###################################
# #Difference in the first eigendirection
# x_0 = [-1., 1]
# y_0 = zeros(2)
# ####################################

# ###################################
# #Difference in the second eigendirection
# x_0 = [1., 1]
# y_0 = zeros(2)
# ####################################

T = 1.
dt = 0.01
β = 1.
burn = 0.
x_traj, y_traj = coupled_traj(drift, drift, sticky_integration_step!, β, x_0, y_0, T, burn, dt; RNG = RNG);

In [None]:
#Plotting the distance between the two trajectories with the same drift and different inital conditions
plot(0:dt:T, [norm([x_traj[:, i] - y_traj[:, i]]) for i in axes(x_traj, 2)], label = L"\left|X_t - Y_t\right|",
marker= :circle)
plot!(legendfontsize = 15)

In [None]:
#Plotting the harmonic dynamics with same drift and different initial conditions
contourf(xrange, yrange, Harmonic, levels=50, cmap=:hsv)
plot!(x_traj[1, 1:20], x_traj[2, 1:20], color = :black, lw = 1.5, label = L"X")
scatter!(x_traj[1, 1:1], x_traj[2, 1:1], color = :black, lw = 1.5, label = L"X_0")
plot!(y_traj[1, 1:20], y_traj[2, 1:20], color = :lightgray, lw = 1.5, label = L"Y")
scatter!(y_traj[1, 1:1], y_traj[2, 1:1], color = :lightgray, lw = 1.5, label = L"Y_0")
plot!(legendfontsize = 12)

We now look at the behavior of the sticky coupling based estimator of the transport coefficient.

<b> Task:</b> In the next cell vary the intensity of the perturbation $\eta$, the inverse temperature $\beta$, and runtime $T$ and comment on the convergence of $\widehat{\Psi}_{\eta,t}^{\mathrm{sync}}$. You may also look at the plots of the trajectories in the cell after and the distribution of distances between the trajectories in the cell after that. 

In [None]:
x_0 = [1., 0]
η = 0.1
β = 1.
dt = 0.01
T = 50000
burn = 1000
N = floor(Int64, T/dt)
dt = T/N
N_burn = floor(Int64, burn/dt)
RNG = Xoshiro(24092023) #fixing RNG

drift_x(x) = -∇Harmonic(x) + η*rotation(x) #perturbed drift
drift_y(x) = -∇Harmonic(x) #equilibrium drift

x_traj, y_traj = coupled_traj(drift_x, drift_y, sticky_integration_step!, β, x_0, y_0, T, burn, dt; RNG = RNG)
Ψ_sync = cumsum(R.(x_traj[1, :], x_traj[2, :]) - R.(y_traj[1, :], y_traj[2, :])) ./ (1:N+1) ./η
α_R = 1/(3β)

plot(0:10dt:T, Ψ_sync[1:10:end], label = L"\widehat{\Psi}_{\eta, t}")
plot!([0, T], [α_R, α_R], label = L"\alpha_R", legendfontsize = 15)

In [None]:
#Plotting the trajectories the equilibrium and nonequilibrium processes 
#if you can't differentiate the two trajectories either increase η and rerun the previous cell
#or try a different range of indicies
contourf(xrange, yrange, Harmonic, levels=50, cmap=:hsv)
plot!(x_traj[1, 2000:2500], x_traj[2, 2000:2500], color = :black, lw = 1.5, label = L"X^\eta")
plot!(y_traj[1, 2000:2500], y_traj[2, 2000:2500], color = :lightgray, lw = 1.5, label = L"Y^0")
plot!(legendfontsize = 15)

In [None]:
#Plotting the empirical distribution of distance between the equilibrium and nonequilibrium processes
sticky_hist = normalize(fit(Histogram, [norm([x_traj[:, i] - y_traj[:, i]]) for i in axes(x_traj, 2)], 
    vcat(0, 100*eps(Float64), LinRange(0, 4,81)[2:end])))

plot()
plot!(sticky_hist.edges[1][2:end-1], sticky_hist.weights[2:end], label = "")
scatter!([0], [sticky_hist.edges[1][2] * sticky_hist.weights[1]], label = "Sticky mass at 0")
plot!(title = latexstring("Distribution of \$\\left|X_t - Y_t\\right|\$"), 
    xlabel = "Distance between trajectories", seriestype = :barhist)

#### b) Non-convex Entropic Switch Potential

In [None]:
#Simulating coupled trajectories with the same drift and different initial conditions

drift(x) = -∇entropic_switch(x)

q_l = [-1.048, -0.04210]
q_r = [1.048, -0.04210]
x_0 = [1., 1]
y_0 = [0., 0.]
β = 1.
T = 7.
burn = 0.
dt = 0.01
RNG = Xoshiro(24092023) #fixing RNG
x_traj, y_traj = coupled_traj(drift, drift, sticky_integration_step!, β, q_l, q_r, T, burn, dt; RNG = RNG)

plot()
plot!(0:dt:T, [norm([x_traj[:, i] - y_traj[:, i]]) for i in axes(x_traj, 2)], label = L"\left|X_t - Y_t\right|")
plot!(legendfontsize = 15)

In the following cells we look at the behavior of the sticky coupling based estimator of the transport coefficient.

<b> Task:</b> In the next cell vary the intensity of the perturbation $\eta$, the inverse temperature $\beta$, and runtime $T$ and comment on the convergence of $\widehat{\Psi}_{\eta,t}^{\mathrm{sync}}$. You may also look at the plots of the trajectories in the cell after and the distribution of distances between the trajectories in the cell after that. 

In [None]:
x_0 = [0., 0]
η = 0.2
β = 1.
dt = 0.01
T = 50000
burn = 10000
N = floor(Int64, T/dt)
dt = T/N
N_burn = floor(Int64, burn/dt)
RNG = Xoshiro(24092023) #fixing RNG

drift_x(x) = -∇entropic_switch(x) + η*rotation(x) #perturbed drift
drift_y(x) = -∇entropic_switch(x) #equilibrium drift

x_traj, y_traj = coupled_traj(drift_x, drift_y, sticky_integration_step!, β, x_0, y_0, T, burn, dt; RNG = RNG)
Ψ_sticky = cumsum(x_traj[1, :] - y_traj[1, :]) ./ (1:N+1) ./η

plot(0:10dt:T, Ψ_sticky[1:10:end], label = L"\widehat{\Psi}_{\eta, t}")
plot!(legendfontsize = 15, ylims = (0, 1))

In [None]:
contourf(xrange, yrange, entropic_switch, levels=50, cmap=:hsv)
plot!(x_traj[1, 1:500], x_traj[2, 1:500], color = :black, lw = 1.5, label = L"X")
plot!(y_traj[1, 1:500], y_traj[2, 1:500], color = :lightgray, lw = 1.5, label = L"Y")
plot!(legendfontsize = 15)

In [None]:
#Plotting the empirical distribution of distance between the equilibrium and nonequilibrium processes
sticky_hist = normalize(fit(Histogram, [norm([x_traj[:, i] - y_traj[:, i]]) for i in axes(x_traj, 2)], 
    vcat(0, 100*eps(Float64), LinRange(0, 4,81)[2:end])))

plot()
plot!(sticky_hist.edges[1][2:end-1], sticky_hist.weights[2:end], label = "")
scatter!([0], [sticky_hist.edges[1][2] * sticky_hist.weights[1]], label = "Sticky mass at 0")
plot!(title = latexstring("Distribution of \$\\left|X_t - Y_t\\right|\$"), 
    xlabel = "Distance between trajectories", seriestype = :barhist)

## III) Some Additional Topics

### 1) Failure of Synchronous Coupling for Kinetic Langevin Dynamics
It is know that for synchronous coupling to work on kinetic Langevin dynamics we need to make additional assumptions on the potential, $U$, and friction parameter, $\gamma$. Specifically we need to assume that there exists $ \Lambda \geq \lambda > 0 $ such that 
$$ \lambda \mathrm{Id} \leq \nabla^2U \leq \Lambda \mathrm{Id},$$ 
and that $\gamma > \sqrt{\Lambda} - \sqrt{\lambda}$.

We propose to try to implement a synchronous coupling of two discretized Langevin with the same quartic plus quadratic potential started different inital conditions. This potential cannot satisfy the above assumptions. What happens in this case? 

We refer to yesterday's notebook for guidance on how to implement discretizations of kinetic Langevin dynamics.

In [None]:
#quartic plus quadratic potential
ϕ(x) = @. x^2/2 + x^4/4 
∇ϕ(x) = @. x + x^3

### 2) L-Lag Couplings
In [arxiv:1905.09971](https://arxiv.org/abs/1905.09971) the authors proposed using an $L$-lag coupling to empirical estimate the rate of convergence to the stationary probability measure of a Markov chain. See page 2 for the algorithm.

We propose that you use your implementations of sticky coupling to estimate the the convergence rate of the Euler-Maruyama discretization of overdamped Langevin dynamics for our two example potentials.

### 3) Unbiased Estimators Based on Couplings

In the same vein as the previous point, we can also use $L$-Lag coupling to construct unbiased estimators based on couplings (unbiased with respect to invariant measure of discretized dynamics). See section 2 of [arXiv:1708.03625](https://arxiv.org/abs/1708.03625).

We propose you implement a method that generates unbiased samples from the invariant measure of a discretized overdamped Langevin equation using your implementation of various coupling methods in the previous section.

### 4) Couplings of Metropolised Dynamics

In this notebook, we largely concerned ourselves with unadjusted overdamped Langevin dynamics, i.e. we had no Metropolis-Hastings step. Naively using the couplings we have implemented above may result in a submaximal coupling due to the Metropolis-Hastings step. It is however possible to construct couplings that take into account the Metropolis-Hastings adjust and hopefully perform better.

We propose you implement the maximal-reflection (sticky) coupling of Metropolised kernels in [arxiv:2010.08573](https://arxiv.org/abs/2010.08573).