# Numerical Langevin
### By Carlos A.C.C. Perello

## Preliminaries
The only preliminaries are importing the relevant packages, i.e. the `LinearAlgebra.jl` package and making a function that creates a square grid centered about $(0,0)$ of side length $2x$ where the points are evenly spaced by $h$.

In [5]:
using LinearAlgebra
function square_grid(x, h)
    grid = [[i j] for i in -x:h:x, j in -x:h:x]
    grid
end

square_grid (generic function with 1 method)

### Test square grid

In [6]:
square_grid(10, 0.01)

2001×2001 Matrix{Matrix{Float64}}:
 [-10.0 -10.0]  [-10.0 -9.99]  [-10.0 -9.98]  …  [-10.0 9.99]  [-10.0 10.0]
 [-9.99 -10.0]  [-9.99 -9.99]  [-9.99 -9.98]     [-9.99 9.99]  [-9.99 10.0]
 [-9.98 -10.0]  [-9.98 -9.99]  [-9.98 -9.98]     [-9.98 9.99]  [-9.98 10.0]
 [-9.97 -10.0]  [-9.97 -9.99]  [-9.97 -9.98]     [-9.97 9.99]  [-9.97 10.0]
 [-9.96 -10.0]  [-9.96 -9.99]  [-9.96 -9.98]     [-9.96 9.99]  [-9.96 10.0]
 [-9.95 -10.0]  [-9.95 -9.99]  [-9.95 -9.98]  …  [-9.95 9.99]  [-9.95 10.0]
 [-9.94 -10.0]  [-9.94 -9.99]  [-9.94 -9.98]     [-9.94 9.99]  [-9.94 10.0]
 [-9.93 -10.0]  [-9.93 -9.99]  [-9.93 -9.98]     [-9.93 9.99]  [-9.93 10.0]
 [-9.92 -10.0]  [-9.92 -9.99]  [-9.92 -9.98]     [-9.92 9.99]  [-9.92 10.0]
 [-9.91 -10.0]  [-9.91 -9.99]  [-9.91 -9.98]     [-9.91 9.99]  [-9.91 10.0]
 [-9.9 -10.0]   [-9.9 -9.99]   [-9.9 -9.98]   …  [-9.9 9.99]   [-9.9 10.0]
 [-9.89 -10.0]  [-9.89 -9.99]  [-9.89 -9.98]     [-9.89 9.99]  [-9.89 10.0]
 [-9.88 -10.0]  [-9.88 -9.99]  [-9.88 -9.98]     [-9.8

## Computing the derivative of the Brenier map

We are interested in transporting the following distributions:

$$
\mathcal{N}(m_1,\Sigma_1)\to\mathcal{N}(m_2,\Sigma_2)
$$

Where $\mathcal{N}(\mathbf{\mu}, \Sigma)$ is a multivariate normal with mean $\mathbf{\mu}\in \mathbb{R}^n$ and covariance matrix $\Sigma\in\mathbb{R}^{n\times n}$.

As we are interested in transporting a Gaussian distribution into another Gaussian distribution, the problem of computing the Brenier map between these 2 distributions simplifies to computing the map [1]:

$$
T_{\text{Brenier}}:x\mapsto m_2 + A(x-m_1) \quad \text{with } A = \Sigma_1^{-1/2}(\Sigma_1^{1/2}\Sigma_2\Sigma_2^{1/2})^{1/2}\Sigma_1^{-1/2}
$$

Therefore, the derivative of the Brenier map, $DT_{\text{Brenier}} = A$, which is easy to compute and done below. We also want to compute the singular values of $DT_{\text{Brenier}}$ as they can be used to bound its operator norm.

In [7]:
function OT_derivative(Σ₁, Σ₂)
    #=
    Returns a tuple consisting of 
    the derivative of the Bernier map in the first entry
    and a vector containing the singular vals
    of the derivative of the Bernier map in the second entry.
    =#
    A = inv(sqrt(Σ₁))*sqrt(sqrt(Σ₁)*Σ₂*sqrt(Σ₁))*inv(sqrt(Σ₁))
    A, svdvals(A)
end

OT_derivative (generic function with 1 method)

### Testing the computation of the derivative of Brenier map

In [8]:
n = 2 # Dimension of multivariate Gaussian
X₁ = rand(n, n) 
X₂ = rand(n, n) # Generate 2 random n×n matrices
C₁ = X₁'*X₁
C₂ = X₂'*X₂ # Use the random matrices to generate 2 random 
# n×n pos. def. matrices, as covariance matrices are pos. def.

OT_map = OT_derivative(C₁, C₂)[1]

2×2 Matrix{Float64}:
 0.0689161  0.218858
 0.218858   1.07786

## Computing the Langevin map
We want to numerically compute the Langevin map between two Gaussians. To do so, we first need to introduce $P_t^A(f)$ and $B_t$. We define $P_t^A(f) as [2]:
$$
P_t^A(f) = \int_{\mathbb{R^n}}f\left(\exp(-tA)x + \sqrt{\text{Id}-\exp(-2tA)} y\right)d\mu(y)
$$

In the case of transport between 2 gaussians, $B_t$ is then defined as [2]:
$$
P_t^A\left(c_o\exp\left(-\frac{1}{2}y^TBy\right)\right)(x) = c_t\exp\left(-\frac{1}{2}x^T B_t x\right)
$$
And $B_t$ can be computed by explicitly computing and rearranging the LHS of the equation above [2].
After doing this with $A=\Sigma_1$ and $B=\Sigma_2$, $B_t$ is:
$$
B_t = (e^{-t\Sigma_1})^T\Sigma_2e^{-t\Sigma_1} - I
$$

THERE IS A MISTAKE HERE ^

and $c_t$ is given by:
$$
c_t = \frac{\sqrt{\pi}c_0c}{2\left|\det\left({\sqrt{(\sqrt{I-e^{-2t\Sigma_1}})^T \Sigma_2 (\sqrt{I-e^{-2t\Sigma_1}})+\Sigma_1}}\right)\right|}
$$

We can implement $B_t$ as a function, done below.

In [10]:
function Bₜ(Σ₁, Σ₂, t)
    (exp(-t.*Σ₁)')*Σ₂*exp(-t*Σ₁)
end

Bₜ(C₁, C₂, 0) - C₂

2×2 Matrix{Float64}:
 0.0  0.0
 0.0  0.0

### Testing $B_t$
We reuse the random positive definite matrices generated above:

In [11]:
Bₜ(C₁, C₂, 0)

2×2 Matrix{Float64}:
 0.0473826  0.228384
 0.228384   1.10555

Once we have $B_t$, the Langevin map is given by solving the following PDE:
$$
\frac{\partial S_t(x)}{\partial t} = B_t S_t(x)
$$

Fixing $x$, this becomes:

$$
\frac{d S_t(x)}{dt} = B_t S_t(x)
$$

Which we can solve using a basic ODE solver, such as the Forward Euler method. [3]

## Solving for the Langevin map using Forward Euler
We now fix $x$ and solve for $S_t(x)$. As we fix $x$, it will be easier to write $S_t(x) = S_x(t)$. We first discretise the time dimension into $n$ equal sized time steps between 0 and $T$.

In [12]:
T = 10
n = 10_000
t = range(0, 20, length=n)
h = step(t)

0.002000200020002

Now, we use the Forward Euler method, given by the following equation, to solve for $S_x(t)$:
$$
S_x(0) = x,\, \text{ as } t_1 = 0\\
S_x(t_{k+1}) = \left(\text{Id} + hB_{t_k}\right)S_x(t_{k}),\, 2\leq k\leq n
$$
This is implemented below as a function that takes in $x$ as its initial condition and outputs the time evolution of the Langevin map for the discrete time steps betwen $t=0$ and $t=T$.

In [13]:
function solve_system(x)
    Sₜx = zeros(2, n)
    Sₜx[:, 1] = x
    for k = 1:n-1
        Sₜx[:, k+1] = (I + h.*Bₜ(C₁, C₂, t[k]))*Sₜx[:, k]
    end
    Sₜx
end

solve_system (generic function with 1 method)

We now broadcast the function over the square grid created in the preliminaries to get the values of the Langevin map for all points in our grid:

In [14]:
A = solve_system.(square_grid(1, 0.1));
A[1]

2×10000 Matrix{Float64}:
 -1.0  -1.00055  -1.0011   -1.00165  …  -1.25485  -1.25485  -1.25485
 -1.0  -1.00267  -1.00533  -1.00799     -2.03031  -2.03031  -2.03031

# References
[1] - https://djalil.chafai.net/blog/2010/04/30/wasserstein-distance-between-two-gaussians/

[2] - Anastasiya Tanana (2020). Comparison of transport map generated by heat flow interpolation and the optimal transport Brenier map. Communications in Contemporary Mathematics, 23(06), 2050025.

[3] - https://github.com/Imperial-MATH50003/MATH50003NumericalAnalysis/blob/main/notes/MATH50003_numerical_analysis_lecture_notes.pdf