**Mapping closure for $f_B$ in 1D using a Monte-Carlo Implementation**

This notebook introduces the mapping closure developed by (Chen, H. 1989, Pope, S.B. 1991) and discusses how it can be applied to the problem of turbulent scalar mixing. In contrast to these works which consider a one-point pdf, usually in the context of homogeneous isotropic turbulence, we consider the *global pdf* describing the contents of an arbitrary control volume.


**Content**

We first import all the packages we need to run this example

In [50]:
import numpy as np
import matplotlib.pyplot as plt

**Evolution equation for the PDF**

Following [Pope et al. 1985](https://www.sciencedirect.com/science/article/abs/pii/0360128585900024) the evolution equation for these distributions is given by

\begin{equation}
\frac{\partial F}{\partial t} = -\mathbb{E}_Y[ \Gamma \nabla^2 Y ] \frac{\partial F}{\partial y},
\end{equation}

and

\begin{equation}
\frac{\partial f}{\partial t} = -\frac{\partial }{\partial y} \left( \mathbb{E}_Y[ \Gamma \nabla^2 Y ] f \right).
\end{equation}

As $f(y,t)$ contains no information about space this equation is unclosed as the conditional expectation is unknown, however we can apply the mapping closure developed by [Pope et al. 1991](https://link.springer.com/article/10.1007/BF00271466) to estimate a closure for the molecular mixing terms.

The idea behind this closure relies on Gaussian random fields. What makes these "random" fields particularly useful is:
- They are completely defined by their mean and covariance: 
\begin{equation}
    \mu = \mathbb{E}[\theta(\mathbf{z})], \quad \rho(r) = \mathbb{E}[\theta(\mathbf{z})\theta(\mathbf{z} + \mathbf{e}r)] - \mu^2,
\end{equation}
- Allow the calculate explicity the conditional expectation
\begin{equation}
lim_{r \to 0} \frac{\partial^2 \rho(r)}{\partial r^2} = \left\langle \frac{\partial \theta}{\partial z_i} \frac{\partial \theta}{\partial z_i} \right\rangle,
\end{equation}

and the fact that multiple different field can have the same global PDF.


**Mapping Closure**

Based on the properties of the Gaussian random field the idea is therefore to find a mapping 
\begin{equation}
\tilde{Y}(\mathbf{x},t) = \mathscr{Y}(\theta(\mathbf{x} J(t)),t),
\end{equation}
such that we can express the CDF $F$ in terms of the cumulative Gaussian $G$ 
\begin{equation}
F(\mathscr{Y}(\eta,t),t) = G(\eta).
\end{equation}

Differentiating the last equation we obtain
\begin{equation}
\frac{ \partial F}{\partial t} = -\frac{ \partial \mathscr{Y}}{\partial t} \frac{ \partial F}{\partial y}.
\end{equation}
which can then be expressed as
\begin{equation}
\frac{\partial \mathscr{Y}}{\partial t} = \mathbb{E}_Y[ \Gamma \nabla^2 Y ] = \Gamma \underbrace{\frac{J^2(t)}{\lambda_{\theta}^2}}_{= 1/\tau} \left( \frac{\partial^2 \mathscr{Y} }{\partial \eta^2} - \eta \frac{\partial \mathscr{Y} }{\partial \eta} \right).
\end{equation}

**Particle Implementation**

Although we can solve this equation as a PDE on a grid using a standard method such as finite-difference it is also possible to use a particle method. Let $g(\eta)$ be the PDF of the random variable $\theta_t$ be generated an Ornstein-Uhlenbeck process:

\begin{equation}
  d \theta_{t} = -\frac{\theta_{t}}{T} d t +\left(\frac{2}{T} \right)^{1/2} d W_{t},
\end{equation}

and $Y_{t}=\mathscr{Y}(\theta_t,t)$, if follows from Ito's lemma implies that

\begin{equation}
  d Y_{t} = \frac{\partial \mathscr{Y}_{t} }{\partial t} dt + \frac{1}{T} \left( -\eta \frac{\partial \mathscr{Y}_{t}}{\partial \eta} + \frac{\partial^2 \mathscr{Y}_{t}}{\partial \eta^{2}} \right) dt + \frac{\partial \mathscr{Y}_{t}}{\partial \eta} \left( \frac{2}{T} \right)^{1/2} d W_{t},
\end{equation}

where $\eta$ is the sample-space (or dummy) variable corresponding to the random variable $\theta_t$. Substitution for $\frac{\partial \mathscr{Y}_{t} }{\partial t}$ then gives the full system as

\begin{align*} 
 d \theta_{t} &= -\frac{\theta_{t}}{T} d t + \left(\frac{2}{T} \right)^{1/2} d W_{t},\\
 d      Y_{t} &= \left( \frac{1}{\tau} + \frac{1}{T} \right) \left( -\eta \frac{\partial \mathscr{Y}_{t}}{\partial \eta} + \frac{\partial^2 \mathscr{Y}_{t}}{\partial \eta^{2}} \right) dt + \frac{\partial \mathscr{Y}_{t}}{\partial \eta} \left( \frac{2}{T} \right)^{1/2} d W_{t},
\end{align*}

where it is understood that $\eta$, $\mathscr{Y}_t(\eta)$ and its derivatives are to be evaluated at $\eta = \theta_t$.

To determine $\mathscr{Y}_t$ and its derivatives requires that $F(\mathscr{Y}(\eta,t),t) = G(\eta)$ is satisfied. Solving for the inverse CDF $F^{-1}(p,t) we can determine

\begin{equation}
\mathscr{Y}(\eta,t) = F^{-1}( G(\eta), t),
\end{equation}

as a function of $\eta$. Differentiating $F(\mathscr{Y}(\eta,t),t) = G(\eta)$ then gives 

\begin{equation}
\frac{\partial \mathscr{Y}_{t}}{\partial \eta} = \frac{g(\eta)}{f(\mathscr{Y}(\eta,t),t)},
\end{equation}

which differentiated again respect to $\eta$ gives

\begin{equation}
\frac{\partial^2 \mathscr{Y}_{t}}{\partial \eta^2} = - \eta \frac{g}{f} - \frac{g}{f^2} \frac{\partial \mathscr{Y}_{t}}{\partial \eta} = - \frac{g}{f} \left( \eta  - \frac{g}{f^2} \right),
\end{equation}

an expression that is a function of $g, f$ and $\eta$ only. However, as the system depends on the global PDF at each time-step it corresponds to a McKean-Vlasov equation.

In [None]:
# from KDEpy import FFTKDE

# def Y_map_KDE(data_x,data_y):
#     """
#     Calculates E[Y|X=x] = int f_Y|X(y|x)*y dy = int f_XY(x,y)*y dy / f_X(x)
#     """
#     # Grid points in the x and y direction
#     grid_points_x, grid_points_y = 2**8, 2**8

#     # Stack the data for 2D input, compute the KDE
#     data = np.vstack((data_x, data_y)).T
#     kde = FFTKDE(kernel='gaussian', bw=.1).fit(data)
#     grid, points = kde.evaluate((grid_points_x, grid_points_y))

#     # Retrieve grid values, reshape output and plot boundaries
#     x, y = np.unique(grid[:, 0]), np.unique(grid[:, 1])
#     f_XY = points.reshape(grid_points_x, grid_points_y)
    
#     EY_cX =  np.sum(  (f_XY.T / np.sum(f_XY, axis=1) ).T  * y , axis=1)

#     return EY_cX, x

def inverse_cdf(X, p):
    """
    Compute the inverse CDF (quantile function) for a given dataset.
    
    Parameters:
    - X: numpy array of float values
    - p: array of probabilities for which to compute the quantile function
    
    Returns:
    - quantiles: values corresponding to the given probabilities
    """
    x = np.sort(X)  # Sort data for empirical CDF
    n = len(x)
    
    # Create an empirical CDF
    F = np.linspace(1/n, 1, n)  # Avoid zero probability
    
    # Interpolate to get the quantiles
    return np.interp(x=p, xp=F, fp=x)

def Y_map(Y, θ, N=32):
    """
    Calculates f, η, Y, dY, ddY
    """

    # PDFs
    f, y  = np.histogram(Y, bins=N, density=True)
    g, η  = np.histogram(θ, bins=N, density=True)
    
    # CDFs
    F = np.cumsum(f) 
    F /= F[-1] 
    G = np.cumsum(g) 
    G /= G[-1] 

    # Inverse CDF
    Q = inverse_cdf(data, probabilities)


    return f, η, Y, dY, ddY

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from   scipy.stats import gaussian_kde, norm
from derivatives import grad,laplacian

# Parameters
num_particles = 10**3  # Number of Monte Carlo samples
num_steps = 10     # Time steps
dt = 0.01         # Time step size
T  = 1
τ  = 1

Y_min = -2
Y_max =  2

# Brownian increments
dW_t = np.sqrt(dt) * norm.rvs(loc=0, scale=1, size=(num_particles, num_steps, 2))  

# Container
θ = np.zeros((num_particles, num_steps))
Y = np.zeros((num_particles, num_steps))

# Initial conditions
θ[:,0] = np.random.normal(0, 1, num_particles)  # Initial condition (Normal distribution)

Y[:,0] = np.random.normal(0, 1, num_particles)  

# Half of the particles to +1 and the other to -1
#Y[:num_particles//2,0] = 1
#Y[num_particles//2:,0] =-1

f, η, Y, dY, ddY = 

# Euler Maruyama
for n in range(1, num_steps):

    # if n%(num_steps // 10) == 0:
    #     plt.plot(η, EY_cη, 'k')
    #     plt.show()
    #     plt.hist(Y[:, n-1], bins=50, density=True, alpha=0.6)
    #     #plt.xlim([-1.1, 1.1])
    #     plt.show()

    # Calculate Y via joint pdf O(num_particles*n_bins) ??
    f, η, Y, dY, ddY = 

    # Update particles
    θ[:, n] = θ[:, n-1] -  (      1/T) * (         θ[:, n-1]) * dt +    np.sqrt(2/T) * dW_t[:,n-1,0]
    Y[:, n] = Y[:, n-1] +  (1/τ + 1/T) * (ddY - dY*θ[:, n-1]) * dt + dY*np.sqrt(2/τ) * dW_t[:,n-1,1]

    # Apply Reflecting/bcs
    # Y[:, n] = np.where(Y[:, n] > Y_max, Y_max - (Y[:, n] - Y_max), Y[:, n]) # Reflect back inside
    # Y[:, n] = np.where(Y[:, n] < Y_min, Y_min + (Y_min - Y[:, n]), Y[:, n]) # Reflect back inside


# # Estimate the probability density function at final time step using KDE
# b_values = np.linspace(-1.2, 1.2, 100)
# kde = gaussian_kde(b[:, -1])
# pdf_values = kde(b_values)

# # Plot the Monte Carlo histogram and estimated density
# plt.figure(figsize=(8, 5))
# plt.hist(b[:, -1], bins=50, density=True, alpha=0.6, label="Monte Carlo Histogram")
# plt.plot(b_values, pdf_values, 'r-', label="Kernel Density Estimation (KDE)")
# plt.xlabel("x")
# plt.ylabel("Probability Density")
# plt.title("Monte Carlo Solution of Fokker-Planck")
# plt.legend()
# plt.show()