# Inference for a Generalized Langevin Equation

**Feiyu Zhu, Martin Lysy, University of Waterloo**

**February 10, 2022**

## The GLE and its Quasi-Markov Representation

Consider a generalized Langevin equation (GLE) for a position process $X(t)$ given by

$$
\ddot X(t) = - \pot'_{\pph}(X(t)) - \int_0^t \gamma(t-s) \dot X(s) \ud s + F(t),
$$

where $V(t) = \frac{\ud}{\ud t} X(t) = \dot X(t)$ is the velocity process, $\ddot X(t) = \dot V(t)$ is the acceleration process, $\pot'_{\pph}(X)$ is the derivative of the potential energy $\pot_{\pph}(X)$, and $F(t)$ is a stationary Gaussian process with autocorrelation

$$
\cov(F(t), F(t+h)) = \beta^{-1} \gamma(h) = \beta^{-1} \sum_{k=1}^K \mu_k^2 e^{-\rho_k h}.
$$

In most applications $\beta$ is known, such that the model parameters are $\tth = (\pph, \mmu, \rrh)$, where $\mu_k, \rho_k > 0$.  The goal is to estimate these parameters from discrete observations $\XX = (X(t_0), \ldots, X(t_N))$.  To do this, it is useful to note that $X(t)$ has the same distribution as the stochastic differential equation (SDE) given by

$$
\begin{aligned}
\ud X(t) &= V(t) \ud t \\
\ud V(t) &= -\pot'_{\pph}(X(t)) \ud t + \sum_{i=1}^K \mu_k Z_k(t)  \ud t \\
\ud Z_k(t) &= -\left(\mu_k V(t) + \rho_k Z_k(t) \right) \ud t + \sigma_k \ud B_k(t),
\end{aligned}
$$

where $\sigma_k = \sqrt{2\rho_k}/\beta$.  Moreover, let $\ZZ(t) = (Z_1(t), \ldots, Z_K(t))$ and $\WW(t) = (X(t), V(t), \ZZ(t))$.  Then the stationary distribution of $\WW(t)$ is [TBD].


## Modified Euler Discretization Scheme

Let $X_n = X(t_n)$, $V_n = V(t_n)$, $\ZZ_n = \ZZ(t_n)$, etc., and let $\Delta X_n(s) = X(t_n+s) - X(t_n)$, and similarly for $\Delta V_n(s)$, $\Delta \WW_n(s)$, $\Delta B_{nk}(s)$, etc.  Then for small $s$, the SDE above may be approximated on the time interval $t \in (t_n, t_n+s)$ by

$$
\begin{aligned}
\Delta X_n(s)    & = \int_{0}^{s} V_n + \Delta V_n(h) \ud h \\
\Delta V_n(s)    
%                 & = \int_{t_n}^{t_n+\dt} \left[-U'_{\pph}(X(t)) + \sum_{i=1}^K \mu_k Z_k(t)\right]  \ud t \\
                 & = -\pot'_{\pph}(X_n) s + \sum_{i=1}^K \mu_k \int_{0}^{s} Z_{nk} + \Delta Z_{nk}(h) \ud h \\
\Delta Z_{nk}(s) 
%                 & = - \int_{t_n}^{t_n+1} \left(\mu_k V(t) + \rho_k Z_k(t) \right) \ud t + \sqrt{2\rho_k}/\beta \int_{t_n}^{t_n+\dt} \ud B_k(t) \\
                 & = -\left(\mu_k V_n + \rho_k Z_{nk}\right)s + \sigma_k \Delta B_{nk}(s).
\end{aligned}
$$
 

Let $U_{nk}^{(j)}(s)$ for $j \in \{0,1,2\}$ be defined as

$$
\begin{aligned}
U^{(2)}_{nk}(s) & = \sigma_k \Delta B_{nk}(s), \\
U^{(1)}_{nk}(s) & = \int_0^s U^{(2)}_{nk}(h) \ud h, \\
U^{(0)}_{nk}(s) & = \int_0^s U^{(1)}_{nk}(h) \ud h.
\end{aligned}
$$

Let $\UU_n^{(j)}(t) = (U^{(j)}_{n1}(t), \ldots, U^{(j)}_{nK}(t))$ and consider the $3K$-dimensional process $\UU_n(t) = (\UU^{(0)}_n(t), \UU^{(1)}_n(t), \UU^{(2)}_n(t))$.  Then upon letting $A_{nk} = \mu_k V_n + \rho_k Z_{nk}$ and $\AA_{n} = (A_{n1}, \ldots, A_{nK})$, we have

$$
\Delta \WW_n(s) = \lla(\WW_n, s) + \tilde \ZZ_n(s),
$$

where each of the three terms above is a $(K+2)$-dimensional process with 

$$
\begin{aligned}
\lla(\WW_n, s) & = 
\begin{bmatrix}
V_n -\tfrac 1 2 \pot'_{\pph}(X_n) s^2 + \mmu'\left(\tfrac 1 2 \ZZ_n s^2 - \tfrac 1 6 \AA_n s^3\right) \\
-\pot'_{\pph}(X_n) s + \mmu' \left(\ZZ_n s - \tfrac 1 2 \AA_n s^2 \right) \\
-\AA_n s
\end{bmatrix}, & 
\tilde \ZZ_n(s) & = 
\begin{bmatrix} 
\mmu' & \bz & \bz \\
\bz & \mmu' & \bz \\
\bz & \bz & \Id_{K\times K}
\end{bmatrix}\UU_n(s).
\end{aligned}
$$

Note that $\UU_n(s)$ is a Markov process, with $\UU_n(0) = \bz$ and

$$
\UU_n(s+h) \mid \UU_n(s) \sim \N\left\{\left(\RR(h) \otimes \Id_{K\times K}\right) \UU_n(s), \SSi(h) \otimes \diag(\ssi^2) \right\},
$$

where 

$$
\begin{aligned}
\RR(h) & = 
\begin{bmatrix}
1 & h & \tfrac 1 2 h^2 \\
0 & 1 & h \\
0 & 0 & 1
\end{bmatrix}, & 
\SSi(h) & =
\begin{bmatrix}
\tfrac{1}{20} h^5 & \tfrac{1}{8} h^4 & \tfrac{1}{6} h^3 \\
\tfrac{1}{8} h^4 & \tfrac{1}{3} h^3 & \tfrac{1}{2} h^2 \\
\tfrac{1}{6} h^3 & \tfrac{1}{2} h^2 & h
\end{bmatrix},
\end{aligned}
$$

and where $\otimes$ is the Kronecker matrix product.  It follows that $\tilde \ZZ_n(s)$ is also a Markov process with $\tilde \ZZ_n(0) = \bz$ and 

$$
\tilde \ZZ_n(s+h) \mid \tilde \ZZ_n(s) \sim 
\N\left( \tilde \RR(h) \tilde \ZZ_n(s), \tilde \SSi(h) \right),
$$

where

$$
\begin{aligned}
\tilde \RR(h) & = \begin{bmatrix}
1 & h & \tfrac 1 2 h^2 \mmu' \\
0 & 1 & h \mmu' \\
0 & 0 & \Id_{K\times K}
\end{bmatrix}, & 
\tilde \SSi(h) & = \begin{bmatrix} 
\tfrac{1}{20} \gamma s^5 & \tfrac {1}{8} \gamma s^4 & \tfrac{1}{6}s^3 \mmu' \diag(\ssi^2) \\
\tfrac {1}{8} \gamma s^4 & \tfrac{1}{3} \gamma s^3 & \tfrac{1}{2}s^2 \mmu' \diag(\ssi^2) \\
\tfrac{1}{6}s^3 \diag(\ssi^2) \mmu & \tfrac{1}{2}s^2 \diag(\ssi^2) \mmu  & s \diag(\ssi^2)
\end{bmatrix},
\end{aligned}
$$

and where $\gamma = \sum_{i=1}^K \mu_k^2 \sigma_k^2$.  Thus, let $t_n = n \dt$.  Then we have the following algorithm to simulate $\WW_0, \ldots, \WW_N$, $\WW_n = \WW(t_n) = \WW(n \dt)$:

- Fix the value of $\WW_0$, or draw it from its stationary distribution.

- Given $\WW_n$, generate $\WW_{n+1}$ via

    $$
    \WW_{n+1} = \WW_n + \lla(\WW_n, \dt) + \tilde \ZZ_n,
    $$
    
    where $\tilde \ZZ_n \iid \N(\bz, \tilde \SSi(\dt))$.

## Particle Filtering

As described in the SDE document, the setting for parameter inference is via noisy GLE observations $\YY = (\YY_0, \ldots, \YY_N)$

$$
\YY_n \ind \N(\AA \WW(t_n), \OOm),
$$

where for simplicity we assume that $t_n = n \dt$.  In fact, we are most interested in the case where $\AA = \left[\begin{smallmatrix} 1 & \bz \\ \bz & \bz \end{smallmatrix}\right]$ and $\OOm = \varepsilon \AA$, which corresponds to observing $Y_n \ind \N(X_n, \varepsilon^2)$, and letting $\varepsilon \to 0$.

For resolution number $m \ge 1$, let $\WW\up m_n$ denote the value of the SDE at time $t = n \dt/m$, such that $\WW\up m_{mn} = \WW_n = \WW(t_n)$.  Then in the PFJAX notation we have

$$
\begin{aligned}
\xx_n & = \WW\up m_{(n-1)m+1:tm}, & \yy_n & = \YY_n.
\end{aligned}
$$

*[TBD]* A bridge proposal for $\xx_n$ is constructed as follows:

- Suppose that $\WW\up m_{(n-1)m+i}$ is given and we wish to draw the proposal for $\WW\up m_{(n-1)m+i+1}$.

- Suppose that the Euler approximation above holds over the time interval $t \in (t_{n-1} + i \dt_m, t_n)$.

## Scratch

Everything below here is obsolete.

## Simplified QM-GLE 

After applying an orthogonal transformation to diagonalize the positive definite matrix in the memory kernel, we can simplify the original QM-GLE to the following SDE:

$$
\begin{aligned}
\ud X(t) &= V(t) \ud t \\
\ud V(t) &= -U'_{\pph}(X(t)) \ud t + \sum_{i=1}^m \mu_i Z_i(t)  \ud t \\
\ud Z_i(t) &= -\left(\mu_i V(t) + \rho_i Z_i(t) \right) \ud t + \sqrt{2\rho_i}/\beta \ud B_i(t), & i = 1,\ldots,m
\end{aligned}
$$

where $X(t)$ is the observed (continuous-time) process whereas $V(t), Z(t)$ are latent, $U_{\pph}(X)$ is the potential energy function which depends on parameters $\pph$ ($U_{\pph}'(X)$ is its first-order derivative wrt $X$), $\beta$, $\mu_i$, $\rho_i$ are unknown (positive) parameters, $\BB(t) = (B_1(t), \ldots, B_m(t))$ is an $m$-dimensional Brownian motion.

## Modified Euler Discretization

Let $t_n = n \dt$, $\WW(t) = (X(t), V(t), \ZZ(t))$, $X_n = X(t_n)$, $\Delta X_n = X_{n+1} - X_n$, and similarly for $\Delta \WW_n$, $V_n$, $\Delta V_n$, $\Delta Z_{ni}$, etc.  Then the SDE above becomes

$$
\begin{aligned}
\Delta X_n    & = \int_{t_n}^{t_n+\dt} V(t) \ud t \\
\Delta V_n    & = \int_{t_n}^{t_n+\dt} \left[-U'_{\pph}(X(t)) + \sum_{i=1}^m \mu_i Z_i(t)\right]  \ud t \\
              & \approx -U'_{\pph}(X_n) \dt + \int_{t_n}^{t_n+\dt} \sum_{i=1}^m \mu_i Z_i(t) \ud t \\
\Delta Z_{ni} & = - \int_{t_n}^{t_n+1} \left(\mu_i V(t) + \rho_i Z_i(t) \right) \ud t + \sqrt{2\rho_i}/\beta \int_{t_n}^{t_n+\dt} \ud B_i(t) \\
              & \approx -\left(\mu_i V_n + \rho_i Z_{ni}\right)\dt + \sigma_i \Delta B_{ni},
\end{aligned}
$$

where $\sigma_i = \sqrt{2\rho_i}/\beta$.  Note that the last approximation can be obtained by requiring $Z_i(t)$ to be Brownian motion with drift $\lambda_{ni} = \mu_i V_n + \rho_i Z_{ni}$ and diffusion $\sigma_i$ over the interval $t \in [t_n, t_n+\dt]$.  Using this approximation, $\Delta V_n$ and $\Delta X_n$ become single and double integrals of Brownian motion, for which the joint distribution is analytically available:

- Let $B(t)^{(0}$ be standard Brownian motion, and let $B^{(1)}(t)$ and $B^{(2)}(t)$ denote its first and second integrals.  Then the joint distribution of $\BB(t) = (B^{(2)}(t), B^{(1)}(t), B^{(0)}(t))$ is

    $$
    \BB(t) \sim \N\left\{\bz, \Sigma(t) = \begin{pmatrix}
    \frac{t^5}{20}  & \frac{t^4}{8} & \frac{t^3}{6} \\
      \frac{t^4}{8}   & \frac{t^3}{3}  & \frac{t^2}{2} \\
      \frac{t^3}{6}   & \frac{t^2}{2}  & t
    \end{pmatrix}\right\}.
    $$
    
- Let $\BB_i(t) = (B_i^{(2)}(t), B_i^{(1)}(t), B_i^{(0)}(t))$ denote iid triplets as above, with $i = 1, \ldots, m$.  Then $\BB(t) = (\BB_1(t), \ldots, \BB_m(t))$ is a multivariate normal with mean 0 and variance $\SSi = \diag(\SSi(t), \ldots, \SSi(t))$.
 
- Note that $\Delta \WW_n$ is a linear transformation of $\Delta \BB_n = \BB(\dt)$.  Use this to find its variance.

- At the end of the day, we should have something like

    $$
    \WW(t_n + s) = \WW_n + \bm{g}(\WW_n, \tth, s) + \bm{h}(\WW_n, \tth) \mathcal{\bm{Z}}, 
    $$
    
    where $\mathcal{\bm{Z}}$ is a multivariate normal with mean zero and variance depending on $s$.  Use this to try to figure out the bridge proposal.
    
Some further derivation:

Let $\mmu = (\mu_1, \ldots, \mu_m)$, $\dr_n = (\lambda_{n1}, \ldots, \lambda_{nm})$, $\ssi = (\sigma_1, \ldots, \sigma_m)$, $\Delta \BB_n^{(k)} = (B^{(k)}(\dt), \ldots, B^{(k)}(\dt))$, $k=0,1,2,\ldots$.

$$
\begin{aligned}
\Delta \ZZ_n &= -\dr_{n} \dt + \ssi \odot \Delta \BB_n^{(0)} \\
\Delta V_n &= -U'_{\pph}(X_n) \dt + \int_{t_n}^{t_n+\dt} \mmu' (\ZZ_n + \Delta \ZZ(t)) \ud t \\
           &= -U'_{\pph}(X_n) \dt + \mmu' \left(\ZZ_n \dt - \dr_n \frac{\dt^2}{2} \right) + \mmu' (\ssi \odot \Delta \BB_n^{(1)}) \\
\Delta X_n &= -U'_{\pph}(X_n) \frac{\dt^2}{2} + \mmu' \ZZ_n \frac{\dt^2}{2} - \mmu' \dr_n \frac{\dt^3}{6} + \mmu' (\ssi \odot \Delta \BB_n^{(2)}).
\end{aligned}
$$

where $\odot$ denotes the element-wise matrix multiplication.

Starting from the $3m \times 1$ vector 

$$
\BB_n(\dt) = \begin{pmatrix} \BB_n^{(2)}(\dt) \\ \BB_n^{(1)}(\dt) \\ \BB_n^{(0)}(\dt) \end{pmatrix}
$$

we can obtain

$$
\Delta \WW_n = \begin{pmatrix} 
-U'_{\pph}(X_n) \frac{\dt^2}{2} + \mmu' \left(\ZZ_n \frac{\dt^2}{2} - \dr_n \frac{\dt^3}{6}\right) \\ 
-U'_{\pph}(X_n) \dt + \mmu' \left(\ZZ_n \dt - \dr_n \frac{\dt^2}{2} \right) \\
-\dr_{n} \dt
\end{pmatrix} + \mmu' \begin{pmatrix} \ssi & & \\ & \ssi & \\ & & \ssi \end{pmatrix}_{3m \times 3m} \BB_n(\dt).
$$

### Approach to derivation

Let $\Delta X_n(s) = X(t_n + s) - X(t_n)$, etc. Let $\Delta \tilde \BB_n(s) = (\diag(\ssi) \Delta \BB_n^{(0)}(s), \diag(\ssi) \Delta \BB_n^{(1)}(s), \diag(\ssi) \Delta \BB_n^{(2)}(s))$.  Let $\SSi(s) = \cov(\Delta B_{ni}^{(0)}(s), B_{ni}^{(1)}(s), B_{ni}^{2}(s))$.  Then 

$$
\cov(\tilde \BB_n(s)) = \SSi(s) \otimes \diag(\ssi^2)
% = \begin{bmatrix} \sigma_1^2 \SSi(s) & & 0\\
% & \ddots & \\ 0 & & \sigma_m^2 \SSi(s) \end{bmatrix}, 
$$

where $\otimes$ is the Kronecker matrix product.  What we want is the distribution of $\Delta \WW_n(s) = (\Delta \BB^{(0)}_n(s), \mmu' \Delta \BB^{(1)}_n(s), \mmu' \Delta \BB^{(2)}_n(s)) = \AA \Delta \tilde \BB_n(s)$, where 

$$
\AA_{m+3 \times 3m} = \begin{bmatrix} \bm{I}_{m\times m} & 0 & 0 \\ 0 & \mmu' & 0 \\ 0 & 0 & \mmu' \end{bmatrix}.
$$

This gives us a way to numerically evaluate $\cov(\Delta \WW_n(s))$ on a computer, which we can then check against the theoretical result:

$$
\cov(\Delta \WW_n(s)) = 
\begin{bmatrix} s \diag(\ssi^2) & \tfrac{s^2}{2} \diag(\sigma^2) \mmu  & \tfrac{s^3}{6} \diag(\sigma^2) \mmu  \\
                                & \tfrac{s^3}{3}\sum_{i=1}^m \mu_i^2 \sigma_i^2 & \tfrac {s^4}{8}\sum_{i=1}^m \mu_i^2 \sigma_i^2 \\
                                & & \tfrac{s^5}{20} \sum_{i=1}^m \mu_i^2 \sigma_i^2 
\end{bmatrix}
$$

### Todo

- [ ] Check theoretical derivation on this $(m+3)\times(m+3)$ variance matrix against numerical version, which is $\AA (\SSi(s) \otimes \diag(\ssi^2)) \AA'$.

- [ ] Derive the bridge proposal for the modified Euler scheme.  It's very similar to what we did for usual SDEs, except instead of having $\XX \mid \WW \sim \N(\WW + \mmu_{X|W}, \SSi_{X|W})$, we now need $E[\XX \mid \WW] = \bm{C} \WW + \mmu_{X|W}$.

## Old Materials Below (should be removed later)

## Euler-Maruyama Discretization

$$
\begin{aligned}
X_n - X_{n-1} &= V_{n-1} \dt \\
V_{n} - V_{n-1} &= U'(X_{n-1})\dt + \sum_{i=1}^m \mu_i Z_{i,n-1} \dt \\
Z_{i,n} - Z_{i,n-1} &= - \left(\mu_i V_{n-1} + \rho_i Z_{i,n-1} \right)\dt + \sqrt{2\rho_i}/\beta \Delta W_{i,n-1} & i=1,\ldots,m
\end{aligned}
$$

where $\Delta W_{i,n} = W_{i,n} - W_{i, n-1} \iid \N(0, \dt)$, $m \ge 1$ is predetermined.

## Idea 1: Bridge Proposal 

Let $\WW_{n} = (W_{1,n}, \ldots, W_{m,n})$.

$$
\begin{aligned}
\Delta \WW_{n-1} &\sim \N(\bz, \dt \II)\\
Z_{i,n} \mid \WW & \sim \N(W_{i,n-1} + \mu_{Z_{i,n-1}|W}, 2\rho_i \dt/\beta^2 \II), & \mu_{Z_{i,n-1}|W} = - \left(\mu_i V_{n-1} + \rho_i Z_{i,n-1} \right)\dt,  \\
V_n \mid X_{n-1}, Z_{n-1}, \WW &\sim \N(V_{n-1} + \mu_{V_{n-1}|Z}, \bz), & \mu_{V_{n-1}|Z} = U'(X_{n-1})\dt + \sum_{i=1}^m \mu_i \mu_{Z_{i,n-1}|W} \dt \\
X_n \mid V_{n-1}, Z_{n-1}, \WW & \sim \N(X_{n-1} + \mu_{X|V}, \bz), & \mu_{X|V} = \dt \mu_{V_{n-1}|Z}
\end{aligned}
$$

## Idea 2: Simulation based on Integrated Wiener Process ($m=1$)



The reason why we try to apply the integrated Wiener process is that we want to introduce more randomness to the first and second equations (which are deterministic) in the QM-GLE SDE.

Draw the initial values $Z_0, V_0, X_0$ from some priors.

Generate a multivariate normal random vector $\WW_n = (\Delta W^{(2)}_n, \Delta W^{(1)}_n, \Delta W_n)$ 

$$
\WW_n \sim \N\left(\bm{0}, \SSi \right), \qquad 
\SSi = \begin{pmatrix}
  \frac{(\Delta t)^5}{20}  & \frac{(\Delta t)^4}{8} & \frac{(\Delta t)^3}{6} \\
  \frac{(\Delta t)^4}{8}   & \frac{(\Delta t)^3}{3}  & \frac{(\Delta t)^2}{2} \\
  \frac{(\Delta t)^3}{6}   & \frac{(\Delta t)^2}{2}  & \Delta t
\end{pmatrix}
$$

Given samples at time $n\Delta t$ ($n=0,1,\ldots$), we can generate $Z_{n+1}, V_{n+1}, X_{n+1}$ as follows

$$
\begin{aligned}
  Z_{n+1} &= Z_n - \left(\mu V_n + \rho Z_n \right)\Delta t + \sigma \Delta W_n \\
  V_{n+1} &= V_n - U'(X_n)\Delta t + \mu Z_n\Delta t - \mu \left(\mu V_n + \rho Z_n\right)\frac{(\Delta t)^2}{2} + \mu\sigma\Delta W^{(1)}_n \\
  X_{n+1} &= X_n - U'(X_n)\frac{(\Delta t)^2}{2} + \mu Z_n\frac{(\Delta t)^2}{2} - \mu \left(\mu V_n + \rho Z_n \right)\frac{(\Delta t)^3}{6} + \mu\sigma \Delta W^{(2)}_n
\end{aligned}
$$

We only need to calculate $Z_n, V_n$ for $n=1,\ldots,N-1$ to obtain the sequence $X_1, \ldots, X_N$.
