# "Gaussian Processes 2 - Application"
> "We look at how to use GP to learn non-linefar partial differential equations"

- toc:true
- branch: master
- badges: true
- comments: true
- author: John J. Molina
- categories: [Gaussian Processes]

We reproduce a recent paper by Raissi and Karniadakis:

[Hidden physics models: Machine learning of nonlinear partial differential equations](https://doi.org/10.1016/j.jcp.2017.11.039)<br>
M. Raissi and G. E. Karniadakis, J. Comp. Phys. 357, 125 (2018)

In [1]:
import jax.numpy as  np
import pandas    as pd
import numpy     as onp
import pymc3     as pm
import theano    as th
import theano.tensor as tt
import matplotlib    as mpl
import matplotlib.pyplot      as plt
import matplotlib.patheffects as PathEffects

from scipy import optimize
from jax   import grad, jit, vmap, jacfwd, jacrev, random
from jax.numpy.lax_numpy import _wraps
from jax.config    import config; config.update("jax_enable_x64", True)
from functools     import partial,reduce
from pymc3.gp.util import plot_gp_dist

mpl.style.use(['seaborn-poster', 'seaborn-muted'])

#betanalphas' colormap
colors = ["#DCBCBC","#C79999","#B97C7C","#A25050","#8F2727", "#7C0000","#DCBCBC20", "#8F272720","#00000060"]
color  = {i[0]:i[1] for i in zip(['light','light_highlight','mid','mid_highlight','dark','dark_highlight','light_trans','dark_trans','superfine'],colors)}
fancycolors = [mpl.colors.to_hex(c) for c in [[0.6, 0.6, 0.6],[0.7, 0.3, 1],[0.3, 0.7, 1],[0.2, 0.9, 0.9],
                                              [0.3, 1, 0.7],[0.7, 1, 0.3],[0.9, 0.9, 0.2],[1, 0.7, 0.3],[1, 0.3, 0.7],
                                              [0.9, 0.2, 0.9],[1.0, 1.0, 1.0]]]
threecolors = [mpl.colors.to_hex(c) for c in [[0.1, 0.15, 0.4],[1, 0.2, 0.25],[1.0, 0.775, 0.375]]]
fourcolors  = [mpl.colors.to_hex(c) for c in [[0.9, 0.6, 0.3],[0.9, 0.4, 0.45],[0.5, 0.65, 0.75],[0.42, 0.42, 0.75]]]

def addtxt(ax, x, y, txt, fs=8, lw=3, clr='k', bclr='w', rot=0):
    """Add text to figure axis"""
    return ax.text(x, y, txt, color=clr, ha='left', transform=ax.transAxes, rotation=rot, weight='bold',
                   path_effects=[PathEffects.withStroke(linewidth=lw, foreground=bclr)], fontsize=fs)



# Preliminaries

## Solving nonlinear ODEs: Exponential time-difference methods

Consider the following differential equation

\begin{align}
  \partial_{t}u &= \mathcal{L}u + \mathcal{G}(u,t)\label{e:uLN}
\end{align}

where $\mathcal{L}$ represents a linear operator (with constant coefficients), which is easily solved
for and does not depend on time, and $\mathcal{G}$ is a non-linear operator. With the following change
of variables

\begin{align*}
  v &= e^{-\mathcal{L} t} u \\
  \partial_{t} v &= e^{-\mathcal{L} t}\partial_{t} u - e^{-\mathcal{L}t}\mathcal{L}u
\end{align*}

Eq.\eqref{e:uLN} becomes
\begin{align*}
  \partial_{t}v &= e^{-\mathcal{L} t} \mathcal{G}(t, u)
\end{align*}

and it can be solved analytically to yield

\begin{align}
  v(t_n + h) &= v(t_n) + \int_0^h \textrm{d}{\tau}e^{-\mathcal{L} \tau}
  \mathcal{G}\left({t_n+\tau, e^{\mathcal{L}\tau}v(t_n + \tau)}\right) \notag\\
  u(t_n + h) &= e^{h\mathcal{L}}\Big[{u(t_n) + \int_0^h \textrm{d}{\tau}
    e^{-\mathcal{L}\tau} \mathcal{G}\big({t_n+\tau, u(t_n + \tau)}\big)}\Big]\label{e:un_int}
\end{align}

The problem, then, is how to treat the integral appearing on the
right-hand side of the equation. For what follows, it will be useful
to introduce the $\varphi_n$ functions, which can be defined in integral,
series, or recursion form as {% cite Schmelzer2007 %}

\begin{align}
  \varphi_n(z) &= \begin{cases} e^{z} & n = 0\\
  \frac{1}{\left({n-1}\right)!}\int_0^1\textrm{d}{s} e^{z(1-s)}
  s^{n-1}& n > 0\end{cases} \label{e:phin_int}\\
  \varphi_n(z) &= \sum_{k=0}^{\infty} \frac{z^k}{\left({k +
      n}\right)!} \label{e:phin_sum} \\
  \varphi_n(z) &= \frac{\varphi_{n-1}(z) - \frac{1}{\left({n-1}\right)!}}{z}, \qquad
  \varphi_0(z) = e^{z}\label{e:phin_rec}
\end{align}

The first four $\varphi$ functions are given by

\begin{align*}
  \varphi_0(z) &= e^{z} &  \varphi_2(z) &= \frac{e^z - 1 - z}{z^2}\\
\varphi_1(z) &= \frac{e^z - 1}{z}  
   & \varphi_3(z) &=\frac{e^z - 1 -z
    - \frac{1}{2}z^2}{z^3}
\end{align*}

for $z$ a complex scalar. In the case of a (non-diagonal) matrix
operator $\Lambda$, the $\varphi$ functions would become

\begin{align*}
  \varphi_0(\Lambda) &= e^{\Lambda} & \varphi_2(\Lambda) &=
  \Lambda^{-2}\left({e^{\Lambda} - I - \Lambda}\right) \\
  \varphi_1(\Lambda) &= \Lambda^{-1}\left({e^{\Lambda} - 1}\right) &
  \varphi_3(\Lambda) &= \Lambda^{-3}\bigg[{e^{\Lambda} - I - \Lambda - \frac{1}{2}\Lambda^{2}}\bigg]
\end{align*}

### ETD-RK

Exponential time-difference Runge-Kutta methods up to order four have
been developed to solve Eq.~\eqref{e:un_int}. They start from the
following variation of constants formula
\begin{align}
  u(t_n + h) &= e^{h \mathcal{L}} u_n + \int_{0}^h \textrm{d}{\tau}
  e^{-\mathcal{L}(\tau - h)} g\left({t_n + \tau, u(t_n + \tau)}\right)\notag
\end{align}
and introduce the following general class of one-step methods {% cite Hochbruck2010  %}:
\begin{align}
  u_{n+1} &= \chi(h\mathcal{L})u_n + h\sum_{i=1}^s b_i(h\mathcal{L}) G_{ni} \notag\\
  G_{nj}  &= \mathcal{G}(t_n + c_j h, U_{nj})  \label{e:etdrkn}\\
  U_{ni}  &= \chi_{i}(h\mathcal{L})u_n + h\sum_{j=1}^{s} a_{ij}(h\mathcal{L})  G_{nj}\notag
\end{align}
where the method coefficients $\chi$, $\chi_i$, $a_{ij}$ and $b_i$ are
constructed from exponential functions (or their rational
approximation) evaluated at the operator $h\mathcal{L}$. In the formal limit when
$b_i = b_i(0)$ and $a_{ij} = a_{ij}(0)$, we recover the standard
Runge-Kutta method. It is assumed that the coefficients satisfy the
following properties
\begin{align*}
  \sum_{j=1}^s b_j(0)  &= 1,
  & \sum_{j=1}^{s} a_{ij}(0) &= c_i\qquad (i = 1,\ldots,s)\\
  \chi(0) = \chi_i(0) &= 1
\end{align*}
Hochbruck and Ostermann list the conditions that the constants should
satisfy to guarantee the convergence of the
methods {% cite Hochbruck2005,Hochbruck2010 %}. Following them, we
focus on "explicit" methods with
\begin{align}
  \sum_{j=1}^s b_j(z) &= \varphi_1(z),
  &\sum_{j=1}^s a_{ij}(z)&= c_i \varphi_1 (c_i z) \label{e:etdrkn_ab}\\
  \chi(z) &= e^{z} = \varphi_0(z),
  \qquad &\chi_i(c) &= e^{c_i z} = \varphi_0(c_i z)\qquad 1\le i
  \le s  \label{e:etdrkn_chi}
\end{align}
The coefficients are most easily represented in tableau form as
\begin{align}
  \begin{array}{c|cccc|c}
    c_1 & & & & & \chi_1(h\mathcal{L}) \\
    c_2 & a_{21}(h\mathcal{L}) & & & & \chi_2(h\mathcal{L}) \\
    \vdots & \vdots & \ddots & & & \vdots \\
    c_s & a_{s1}(h\mathcal{L}) & \ldots & a_{s,s-1}(h\mathcal{L}) & & \chi_s(h\mathcal{L}) \\
    \hline
    & b_{1}(h\mathcal{L}) & \ldots & b_{s-1}(h\mathcal{L}) & b_s(h\mathcal{L}) & \chi(h\mathcal{L})
  \end{array}
  \label{e:rktable}
\end{align}
As with all Runge-Kutta methods, the internal stages are of order one
only, which makes the construction of higher-order methods quite
involved. For what follows, we adopt the following simplifying
notation
\begin{align}
  \varphi_j    &= \varphi_j\left({h \mathcal{L}}\right)\label{e:phijk}\\
  \varphi_{j,k} &= \varphi_j\left({c_k h \mathcal{L}}\right)\notag
\end{align}
A detailed discussion of explicit ETDRK methods up to order four
be found in Refs. {% cite Hochbruck2005a Cox2002 %}.


#### ETDRK1 

For $s=1$, we obtain the exponential version of Euler's method

\begin{align}
  \begin{array}{c|c}
    0 & \\
    \hline
    & \varphi_1
  \end{array}
  \label{e:etdrk1}
\end{align}

or 

\begin{align*}
  u_{n+1} &= \varphi_0 u_n + h \varphi_1(h\mathcal{L}) \mathcal{G}(t_n, u_n) \\
\end{align*}

Expanding the terms inside the
parenthesis to first order in $h$, we obtain the following approximate
integrator (equivalent to Euler's method)

\begin{align*}
  u_{n+1} &= e^{h\mathcal{L}}\big({u_n + \mathcal{L}^{-1}\left({1 - e^{-h\mathcal{L}}}\right)\mathcal{G}(t_n,
    u_n)}\big) \\
  &\approx e^{h\mathcal{L}}\big({u_n + h \mathcal{G}(t_n, u_n)}\big)
\end{align*}

#### ETDRK2

For a second-order method we need at least two-internal
stages {% cite Hochbruck:2010 Cox:2002 %},

\begin{align}
  \begin{array}{c|cc}
    0 & & \\
    c_2 & c_2 \varphi_{1,2} & \\
    \hline
      & \varphi_{1} - \frac{1}{c_2}\varphi_{2} & \frac{1}{c_2}\varphi_2
  \end{array}\qquad
  \begin{array}{c|cc}
    0 & & \\
    \frac{1}{2} & \frac{1}{2}\varphi_{1,2} & \\
    \hline
      & \varphi_{1} - 2\varphi_{2} & 2\varphi_2
  \end{array}
  \label{e:etdrk2}
\end{align}

or

\begin{align*}
  U_{n1} &= u_n  &G_{n1} &= \mathcal{G}(t_n, U_{n1})\\
  U_{n2} &= \varphi_{0}\left({h/2\mathcal{L}}\right) u_n + h\bigg[{\frac{1}{2}\varphi_1\left({h/2 \mathcal{L}}\right) G_{n1}}\bigg] &
  G_{n2} &= \mathcal{G}(t_n + h/2, U_{n2}) \\
  u_{n+1} &= \varphi_{0} u_n + h\big[{\left({\varphi_1 -
      2\varphi_2}\right)G_{n1} + 2\varphi_2G_{n2}}\big]
\end{align*}

Easing some of the restrictions, one can obtain an alternative
one-parameter method,

\begin{align}
  \begin{array}{c|cc}
    0 & & \\
    c_2 & c_2\varphi_{1,2} & \\
    \hline
    & \big({1 - \frac{1}{2 c_2}}\big)\varphi_1 & \frac{1}{2 c_2}\varphi_1
  \end{array}\qquad
  \begin{array}{c|cc}
    0 & & \\
    \frac{1}{2} & \frac{1}{2}\varphi_{1,2} & \\
    \hline
    & 0 & \varphi_1
  \end{array}
  \label{e:etdrk2b}
\end{align}

such that

\begin{align*}
  u_{n+1} &= \varphi_0 u_n + h\varphi_1G_{n2}
\end{align*}

Both methods (\ref{e:etdrk2} and \ref{e:etdrk2b}) are B-consistent of
order one.

Cox and Mathews have proposed to use $c_2 = 1$

\begin{align}
  \begin{array}{c|cc}
    0 & & \\
    1 & \varphi_{1,2} & \\
    \hline
    & \varphi_1 - \varphi_2 & \varphi_2
  \end{array}
  \label{e:etdrk2Cox}
\end{align}

such that

\begin{align*}
  U_{n1} &= u_n & G_{n1} &= \mathcal{G}(t_n, U_{n1}) \\
  U_{n2} &= \varphi_{0}u_n + h\varphi_1G_{n1} & G_{n2} &= \mathcal{G}(t_n + h,
  U_{n2}) \\
  u_{n+1} &= \varphi_0 u_n + h\big[{\left({\varphi_1 - \varphi_2}\right) G_{n1} + \varphi_2G_{n2}}\big]
\end{align*}

however, in this case the method is no longer guaranteed to be second order.


{% bibliography --cited %}