*This notebook shows that it is advisable to use SVD for the calculation of the nlml as it provides a more stable method to invert K.*

## PDE 1 - 2D with 4 parameters


#### Problem Setup

$\phi u + u_{x} + u_{y,y} = f(x,y)$

For the generation of our initial data samples we use:

$\phi = 2$ <br>
$X_i := (x_i, y_i) \in [0,1] \times [0,1]$ for $i \in \{1, \dotsc, n\}$ <br>
$u(x,y) = x^2 + y$ <br>
$f(x,y) = 2(x^2 + x + y)$


We assume that $u$ can be represented as a Gaussian process with SE kernel.

$u \sim \mathcal{GP}(0, k_{uu}(X_i, X_j; \theta))$, where $\theta = \{\sigma, l_x, l_y\}$.

And the linear operator:

$\mathcal{L}_X^{\phi} = \phi + \partial_x + \partial_{y,y}$

so that

$\mathcal{L}_X^{\phi} u = f$

Problem at hand: Estimate $\phi$ (the closer to $\phi = 2$, the better).


#### Step 1: Simulate data

In [1]:
import time
import numpy as np
import sympy as sp
from scipy.optimize import minimize

*Parameters, that can be modified:*

In [2]:
# Number of data samples:
n = 40

In [3]:
# np.random.seed(int(time.time()))
def simulate_data():
    x = np.random.rand(n)
    y = np.random.rand(n)
    y_u = np.multiply(x,x) + y
    y_f = 2*(np.multiply(x,x) + x + y)
    return (x,y,y_u,y_f)
(x,y,y_u,y_f) = simulate_data()

#### Step 2: Evaluate kernels

$k_{uu}(X_i, X_j; \theta) = \sigma exp(-\frac{1}{2l_x}(x_i-x_j)^2 - \frac{1}{2l_y}(y_i-y_j)^2)$

In [4]:
x_i, x_j, y_i, y_j, sigma, l_x, l_y, phi = sp.symbols('x_i x_j y_i y_j sigma l_x l_y phi')
kuu_sym = sigma*sp.exp(-1/(2*l_x)*((x_i - x_j)**2) - 1/(2*l_y)*((y_i - y_j)**2))
kuu_fn = sp.lambdify((x_i, x_j, y_i, y_j, sigma, l_x, l_y), kuu_sym, "numpy")
def kuu(x, y, sigma, l_x, l_y):
    k = np.zeros((x.size, x.size))
    for i in range(x.size):
        for j in range(x.size):
            k[i,j] = kuu_fn(x[i], x[j], y[i], y[j], sigma, l_x, l_y)
    return k

$k_{ff}(X_i,X_j;\theta,\phi) \\
= \mathcal{L}_{X_i}^{\phi} \mathcal{L}_{X_j}^{\phi} k_{uu}(X_i, X_j; \theta) \\
= \phi^2k_{uu} + \phi \frac{\partial}{\partial x_i}k_{uu} + \phi \frac{\partial^2}{\partial y_i^2}k_{uu} + \phi \frac{\partial}{\partial x_j}k_{uu} + \frac{\partial^2}{\partial x_i, x_j}k_{uu} + \frac{\partial^3}{\partial y_i^2 \partial x_j}k_{uu} + \phi \frac{\partial^2}{\partial y_j^2}k_{uu} + \frac{\partial^3}{\partial x_i \partial y_j^2}k_{uu} + \frac{\partial^4}{\partial y_i^2 \partial y_j^2}k_{uu}$

In [5]:
kff_sym = phi**2*kuu_sym \
        + phi*sp.diff(kuu_sym, x_i) \
        + phi*sp.diff(kuu_sym, y_i, y_i) \
        + phi*sp.diff(kuu_sym, x_j) \
        + sp.diff(kuu_sym, x_i, x_j) \
        + sp.diff(kuu_sym, y_i, y_i, x_j) \
        + phi*sp.diff(kuu_sym, y_j, y_j) \
        + sp.diff(kuu_sym, x_i, y_j, y_j) \
        + sp.diff(kuu_sym, y_i, y_i, y_j, y_j)
kff_fn = sp.lambdify((x_i, x_j, y_i, y_j, sigma, l_x, l_y, phi), kff_sym, "numpy")
def kff(x, y, sigma, l_x, l_y, phi):
    k = np.zeros((x.size, x.size))
    for i in range(x.size):
        for j in range(x.size):
            k[i,j] = kff_fn(x[i], x[j], y[i], y[j], sigma, l_x, l_y, phi)
    return k

$k_{fu}(X_i,X_j;\theta,\phi) \\
= \mathcal{L}_{X_i}^{\phi} k_{uu}(X_i, X_j; \sigma) \\
= \phi k_{uu} + \frac{\partial}{\partial x_i}k_{uu} + \frac{\partial^2}{\partial y_i^2}k_{uu}$

In [6]:
kfu_sym = phi*kuu_sym \
        + sp.diff(kuu_sym, x_i) \
        + sp.diff(kuu_sym, y_i, y_i)
kfu_fn = sp.lambdify((x_i, x_j, y_i, y_j, sigma, l_x, l_y, phi), kfu_sym, "numpy")
def kfu(x, y, sigma, l_x, l_y, phi):
    k = np.zeros((x.size, x.size))
    for i in range(x.size):
        for j in range(x.size):
            k[i,j] = kfu_fn(x[i], x[j], y[i], y[j], sigma, l_x, l_y, phi)
    return k

In [7]:
def kuf(x, t, sigma, l_x, l_y, phi):
    return kfu(x, t, sigma, l_x, l_y, phi).T

#### Step 3: Compute NLML

In [8]:
def nlml(params, x, y, y1, y2, s):
    sigma_exp = np.exp(params[0]) 
    l_x_exp = np.exp(params[1])
    l_y_exp = np.exp(params[2]) 
    # phi = params[3]
    K = np.block([
        [kuu(x, y, sigma_exp, l_x_exp, l_y_exp) + s*np.identity(x.size), kuf(x, y, sigma_exp, l_x_exp, l_y_exp, params[3])],
        [kfu(x, y, sigma_exp, l_x_exp, l_y_exp, params[3]), kff(x, y, sigma_exp, l_x_exp, l_y_exp, params[3]) + s*np.identity(x.size)]
    ])
    y = np.concatenate((y1, y2))  
    
    val = np.linalg.slogdet(K) + np.mat(y) * np.linalg.inv(K) * np.mat(y).T
    return val.item(0)

With SVD:

In [9]:
def nlml_svd(params, x, y, y1, y2, s):
    sigma_exp = np.exp(params[0])
    l_x_exp = np.exp(params[1])
    l_y_exp = np.exp(params[2])
    # phi = params[3]
    K = np.block([
        [kuu(x, y, sigma_exp, l_x_exp, l_y_exp) + s*np.identity(x.size), kuf(x, y, sigma_exp, l_x_exp, l_y_exp, params[3])],
        [kfu(x, y, sigma_exp, l_x_exp, l_y_exp, params[3]), kff(x, y, sigma_exp, l_x_exp, l_y_exp, params[3]) + s*np.identity(x.size)]
    ])
    y = np.concatenate((y1, y2))
    
    u, s, vt = np.linalg.svd(K)
    log_sum = 0
    for i in range(s.size):
        log_sum = log_sum + np.log(np.abs(s[i]))
    K_inv = (vt.T).dot(np.linalg.inv(np.diag(s))).dot(u.T)
    
    val = log_sum + np.mat(y) * K_inv * np.mat(y).T
    return val.item(0)

Result:

In [10]:
nlml_wp = lambda params: nlml(params, x, y, y_u, y_f, 1e-7)
t0 = time.time()
out1 = minimize(nlml_wp, np.random.rand(4), method="Nelder-Mead", options={'maxiter':5000, 'fatol':0.001})
t1 = time.time() - t0
print("The time needed for standard nlml was", t1)
print(out1)
print(out1.x[3], '\n')

nlml_sv = lambda params: nlml_svd(params, x, y, y_u, y_f, 1e-7)
t0_svd = time.time()
out_sv = minimize(nlml_sv, np.random.rand(4), method="Nelder-Mead", options={'maxiter':5000, 'fatol':0.001})
t1_svd = time.time() - t0_svd
print("The time needed for nlml with svd was", t1_svd)
print(out_sv)
print(out_sv.x[3])

The time needed for standard nlml was 71.26787805557251
 final_simplex: (array([[  42.73194289, -131.08758055,   15.03694152,   31.07984587],
       [  42.73193848, -131.0875649 ,   15.03693967,   31.07984231],
       [  42.73195669, -131.08761938,   15.0369454 ,   31.07985499],
       [  42.73196569, -131.08764995,   15.0369489 ,   31.07986224],
       [  42.73196977, -131.08766351,   15.03695061,   31.07986538]]), array([1., 1., 1., 1., 1.]))
           fun: 1.0
       message: 'Optimization terminated successfully.'
          nfev: 177
           nit: 57
        status: 0
       success: True
             x: array([  42.73194289, -131.08758055,   15.03694152,   31.07984587])
31.07984587406263 

The time needed for nlml with svd was 320.26781940460205
 final_simplex: (array([[1461.25648885,   46.44656343, 1706.75057846,    1.99999081],
       [1461.25648885,   46.44656343, 1706.75057846,    1.99999081],
       [1461.25648885,   46.44656343, 1706.75057846,    1.99999081],
       [1461