# Stability of Single Column Models

In this section I evaluate the stability of the single layer perceptron models (SLP) in a single column setting.

In [None]:
import matplotlib.pyplot as plt
%matplotlib inline

In [None]:
import numpy as np
import xarray as xr
import torch

from lib.models.torch_models import predict
from lib.models.torch_models import train_euler_network

This is the coarse sampling time step

In [None]:
dt = 3/24

Let's load the data.

In [None]:
data = np.load("../data/ml/ngaqua/time_series_data.npz")


X = data['X']
G = data['G']
scale = data['scales']
w = data['w']

# we need to grap the pressure field from a different path
p = xr.open_dataset("../data/raw/ngaqua/stat.nc").p.values
t = dt * np.arange(X.shape[0])

Now we extract qt, sl and the advection terms for one specific horizonal location.

In [None]:
x = X[:-1,8,0,:]
xp = X[1:, 8,0,:]
g = G[:-1,8,0,:]

In [None]:
# some simple plotting routines

def plot_t(t, x):
    plt.figure(figsize=(10,2))
    plt.pcolormesh(t, p, x[:,:34].T, cmap='inferno')
    plt.gca().invert_yaxis()
    plt.colorbar()

Here, I plot the observed $Q_{1}$, which is defined as 

$$\frac{s_l^{n+1} - s_l^{n}}{\Delta t} - g_{s_l}^n, $$
where $g_{s_l}$ are the horizontal and vertical advection of $s_l$, which we approximate using centered differences.

In [None]:
plot_t(t, (xp-x)/dt-g)

Outside of this notebok, I have trained a neural network model with this data, and I load it here

In [None]:
net = torch.load("../data/ml/ngaqua/time_series_fit.torch")

Let's see what the networks predicted $Q_1$ looks like.

In [None]:
predicted_q1 = predict(net, x)
plot_t(t, predicted_q1)

It seems to do a pretty good job compared to the truth above. So we seem to be reaching some sort of **consistency**. However, we also have to check the **stability** of the scheme.

# Single column tests of the model.

## Initial value problem

Let's start some reference profile and integrate the model forward without any advection terms. If the scheme is unstable then we will need to rethink the model.

Here is a simple function which we will use to integrate our neural network

In [None]:
def run_time_series(predict, x0, nsteps):
    out = np.empty((nsteps+1, x0.shape[0]))
    out[0] = x0
    x = x0
    for i in range(nsteps+1):
        x = predict(x)
        out[i] = x
        
    return out

What happens when we take 20 forward euler time steps using our scheme?

In [None]:
ts = run_time_series(lambda x: x + predict(net, x)*.125,  x[0], 20)

In [None]:
plot_t(t[1:21], ts[1:])

We can see that the scheme is very unstable, and the scheme just gives NaNs after 8 time steps . The white Which is not good. Can we assess the instability of a scheme somehow without actually running it? Once solution is to use a global fit.

Soem sort of rayleigh quotient should work. Here is one definition: $\frac{x' \cdot f(x' + \bar{x})}{x'\cdot x'}$, where $x$ is the concatenated vertical profiles of $q_T$ and $s_L$ as usual, and $\bar{x}$ is the time mean.

In [None]:
def rayleigh_quotient(f, x, axis=-1):
    fx = f(x)
    return np.sum(x * fx, axis=axis)/np.sum(x * x, axis=axis)

In [None]:
x_mean = x.mean(axis=0)
x_pert = x-x_mean

r = rayleigh_quotient(lambda x: predict(net, x+x_mean), x_pert)
plt.plot(r)
plt.axhline(0.0, c='k')

It appears that the scheme is mostly stable over the observed training dataset.

What happened when we used the neural network model in a predictive mode?

In [None]:
rayleigh_quotient(lambda x: predict(net, x+x_mean), ts-x_mean)[:10]

We can see that scheme somehow became extremely unstable. But how was this possible after only 1 time step, so let's just look at the first two time steps.

In [None]:
x0 = x[0]
x1 = x0 + predict(net, x0) * .125

In [None]:
import holoviews as hv
hv.extension('matplotlib')

In [None]:
%%opts Curve[invert_yaxis=True, ] {+axiswise}

hv.Curve((x0[34:], p)) * hv.Curve((x1[34:], p))

holomap = hv.HoloMap({
    ('qt', 0): hv.Curve((x0[34:], p), vdims=['p']),
    ('qt', 1): hv.Curve((x1[34:], p), vdims=['p']),
    ('sl', 0): hv.Curve((x0[:34], p), vdims=['p']),
    ('sl', 1): hv.Curve((x1[:34], p), vdims=['p'])},
kdims=['variable', 'step'])

holomap.overlay("step").layout("variable")

We can see that the intial profile and the profile after 1 step are nearly overlapping. What are the rayleigh coefficients of each profile

In [None]:
rayleigh_quotient(lambda x: predict(net, x+x_mean), x0-x_mean)

In [None]:
rayleigh_quotient(lambda x: predict(net, x+x_mean), x1-x_mean)

How is it that the rayleigh coefficient is so drastically different for such a small change? Let's look at the different in the predicted heating profiles. Is it because our network is overfitting? Is this an artifact of the code in this particular section?

In [None]:
%%opts Curve[invert_yaxis=True, ] {+axiswise}


hv.HoloMap({
    0: hv.Curve((predict(net, x0)[:34], p), vdims=['Q1']),
    1: hv.Curve((predict(net, x1)[:34], p), vdims=['Q1'])
}, kdims=['step']).layout()


Indeed the profiles are drastically different! It is very strange that such slightly different profiles can lead to such radically different predicted heating profiles!

It is strange that this happens. Overfitting might cause the network to generalize so poorly, but the cross validation score for this network is actually pretty good.

In any case, this example demonstrates that **consistency** as measured by the R2 score of predicting $Q1$ and $Q2$ does not ensure that the scheme is **accurate**. We also need to make sure the network is **stable**.