The data we are using are only evaulated with a coarse sampling time step of 3 hours. On the other hand, we will probably use 10-20 minute time step for the coarse resolution model. This means that the dynamical model we are trying to fit is 
$$ x^i_{n+1} = \underbrace{f(f(\ldots f}_{\text{m times}}(x^i_n))) + \int_{t_n}^{t_{n+1}} g(x(t), t) dt$$ 
where $i$ is the horizontal spatial index, and $n$ is the time step. The number of times the function $f$ is applied is $m=\frac{\Delta t}{h}$ where $h$ is the GCMs time step, and $\Delta t$ is the sampling interval of the stored output. The integral on the right represents the approximately known terms such as advection, and $f$ represents the unknown source terms.

We solve a minimization problem to find $f$. This is given by 
$$
\min_{a} \lim_{m \rightarrow \infty} \sum_{i,n} ||x^{i}_{n+1} - F^{(m)} x^i_{n} - g_n^{i}||_W^2 \quad \text{s.t.}\quad F^{(m)}(\cdot) = \underbrace{f(f(\ldots f}_{\text{m times}}(\cdot))),\ f(x) = x +  \frac{ \Delta t}{m} a(x).
$$
Intuitively, the forward operator $F^{(m)}$ is the result applying $m$ forward euler steps to the system $a$.

Let's try performing this fit. First, we need to import the appropriate models, and load the data

# TODO

[ ] revise this introduction. it does not reflect what I actually did in this notebook.

In [None]:
import matplotlib.pyplot as plt
%matplotlib inline

In [None]:
import numpy as np
import xarray as xr
import torch

In [None]:
from lib.models.torch_models import predict
from lib.models.torch_models import train_euler_network

This is the coarse sampling time step

In [None]:
dt = 3/24

Let's now define a torch module for the function $a$. It will just be a single layer perceptron, which appropriately scales the inputs first. Let's first compute the appropriate scaling

In [None]:
data = np.load("../data/ml/ngaqua/time_series_data.npz")

X = data['X']
G = data['G']
scale = data['scales']
w = data['w']

In [None]:
x = X[:-1,8,0,:]
xp = X[1:, 8,0,:]
g = G[:-1,8,0,:]

plt.pcolormesh(x[:,:34]-a.mu)

In [None]:
def plot_q(x):
    plt.pcolormesh(x[:,34:].T)
    
def plot_t(x):
    plt.figure(figsize=(12,2))
    plt.pcolormesh(x[:,:34].T)

In [None]:
plot_t((xp-x)/dt-g)
plt.colorbar()

In [None]:
def torch_net_file_plot(fname):
    

    net = torch.load(fname)
    plot_t(predict(net, x))
    plt.colorbar()

now let's use it to make a prediction

In [None]:
torch_net_file_plot("../data/ml/ngaqua/time_series_fit.torch")

It seems to do a pretty good job compared to the run above.

# Single column tests of the model.

## Initial value problem

Let's start some reference profile and integrate the model forward without any advection terms. If the scheme is unstable then we will need to rethink the model.

Let's load the neural network we fit using `lib/scripts/fit_torch_cli.py`

In [None]:
net = torch.load("../data/ml/ngaqua/time_series_fit.torch")

In [None]:
def run_time_series(predict, x0, nsteps):
    out = np.empty((nsteps+1, x0.shape[0]))
    out[0] = x0
    x = x0
    for i in range(nsteps+1):
        x = predict(x)
        out[i] = x
        
    return out

This is what the predicted time derivative looks like. It looks pretty good!

In [None]:
plot_t(predict(net, x))

What happens when we use this scheme in a predictive mode? 

In [None]:
ts = run_time_series(lambda x: predict(net, x)*.125/10,  x[0], 100)

In [None]:
plot_t(ts)

We can see that the scheme is very unstable. Which is not good. Can we assess the instability of a scheme somehow without actually running it? Once solution is to use a global fit.

The modified rayleigh quotient should work. $\frac{x' \cdot f(x' + \bar{x})}{x'\cdot x'}$

In [None]:
def rayleigh_quotient(f, x, axis=-1):
    fx = f(x)
    return np.sum(x * fx, axis=axis)/np.sum(x * x, axis=axis)

In [None]:
x_mean = x.mean(axis=0)
x_pert = x-x_mean

r = rayleigh_quotient(lambda x: predict(net, x+x_mean), x_pert)
plt.plot(r)
plt.axhline(0.0, c='k')

What happened when we used the neural network model in a predictive mode?

In [None]:
rayleigh_quotient(lambda x: predict(net, x+x_mean), ts-x_mean)[:10]

We can see that scheme somehow became extremely stiff. But how was this possible after only 1 time step

In [None]:
x0 = x[0]
x1 = x0 + predict(net, x0) * .125

In [None]:
plt.plot(x1[34:])
plt.plot(x0[34:])

In [None]:
rayleigh_quotient(lambda x: predict(net, x+x_mean), x0-x_mean)

In [None]:
rayleigh_quotient(lambda x: predict(net, x+x_mean), x1-x_mean)

How is it that the rayleigh coefficient is so drastically different for such a small change? Let's look at the different in the predicted heating profiles. Is it because our network is overfitting? Is this an artifact of the code in this particular section?

In [None]:
plt.plot(predict(net, x0))
plt.figure()
plt.plot(predict(net, x1).T)

Indeed the profiles are drastically different! We definitely need some kind of penalization here.