# Multistep Neural Network for Learning Dynamics from Data

_Kevin Siswandi | May 2020_

In this notebook, I demonstrate how the Multistep Neural Network works to reconstruct the dynamics of 2-D Yeast Glycolysis. First of all, we need to make sure that `nodepy` and `tensorflow 2.x` are installed.

In [6]:
# Install nodepy in the current Jupyter kernel
import sys
!pip install --user nodepy

Collecting nodepy
  Downloading nodepy-0.9-py3-none-any.whl (818 kB)
[K     |████████████████████████████████| 818 kB 3.1 MB/s eta 0:00:01
[?25hCollecting networkx
  Downloading networkx-2.4-py3-none-any.whl (1.6 MB)
[K     |████████████████████████████████| 1.6 MB 21.4 MB/s eta 0:00:01
[?25hCollecting matplotlib
  Downloading matplotlib-3.2.1-cp36-cp36m-manylinux1_x86_64.whl (12.4 MB)
[K     |████████████████████████████████| 12.4 MB 23.8 MB/s eta 0:00:01
Collecting sympy
  Downloading sympy-1.5.1-py2.py3-none-any.whl (5.6 MB)
[K     |████████████████████████████████| 5.6 MB 28.9 MB/s eta 0:00:01
Collecting cycler>=0.10
  Downloading cycler-0.10.0-py2.py3-none-any.whl (6.5 kB)
Collecting kiwisolver>=1.0.1
  Downloading kiwisolver-1.2.0-cp36-cp36m-manylinux1_x86_64.whl (88 kB)
[K     |████████████████████████████████| 88 kB 13.5 MB/s eta 0:00:01
Collecting mpmath>=0.19
  Downloading mpmath-1.1.0.tar.gz (512 kB)
[K     |████████████████████████████████| 512 kB 30.3 MB/s eta 0:00

In [3]:
# upgrade to tensorflow to 2
!pip install --user --upgrade tensorflow

Collecting tensorflow
  Downloading tensorflow-2.2.0-cp36-cp36m-manylinux2010_x86_64.whl (516.2 MB)
[K     |████████████████████████████████| 516.2 MB 5.3 kB/s  eta 0:00:01     |█████████████                   | 211.4 MB 64.9 MB/s eta 0:00:05███████████████████████       | 401.9 MB 84.2 MB/s eta 0:00:02
Collecting opt-einsum>=2.3.2
  Downloading opt_einsum-3.2.1-py3-none-any.whl (63 kB)
[K     |████████████████████████████████| 63 kB 3.7 MB/s  eta 0:00:01
[?25hCollecting absl-py>=0.7.0
  Downloading absl-py-0.9.0.tar.gz (104 kB)
[K     |████████████████████████████████| 104 kB 73.3 MB/s eta 0:00:01
[?25hCollecting tensorboard<2.3.0,>=2.2.0
  Downloading tensorboard-2.2.1-py3-none-any.whl (3.0 MB)
[K     |████████████████████████████████| 3.0 MB 53.0 MB/s eta 0:00:01
[?25hCollecting tensorflow-estimator<2.3.0,>=2.2.0
  Downloading tensorflow_estimator-2.2.0-py2.py3-none-any.whl (454 kB)
[K     |████████████████████████████████| 454 kB 54.6 MB/s eta 0:00:01
[?25hCollecting termc

Successfully installed absl-py-0.9.0 astunparse-1.6.3 cachetools-4.1.0 gast-0.3.3 google-auth-1.14.2 google-auth-oauthlib-0.4.1 google-pasta-0.2.0 grpcio-1.28.1 h5py-2.10.0 keras-preprocessing-1.1.0 markdown-3.2.2 oauthlib-3.1.0 opt-einsum-3.2.1 pyasn1-modules-0.2.8 requests-oauthlib-1.3.0 tensorboard-2.2.1 tensorboard-plugin-wit-1.6.0.post3 tensorflow-2.2.0 tensorflow-estimator-2.2.0 termcolor-1.1.0 werkzeug-1.0.1 wrapt-1.12.1
You should consider upgrading via the '/home/ec2-user/anaconda3/envs/JupyterSystemEnv/bin/python -m pip install --upgrade pip' command.[0m


In [1]:
import numpy as np
import tensorflow as tf
from tensorflow.keras import Model
import nodepy.linear_multistep_method as lm
from scipy.integrate import odeint
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
import timeit

print(tf.__version__)
np.random.seed(1234)
tf.random.set_seed(1234)

2.2.0


Training is done to find the optimal parameters via minimising loss function:

$$ \arg \min_w \frac{1}{N - M + 1} \sum_{n = M}^{N} |\textbf{y}_n|^2 $$

where N is the number of data points and M multi steps. The linear difference/residual operator is defined as

$$ \textbf{y}_n = \sum_{n=0}^M \left(\alpha_m x_{n-m} + h \beta_m \textbf{f}(\textbf{x}_{n-m}) \right)$$

for $n=M,...,N$.

In [11]:
class lmmNet:
    """
    Implementation of the LMMNet
    version 1.2
    Fixes/updates:
        * number of hidden layer units is no longer hardcoded
        * fixed bug for wrong indexing of the coefficients in computing linear diff operator
        * loss printed every 100 epochs
        * optimizer now declared in constructor

    """
    
    def __init__(self, h, X, M, scheme, hidden_units):
        """
        Args:
        h -- step size
        X -- data array with shape S x N x D 
        M -- number of LMM steps
        scheme -- the LMM scheme (either AB, AM, or BDF)
        hidden_units -- number of units for the hidden layer
        
        """
        self.h = h
        self.X = X
        self.M = M # number of time steps
        
        # get the number of trajectories, discrete time instances, and number of feature dimensions
        self.S = X.shape[0]
        self.N = X.shape[1]
        self.D = X.shape[2]
        
        # load LMM coefficients from NodePy
        # https://nodepy.readthedocs.io/en/latest/
        if scheme == 'AB':
            coefs = lm.Adams_Bashforth(M)
        elif scheme == 'AM':
            coefs = lm.Adams_Moulton(M)
        elif scheme == 'BDF':
            coefs = lm.backward_difference_formula(M)
        else:
            raise Exception('Please choose a valid LMM scheme')
        
        self.alpha = np.float32(-coefs.alpha[::-1])
        self.beta = np.float32(coefs.beta[::-1])
        
        class DenseModel(Model):
            """
            A simple feed-forward network with 1 hidden layer
            
            Arch:
            * 256 hidden units
            * input units and output units correspond to the dimensionality
            """
            def __init__(self, D):
                super(DenseModel, self).__init__()
                self.D = D

                self.d1 = tf.keras.layers.Dense(units=hidden_units, activation=tf.nn.tanh, input_shape=(self.D,))
                self.d2 = tf.keras.layers.Dense(units=self.D, activation=None)

            def call(self, X1):
                A = self.d1(X1)
                A = self.d2(A)
                return A
        
        self.nn = DenseModel(self.D)
                
        self.opt = tf.keras.optimizers.Adam()
        
    def get_F(self, X):
        """
        Output of the NN/ML model.
        
        Args:
        - X: the data matrix with shape S x (N-M) x D
        
        Output:
        - F: the output dynamics with shape S x (N-M) x D
        """

        assert X.shape == (self.S, self.N - self.M, self.D)
        
        X1 = tf.reshape(X, [-1, self.D])
        F1 = self.nn(X1)
        
        return tf.reshape(F1, [self.S, -1, self.D])
    
    def get_Y(self, X):
        """
        The linear difference (residual) operator.
        
        Args:
        - X: the data matrix with shape S x N x D
        """
        
        M = self.M
        
        # compute the difference operator
        # broadcasting from M to N to get an array for all n
        # Y has shape S x (N - M) x D
        Y = self.alpha[0] * X[:, M: ,:] + self.h * self.beta[0] * self.get_F(X[:, M:, :]) # for m = 0
        
        # sum over m from m = 1
        for m in range(1, M+1):
            Y += self.alpha[m] * X[:, M-m:-m, :] + self.h * self.beta[m] * self.get_F(X[:, M-m:-m, :])
        
        return self.D * tf.reduce_mean(tf.square(Y))
    
    def train(self, epochs):
        """
        Fit the model PyTorch-style
        """
        
        start_time = timeit.default_timer()
        
        for epoch in range(epochs):
            with tf.GradientTape() as tape:
                self.loss = self.get_Y(self.X)
            grads = tape.gradient(self.loss, self.nn.trainable_variables)
            self.opt.apply_gradients(zip(grads, self.nn.trainable_variables))
            
            if epoch % 100 == 0:
                elapsed_time = timeit.default_timer() - start_time
                print('Epoch: %d, Time: %.2f, Loss: %.4e' %(epoch, elapsed_time, self.loss))
                #tf.print(self.loss)

        
    def predict(self, X_reshaped):
        """
        Args:
        - X_reshaped with shape S(N-M+1) x D
        """
        return self.nn(X_reshaped)


In [3]:
help(lmmNet)

Help on class lmmNet in module __main__:

class lmmNet(builtins.object)
 |  Implementation of the LMMNet
 |  version 1.2
 |  Fixes/updates:
 |      * number of hidden layer units is no longer hardcoded
 |      * fixed bug for wrong indexing of the coefficients in computing linear diff operator
 |      * loss printed every 100 epochs
 |      * optimizer now declared in constructor
 |  
 |  Methods defined here:
 |  
 |  __init__(self, h, X, M, scheme, hidden_units)
 |      Args:
 |      h -- step size
 |      X -- data array with shape S x N x D 
 |      M -- number of LMM steps
 |      scheme -- the LMM scheme (either AB, AM, or BDF)
 |      hidden_units -- number of units for the hidden layer
 |  
 |  get_F(self, X)
 |      Output of the NN/ML model.
 |      
 |      Args:
 |      - X: the data matrix with shape S x (N-M) x D
 |      
 |      Output:
 |      - F: the output dynamics with shape S x (N-M) x D
 |  
 |  get_Y(self, X)
 |      The linear difference (residual) operator.
 | 

In [5]:
# testing to see if it works on dummy data

step_size = 1
data = tf.ones((2,2,7))
steps = 1
n_units = 256

net = lmmNet(step_size, data, steps, 'AM', n_units)
loss = net.D * tf.reduce_mean(tf.square(net.get_Y(data)))
print( '%.3e' %loss)

1.916e+00


In [7]:
# Load training data
import pickle
with open('bier_damped.pkl', 'rb') as file:
    bier = pickle.load(file)
    
bier_data = bier['data']
time_points = bier['t']
print(bier_data.shape)

(1, 2500, 2)


In [12]:
step_size = time_points[1] - time_points[0]

net = lmmNet(step_size, bier_data, M = 1, scheme='AM', hidden_units=256) # use trapezoidal rule (smallest error constant that is also stable)

In [13]:
epochs = 10000
net.train(epochs)



To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the author of this layer, you can disable autocasting by passing autocast=False to the base Layer constructor.

Epoch: 0, Time: 0.03, Loss: 0.0543
Epoch: 100, Time: 1.85, Loss: 0.0002
Epoch: 200, Time: 3.68, Loss: 0.0001
Epoch: 300, Time: 5.51, Loss: 0.0001
Epoch: 400, Time: 7.40, Loss: 0.0001
Epoch: 500, Time: 9.23, Loss: 0.0000
Epoch: 600, Time: 11.06, Loss: 0.0000
Epoch: 700, Time: 12.91, Loss: 0.0000


KeyboardInterrupt: 

In [8]:
def ml_f(x):
    """
    Define the derivatives (RHS of the ODE) learned by ML
    """
    return np.ravel(net.predict(x.reshape(1,-1)))
    
# testing to see if it works
ml_f(gly_data[0,:])

array([ 0.89327437, -9.029732  ,  5.0351768 , -5.3296275 , -4.2815266 ,
       -2.1680076 ,  0.03295352], dtype=float32)

In [9]:
# Solve the IVP

time_points = np.arange(0, 10, step_size)

predicted_traj = odeint(lambda x, t: ml_f(x), gly_data[0,:], time_points)

In [10]:
# save the predictions for analysis in other notebooks

with open('gly_pred.npy', 'wb') as file:
    np.save(file, predicted_traj)

predicted_traj

array([[ 1.36122456,  1.4866053 ,  0.05739308, ...,  0.29558962,
         1.87303717,  0.06480158],
       [ 1.37076952,  1.40535577,  0.09402223, ...,  0.26416342,
         1.8735414 ,  0.06500119],
       [ 1.38215587,  1.33628228,  0.11167621, ...,  0.24782192,
         1.90445386,  0.06505066],
       ...,
       [-0.57196215,  3.23297342, -0.36502048, ..., -1.53900435,
         2.07049885, -1.89131542],
       [-0.58596219,  3.24259846, -0.35796348, ..., -1.53307254,
         2.09790285, -1.89438169],
       [-0.59513727,  3.23922149, -0.35071086, ..., -1.5277799 ,
         2.13665988, -1.89741348]])