## GRU model testing
OK, we have done carry forward control, we have done simple linear models, we have done ARIMA - last thing to try before going on a data hunt in true neural network based models. We could try some regression with vanilla dense layers or something, but I think we should cut to the chase and go straight to a GRU based network. Let's get a minimal example set up to get an idea of what we are working with and then set up a more rigorous hyperparameter optimization with crossvalidation/bootstrapping.

In [1]:
# Add notebook parent dir to path so we can import from functions/
import sys
sys.path.append('..')

import shelve
import numpy as np

# Import project config file
import config as conf

# Import notebooks specific helper functions
import functions.notebook_helper_functions.notebook15 as funcs

# Instantiate paths and model parameters
paths = conf.DataFilePaths()
params = conf.GRU_model_parameters()

# Load data column index
index = shelve.open(paths.PARSED_DATA_COLUMN_INDEX)

# Set number of counties to include in our testing
num_counties = 100

# Set data block size 
block_size = 9

In [2]:
for column, num in index.items():
    print(f'{column}: {num}')

cfips: 0
first_day_of_month: 1
microbusiness_density: 2
active: 3
microbusiness_density_change: 4
microbusiness_density_change_change: 5


In [3]:
# Load data with block size
input_file = f'{paths.PARSED_DATA_PATH}/{params.input_file_root_name}{block_size}.npy'
timepoints = np.load(input_file)

OK, so here we go. Hardest part about working with this type of model is getting the data into the correct shape for input. Couple of considerations here:
1. Do training/validation split where data is kept in sequential time order with older data being used for training and newer data being used for validation.
2. Start with only one input feature - the microbusiness density (or detrended microbusiness density).
3. Forecast one point into the future.
4. Be ready to standardize/unstandardize data using statistics from the training set only.
5. Input data is formatted as (batch, timesteps, feature). 

The last part is a little complicated to think about - we have > 3k timeseries with the same time axis, one for each county. We could treat this like 3k features, but I think the better idea is to think of it as one feature and 3k counties * 37 timepoints input datapoints. The trick is, how do we batch/make timesteps out of it? We don't want to present the model timeseries from different counties as if one comes after the next. I think the most obvious way to do this is use a stateless GRU layer and then present each county as a batch. Within that batch we then have the block from our sliding window data parse.

Ok, I think that sounds like as good a place to start as any. Let's take a look at the data:

In [4]:
print(f'Input data shape: {timepoints.shape}')

Input data shape: (29, 3135, 9, 6)


The dimensions here are:

0. The timepoint block - the sized of this axis depends on the width of the block used to scan the data - smaller blocks give more timepoints with num_timepoint_blocks = total_timepoints - block_size + 1. This is also the axis we need to do our training validation split on. First part becomes training, last part becomes validation.
1. The counties - each element here is a county, for the purposes of our first experiment with this we will treat each county as a batch.
2. The the timepoints in the timepoint block (~row in pandas dataframe).
3. The features (~column in pandas dataframe).

First up - training/validation split:

In [5]:
# Choose split
split_index = int(timepoints.shape[0] * params.training_split_fraction)

# Before we split, choose just the data we want and drop everything else
input_data = timepoints[:,:,:,index[params.input_data_type]]

training_data = input_data[0:split_index]
validation_data = input_data[split_index:-1]

print(f'Input data shape: {input_data.shape}')
print(f'Split fraction: {params.training_split_fraction}')
print(f'Split index: {split_index}')
print(f'Training data shape: {training_data.shape}')
print(f'Validation data shape: {validation_data.shape}')

Input data shape: (29, 3135, 9)
Split fraction: 0.7
Split index: 20
Training data shape: (20, 3135, 9)
Validation data shape: (8, 3135, 9)


OK, looks good - let's try converting everything to a z-score, using mean and standard deviation from the training sample only.

In [6]:
training_mean = np.mean(training_data)
training_deviation = np.std(training_data)

print(f'Mean: {training_mean:.2f}, standard deviation: {training_deviation:.2f}')

training_data = (training_data - training_mean) / training_deviation

print(f'New mean: {np.mean(training_data):.2f}, New standard deviation: {np.std(training_data):.2f}')

Mean: 3.77, standard deviation: 4.52
New mean: -0.00, New standard deviation: 1.00


In [32]:
training_data = np.swapaxes(training_data, 1, 0)
print(f'New shape: {training_data.shape}')

New shape: (3135, 20, 9)


Looks good! Let's build the model. Only additional thing to mention here is that for each time block in the counties, the first n datapoints are the time ordered input and the last one is the value we are trying to predict. With that in mind, let's go!

In [33]:
import os

from tensorflow import keras
from tensorflow.keras import layers

os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'

In [35]:
def build_GRU(
    units: int = 64,
    learning_rate: float = 0.002
):

    # Input layer
    input = layers.Input(
        name = 'Input',
        shape = (20,8)
    )

    # GRU layer
    gru = layers.GRU(
        units,
        activation="tanh",
        recurrent_activation="sigmoid",
        use_bias=True,
        kernel_initializer="glorot_uniform",
        recurrent_initializer="orthogonal",
        bias_initializer="zeros",
        kernel_regularizer=None,
        recurrent_regularizer=None,
        bias_regularizer=None,
        activity_regularizer=None,
        kernel_constraint=None,
        recurrent_constraint=None,
        bias_constraint=None,
        dropout=0.0,
        recurrent_dropout=0.0,
        return_sequences=False,
        return_state=False,
        go_backwards=False,
        stateful=False,
        unroll=False,
        time_major=False,
        reset_after=True,
        name='GRU'
    )(input)

    # output layer
    output = layers.Dense(
        name = 'Output',
        units = 1,
        activation = 'linear'
    )(gru)

    # Next, we will build the complete model and compile it.
    model = keras.models.Model(
        input, 
        output,
        name = 'Simple_GRU_model'
    )

    model.compile(
        loss = keras.losses.MeanSquaredError(name = 'MSE'), 
        optimizer = keras.optimizers.Adam(learning_rate = learning_rate),
        metrics = [keras.metrics.MeanAbsoluteError(name = 'MAE')]
    )

    print(model.summary())

    return model

In [36]:
def data_generator(data):
    
    while True:
        for i in range(data.shape[1]):
            X = data[:,i,:-1]
            Y = data[:,i,-1:]

            yield (X, Y)

In [37]:
training_data_generator = data_generator(training_data)
input_sample = next(training_data_generator)

print(f'Input shape: {input_sample[0].shape}')
print(f'Target shape: {input_sample[1].shape}')

Input shape: (3135, 8)
Target shape: (3135, 1)


In [38]:
model = build_GRU()

Model: "Simple_GRU_model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 Input (InputLayer)          [(None, 20, 8)]           0         
                                                                 
 GRU (GRU)                   (None, 64)                14208     
                                                                 
 Output (Dense)              (None, 1)                 65        
                                                                 
Total params: 14,273
Trainable params: 14,273
Non-trainable params: 0
_________________________________________________________________
None


In [39]:
# Fit the model to the training data.
training_data_generator = data_generator(training_data)

history = model.fit(
    x=training_data[:,:,:-1],
    y=training_data[:,:,-1:],
    #batch_size = 20,
    epochs = 10,
    #steps_per_epoch = steps_per_epoch,
    #validation_data = validation_data,
    #validation_steps = validation_steps,
    verbose = True
    #callbacks = callbacks
)

Epoch 1/10


InternalError: Graph execution error:

Detected at node 'StatefulPartitionedCall_1' defined at (most recent call last):
    File "/home/siderealyear/anaconda3/envs/microbusiness/lib/python3.10/runpy.py", line 196, in _run_module_as_main
      return _run_code(code, main_globals, None,
    File "/home/siderealyear/anaconda3/envs/microbusiness/lib/python3.10/runpy.py", line 86, in _run_code
      exec(code, run_globals)
    File "/home/siderealyear/anaconda3/envs/microbusiness/lib/python3.10/site-packages/ipykernel_launcher.py", line 17, in <module>
      app.launch_new_instance()
    File "/home/siderealyear/anaconda3/envs/microbusiness/lib/python3.10/site-packages/traitlets/config/application.py", line 992, in launch_instance
      app.start()
    File "/home/siderealyear/anaconda3/envs/microbusiness/lib/python3.10/site-packages/ipykernel/kernelapp.py", line 711, in start
      self.io_loop.start()
    File "/home/siderealyear/anaconda3/envs/microbusiness/lib/python3.10/site-packages/tornado/platform/asyncio.py", line 215, in start
      self.asyncio_loop.run_forever()
    File "/home/siderealyear/anaconda3/envs/microbusiness/lib/python3.10/asyncio/base_events.py", line 595, in run_forever
      self._run_once()
    File "/home/siderealyear/anaconda3/envs/microbusiness/lib/python3.10/asyncio/base_events.py", line 1881, in _run_once
      handle._run()
    File "/home/siderealyear/anaconda3/envs/microbusiness/lib/python3.10/asyncio/events.py", line 80, in _run
      self._context.run(self._callback, *self._args)
    File "/home/siderealyear/anaconda3/envs/microbusiness/lib/python3.10/site-packages/ipykernel/kernelbase.py", line 510, in dispatch_queue
      await self.process_one()
    File "/home/siderealyear/anaconda3/envs/microbusiness/lib/python3.10/site-packages/ipykernel/kernelbase.py", line 499, in process_one
      await dispatch(*args)
    File "/home/siderealyear/anaconda3/envs/microbusiness/lib/python3.10/site-packages/ipykernel/kernelbase.py", line 406, in dispatch_shell
      await result
    File "/home/siderealyear/anaconda3/envs/microbusiness/lib/python3.10/site-packages/ipykernel/kernelbase.py", line 729, in execute_request
      reply_content = await reply_content
    File "/home/siderealyear/anaconda3/envs/microbusiness/lib/python3.10/site-packages/ipykernel/ipkernel.py", line 411, in do_execute
      res = shell.run_cell(
    File "/home/siderealyear/anaconda3/envs/microbusiness/lib/python3.10/site-packages/ipykernel/zmqshell.py", line 531, in run_cell
      return super().run_cell(*args, **kwargs)
    File "/home/siderealyear/anaconda3/envs/microbusiness/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 2945, in run_cell
      result = self._run_cell(
    File "/home/siderealyear/anaconda3/envs/microbusiness/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3000, in _run_cell
      return runner(coro)
    File "/home/siderealyear/anaconda3/envs/microbusiness/lib/python3.10/site-packages/IPython/core/async_helpers.py", line 129, in _pseudo_sync_runner
      coro.send(None)
    File "/home/siderealyear/anaconda3/envs/microbusiness/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3203, in run_cell_async
      has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
    File "/home/siderealyear/anaconda3/envs/microbusiness/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3382, in run_ast_nodes
      if await self.run_code(code, result, async_=asy):
    File "/home/siderealyear/anaconda3/envs/microbusiness/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3442, in run_code
      exec(code_obj, self.user_global_ns, self.user_ns)
    File "/tmp/ipykernel_59692/1745903755.py", line 4, in <module>
      history = model.fit(
    File "/home/siderealyear/anaconda3/envs/microbusiness/lib/python3.10/site-packages/keras/utils/traceback_utils.py", line 65, in error_handler
      return fn(*args, **kwargs)
    File "/home/siderealyear/anaconda3/envs/microbusiness/lib/python3.10/site-packages/keras/engine/training.py", line 1650, in fit
      tmp_logs = self.train_function(iterator)
    File "/home/siderealyear/anaconda3/envs/microbusiness/lib/python3.10/site-packages/keras/engine/training.py", line 1249, in train_function
      return step_function(self, iterator)
    File "/home/siderealyear/anaconda3/envs/microbusiness/lib/python3.10/site-packages/keras/engine/training.py", line 1233, in step_function
      outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "/home/siderealyear/anaconda3/envs/microbusiness/lib/python3.10/site-packages/keras/engine/training.py", line 1222, in run_step
      outputs = model.train_step(data)
    File "/home/siderealyear/anaconda3/envs/microbusiness/lib/python3.10/site-packages/keras/engine/training.py", line 1027, in train_step
      self.optimizer.minimize(loss, self.trainable_variables, tape=tape)
    File "/home/siderealyear/anaconda3/envs/microbusiness/lib/python3.10/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 527, in minimize
      self.apply_gradients(grads_and_vars)
    File "/home/siderealyear/anaconda3/envs/microbusiness/lib/python3.10/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 1140, in apply_gradients
      return super().apply_gradients(grads_and_vars, name=name)
    File "/home/siderealyear/anaconda3/envs/microbusiness/lib/python3.10/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 634, in apply_gradients
      iteration = self._internal_apply_gradients(grads_and_vars)
    File "/home/siderealyear/anaconda3/envs/microbusiness/lib/python3.10/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 1166, in _internal_apply_gradients
      return tf.__internal__.distribute.interim.maybe_merge_call(
    File "/home/siderealyear/anaconda3/envs/microbusiness/lib/python3.10/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 1216, in _distributed_apply_gradients_fn
      distribution.extended.update(
    File "/home/siderealyear/anaconda3/envs/microbusiness/lib/python3.10/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 1211, in apply_grad_to_update_var
      return self._update_step_xla(grad, var, id(self._var_key(var)))
Node: 'StatefulPartitionedCall_1'
libdevice not found at ./libdevice.10.bc
	 [[{{node StatefulPartitionedCall_1}}]] [Op:__inference_train_function_6875]