## Side Notes for Lesson 6

Note: not reviewed and/or edited. This is to be seen as a live quick-and-dirty toy example.

Today, we look at a very simple RNN (LSTM) model in Keras. Since the purpose of the notebook is to familiarize ourselves with the various inputs, outputs, dimensions, etc., we don't even bother to train the model - we simply define it, study the parameters/options and look at the model output for a specific example. 

And hopefully, everything makes sense...


In [None]:
import json
import pandas as pd
import numpy as np
import os
import sys
import tensorflow as tf
from time import time

import matplotlib.pyplot as plt
from matplotlib import colors
from matplotlib.ticker import PercentFormatter

from tensorflow.keras import backend as K
from tensorflow.keras import layers
from tensorflow.python.keras.layers import Lambda, Dense
from tensorflow.keras.callbacks import TensorBoard
from tensorflow.keras.backend import sparse_categorical_crossentropy

#### Define the model with the Functional API formalism in Keras.

This is very useful as it allows us to construct non-sequential models with multiple inputs, multiple outputs, branched models,  etc..

Key aspects:

* start with definition of inputs (input sequence and initial state(s))
* define the actions of layers as:
$$\rm layer\_activations = layer(layer\_parameters) (previous\_layer\_activations)$$  
* define the model by specifying input and output

In [None]:
def build_model(): 
    
    # inputs - comprised of - for each example in batch:
    #             - (four) x_t, each of dim 3, 
    #.            - the initial 2 x 3d state (h and c) 
    
    in_x = tf.keras.layers.Input(shape=(4,3), name="in_id")

    in_state_h = tf.keras.layers.Input(shape=(3,), name="in_state_h")
    in_state_c = tf.keras.layers.Input(shape=(3,), name="in_state_c")
    
    # define a very simple lstem layer, ACTING on the input
    
    lstm_output = tf.keras.layers.LSTM(3, return_sequences=True, return_state=True,
                              kernel_initializer=tf.keras.initializers.RandomNormal(stddev=0.1),
                              recurrent_initializer=tf.keras.initializers.RandomNormal(stddev=0.1))\
            (in_x, initial_state=[in_state_h, in_state_c])

    model = tf.keras.models.Model([in_x, in_state_h, in_state_c], lstm_output)
    
    model.summary()
    
    return model

In [None]:
myModel = build_model()

### Questions:

* is '84' correct?
* what would change if we set return_sequences=False?
* what would happen if we set return_state=False? 


### Model Output
 
Now we create some very simple sample input, the fake 'embeddings' of four words and the initial $h$ and $c$ defining the initial state. We then let the model 'predict' (i.e, calculate based on the random initializations as we didn;'t train the model) the output of the model for the input.

Do the results make sense?

In [None]:
lstm_input = np.array([[[1.1,2,3], [4,5,6], [7,8,9], [10,11,12]]])
initial_h = np.array([[1.,4,3]]*1)
initial_c = np.array([[1,2,6]]*1)

out, state_h, state_c = myModel.predict([lstm_input,initial_h, initial_c],batch_size=4)


print('output_vector', out)
print('out_state_h ', state_h)
print('out_state_c ', state_c)



print('')

Questions:
* if we wondered whether we confused out_state_h and out_state_c, how could we tell?
* if we change the initial states, do we get different results? (Of course we do, but still good to see.)
