## Side Notes for Lesson 6

Note: not reviewed and/or edited. This is to be seen as a live quick-and-dirty toy example.

In [1]:
import json
import pandas as pd
import numpy as np
import os
import sys
import tensorflow as tf
import tensorflow_hub as hub
from time import time

import matplotlib.pyplot as plt
from matplotlib import colors
from matplotlib.ticker import PercentFormatter

from tensorflow.keras import backend as K
from tensorflow.keras import layers
from tensorflow.python.keras.layers import Lambda, Dense
from tensorflow.keras.callbacks import TensorBoard
from tensorflow.keras.backend import sparse_categorical_crossentropy

#### Define the model with the Functional API formalism in Keras.

This is very useful as it allows us to construct non-sequential models with multiple inputs, multiple outputs, branched models,  etc..

Key aspects:

* start with definition of inputs
* define the model as:
$$\rm next\_layer\_activations = layer(layer\_parameters) (previous\_layer\_activations)$$

In [2]:
def build_model(): 
    
    # inputs - comprised of - for each example in batch:
    #             - (four) x_t each of dim 3, 
    #.            - the initial 2 x 3d state (h and c) 
    
    in_x = tf.keras.layers.Input(shape=(4,3), name="in_id")
    in_state_h = tf.keras.layers.Input(shape=(3,), name="in_state_h")
    in_state_c = tf.keras.layers.Input(shape=(3,), name="in_state_c")
        
        
    # combine in_state_h and in_state_c into the in_state
    in_state = [in_state_h, in_state_c]
    
    # define a very simple lstem layer, ACTING on the input
    
    lstm_output = tf.keras.layers.LSTM(3, return_sequences=True, return_state=True,
                              kernel_initializer=tf.keras.initializers.RandomNormal(stddev=0.1),
                              recurrent_initializer=tf.keras.initializers.RandomNormal(stddev=0.1))\
            (in_x, initial_state=in_state)

    model = tf.keras.models.Model([in_x, in_state_h, in_state_c], lstm_output)
    
    model.summary()
    
    return model

In [3]:
myModel = build_model()

W1003 14:59:54.957829 4661106112 deprecation.py:506] From /anaconda3/envs/tf1_14/lib/python3.7/site-packages/tensorflow/python/keras/initializers.py:143: calling RandomNormal.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor


Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
in_id (InputLayer)              [(None, 4, 3)]       0                                            
__________________________________________________________________________________________________
in_state_h (InputLayer)         [(None, 3)]          0                                            
__________________________________________________________________________________________________
in_state_c (InputLayer)         [(None, 3)]          0                                            
__________________________________________________________________________________________________
lstm (LSTM)                     [(None, 4, 3), (None 84          in_id[0][0]                      
                                                                 in_state_h[0][0]             

### Questions:

* is '84' correct?
* what would change if we set return_sequences=False?
* what would happen if we set return_state=False? 

In [5]:
lstm_input = np.array([[[1.1,2,3], [4,5,6], [7,8,9], [10,11,12]]])
initial_h = np.array([[1.,2,3]]*1)
initial_c = np.array([[1,2,1]]*1)


out, state_h, state_c = myModel.predict([lstm_input,initial_h,initial_c],batch_size=4)


print('output_vector', out)
print('out_state_h ', state_h)
print('out_state_c ', state_c)



print('')

output_vector [[[0.18031003 0.52822155 0.17851658]
  [0.12420709 0.40717268 0.08703903]
  [0.         0.35303497 0.00636824]
  [0.         0.31276196 0.        ]]]
out_state_h  [[0.         0.31276196 0.        ]]
out_state_c  [[1.5253313  1.3554394  0.24917921]]



Questions:
* if we wondered whether we confused out_state_h and out_state_c, how could we tell?
* if we change the initial states, do we get different results? (Of course we do, but still good to see.)
