In [1]:
from keras.models import Model
from keras.layers import Input, LSTM, GRU
import numpy as np

In [5]:
timesteps = 8
num_features = 2
num_nodes = 3

# Create some random data
data = np.random.randn(1, timesteps, num_features)
print(data)
print(data.shape)

[[[ 0.42617047  2.30064248]
  [ 0.1203544   1.24432363]
  [-0.64570541  0.46940303]
  [ 0.81961845 -2.4221855 ]
  [-0.4573022   0.1680888 ]
  [-0.11746385 -0.3638399 ]
  [ 0.58524095  0.86710422]
  [ 0.19481456  0.92405435]]]
(1, 8, 2)


Q1: What do the dimensions of the data represent? E.g., what do the 1st dimension, the 2nd, and the 3rd represent here?
A1: 1 batch size, 8 time step samples, 2 features

In [6]:
inputs = Input(shape=(timesteps, num_features))
rnn = LSTM(num_nodes, return_state=True, return_sequences=False)
x = rnn(inputs)
model = Model(inputs=inputs, outputs=x)
output, h, c = model.predict(data)
print("output:", output, "dimensions:", output.shape)
print("h:", h, "dimensions:", h.shape)
print("c:", c, "dimensions:", c.shape)

output: [[-0.06304687  0.10006328 -0.01493886]] dimensions: (1, 3)
h: [[-0.06304687  0.10006328 -0.01493886]] dimensions: (1, 3)
c: [[-0.13980696  0.21100064 -0.02947333]] dimensions: (1, 3)


Q2: What do these "predict" return values (output, h, and c) represent? Describe what they mean.
A2: output is the predicted output at each node, h is the hidden state or the summary of the information learned at final time step and c is the cell state or information in the cell at final time step

Q3: What are the dimensions of output, h, and c?
A3: (1,3)

Q4: Is there any duplication among the values of (output, h, c)? If so, explain.
A4: Output and hidden state have the same values because we don't have specific outputs that the network is training to learn. Instead we are considering the hidden state values as the output or predicted values.

In [8]:
inputs = Input(shape=(timesteps, num_features))
rnn = LSTM(num_nodes, return_state=False, return_sequences=False)
x = rnn(inputs)
model = Model(inputs=inputs, outputs=x)
output = model.predict(data)
print("output:", output, "dimensions:", output.shape)

output: [[-0.01948409  0.20419614 -0.1382258 ]] dimensions: (1, 3)


Q5: Change the LSTM to use "return_state=False" and modify the code accordingly to avoid errors. What does the LSTM return now?
A5: The hidden and cell states aren't returned, just the output.

In [12]:
inputs = Input(shape=(timesteps, num_features))
rnn = LSTM(num_nodes, return_state=True, return_sequences=True)
x = rnn(inputs)
model = Model(inputs=inputs, outputs=x)
output,h,c = model.predict(data)
print("output:", output, "dimensions:", output.shape)
print("h:", h, "dimensions:", h.shape)
print("c:", c, "dimensions:", c.shape)

output: [[[-0.08331491  0.09349339 -0.04164696]
  [-0.13099748  0.1851472  -0.04789009]
  [-0.12819001  0.2980693  -0.06047511]
  [ 0.07423662 -0.03120806  0.00975953]
  [ 0.08014288  0.04498491 -0.01247792]
  [ 0.0872453  -0.00790709 -0.00197999]
  [ 0.03592693  0.02363204  0.00746143]
  [-0.03782576  0.09020159 -0.00870624]]] dimensions: (1, 8, 3)
h: [[-0.03782576  0.09020159 -0.00870624]] dimensions: (1, 3)
c: [[-0.06392667  0.22244294 -0.01704968]] dimensions: (1, 3)


Q6: Now, what are the dimensions of output, h, and c?
A6: (1,8,3) for output, (1,3) for hidden state and (1,3) for cell state

Q7: What does the output variable represent now? Explain the dimensions of the returned matrix (e.g., what does the 1st dimension represent? What does the 2nd represent? The 3rd? etc.)
A7: The predictions across all 8 time steps for 1 batch and 3 nodes in the single layer RNN

In [14]:
inputs = Input(shape=(timesteps, num_features))
rnn = GRU(num_nodes, return_state=True)
x = rnn(inputs)
model = Model(inputs=inputs, outputs=x)
output, h = model.predict(data)
print("o:", output, "dimensions:", output.shape)
print("h:", h, "dimensions:", h.shape)

o: [[ 0.13488288 -0.35273033  0.31479627]] dimensions: (1, 3)
h: [[ 0.13488288 -0.35273033  0.31479627]] dimensions: (1, 3)


Q8: What happened to the variable "c"? Why does the GRU only return a tuple of 2 values as opposed to 3?
A8: GRUs don't have a separate cell state.

Q9: What are the dimension of output and h?
A9: (1,3)

Q10: Is there any duplication between the values of output and h? If so, explain.
A10: Output and hidden state have the same values because we don't have specific outputs that the network is training to learn. Instead we are considering the hidden state values as the output or predicted values.

In [15]:
inputs = Input(shape=(timesteps, num_features))
rnn = GRU(num_nodes, return_state=True, return_sequences=True)
x = rnn(inputs)
model = Model(inputs=inputs, outputs=x)
output, h = model.predict(data)
print("o:", output, "dimensions:", output.shape)
print("h:", h, "dimensions:", h.shape)

o: [[[ 0.5338172   0.12903047 -0.35914803]
  [ 0.5023333   0.18508166 -0.47603512]
  [ 0.5234302   0.32511708 -0.31396922]
  [ 0.14841181 -0.5738381   0.1537786 ]
  [ 0.18573833 -0.15820807  0.11780378]
  [ 0.0477335  -0.11498538  0.13743234]
  [ 0.03314649 -0.15249561 -0.23270506]
  [ 0.1755483  -0.0050875  -0.34945416]]] dimensions: (1, 8, 3)
h: [[ 0.1755483  -0.0050875  -0.34945416]] dimensions: (1, 3)


Q11: Now, what are the dimensions of output and h?
A11: (1,8,3) for output and (1,3) for hidden state

Q12: Is there any duplication within the values of output and h? If so, explain.
A12: The final row in the output is the summarized hidden state.