In [1]:
%matplotlib inline

In [2]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

import tensorflow as tf
from tensorflow.keras.layers import SimpleRNN

# RNN IO Shapes

Example dataset:
- #1: 'I like tomatoes in pasta.' - 6 words
- #2:'My hair is black.' - 5 words
- ...
- #1000: 'Do you what what will be the wheather tomorrow?' - 10 words


Notes:
- All sentences (=corpus) is 1000. One sentence = one document.
- The batch size is 32. 32 sentences of all 1000 are processesd before GD update. `batchsize=32`
- Each senstence consists of words, max sentence len 10. `timestep=sequence=10`
- Each word is represented as 8x1 vector (that is the num of features is 8). `features=8` 

In [3]:
inputs = np.random.random([32, 10, 8]).astype(np.float32)
inputs.shape

(32, 10, 8)

- The layer has 4 units (neurons).
- Each unit produces a single number on each time step (=when a word is 'fed').
- The steps are equal the length of the sequence/sentence. However, the max length is 10 (the remaining words are not used).
- By the default settings, only the latest values for each of the 32 sentences is returned -> the output shape is 32 sentences (in the batch) * 4 values (one from each unit in the `SimpleRNN` layer).

In [7]:
simple_rnn = SimpleRNN(units=4)
output = simple_rnn(inputs)
output.shape

TensorShape([32, 4])

We can get the outputs after each word, each element/token of the sequence as below:

In [6]:
simple_rnn = SimpleRNN(4, return_sequences=True)
simple_rnn(inputs).shape

TensorShape([32, 10, 4])