# 0. Imports

In [1]:
import numpy as np
import tensorflow as tf

# 1. Setup Dummy Data for Demonstration
## 1.1 create data in numpy arrays

first we create some dummy data from which we want to generate a dataset. We will make dummy images in channels-first order, since in this way the prints will look more intuitive and compact. In this example we will use ```2``` input images to the network, one per time step (so no stereo images and no ```sequence_length > 2```).

In [2]:
width = height = 8; channels = 1;
n_images = 6
inputs_0 = []
inputs_1 = []
for i in range(n_images):
    inputs_0.append(np.ones((channels, width, height))*i)
    inputs_1.append(np.ones((channels, width, height))*(i+1))

inputs_0 = np.array(inputs_0); inputs_1 = np.array(inputs_1);
labels   = np.array([['tx01','ty01','tz01','r01','p01','y01'],
                     ['tx12','ty12','tz12','r12','p12','y12'],
                     ['tx23','ty23','tz23','r23','p23','y23'],
                     ['tx34','ty34','tz34','r34','p34','y34'],
                     ['tx45','ty45','tz45','r45','p45','y45'],
                     ['tx56','ty56','tz56','r56','p56','y56']])

print('inputs_0:\n', inputs_0)
print('inputs_1:\n', inputs_1)
print('labels:\n', labels)

inputs_0:
 [[[[0. 0. 0. 0. 0. 0. 0. 0.]
   [0. 0. 0. 0. 0. 0. 0. 0.]
   [0. 0. 0. 0. 0. 0. 0. 0.]
   [0. 0. 0. 0. 0. 0. 0. 0.]
   [0. 0. 0. 0. 0. 0. 0. 0.]
   [0. 0. 0. 0. 0. 0. 0. 0.]
   [0. 0. 0. 0. 0. 0. 0. 0.]
   [0. 0. 0. 0. 0. 0. 0. 0.]]]


 [[[1. 1. 1. 1. 1. 1. 1. 1.]
   [1. 1. 1. 1. 1. 1. 1. 1.]
   [1. 1. 1. 1. 1. 1. 1. 1.]
   [1. 1. 1. 1. 1. 1. 1. 1.]
   [1. 1. 1. 1. 1. 1. 1. 1.]
   [1. 1. 1. 1. 1. 1. 1. 1.]
   [1. 1. 1. 1. 1. 1. 1. 1.]
   [1. 1. 1. 1. 1. 1. 1. 1.]]]


 [[[2. 2. 2. 2. 2. 2. 2. 2.]
   [2. 2. 2. 2. 2. 2. 2. 2.]
   [2. 2. 2. 2. 2. 2. 2. 2.]
   [2. 2. 2. 2. 2. 2. 2. 2.]
   [2. 2. 2. 2. 2. 2. 2. 2.]
   [2. 2. 2. 2. 2. 2. 2. 2.]
   [2. 2. 2. 2. 2. 2. 2. 2.]
   [2. 2. 2. 2. 2. 2. 2. 2.]]]


 [[[3. 3. 3. 3. 3. 3. 3. 3.]
   [3. 3. 3. 3. 3. 3. 3. 3.]
   [3. 3. 3. 3. 3. 3. 3. 3.]
   [3. 3. 3. 3. 3. 3. 3. 3.]
   [3. 3. 3. 3. 3. 3. 3. 3.]
   [3. 3. 3. 3. 3. 3. 3. 3.]
   [3. 3. 3. 3. 3. 3. 3. 3.]
   [3. 3. 3. 3. 3. 3. 3. 3.]]]


 [[[4. 4. 4. 4. 4. 4. 4. 4.]
   [4. 4. 4. 4. 

## 1.2 put numpy data into tf.data.Dataset objects

In [3]:
## prepare inputs
ds_inputs_0 = tf.data.Dataset.from_tensor_slices(inputs_0)
ds_inputs_1 = tf.data.Dataset.from_tensor_slices(inputs_1)
# create a dataset that returns a tuple (image_0, image_1)
ds_inputs   = tf.data.Dataset.zip((ds_inputs_0, ds_inputs_1))

## prepare labels
ds_labels = tf.data.Dataset.from_tensor_slices(labels)

## zip togeter the input images and the labels s.t. a tuple ((image_0, image_1), labels) is returned
ds_zip = tf.data.Dataset.zip((ds_inputs, ds_labels))

# print outputshape of ds_zip
print(ds_zip)

<ZipDataset shapes: (((1, 8, 8), (1, 8, 8)), (6,)), types: ((tf.float64, tf.float64), tf.string)>


If we wanted to train on non-sequence data, ```ds_zip``` could now be batched and shuffled in order to obtain a finalized dataset that can be used for training with the ```tf.keras.model.fit()``` train-loop, i.e. the output shapes of ```ds_zip``` fit the expectations of tf.keras module.
But since we want to learn from sequences we first need to do further processing of the dataset pipeline.

# 2. Slice Dataset into Subsequences
what we wish to come up with is a transformation, that takes an array ```[1,2,3,4]``` and outputs for example an array of the form  ```[[1,2], [2,3], [3,4]]```. Such an transformation can be achived with the ```tf.data.Dataset.window()``` function:  

In [4]:
ds_window_unmapped = ds_zip.window(3,1,1, drop_remainder=True)
print(ds_window_unmapped)

<WindowDataset shapes: ((DatasetSpec(TensorSpec(shape=(1, 8, 8), dtype=tf.float64, name=None), TensorShape([])), DatasetSpec(TensorSpec(shape=(1, 8, 8), dtype=tf.float64, name=None), TensorShape([]))), DatasetSpec(TensorSpec(shape=(6,), dtype=tf.string, name=None), TensorShape([]))), types: ((DatasetSpec(TensorSpec(shape=(1, 8, 8), dtype=tf.float64, name=None), TensorShape([])), DatasetSpec(TensorSpec(shape=(1, 8, 8), dtype=tf.float64, name=None), TensorShape([]))), DatasetSpec(TensorSpec(shape=(6,), dtype=tf.string, name=None), TensorShape([])))>


you can see that ```tf.data.Dataset.window()``` returns a sequence of datasets rather than a sequence of arrays, so we need to remap the inner datasets to arrays by hand. This can be done with ```tf.data.Dataset.flat_map()``` that takes a function in that remaps each element of the dataset as required. Therefore we implement a class ```mapper``` that holds a member ```window_size```  and defines the mapping function ```map_to_batch```. By defining a wrapping class we can use a parameterized mapping function, since we are not allowed to pass additional arguments to the function. ```map_to_batch``` builds arrays from each dataset by simply using the ```tf.data.Dataset.batch()``` function.

In [5]:
## define class that holds parameterized function to map on dataset
class mapper():
    def __init__(self, window_size):
        self.window_size = window_size
        
    def map_to_batch(self, *sub):
        tmp = tf.data.Dataset.zip(
                (tf.data.Dataset.zip((sub[0][0].batch(self.window_size), sub[0][1].batch(self.window_size))),
                sub[1].batch(self.window_size)))
        return tmp
    
## map entries of windowed dataset into flat arrays of sequences
ds_window = ds_window_unmapped.flat_map(mapper(3).map_to_batch)
print(ds_window)

<FlatMapDataset shapes: (((None, 1, 8, 8), (None, 1, 8, 8)), (None, 6)), types: ((tf.float64, tf.float64), tf.string)>


you can see that the entries of the dataset are no of the shape ```((None, channels, width, height), (None, labels))``` where the None dimension represents the time dimension of the sequence.

# 3. Further process Dataset to get finalized Dataset
## 3.1 Map Inputs to Layernames
using ```tf.keras.Model.fit()``` as training loop allows us to precisly map each input to given input layers by passing a dictionary instead of an array as inputs. In this way it should be totally clear which input needs to be passed to which layer. Therefore the function ```map_to_dict()``` is going to map the 1st dimension of the tuple returned by the dataset (which are the input sequences) to a dictionary using ```tf.data.Dataset.map()```:

In [6]:
## define mapping of input images to input layernames s.t. dataset returns dictionaries as inputs
def map_to_dict(*sub):
    layernames = ['in_t0', 'in_t1'] # dummy layernames
    return ({ layernames[i] : sequence for i, sequence in enumerate(sub[0]) }, sub[1])

## remap inputs of dataset
ds_final = ds_window.map(map_to_dict)
print(ds_final)

## print entries of dataset
print("iterate dataset:")
for data in ds_final:
    print("next data:")
    print(data)

<MapDataset shapes: ({in_t0: (None, 1, 8, 8), in_t1: (None, 1, 8, 8)}, (None, 6)), types: ({in_t0: tf.float64, in_t1: tf.float64}, tf.string)>
iterate dataset:
next data:
({'in_t0': <tf.Tensor: id=46, shape=(3, 1, 8, 8), dtype=float64, numpy=
array([[[[0., 0., 0., 0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0., 0., 0., 0.]]],


       [[[1., 1., 1., 1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1., 1., 1., 1.]]],


       [[[2., 2., 2., 2., 2., 2., 2., 2.],
         [2., 2., 

you can see that the dataset now returns tuples of the form ```({'layer0' : input_sequence_0, 'layer1' : input_sequence_1}, labels_sequence)```.

## 3.2 Finalize Dataset
now the dataset returns sequences as required and so we can finalize the dataset for training by shuffling (```tf.data.Dataset.shuffle()```) and batching (```tf.data.Dataset.batch()```). We will use batches of size ```2``` and a shuffle-buffer length of ```100```, which guarantees to shuffle all data uniformly.

In [7]:
batch_size = 2
ds_final_shuffled_batched = ds_final.shuffle(100).batch(batch_size)
print(ds_final_shuffled_batched)

<BatchDataset shapes: ({in_t0: (None, None, 1, 8, 8), in_t1: (None, None, 1, 8, 8)}, (None, None, 6)), types: ({in_t0: tf.float64, in_t1: tf.float64}, tf.string)>


you can see that each inputs and the labels are expanded by another ```None``` dimension which represents the batch dimension.

## 3.3 Investigate Elements of Dataset

we now want to investigate the data inside the dataset. So far we have only watched the shapes of the entries, but we have not verified if the data is in the right order.

In [8]:
for data in ds_final_shuffled_batched:
    print("next data:")
    print(data)

next data:
({'in_t0': <tf.Tensor: id=72, shape=(2, 3, 1, 8, 8), dtype=float64, numpy=
array([[[[[2., 2., 2., 2., 2., 2., 2., 2.],
          [2., 2., 2., 2., 2., 2., 2., 2.],
          [2., 2., 2., 2., 2., 2., 2., 2.],
          [2., 2., 2., 2., 2., 2., 2., 2.],
          [2., 2., 2., 2., 2., 2., 2., 2.],
          [2., 2., 2., 2., 2., 2., 2., 2.],
          [2., 2., 2., 2., 2., 2., 2., 2.],
          [2., 2., 2., 2., 2., 2., 2., 2.]]],


        [[[3., 3., 3., 3., 3., 3., 3., 3.],
          [3., 3., 3., 3., 3., 3., 3., 3.],
          [3., 3., 3., 3., 3., 3., 3., 3.],
          [3., 3., 3., 3., 3., 3., 3., 3.],
          [3., 3., 3., 3., 3., 3., 3., 3.],
          [3., 3., 3., 3., 3., 3., 3., 3.],
          [3., 3., 3., 3., 3., 3., 3., 3.],
          [3., 3., 3., 3., 3., 3., 3., 3.]]],


        [[[4., 4., 4., 4., 4., 4., 4., 4.],
          [4., 4., 4., 4., 4., 4., 4., 4.],
          [4., 4., 4., 4., 4., 4., 4., 4.],
          [4., 4., 4., 4., 4., 4., 4., 4.],
          [4., 4., 4., 4.,

first we can see that the shape of each input layer is ```(2,3,1,8,8)``` which means we have ```2```batches of sequences of length ```3``` and each element in the sequence is a ```(1,8,8)``` image. Furthermore the shape of the labels is ```(2,3,6)```, meaning ```2``` batches of sequences of length ```3``` where each element of the sequence has ```6``` values (```6``` dof (dummy) poses).Also note that each data frame returned from the dataset is well ordered. So as an example if we get as a batch element for ```in_t0``` ```[[2,..], [3,...], [4,...]]```, then for  ```in_t1``` we should get ```[[3,...], [4,...], [5,...]]``` and the labels should be ```[[tx23, ..., y23], [tx34, ..., y34], [tx45, ...,y45]]``` (I need to give this cryptic explanation since the dataset has been shuffled and therefore the order of the output will change each time running the kernel).

# 4. Further Reading
If you want to use such a sequenced dataset to train on a model similar to DeepVO (https://www.cs.ox.ac.uk/files/9026/DeepVO.pdf) this Jupyter Notebook will be interesting: http://www.cs.virginia.edu/~vicente/recognition/2016/notebooks/kerasLSTM.html .
In that notebook they show how to generally train a LSTM base RNN on sequenced data. They train on sequences of chars in order to generate english sentences. They do not use the ```tf.data.Dataset``` interface and instead they use simple numpy arrays. But this example is shows well how the RNNs need to be trained on sequences and later can be used to infere from single instances (which is principally the way it is done in DeepVO).