Having efficient data pipelines is of paramount importance for any machine learning model. In this blog we will learn how to use TensorFlow's Dataset module tf.data to build efficient data pipelines.

Most of the introductory articles on TensorFlow would introduce you with the feed_dict method of feeding the data to the model. feed_dict processes the input data in a single thread and while the data is being loaded and processed on CPU, the GPU remains idle and when the GPU is training the first batch of data, CPU remains in idle state. The developers of TensorFlow have advised not to use this method during training or repeated validation of same datasets.

tf_data improves the performance by prefetching the next batch of data asynchronously so that GPU need not wait for the data. You can also parallelize the process of preprocessing and loading the dataset.

In [0]:
import tensorflow as tf 
import numpy as np

#Creating Datasets

Tensorflow provides many methods to create datasets from numpy arrays, tensors, text files, csv files, etc. Let's have a look at all the methods. 

1. **from_tensor_slices** Accepts single or multiple numpy/tensor objects. This emits only one data at a time when iterator's get_next is called

In [0]:
# source data - numpy array 
data = np.arange(0,10)

# create a dataset from numpy array
dataset = tf.data.Dataset.from_tensor_slices(data)

The object **dataset** is a tensorflow Dataset object. We need to create an iterator that will extract data from this dataset.

In [0]:
iterator = dataset.make_one_shot_iterator()
next_element = iterator.get_next()

The iterator is not aware of the number of elements in the dataset. The get_next function returns the next element in the dataset until it is exhausted. Once exhausted it will throw **tf.errors.OutOfRangeError** exception. The below code will print out integers from 0 to 9. If we try to run the **next_element** once the dataset is exhausted,  it will throw an OutOfRangeError. This is because the code extracted all the data slices from the dataset and it is now out of range or "empty"


In [4]:
with tf.Session() as sess:
  for i in range(10):
    val = sess.run(next_element)
    print(val)

0
1
2
3
4
5
6
7
8
9


Observe that get_next() returns only one element at a time. To fetch all the data from the dataset, we need to run the get_next function for times equal to the number of elements in the dataset. 

You can also pass multiple numpy arrays to the dataset. Example: In case of training a model, we would need the pair of features and labels

In [5]:
features, labels = (np.random.uniform((100,2)), np.random.uniform((100,1)))
dataset = tf.data.Dataset.from_tensor_slices((features, labels))
iterator = dataset.make_one_shot_iterator()
next_element = iterator.get_next()
with tf.Session() as sess:
  val = sess.run(next_element)
  print(val)

(62.6666235097523, 48.174537150505884)


2. **from_tensors** This method also accepts multipe numpy arrays and tensors, but emits all the data at once

In [0]:
data = np.arange(10,15)
dataset = tf.data.Dataset.from_tensors(data)

In [7]:
iterator = dataset.make_one_shot_iterator()
next_element = iterator.get_next()
with tf.Session() as sess:
  val = sess.run(next_element)
  print(val)

[10 11 12 13 14]


Observe that all the values in the dataset are emitted at once

3. **from_generator** This method takes a generator function as an input. This emits one data at a time

In [8]:
def generator():
  for i in range(10):
    yield 2*i
    
dataset = tf.data.Dataset.from_generator(generator, (tf.int32))

iterator = dataset.make_one_shot_iterator()
next_element = iterator.get_next()
with tf.Session() as sess:
  for i in range(10):
    val = sess.run(next_element)
    print(val)

Instructions for updating:
tf.py_func is deprecated in TF V2. Instead, use
    tf.py_function, which takes a python function which manipulates tf eager
    tensors instead of numpy arrays. It's easy to convert a tf eager tensor to
    an ndarray (just call tensor.numpy()) but having access to eager tensors
    means `tf.py_function`s can use accelerators such as GPUs as well as
    being differentiable using a gradient tape.
    
0
2
4
6
8
10
12
14
16
18


#Transforming of Datasets

**Batches** - Combines consecutive elements of the dataset into a single batch. Useful when you want to train smaller batches of data while training the model to avoid out of memory errors.

In [9]:
data = np.arange(10,40)
dataset = tf.data.Dataset.from_tensor_slices(data).batch(10)
iterator = dataset.make_one_shot_iterator()
next_ele = iterator.get_next()

with tf.Session() as sess:
  try:
    while True:
      val = sess.run(next_ele)
      print(val)
  except tf.errors.OutOfRangeError:
    pass

[10 11 12 13 14 15 16 17 18 19]
[20 21 22 23 24 25 26 27 28 29]
[30 31 32 33 34 35 36 37 38 39]


**Zip** - Creates a dataset by zipping together datasets. Useful in scenarios where you have features and labels and you need to provide the pair of feature and label for training the model.

In [10]:
datax = np.arange(10,20)
datay = np.arange(11,21)

datasetx = tf.data.Dataset.from_tensor_slices(datax)
datasety = tf.data.Dataset.from_tensor_slices(datay)

dcombined = tf.data.Dataset.zip((datasetx, datasety)).batch(2)
iterator = dcombined.make_one_shot_iterator()
next_ele = iterator.get_next()

with tf.Session() as sess:
  try:
    while True:
      val = sess.run(next_ele)
      print(val)
  except tf.errors.OutOfRangeError:
    pass

(array([10, 11]), array([11, 12]))
(array([12, 13]), array([13, 14]))
(array([14, 15]), array([15, 16]))
(array([16, 17]), array([17, 18]))
(array([18, 19]), array([19, 20]))


**Repeat**

In [11]:
dataset = tf.data.Dataset.from_tensor_slices(tf.range(10))
dataset = dataset.repeat(2)
iterator = dataset.make_one_shot_iterator()
next_ele = iterator.get_next()

with tf.Session() as sess:
  try:
    while True:
      val = sess.run(next_ele)
      print(val)
  except tf.errors.OutOfRangeError:
    pass

0
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9


**Map** - Used to transform the elements of the dataset. Useful in cases where you want to transform your raw data before feeding into the model.

In [12]:
def map_fnc(x):
  return x*2;

data = np.arange(10)
dataset = tf.data.Dataset.from_tensor_slices(data)
dataset = dataset.map(map_fnc)
iterator = dataset.make_one_shot_iterator()
next_ele = iterator.get_next()

with tf.Session() as sess:
  try:
    while True:
      val = sess.run(next_ele)
      print(val)
  except tf.errors.OutOfRangeError:
    pass

  

 

0
2
4
6
8
10
12
14
16
18


#Creating Iterators

**One-shot iterator:** This is the most basic form of iterator. It requires no explicit initialization and iterates over the data only one time and once it gets exhausted, it cannot be re-initialized.

In [13]:
data = np.arange(10,15)
#create the dataset
dataset = tf.data.Dataset.from_tensor_slices(data)
#create the iterator
iterator = dataset.make_one_shot_iterator()
next_element = iterator.get_next()
with tf.Session() as sess:
  val = sess.run(next_element)
  print(val)

10


**Initializable iterator:** This iterator requires you to explicitly initialize the iterator by running iterator.initialize. You can define a tf.placeholder and pass data to it dynamically each time you call the initialize operation.

In [14]:
# define two placeholders to accept min and max value
min_val = tf.placeholder(tf.int32, shape=[])
max_val = tf.placeholder(tf.int32, shape=[])
data = tf.range(min_val, max_val)
dataset = tf.data.Dataset.from_tensor_slices(data)
iterator = dataset.make_initializable_iterator()
next_ele = iterator.get_next()
with tf.Session() as sess:
  
  print("------Dataset1-----")
  # initialize an iterator with range of values from 10 to 15
  sess.run(iterator.initializer, feed_dict={min_val:10, max_val:15})
  try:
    while True:
      val = sess.run(next_ele)
      print(val)
  except tf.errors.OutOfRangeError:
    pass
     
  print("\n")  
  print("------Dataset2-----")
  # initialize an iterator with range of values from 1 to 10
  sess.run(iterator.initializer, feed_dict={min_val:1, max_val:10})
  try:
    while True:
      val = sess.run(next_ele)
      print(val)
  except tf.errors.OutOfRangeError:
    pass

Instructions for updating:
Colocations handled automatically by placer.
------Dataset1-----
10
11
12
13
14


------Dataset2-----
1
2
3
4
5
6
7
8
9


**Reinitializable iterator:** This iterator can be initialized from different Dataset objects that have the same structure. Each dataset can pass through it's own transformation pipeline.

In [15]:
def map_fnc(ele):
  return ele*2
min_val = tf.placeholder(tf.int32, shape=[])
max_val = tf.placeholder(tf.int32, shape=[])
data = tf.range(min_val, max_val)
#Define separate datasets for training and validation
train_dataset =  tf.data.Dataset.from_tensor_slices(data)
val_dataset = tf.data.Dataset.from_tensor_slices(data).map(map_fnc)
#create an iterator 
iterator=tf.data.Iterator.from_structure(train_dataset.output_types    ,train_dataset.output_shapes)
train_initializer = iterator.make_initializer(train_dataset)
val_initializer = iterator.make_initializer(val_dataset)
next_ele = iterator.get_next()
with tf.Session() as sess:
  
  # initialize an iterator with range of values from 10 to 15
  print("-----Training-----")
  sess.run(train_initializer, feed_dict={min_val:10, max_val:15})
  try:
    while True:
      val = sess.run(next_ele)
      print(val)
  except tf.errors.OutOfRangeError:
    pass
  
  print("\n")
  print("-------Validation------")
  # initialize an iterator with range of values from 1 to 10
  sess.run(val_initializer, feed_dict={min_val:1, max_val:10})
  try:
    while True:
      val = sess.run(next_ele)
      print(val)
  except tf.errors.OutOfRangeError:
    pass

-----Training-----
10
11
12
13
14


-------Validation------
2
4
6
8
10
12
14
16
18


**Feedable iterator:** Can be used to switch between Iterators for different datasets. Useful when you have different datasets and you want to have more control over which iterator to use over the dataset.

In [16]:
def map_fnc(ele):
  return ele*2
min_val = tf.placeholder(tf.int32, shape=[])
max_val = tf.placeholder(tf.int32, shape=[])
data = tf.range(min_val, max_val)
train_dataset = tf.data.Dataset.from_tensor_slices(data)
val_dataset = tf.data.Dataset.from_tensor_slices(data).map(map_fnc)
train_val_iterator = tf.data.Iterator.from_structure(train_dataset.output_types , train_dataset.output_shapes)
train_initializer = train_val_iterator.make_initializer(train_dataset)
val_initializer = train_val_iterator.make_initializer(val_dataset)
test_dataset = tf.data.Dataset.from_tensor_slices(tf.range(10,15))
test_iterator = test_dataset.make_one_shot_iterator()
handle = tf.placeholder(tf.string, shape=[])
iterator = tf.data.Iterator.from_string_handle(handle, train_dataset.output_types, train_dataset.output_shapes)
next_ele = iterator.get_next()
with tf.Session() as sess:
  
  train_val_handle = sess.run(train_val_iterator.string_handle())
  test_handle = sess.run(test_iterator.string_handle())
  
  print("-----Training------")
  # training
  sess.run(train_initializer, feed_dict={min_val:10, max_val:15})
  try:
    while True:
      val = sess.run(next_ele, feed_dict={handle:train_val_handle})
      print(val)
  except tf.errors.OutOfRangeError:
    pass
  
  print("\n")
  print("------Validation-----")
  #validation
  sess.run(val_initializer, feed_dict={min_val:1, max_val:10})
  try:
    while True:
      val = sess.run(next_ele, feed_dict={handle:train_val_handle})
      print(val)
  except tf.errors.OutOfRangeError:
    pass
  
  print("\n")
  print("------Testing-------")
  #testing
  try:
    while True:
      val = sess.run(next_ele, feed_dict={handle:test_handle})
      print(val)
  except tf.errors.OutOfRangeError:
    pass

-----Training------
10
11
12
13
14


------Validation-----
2
4
6
8
10
12
14
16
18


------Testing-------
10
11
12
13
14


#Building the LeNet5 Model

Let's import the MNIST data from the tensorflow library. The MNIST database contains 60,000 training images and 10,000 testing images. 

In [17]:
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", reshape=False, one_hot = True)

X_train, y_train = mnist.train.images, mnist.train.labels
X_val, y_val = mnist.validation.images, mnist.validation.labels
X_test, y_test = mnist.test.images, mnist.test.labels


Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.
Instructions for updating:
Please write your own downloading logic.
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting MNIST_data/train-images-idx3-ubyte.gz
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Instructions for updating:
Please use tf.one_hot on tensors.
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz
Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.


In [0]:
X_train = np.pad(X_train, ((0,0), (2,2), (2,2), (0,0)), 'constant')
X_val =   np.pad(X_val, ((0,0), (2,2), (2,2), (0,0)), 'constant')
X_test =  np.pad(X_test, ((0,0), (2,2), (2,2), (0,0)), 'constant')

In [19]:
print(X_train.shape)
print(y_train.shape)
print(X_val.shape)
print(y_val.shape)
print(X_test.shape)
print(y_test.shape)

(55000, 32, 32, 1)
(55000, 10)
(5000, 32, 32, 1)
(5000, 10)
(10000, 32, 32, 1)
(10000, 10)


In [0]:
def forward_pass(X):
    W1 = tf.get_variable("W1", [5,5,1,6], initializer = tf.contrib.layers.xavier_initializer(seed=0))
    # for conv layer2
    W2 = tf.get_variable("W2", [5,5,6,16], initializer = tf.contrib.layers.xavier_initializer(seed=0))
    Z1 = tf.nn.conv2d(X, W1, strides = [1,1,1,1], padding='VALID')
    A1 = tf.nn.relu(Z1)
    P1 = tf.nn.max_pool(A1, ksize = [1,2,2,1], strides = [1,2,2,1], padding='VALID')
    Z2 = tf.nn.conv2d(P1, W2, strides = [1,1,1,1], padding='VALID')
    A2= tf.nn.relu(Z2)
    P2= tf.nn.max_pool(A2, ksize = [1,2,2,1], strides=[1,2,2,1], padding='VALID')
    P2 = tf.contrib.layers.flatten(P2)
   
    Z3 = tf.contrib.layers.fully_connected(P2, 120)
    Z4 = tf.contrib.layers.fully_connected(Z3, 84)
    Z5 = tf.contrib.layers.fully_connected(Z4,10, activation_fn= None)
    return Z5
  

In [0]:
def model(X,Y):
    
    logits = forward_pass(X)
    cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=logits, labels=Y))
    optimizer = tf.train.AdamOptimizer(learning_rate=0.0009)
    learner = optimizer.minimize(cost)
    correct_predictions = tf.equal(tf.argmax(logits,1), tf.argmax(Y,1))
    accuracy = tf.reduce_mean(tf.cast(correct_predictions, tf.float32))
    
    return (learner, accuracy)



We have now created the model. Before deciding on the Iterator to use for our model, let's see what are the typical requirements of a machine learning model.


*   **Training the data over batches:** Dataset can be very huge. To prevent out of memory errors, we would need to train our dataset in small batches.
*   **Train the model over n passes of the dataset:** Typically you want to run your training model over multiple passes of the dataset.
*   **Validate the model at each epoch:** You would need to validate your model at each epoch to check your model's performance.
*   **Finally test your model on unseen data:** After the model is trained, you would like to test your model on unseen data.



Let's see the pros and cons of each iterator.

##One Shot Iterator

The Dataset can't be reinitialized once exhausted. To train for more epochs, you would need to repeat the Dataset before feeding to the iterator. This will require huge memory if the size of the data is large. It also doesn't provide any option to validate the model.

In [22]:
epochs = 10 
batch_size = 64 
iterations = len(y_train) * epochs 

tf.reset_default_graph()
dataset = tf.data.Dataset.from_tensor_slices((X_train, y_train))

# need to repeat the dataset for epoch number of times, as all the data needs
# to be fed to the dataset at once
dataset = dataset.repeat(epochs).batch(batch_size)
iterator = dataset.make_one_shot_iterator()

X_batch , Y_batch = iterator.get_next()


(learner, accuracy) = model(X_batch, Y_batch)

with tf.Session() as sess:
  sess.run(tf.global_variables_initializer())
  
  total_accuracy = 0
  try:
    while True:
      temp_accuracy, _ = sess.run([accuracy, learner])
      total_accuracy += temp_accuracy
      
  except tf.errors.OutOfRangeError:
    pass
  
  
print('Avg training accuracy is {}'.format((total_accuracy * batch_size) / iterations ))



Instructions for updating:
Use keras.layers.flatten instead.
Avg training accuracy is 0.9822563636363636


##Initializable iterator

You can dynamically change the Dataset between training and validation Datasets at the time of initializing the iterator. However both the Dataset needs to go through the same transformation pipeline.

In [23]:
epochs = 10 
batch_size = 64 

tf.reset_default_graph()

X_data = tf.placeholder(tf.float32, [None, 32,32,1])
Y_data = tf.placeholder(tf.float32, [None, 10])


dataset = tf.data.Dataset.from_tensor_slices((X_data, Y_data))
dataset = dataset.batch(batch_size)
iterator = dataset.make_initializable_iterator()

X_batch , Y_batch = iterator.get_next()


(learner, accuracy) = model(X_batch, Y_batch)

with tf.Session() as sess:
  sess.run(tf.global_variables_initializer())
  for epoch in range(epochs):
    
    # train the model
    sess.run(iterator.initializer, feed_dict={X_data:X_train, Y_data:y_train})
    total_train_accuracy = 0
    no_train_examples = len(y_train)
    try:
      while True:
        temp_train_accuracy, _ = sess.run([accuracy, learner])
        total_train_accuracy += temp_train_accuracy*batch_size
    except tf.errors.OutOfRangeError:
      pass
    
    # validate the model
    sess.run(iterator.initializer, feed_dict={X_data:X_val, Y_data:y_val})
    total_val_accuracy = 0
    no_val_examples = len(y_val)
    try:
      while True:
        temp_val_accuracy = sess.run(accuracy)
        total_val_accuracy += temp_val_accuracy*batch_size
    except tf.errors.OutOfRangeError:
      pass
    
    print('Epoch {}'.format(str(epoch+1)))
    print("---------------------------")
    print('Training accuracy is {}'.format(total_train_accuracy/no_train_examples))
    print('Validation accuracy is {}'.format(total_val_accuracy/no_val_examples))

Epoch 1
---------------------------
Training accuracy is 0.9227272727272727
Validation accuracy is 0.9748
Epoch 2
---------------------------
Training accuracy is 0.9742181818181819
Validation accuracy is 0.9914
Epoch 3
---------------------------
Training accuracy is 0.9823454545454545
Validation accuracy is 0.9954
Epoch 4
---------------------------
Training accuracy is 0.9876
Validation accuracy is 0.997
Epoch 5
---------------------------
Training accuracy is 0.9903454545454545
Validation accuracy is 0.999
Epoch 6
---------------------------
Training accuracy is 0.9924727272727273
Validation accuracy is 0.998
Epoch 7
---------------------------
Training accuracy is 0.9940727272727272
Validation accuracy is 0.999
Epoch 8
---------------------------
Training accuracy is 0.9940909090909091
Validation accuracy is 0.997
Epoch 9
---------------------------
Training accuracy is 0.9954
Validation accuracy is 0.9988
Epoch 10
---------------------------
Training accuracy is 0.9958
Validation

##Reinitializable Iterartor

This iterator overcomes the problem of initializable iterator by using two separate Datasets. Each dataset can go through it's own preprocessing pipleine.

In [24]:
epochs = 10 
batch_size = 64 

tf.reset_default_graph()

X_data = tf.placeholder(tf.float32, [None, 32,32,1])
Y_data = tf.placeholder(tf.float32, [None, 10])


train_dataset = tf.data.Dataset.from_tensor_slices((X_data, Y_data)).batch(batch_size)
val_dataset =  tf.data.Dataset.from_tensor_slices((X_data, Y_data)).batch(batch_size)


iterator = tf.data.Iterator.from_structure(train_dataset.output_types, train_dataset.output_shapes)
X_batch , Y_batch = iterator.get_next()
(learner, accuracy) = model(X_batch, Y_batch)

train_initializer = iterator.make_initializer(train_dataset)
val_initializer =  iterator.make_initializer(val_dataset)

with tf.Session() as sess:
  sess.run(tf.global_variables_initializer())
  for epoch in range(epochs):
    
    # train the model
    sess.run(train_initializer, feed_dict={X_data:X_train, Y_data:y_train})
    total_train_accuracy = 0
    no_train_examples = len(y_train)
    try:
      while True:
        temp_train_accuracy, _ = sess.run([accuracy, learner])
        total_train_accuracy += temp_train_accuracy*batch_size
    except tf.errors.OutOfRangeError:
      pass
    
    # validate the model
    sess.run(val_initializer, feed_dict={X_data:X_val, Y_data:y_val})
    total_val_accuracy = 0
    no_val_examples = len(y_val)
    try:
      while True:
        temp_val_accuracy = sess.run(accuracy)
        total_val_accuracy += temp_val_accuracy*batch_size
    except tf.errors.OutOfRangeError:
      pass
    
    print('Epoch {}'.format(str(epoch+1)))
    print("---------------------------")
    print('Training accuracy is {}'.format(total_train_accuracy/no_train_examples))
    print('Validation accuracy is {}'.format(total_val_accuracy/no_val_examples))

Epoch 1
---------------------------
Training accuracy is 0.9220909090909091
Validation accuracy is 0.9682
Epoch 2
---------------------------
Training accuracy is 0.9752545454545455
Validation accuracy is 0.9916
Epoch 3
---------------------------
Training accuracy is 0.9830181818181818
Validation accuracy is 0.9928
Epoch 4
---------------------------
Training accuracy is 0.9871818181818182
Validation accuracy is 0.9964
Epoch 5
---------------------------
Training accuracy is 0.9907454545454546
Validation accuracy is 0.9968
Epoch 6
---------------------------
Training accuracy is 0.9925818181818182
Validation accuracy is 0.9972
Epoch 7
---------------------------
Training accuracy is 0.9937272727272727
Validation accuracy is 0.9966
Epoch 8
---------------------------
Training accuracy is 0.9942909090909091
Validation accuracy is 0.9958
Epoch 9
---------------------------
Training accuracy is 0.9953636363636363
Validation accuracy is 0.9988
Epoch 10
---------------------------
Training 

##Feedable Iterator

This iterator provides the option of switching between the iterators. In my opinion this is the best. You can create a reinitializable iterator for training and validation purposes. For inference/testing where you require one pass of the dataset, you can use the one shot iterator.

In [25]:
epochs = 10
batch_size = 64 

tf.reset_default_graph()

X_data = tf.placeholder(tf.float32, [None, 32,32,1])
Y_data = tf.placeholder(tf.float32, [None, 10])



train_dataset = tf.data.Dataset.from_tensor_slices((X_data, Y_data)).batch(batch_size)
val_dataset =  tf.data.Dataset.from_tensor_slices((X_data, Y_data)).batch(batch_size)

test_dataset =  tf.data.Dataset.from_tensor_slices((X_test, y_test.astype(np.float32))).batch(batch_size)


handle = tf.placeholder(tf.string, shape=[])
iterator = tf.data.Iterator.from_string_handle(handle, train_dataset.output_types, train_dataset.output_shapes)
X_batch , Y_batch = iterator.get_next()
(learner, accuracy) = model(X_batch, Y_batch)

train_val_iterator = tf.data.Iterator.from_structure(train_dataset.output_types, train_dataset.output_shapes)
train_iterator = train_val_iterator.make_initializer(train_dataset)
val_iterator = train_val_iterator.make_initializer(val_dataset)

test_iterator = test_dataset.make_one_shot_iterator()

with tf.Session() as sess:
  sess.run(tf.global_variables_initializer())
  train_val_string_handle = sess.run(train_val_iterator.string_handle())
  test_string_handle = sess.run(test_iterator.string_handle())
  for epoch in range(epochs):
    
    # train the model
    sess.run(train_iterator, feed_dict={X_data:X_train, Y_data:y_train})
    total_train_accuracy = 0
    no_train_examples = len(y_train)
    try:
      while True:
        temp_train_accuracy, _ = sess.run([accuracy, learner], feed_dict={handle:train_val_string_handle})
        total_train_accuracy += temp_train_accuracy*batch_size
    except tf.errors.OutOfRangeError:
      pass
    
    # validate the model
    sess.run(val_iterator, feed_dict={X_data:X_val, Y_data:y_val})
    total_val_accuracy = 0
    no_val_examples = len(y_val)
    try:
      while True:
        temp_val_accuracy = sess.run(accuracy, feed_dict={handle:train_val_string_handle})
        total_val_accuracy += temp_val_accuracy*batch_size
    except tf.errors.OutOfRangeError:
      pass
    
    print('Epoch {}'.format(str(epoch+1)))
    print("---------------------------")
    print('Training accuracy is {}'.format(total_train_accuracy/no_train_examples))
    print('Validation accuracy is {}'.format(total_val_accuracy/no_val_examples))
  
  
  print("Testing the model --------")
 
  total_test_accuracy = 0
  no_test_examples = len(y_test)
  try:
    while True:
        temp_test_accuracy = sess.run(accuracy, feed_dict={handle:test_string_handle})
        total_test_accuracy += temp_test_accuracy*batch_size
  except tf.errors.OutOfRangeError:
    pass
    
  print('Testing accuracy is {}'.format(total_test_accuracy/no_test_examples)) 
  

Epoch 1
---------------------------
Training accuracy is 0.9239272727272727
Validation accuracy is 0.9648
Epoch 2
---------------------------
Training accuracy is 0.9762181818181818
Validation accuracy is 0.9874
Epoch 3
---------------------------
Training accuracy is 0.9831090909090909
Validation accuracy is 0.9934
Epoch 4
---------------------------
Training accuracy is 0.9873272727272727
Validation accuracy is 0.9932
Epoch 5
---------------------------
Training accuracy is 0.9900909090909091
Validation accuracy is 0.9946
Epoch 6
---------------------------
Training accuracy is 0.9920727272727272
Validation accuracy is 0.9966
Epoch 7
---------------------------
Training accuracy is 0.9931636363636364
Validation accuracy is 0.996
Epoch 8
---------------------------
Training accuracy is 0.9941272727272727
Validation accuracy is 0.9988
Epoch 9
---------------------------
Training accuracy is 0.9947454545454546
Validation accuracy is 0.9966
Epoch 10
---------------------------
Training a

**References**

* https://www.tensorflow.org/api_docs/python/tf/data/Iterator#from_string_handle

*  https://www.tensorflow.org/guide/datasets
*  https://docs.google.com/presentation/d/16kHNtQslt-yuJ3w8GIx-eEH6t_AvFeQOchqGRFpAD7U/edit#slide=id.g254d08e080_0_141

*  https://github.com/tensorflow/tensorflow/issues/2919