# How to transfer raw image data to TFRecords
----

Hello everyone! This tutorial, like the previous one, is focused on automatizing the data input pipeline.

Most of the time, our datasets are too big to read in memory so we have to prepare a pipeline for reading the data in batches from hard disk. I always process my raw data (text, images, tabular) to TFRecords as it makes my life so much easier hehe :).

### Tutorial flowchart
----
![img](tutorials_graphics/images2tfrecords.png)

This tutorial will cover the following parts:
* *create a function that reads raw images and transfers them to TFRecords.*
* *create a function that parses the TFRecords to TF tensors.*

So without any further due, let's get started.

### Import here useful libraries
----

In [2]:
import tensorflow as tf
import tensorflow.contrib.eager as tfe
import glob

In [3]:
# Enable eager mode. Once activated it cannot be reversed! Run just once.
tfe.enable_eager_execution()

### Transfer raw images to TFRecords
----

For this task, we will be using a few images from the FER2013 dataset, that you can find in the **datasets/dummy_images** folder. The emotion label can be found in the filename of the image.
For example, picture **id7_3.jpg** has the label emotion **3**, which corresponds to the state **'Happy'** as you can see in the dictionary below.

In [4]:
# Get the meaning of each emotion index
emotion_cat = {0:'Angry', 1:'Disgust', 2:'Fear', 3:'Happy', 4:'Sad', 5:'Surprise', 6:'Neutral'}

In [5]:
def img2tfrecords(path_data='datasets/dummy_images/', image_format='jpeg'):
    ''' Function to transfer raw images, along with their 
        target labels, to TFRecords.
        Original source code for helper functions: https://goo.gl/jEhp2B
        
        Args:
            path_data: the location of the raw images
            image_format: the format of the raw images (e.g. 'png', 'jpeg')
    '''
    
    def _int64_feature(value):
        '''Helper function.'''
        return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))
    
    def _bytes_feature(value):
        '''Helper function.'''
        return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
    
    # Get the filename of each image within the directory
    filenames = glob.glob(path_data + '*' + image_format)
    
    # Create a TFRecordWriter
    writer = tf.python_io.TFRecordWriter(path_data + 'dummy.tfrecords')
    
    # Iterate through each image and write it to the TFrecords file.
    for filename in filenames:
        # Read raw image
        img = tf.read_file(filename).numpy()
        # Parse its label from the filename
        label = int(filename.split('_')[-1].split('.')[0])
        # Create an example (image, label)
        example = tf.train.Example(features=tf.train.Features(feature={
            'label': _int64_feature(label),
            'image': _bytes_feature(img)}))
        # Write serialized example to TFRecords
        writer.write(example.SerializeToString())

In [6]:
# Transfer raw data to TFRecords
img2tfrecords()

### Parse TFRecords to TF tensors
----

In [7]:
def parser(record):
    '''Function to parse a TFRecords example'''
    
    # Define here the features you would like to parse
    features = {'image': tf.FixedLenFeature((), tf.string),
                'label': tf.FixedLenFeature((), tf.int64)}
    
    # Parse example
    parsed = tf.parse_single_example(record, features)

    # Decode image 
    img = tf.image.decode_image(parsed['image'])
   
    return img, parsed['label']


If you want me to add anything to this tutorial, please let me know and I will be happy to further enhance it :).