# Gap Data Engineering Framework - CV Code Along with TensorFlow

## Sign-Language Dataset

We will do the following in this tutorial:

    1. Install the VISION module of the Gap Framework.
    2. Install TensorFlow for Machine Learning Model.
    3. Use the VISION module to preprocess the image data and make min-batches for training.
    4. Train the model.

### Install Vision Module

We start by importing the **Gap** <b style='color:saddlebrown'>Vision</b> module

In [None]:
# import the Gap Vision module
from vision import Images

### Install TensorFlow

Next we import the TensorFlow package

In [None]:
# import Tensorflow
# Importing Tensorflow
import tensorflow as tf
print( tf.__version__ )

### Data Source

Github Public Repository - https://github.com/EvilPort2/Sign-Language

<i>"The first thing I did was, I created 10 gesture samples using OpenCV. For each gesture I captured 1200 images which were 50x50 pixels. All theses images were in grayscale which is stored in the gestures/ folder. The gestures/0/ folder contains 1200 blank images which signify "none" gesture. Also I realised that keeping this category increased my model's accuracy to 99% from a laughable 82%."</i>

Let's go to the downloaded images and see what's there. Let's first check that we are in the correct directory.

In [None]:
pwd

You will see under the gestures directory, we have folders 0 through 26. Folder 0 is the none image, while folders 1 through 26 have images for the letters A..Z, respectively.

Let's look under one of the folders (i.e., folder 1 for the letter A):

<img src='gestures.jpg'/>

### Create Images Collections

We will use the <b style='color:saddlebrown'>Images</b> class to create a collection for each set (i.e., letter in alphabet) of training images.

To load and process the images, we do:
    1. Each subfolder under gestures corresponds to a letter, where the subfolder name is the label.
    2. Get a list of all files (images) under each subfolder.
    3. Create an Images collection per subfolder, where the files in the subfolder are the images, and the subfolder name is the label.

For the config(uration) options:
    - Process as grayscale 
    - Flatten images as 1D vectors
    - To reduce storage space, we are not going to store the raw pixel data.


In [None]:
import os

# There are 27 sets of images under gestures, each in its own subfolder
subfolders = os.listdir("gestures")

# Make list of all the sign language subfolder collections
imagedirs = [ "gestures/" + label for label in subfolders ]
# Make list of all corresponding labels per subfolder
labels = [ int(label) for label in subfolders ]

# Create the Images collection for all the subfolders
images = Images( imagedirs, labels, name='signlang', config=['grayscale', 'flatten'])

print("Total Time:", images.time)

### Splitting into Training and Test

Let's now split the collection of preprocessed images into Training and Test data. We will use the property *split* to split the collection.

In [None]:

# 80% of the images will be used for training (note that the list is randomized)
images.split = 0.2, 42

# When used as a getter, the split property will return the training / test data and labels the same as the sci-learn
# procedure train_test_split()
X_train, X_test, Y_train, Y_test = images.split

## Build a Graph

In [None]:
# Let's first reset our graph, so our neural network components are all declared within the same graph
from tensorflow.python.framework import ops
ops.reset_default_graph() 

### Placeholders

In [None]:
X = tf.placeholder(tf.float32, shape=[2500, None])
Y = tf.placeholder(tf.float32, shape=[27, None])
D = tf.placeholder(tf.float32, [])
L = tf.placeholder(tf.float32, [])

### Input Layer

Let's now design our input layer. We need two things: weights and biases.

Each input feature (pixel) will need a weight (which our model will learning during training). The weight is multipled against the value of the input (pixel), which we symbolically represent as Wx.

Each output from the layer will need a bias (which our model will learning during training). The bias is added to the result of the weight multipled by the pixel value (Wx).

Let's create two Tensorflow variables for our weights and biases. The weights (which we call W) will need to be a 2D matrix. The rows are the number of inputs, which is 2500 and the columns the number of outputs to the hidden layer, which will be 64.

The bias will be a vector of size 64 (one for each output).

We need to initialize our weights and biases to some initial value. We will initialize the weights using a random value initializer (Xavier) and initialize the biases to zero.

In [None]:
tf.set_random_seed(1)   # Set the same seed to get the same initialization as in this demo.

# The weights for the input layer
W1 = tf.get_variable("W1", [64, 2500], initializer=tf.contrib.layers.xavier_initializer(seed=1))

# The bias for the output from the input layer
b1 = tf.get_variable("b1", [64, 1], initializer=tf.zeros_initializer())

In [None]:
# The first layer (input layer)
Z1 = tf.add(tf.matmul(W1, X), b1)

# Let's add the activation function to the output signal from the first layer
A1 = tf.nn.relu(Z1)

### First Hidden Layer

The first hidden layer will have 64 inputs (outputs from input layer) and 32 outputs. Each input will need a weight and each output a bias (which we will train). Each output will be passed through the linear rectifier unit (RELU) activation function.

We will initialize the weights using a random value initializer (Xavier) and initialize the biases to zero.

In [None]:
W2 = tf.get_variable("W2", [32, 64], initializer=tf.contrib.layers.xavier_initializer(seed=1))
b2 = tf.get_variable("b2", [32, 1], initializer=tf.zeros_initializer())

Let's construct the first hidden layer

    Create a node that will multiply the weights (W2) against the outputs of the input layer (A1).
    Create a node that adds the bias to the above node (W2 * A1)
    Pass the outputs from the (first) hidden layer through a dropout layer
    Pass the outputs from the dropout layer through a RELU activation function


In [None]:
# The second layer (first hidden layer)
Z2 = tf.add(tf.matmul(W2, A1), b2) 

# Let's add the dropout layer to the output signal from the second layer
D2 = tf.nn.dropout(Z2, keep_prob=D)

# Let's add the activation function to the output signal from the dropout layer
A2 = tf.nn.relu(D2)

### Second Hidden Layer

The second hidden layer will have 32 inputs (outputs from first hidden layer) and 20 outputs. Each input will need a weight and each output a bias (which we will train). Each output will be passed through the linear rectifier unit (RELU) activation function.

We will initialize the weights using a random value initializer (Xavier) and initialize the biases to zero.

In [None]:
W3 = tf.get_variable("W3", [20, 32], initializer=tf.contrib.layers.xavier_initializer(seed=1))
b3 = tf.get_variable("b3", [20, 1], initializer=tf.zeros_initializer())

Let's construct the second hidden layer.

    Create a node that will multiply the weights (W3) against the outputs of the first hidden layer (A2).
    Create a node that adds the bias to the above node (W3 * A2)
    Pass the outputs from the second hidden layer through a RELU activation function

In [None]:
# The third layer (second hidden layer)
Z3 = tf.add(tf.matmul(W3, A2), b3) 

# Let's add the activation function to the output signal from the third layer
A3 = tf.nn.relu(Z3)

### Output Layer

The output layer will have 20 inputs (outputs from the second hidden layer) and 27 outputs (one for each letter and None). Each input will need a weight and each output a bias (which we will train). The 27 outputs will be passed through a softmax activation function.

We will initialize the weights using a random value initializer (Xavier) and initialize the biases to zero.

In [None]:
W4 = tf.get_variable("W4", [27, 20], initializer=tf.contrib.layers.xavier_initializer(seed=1))
b4 = tf.get_variable("b4", [27, 1], initializer=tf.zeros_initializer())

Let's construct the output layer.

    Create a node that will multiply the weights (W4) against the outputs of the second hidden layer (A3).
    Create a node that adds the bias to the above node (W4 * A3)
    Pass the outputs from the output layer through a SOFTMAX squashing function (done by the optimizer)



In [None]:
# The fourth layer (output layer)
Z4 = tf.add(tf.matmul(W4, A3), b4)

### Optimizer

Now its time to design our optimizer. Let's start by designing our cost function. We will use the mean value of the softmax cross entropy between the predicted labels and actual labels. This is what we want to reduce on each batch (aka the cost).

In [None]:
# to fit the tensorflow requirement for tf.nn.softmax_cross_entropy_with_logits(...,...)
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=tf.transpose(Z4), labels=tf.transpose(Y)))

Let's design our optimizer. This is the method that adjusts the values of the weights and biases, based on minizing the cost value during training.

We also need to set a learning rate. This is multiplied against the gradient calculation. It's used to prevent huge swings in setting weights which can result in either converging at a local (instead of global) optima, or not converging at all (infinite gradient). We will set the learning rate when we run the graph using the placeholder L.


In [None]:
# The learning rate for Gradient Descent algorithm
optimizer = tf.train.GradientDescentOptimizer(L).minimize(cost)

### Run the Graph

In [None]:
init = tf.global_variables_initializer()

#### Hyperparameter Tuning

In [None]:
import time

epochs = 25                                    # run a 25 epochs
batch_size = 100                               # for each epoch, train in batches of 100 images
number_of_images = len(X_train)                # number of images in training data

# Feed Dictionary Parameters
keep_prob = 0.9                                # percent of outputs to keep in dropout layer
learning_rate = 0.004                          # the learning rate for graident descent

In [None]:
def train():
    start = time.time()

    with tf.Session() as sess:
        # Initialize the variables
        sess.run(init)
        
        # number of batches in an epoch
        batches = number_of_images // batch_size

        # run our training data through the neural network for each epoch
        for epoch in range(epochs):

          epoch_cost = 0

          # Run the training data through the neural network
          for batch in range(batches):

              # Calculate the start and end indices for the next batch
              begin = (batch * batch_size)
              end   = (batch * batch_size) + batch_size


              # Get the next sequential batch from the training data
              batch_xs, batch_ys = X_train[begin:end], Y_train[begin:end]

              # Feed this batch through the neural network.
              _, batch_cost = sess.run([optimizer, cost], feed_dict={X: batch_xs.T, Y: batch_ys.T, D: keep_prob, L: learning_rate})

              epoch_cost += batch_cost

          print("Epoch: ", epoch, epoch_cost / batches)

        end = time.time()

        print("Training Time:", end - start)

        # Test the Model

        # Let's select the highest percent from the softmax output per image as the prediction.
        prediction = tf.equal(tf.argmax(Z4), tf.argmax(Y))

        # Let's create another node for calculating the accuracy
        accuracy = tf.reduce_mean(tf.cast(prediction, tf.float32))

        # Now let's run our trainingt images through the model to calculate our accuracy during training
        # Note how we set the keep percent for the dropout rate to 1.0 (no dropout) when we are evaluating the accuracy.
        print ("Train Accuracy:", accuracy.eval({X: X_train.T, Y: Y_train.T, D: 1.0}))

        # Now let's run our test images through the model to calculate our accuracy on the test data
        print ("Test Accuracy:", accuracy.eval({X: X_test.T, Y: Y_test.T, D: 1.0}))
        
train()