<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Deep-Learning-with-the-Edge-Based-Transformation" data-toc-modified-id="Deep-Learning-with-the-Edge-Based-Transformation-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Deep Learning with the Edge-Based Transformation</a></span><ul class="toc-item"><li><span><a href="#A-Simple-Network" data-toc-modified-id="A-Simple-Network-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>A Simple Network</a></span></li><li><span><a href="#A-More-Complex-Network" data-toc-modified-id="A-More-Complex-Network-1.2"><span class="toc-item-num">1.2&nbsp;&nbsp;</span>A More Complex Network</a></span><ul class="toc-item"><li><span><a href="#Adding-More-Nodes" data-toc-modified-id="Adding-More-Nodes-1.2.1"><span class="toc-item-num">1.2.1&nbsp;&nbsp;</span>Adding More Nodes</a></span></li><li><span><a href="#Adding-More-Layers" data-toc-modified-id="Adding-More-Layers-1.2.2"><span class="toc-item-num">1.2.2&nbsp;&nbsp;</span>Adding More Layers</a></span></li><li><span><a href="#Adding-More-Nodes-and-More-Layers-to-the-Network" data-toc-modified-id="Adding-More-Nodes-and-More-Layers-to-the-Network-1.2.3"><span class="toc-item-num">1.2.3&nbsp;&nbsp;</span>Adding More Nodes and More Layers to the Network</a></span></li></ul></li><li><span><a href="#Discussion" data-toc-modified-id="Discussion-1.3"><span class="toc-item-num">1.3&nbsp;&nbsp;</span>Discussion</a></span></li></ul></li></ul></div>

# Deep Learning with the Edge-Based Transformation
Note that this main explaination of the deep learning is in the deep_learning_raw_pixel notebook. This notebook would contains only the extra description of the edge-based transformation. 

In this notebook, we adopt the edge-based transformation to transform the image, and feed the images to train the model. 

In [1]:
import tensorflow as tf

In [2]:
import pandas as pd
from autograd import numpy as np

In [3]:
df_train = pd.read_csv("./traindata.csv", dtype=np.uint8)

In [4]:
df_train.head(3)

Unnamed: 0,id,0,1,2,3,4,5,6,7,8,...,775,776,777,778,779,780,781,782,783,label
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,5
1,1,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,2,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,4


In [5]:
df_train.shape

(60000, 786)

In [6]:
df_imgs = df_train.drop(['label', 'id'], axis=1)
x_original = df_imgs.as_matrix().T

print("x_original shape: ", x_original.shape)

x_original shape:  (784, 60000)


  


Here, we extract the edge-based feature from the data. 

In [7]:
# extract edge-based features
import data_transformer
x_original_edgebased_features = data_transformer.edge_transformer(x_original)   

print('shape of original input ', x_original.shape)
print('shape of transformed input ', x_original_edgebased_features.shape)

shape of original input  (784, 60000)
shape of transformed input  (1352, 60000)


In [8]:
y_original = df_train['label'].as_matrix()

print("y_original shape: ", y_original.shape)

y_original shape:  (60000,)


  """Entry point for launching an IPython kernel.


In [9]:
# for testing purposes, this would minimize the training time of the model while testing the 
# code/model.

# num_sample = 5000
# inds = np.random.permutation(y_original.shape[0])[:num_sample]
# x_sample = x_original_edgebased_features[:,inds].T
# y_sample = y_original[inds]

In [10]:
# print("x_sample shape: ", x_sample.shape)
# print("y_sample shape: ", y_sample.shape)

x_sample shape:  (5000, 1352)
y_sample shape:  (5000,)


In [36]:
x = x_original_edgebased_features.T
y = y_original

print("x shape: ", x.shape)
print("y shape: ", y.shape)

x shape:  (60000, 1352)
y shape:  (60000,)


In [37]:
xmax = np.max(x)
x_scale = x/xmax

In [51]:
input_dim = x.shape[1]
nb_classes = 10

print("input_dim: ", input_dim)
print("nb_classes: ", nb_classes)

input_dim:  1352
nb_classes:  10


## A Simple Network
This model has only three layers: input, one hidden, and the output layers.

In [39]:
model = tf.keras.models.Sequential()
# model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(512, 
                                input_dim=input_dim, 
                                activation=tf.nn.relu))
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.Dense(10, activation=tf.nn.softmax))

In [40]:
model.compile(loss='sparse_categorical_crossentropy',
              optimizer='adam', 
              metrics=['accuracy'])

In [41]:
model.fit(x_scale, y, epochs=10, verbose=2)

Epoch 1/10
 - 31s - loss: 0.1394 - acc: 0.9567
Epoch 2/10
 - 30s - loss: 0.0707 - acc: 0.9783
Epoch 3/10
 - 32s - loss: 0.0543 - acc: 0.9838
Epoch 4/10
 - 31s - loss: 0.0462 - acc: 0.9851
Epoch 5/10
 - 31s - loss: 0.0408 - acc: 0.9870
Epoch 6/10
 - 34s - loss: 0.0341 - acc: 0.9889
Epoch 7/10
 - 37s - loss: 0.0322 - acc: 0.9898
Epoch 8/10
 - 32s - loss: 0.0317 - acc: 0.9900
Epoch 9/10
 - 33s - loss: 0.0290 - acc: 0.9914
Epoch 10/10
 - 34s - loss: 0.0261 - acc: 0.9912


<tensorflow.python.keras._impl.keras.callbacks.History at 0x7fe3cd500048>

Loading the testing data to evaluate the model.

In [42]:
df_testdata = pd.read_csv("./testdata.csv", dtype=np.uint8)
df_testdata.head(3)

Unnamed: 0,id,0,1,2,3,4,5,6,7,8,...,775,776,777,778,779,780,781,782,783,label
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,7
1,1,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,2
2,2,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,1


In [43]:
df_testimgs = df_testdata.drop(['label', 'id'], axis=1)
xtest = df_testimgs.as_matrix().T

print("xtest shape:", xtest.shape)

xtest shape: (784, 10000)


  


We also need to extract the edge-based features from the testing images, and we also need to scale the feature-extracted images.

In [44]:
xtest_edgebased_features = data_transformer.edge_transformer(xtest)   

print('shape of original input ', xtest_edgebased_features.shape)

shape of original input  (1352, 10000)


In [45]:
ytest = df_testdata['label'].as_matrix()
# ytest = ytest.reshape((10000, 1)).T
print("ytest shape:", ytest.shape)

ytest shape: (10000,)


  """Entry point for launching an IPython kernel.


In [46]:
xtest_scale = xtest_edgebased_features/xmax

xtest_scale = xtest_scale.T

In [47]:
model.evaluate(xtest_scale, ytest)




[0.071228929195043741, 0.9849]

With a simple network, we can achievve the 98 percent accuracy. 

## A More Complex Network

### Adding More Nodes
We increase the number of nodes in a hidden layer from 512 nodes to 1352 nodes. 

In [52]:
model = tf.keras.models.Sequential()
# model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(input_dim, 
                                input_dim=input_dim, 
                                activation=tf.nn.relu))
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.Dense(nb_classes, activation=tf.nn.softmax))

model.compile(loss='sparse_categorical_crossentropy',
              optimizer='adam', 
              metrics=['accuracy'])

In [53]:
model.fit(x_scale, y, epochs=10, verbose=2)

Epoch 1/10
 - 66s - loss: 0.1420 - acc: 0.9553
Epoch 2/10
 - 61s - loss: 0.0732 - acc: 0.9770
Epoch 3/10
 - 65s - loss: 0.0613 - acc: 0.9803
Epoch 4/10
 - 65s - loss: 0.0509 - acc: 0.9843
Epoch 5/10
 - 66s - loss: 0.0420 - acc: 0.9875
Epoch 6/10
 - 65s - loss: 0.0402 - acc: 0.9876
Epoch 7/10
 - 67s - loss: 0.0361 - acc: 0.9888
Epoch 8/10
 - 58s - loss: 0.0331 - acc: 0.9903
Epoch 9/10
 - 59s - loss: 0.0338 - acc: 0.9908
Epoch 10/10
 - 65s - loss: 0.0310 - acc: 0.9914


<tensorflow.python.keras._impl.keras.callbacks.History at 0x7fe41e0bfe48>

In [54]:
model.evaluate(xtest_scale, ytest)




[0.049635707115774128, 0.98750000000000004]

From the output of the evaluation method, the accuracy is not improved from the simple model significantly.

### Adding More Layers
We aim to increase more accuracy by adding more layers to the network.

In [61]:
model = tf.keras.models.Sequential()
# model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(512, 
                                input_dim=input_dim, 
                                activation=tf.nn.relu))
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.Dense(512,
                                activation=tf.nn.relu))
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.Dense(nb_classes, activation=tf.nn.softmax))

model.compile(loss='sparse_categorical_crossentropy',
              optimizer='adam', 
              metrics=['accuracy'])

In [62]:
model.fit(x_scale, y, epochs=10, verbose=2)

Epoch 1/10
 - 45s - loss: 0.1588 - acc: 0.9514
Epoch 2/10
 - 43s - loss: 0.0860 - acc: 0.9732
Epoch 3/10
 - 46s - loss: 0.0683 - acc: 0.9792
Epoch 4/10
 - 42s - loss: 0.0612 - acc: 0.9815
Epoch 5/10
 - 42s - loss: 0.0509 - acc: 0.9846
Epoch 6/10
 - 42s - loss: 0.0501 - acc: 0.9851
Epoch 7/10
 - 40s - loss: 0.0489 - acc: 0.9856
Epoch 8/10
 - 41s - loss: 0.0468 - acc: 0.9865
Epoch 9/10
 - 39s - loss: 0.0409 - acc: 0.9888
Epoch 10/10
 - 42s - loss: 0.0419 - acc: 0.9883


<tensorflow.python.keras._impl.keras.callbacks.History at 0x7fe41d12a860>

In [63]:
model.evaluate(xtest_scale, ytest)




[0.066465662567854814, 0.98460000000000003]

The accuracy of adding more layers is 98%.

### Adding More Nodes and More Layers to the Network 
In this section, we add more layers and more nodes to the network, and measure the accuracy of the model.

In [66]:
model = tf.keras.models.Sequential()
# model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(input_dim, 
                                input_dim=input_dim, 
                                activation=tf.nn.relu))
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.Dense(input_dim,
                                activation=tf.nn.relu))
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.Dense(nb_classes, activation=tf.nn.softmax))

model.compile(loss='sparse_categorical_crossentropy',
              optimizer='adam', 
              metrics=['accuracy'])

In [67]:
model.fit(x_scale, y, epochs=10, verbose=2)

Epoch 1/10
 - 97s - loss: 0.1601 - acc: 0.9508
Epoch 2/10
 - 99s - loss: 0.0900 - acc: 0.9737
Epoch 3/10
 - 97s - loss: 0.0751 - acc: 0.9782
Epoch 4/10
 - 99s - loss: 0.0655 - acc: 0.9810
Epoch 5/10
 - 98s - loss: 0.0602 - acc: 0.9829
Epoch 6/10
 - 99s - loss: 0.0538 - acc: 0.9843
Epoch 7/10
 - 101s - loss: 0.0488 - acc: 0.9868
Epoch 8/10
 - 98s - loss: 0.0507 - acc: 0.9868
Epoch 9/10
 - 106s - loss: 0.0456 - acc: 0.9879
Epoch 10/10
 - 100s - loss: 0.0511 - acc: 0.9871


<tensorflow.python.keras._impl.keras.callbacks.History at 0x7fe3c9ec7e10>

In [68]:
model.evaluate(xtest_scale, ytest)




[0.0613382150465899, 0.98509999999999998]

The accuracy of this model is almost 99 percent.

## Discussion
Although the percent accuracy of the model does not show any striking difference, we cannot conclude that 
adding more nodes or more layers and both improves accuracy to solve this classification problem.