<center><img src="https://github.com/insaid2018/Term-1/blob/master/Images/INSAID_Full%20Logo.png?raw=true" width="240" height="180" /></center>

### Table of Content

1. [MNIST Overview](#section1)<br>
2. [MNIST using Keras](#section2)<br>
3. [MNIST using Eager Execution](#section3)<br>

<a id=section1></a>

# MNIST Handwritten Digits Dataset

* MNIST - __Dataset__ for evaluating machine learning models __on the handwritten digit classification problem.__
* Contains __28 x 28 sized images__ which are __normalized and centred.__
* __60,000 images__ are used to __train a model__ and a separate set of __10,000 images are used to test it.__

<a id=section2></a>

# MNIST Using Keras

Let's tackle the MNIST dataset using Keras - 

## 1. Import the Libraries

In [None]:
# Import tensorflow 2.x
# This code block will only work in Google Colab.
try:
    # %tensorflow_version only exists in Colab.
    %tensorflow_version 2.x
except Exception:
    pass

TensorFlow 2.x selected.


In [None]:
import numpy as np
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Dropout
from tensorflow.keras.utils import to_categorical

## 2. Set Random Seed for Reproducibility

In [None]:
seed = 7
np.random.seed(seed)

## 3. Load MNIST

Keras provides very __convenient means of loading the dataset__ as well as doing the __data slicing__ as shown below.

Note - It is customary to name the attributes X (matrix) in upper-case and the label y (vector) in lower-case.

In [None]:
# load (downloaded if needed) the MNIST dataset
(X_train, y_train), (X_test, y_test) = mnist.load_data()

## 4. Manually Observe the Dataset

__Plot a few images__ to see how the dataset actually looks like - 

In [None]:
import matplotlib.pyplot as plt
plt.close('all')
plt.subplot(221)                                             # Used to plot more than one figure in a graph
plt.imshow(X_train[0], cmap=plt.get_cmap('gray'))
plt.subplot(222)
plt.imshow(X_train[1], cmap=plt.get_cmap('gray'))
plt.subplot(223)
plt.imshow(X_train[2], cmap=plt.get_cmap('gray'))
plt.subplot(224)
plt.imshow(X_train[3], cmap=plt.get_cmap('gray'))
# show the plot
plt.show()

## 5. Flatten Images

The training dataset is structured as a __3-D array of images, image width and image height__ - 

* Since our network will be a __simple MLP, we need a vectorized input.__
* For this, we __flatten 28 x 28 pixels into 784 linear values.__
* For instance, in this diagram, we flatten a 3x3 image matrix into a vector of 9 values -

In [None]:
# flatten 28*28 images to a 784 vector for each image
num_pixels = X_train.shape[1] * X_train.shape[2]
test_images = X_test
X_train = X_train.reshape(X_train.shape[0], num_pixels)
X_test = X_test.reshape(X_test.shape[0], num_pixels)

## 6. Normalize Features and One-Hot Encode Labels

* Normalization is almost always needed for __faster convergence__ and __scale all attributes to a comparable range.__
* Here, __each of the 784 pixels is an attribute.__
* Since a gray pixel value varies from __0-255, dividing by 255 is a good method for normalizing.__

In [None]:
# normalize inputs from 0-255 to 0-1
X_train = X_train / 255
X_test = X_test / 255

Converting labels to categorical one-hot arrays - 

In [None]:
# one hot encode outputs
y_train = to_categorical(y_train)
test_labels = y_test
y_test = to_categorical(y_test)
num_classes = y_test.shape[1]

## 7. Define Neural Network

We define a simple Neural Network with __784 inputs units, two hidden layers and one output layer -__

In [None]:
# create model
model = Sequential()
model.add(Dense(num_pixels, input_dim=num_pixels, kernel_initializer='normal', activation='relu'))
model.add(Dense(num_pixels, kernel_initializer='normal', activation='relu'))
model.add(Dense(num_classes, kernel_initializer='normal', activation='softmax'))
# Compile model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

## 8. Fit the Model

In [None]:
# Fit the model
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10, batch_size=200, verbose=2)
# Final evaluation of the model
scores = model.evaluate(X_test, y_test, verbose=0)
print("Baseline Error: %.2f%%" % (100-scores[1]*100))

After our model has trained, it will learn to predict digits in this manner - 

<center><img src="https://thumbs.gfycat.com/WeepyConcreteGemsbok-size_restricted.gif"/></center>

## 9. Evaluate Performance

Finally, we evaluate the model by __looking at the correctly and incorrectly__ classified images

In [None]:
loss_and_metrics = model.evaluate(X_test, y_test, verbose=2)

print("Test Loss", loss_and_metrics[0])
print("Test Accuracy", loss_and_metrics[1])

In [None]:
#take ith image
i=0
predictions = model.predict(X_test)
print (predictions[i])
print ("Model predicts {}, for the actual digit {}".format(np.argmax(predictions[i]), np.argmax(y_test[i])))

In [None]:
#check incorrect predictions in a range
predicted_classes = model.predict_classes(X_test)
for i in range (1000):
    if(predicted_classes[i] != test_labels[i]) :
        print ("Model predicts {}, for the actual digit {}".format(np.argmax(predictions[i]), np.argmax(y_test[i])))

<a id=section3></a>

# MNIST Using TensorFlow Eager 

* Eager execution can be used for __rapid build and checks.__
* However, using TensorFlow instead of Keras means we need to __declare our own loss and accuracy functions.__

Note - Restart Kernel before running Eager, it needs to be instantiated at the beginning.

## 1. Import the Libraries

- Using **TensorFlow 1**.

- The next cell will only work on **Google Colab**, because it allows us to choose between the 1.x or 2.x versions of TensorFlow.

In [None]:
%tensorflow_version 1.x

In [None]:
import tensorflow as tf

In [None]:
from __future__ import absolute_import, division, print_function

## 2. Enable Eager Execution

In [None]:
# Set Eager API
tf.enable_eager_execution()
tfe = tf.contrib.eager

## 3. Load the Data

In [None]:
# Import MNIST data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("/tmp/data/", one_hot=False)

Extracting /tmp/data/train-images-idx3-ubyte.gz
Extracting /tmp/data/train-labels-idx1-ubyte.gz
Extracting /tmp/data/t10k-images-idx3-ubyte.gz
Extracting /tmp/data/t10k-labels-idx1-ubyte.gz


## 4. Initialize Hyperparameters

In [None]:
# Parameters
learning_rate = 0.001
num_steps = 10000
batch_size = 128
display_step = 100

# Network Parameters
n_hidden_1 = 256 # 1st layer number of neurons
n_hidden_2 = 256 # 2nd layer number of neurons
num_input = 784 # MNIST data input (img shape: 28*28)
num_classes = 10 # MNIST total classes (0-9 digits)

## 5. Make Eager Iterator for Dataset

In [None]:
# Iterator for the dataset
dataset = tf.data.Dataset.from_tensor_slices(
    (mnist.train.images, mnist.train.labels))
dataset = dataset.repeat().batch(batch_size).prefetch(batch_size)
dataset_iter = tfe.Iterator(dataset)

## 6. Define Neural Network

In [None]:
# Define the neural network. To use eager API and tf.layers API together,
# we must instantiate a tfe.Network class as follow:
class NeuralNet(tfe.Network):
    def __init__(self):
        # Define each layer
        super(NeuralNet, self).__init__()
        # Hidden fully connected layer with 256 neurons
        self.layer1 = self.track_layer(
            tf.layers.Dense(n_hidden_1, activation=tf.nn.relu))
        # Hidden fully connected layer with 256 neurons
        self.layer2 = self.track_layer(
            tf.layers.Dense(n_hidden_2, activation=tf.nn.relu))
        # Output fully connected layer with a neuron for each class
        self.out_layer = self.track_layer(tf.layers.Dense(num_classes))

    def call(self, x):
        x = self.layer1(x)
        x = self.layer2(x)
        return self.out_layer(x)


neural_net = NeuralNet()


Please inherit from `tf.keras.Model`, and see its documentation for details. `tf.keras.Model` should be a drop-in replacement for `tfe.Network` in most cases, but note that `track_layer` is no longer necessary or supported. Instead, `Layer` instances are tracked on attribute assignment (see the section of `tf.keras.Model`'s documentation on subclassing). Since the output of `track_layer` is often assigned to an attribute anyway, most code can be ported by simply removing the `track_layer` calls.

`tf.keras.Model` works with all TensorFlow `Layer` instances, including those from `tf.layers`, but switching to the `tf.keras.layers` versions along with the migration to `tf.keras.Model` is recommended, since it will preserve variable names. Feel free to import it with an alias to avoid excess typing :).


## 7. Specify Loss and Accuracy Functions

In [None]:
# Cross-Entropy loss function
def loss_fn(inference_fn, inputs, labels):
    # Using sparse_softmax cross entropy
    return tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(
        logits=inference_fn(inputs), labels=labels))

# Calculate accuracy
def accuracy_fn(inference_fn, inputs, labels):
    prediction = tf.nn.softmax(inference_fn(inputs))
    correct_pred = tf.equal(tf.argmax(prediction, 1), labels)
    return tf.reduce_mean(tf.cast(correct_pred, tf.float32))

# SGD Optimizer
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)

# Compute gradients
grad = tfe.implicit_gradients(loss_fn)

## 8. Train the Model

In [None]:
# Training
average_loss = 0.
average_acc = 0.
for step in range(num_steps):

    # Iterate through the dataset
    d = dataset_iter.next()

    # Images
    x_batch = d[0]
    # Labels
    y_batch = tf.cast(d[1], dtype=tf.int64)

    # Compute the batch loss
    batch_loss = loss_fn(neural_net, x_batch, y_batch)
    average_loss += batch_loss
    # Compute the batch accuracy
    batch_accuracy = accuracy_fn(neural_net, x_batch, y_batch)
    average_acc += batch_accuracy

    if step == 0:
        # Display the initial cost, before optimizing
        print("Initial loss= {:.9f}".format(average_loss))

    # Update the variables following gradients info
    optimizer.apply_gradients(grad(neural_net, x_batch, y_batch))

    # Display info
    if (step + 1) % display_step == 0 or step == 0:
        if step > 0:
            average_loss /= display_step
            average_acc /= display_step
        print("Step:", '%04d' % (step + 1), " loss=",
              "{:.9f}".format(average_loss), " accuracy=",
              "{:.4f}".format(average_acc))
        average_loss = 0.
        average_acc = 0.

Initial loss= 2.345647335
Step: 0001  loss= 2.345647335  accuracy= 0.0781
Step: 0100  loss= 2.274475336  accuracy= 0.1259
Step: 0200  loss= 2.230574608  accuracy= 0.2231
Step: 0300  loss= 2.170125008  accuracy= 0.3077
Step: 0400  loss= 2.111117840  accuracy= 0.4073
Step: 0500  loss= 2.038504839  accuracy= 0.5015
Step: 0600  loss= 1.983657122  accuracy= 0.5513
Step: 0700  loss= 1.921869159  accuracy= 0.5946
Step: 0800  loss= 1.863393307  accuracy= 0.6259
Step: 0900  loss= 1.771777630  accuracy= 0.6801
Step: 1000  loss= 1.718756676  accuracy= 0.6741
Step: 1100  loss= 1.663859725  accuracy= 0.6901
Step: 1200  loss= 1.586413980  accuracy= 0.7123
Step: 1300  loss= 1.505777597  accuracy= 0.7384
Step: 1400  loss= 1.462781787  accuracy= 0.7294
Step: 1500  loss= 1.395831466  accuracy= 0.7437
Step: 1600  loss= 1.339035392  accuracy= 0.7602
Step: 1700  loss= 1.276643872  accuracy= 0.7653
Step: 1800  loss= 1.204934478  accuracy= 0.7790
Step: 1900  loss= 1.168129086  accuracy= 0.7791
Step: 2000  lo

## 9. Test the Model

In [None]:
# Evaluate model on the test image set
testX = mnist.test.images
testY = mnist.test.labels

test_acc = accuracy_fn(neural_net, testX, testY)
print("Testset Accuracy: {:.4f}".format(test_acc))

Testset Accuracy: 0.9010
