# Deep Neural Net on MNIST Data

This notebook shows an implementation of the deep neural network discussed on the bitfusion blog post titled "Intro to Tensorflow." More details about the code can be found there.

What follows assumes you are set up on the bitfusion TensorFlow 1.0 AMI and that you have already run the `./setup.sh` script to download the required data.

## Set up the Environment

The first thing we do is import the required packages and also change the logging of TensorFlow (TF). Since tf.contrib.learn is still part of contrib and not yet core, there are some warning messages that are outputted. If the warning in these messages becomes something that needs to change the code, we will change the code. The commented out code is what would be in place if the TF code was not causing a large amount of warning text (and what should be used in the future).

In [1]:
# Read in required packages
import tensorflow as tf
import tensorflow.contrib.layers as layers
import tensorflow.contrib.learn as learn
from tensorflow.contrib.learn.python.learn.metric_spec import MetricSpec
import tensorflow.contrib.metrics as tfmetrics
import cPickle
import gzip

# The following code is to over-write the logging information outputted by tf.contrib.learn
from logging import StreamHandler, INFO, getLogger

logger = getLogger('tensorflow')
logger.removeHandler(logger.handlers[0])

logger.setLevel(INFO)


class DebugFileHandler(StreamHandler):
    def __init__(self):
        StreamHandler.__init__(self)

    def emit(self, record):
        if not record.levelno == INFO:
            return
        StreamHandler.emit(self, record)

logger.addHandler(DebugFileHandler())

# Once the code is fixed, the way that the code should be implemented is this:
# tf.logging.set_verbosity(tf.logging.INFO)

ImportError: No module named tensorflow

## Reading the Data

For this project we will be reading in the MNIST dataset that was downloaded as part of the `./setup.sh` script.

In [2]:
# Read Data
f = gzip.open('mnist.pkl.gz', 'rb')
train_set, valid_set, test_set = cPickle.load(f)
f.close()

NameError: name 'gzip' is not defined

## Defining the Model Using `tf.contrib.learn`

We will now define the model structure for a deep neural network (DNN). The `tf.contrib.learn` framework provides pre-packaged models that can be run in a very similar manner to a scikit-learn model.

In [3]:
# Infer the feature columns
feature_columns = tf.contrib.learn.infer_real_valued_columns_from_input(train_set[0])

# Define the DNN classifier
classifier = tf.contrib.learn.DNNClassifier(
    feature_columns=feature_columns,
    hidden_units=[512, 128],
    n_classes=10
)

NameError: name 'tf' is not defined

## Training the `tf.contrib.learn` Model

The next thing that we will do is train the model using the `fit` function on the `DNNClassifer` class object. The code will run for 10,000 steps with a batch size of 100.

In [4]:
classifier.fit(x=train_set[0],
               y=train_set[1],
               batch_size=100,
               steps=10000)

NameError: name 'classifier' is not defined

## Testing the Trained Model

The last thing we want to do with this model is test the accuracy against some held-out test data set.

In [5]:
score = classifier.evaluate(x=test_set[0], y=test_set[1])
print('Accuracy: {0:f}'.format(score['accuracy']))

NameError: name 'classifier' is not defined

## Defining Our Own Model Using `tf.contrib.layers`

The next thing we want to illustrate is how someone would implement their own DNN architecture. For this example we will actually just implement the exact same version of the model that we already trained. The purpose will be to get comfortable with defining models using `tf.contrib.layers`

In [6]:
# Model Definition
def fully_connected_model(features, labels):
    features = layers.flatten(features)
    labels = tf.one_hot(tf.cast(labels, tf.int32), 10, 1, 0)

    layer1 = layers.fully_connected(features, 512, activation_fn=tf.nn.relu, scope='fc1')
    layer2 = layers.fully_connected(layer1, 128, activation_fn=tf.nn.relu, scope='fc2')
    logits = layers.fully_connected(layer2, 10, activation_fn=None, scope='out')

    loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=labels))

    train_op = layers.optimize_loss(
        loss,
        tf.contrib.framework.get_global_step(),
        optimizer='SGD',
        learning_rate=0.01
    )

    return tf.argmax(logits, 1), loss, train_op

custom_classifier = learn.Estimator(model_fn=fully_connected_model)

NameError: name 'learn' is not defined

## Train the Custom Model

Train our custom model with the same steps and batch size.

In [None]:
classifier.fit(x=train_set[0],
               y=train_set[1],
               batch_size=100,
               steps=10000)

## Test the Model

In [None]:
# Model Testing
score = classifier.evaluate(x=test_set[0], y=test_set[1],
                            metrics={'accuracy': MetricSpec(tfmetrics.streaming_accuracy)})
print('Accuracy: {0:f}'.format(score['accuracy']))

# Summary and Note on Accuracy

This is a great starting point for going deeper with TensorFlow. What we have done is implemented a DNN using a pre-built model using `tf.contrib.learn` and our own model using `tf.contrib.layers`. Moving forward, we will expand on the custom model. To look at the raw code of the final model, look at the `model.py` file found in this repository.

One thing that I want to address is the fact that the accuracy of the custom model may be lower than the pre-built model. We will cover optimizers in future posts, but long story short is the optimizer used by the DNNClassifier class converges better on this dataset. One might also note that we never used a learning rate when defining the DNNClassifier. It is because we used an adagrad optimizer in the DNNClassifier, but we use a traditional Stochastic Gradient Descent (SGD) optimizer for our custom model. We will go over these a little more in later posts.