# VGG-like CNN Custom Estimator for MNIST, built with keras layers

In this example, we'll look at how to build a Custom Estimator -- a CNN model -- using Keras layers to define the model. This version of the model is based on
[this blog post](http://www.sas-programming.com/2017/09/a-vgg-like-cnn-for-fashion-mnist-with.html) --
a [VGG-like]( http://www.robots.ox.ac.uk/~vgg) CNN (VGG is a deep convolutional network for object
recognition developed and trained by Oxford's Visual Geometry Group, which achieved
good performance on the ImageNet dataset.)


First, do some imports and define some variables.

**If you're running this notebook on colab**, download the `dataset.py` file from the repo:

In [None]:
%%bash
wget https://raw.githubusercontent.com/amygdala/tensorflow-workshop/master/workshop_sections/mnist_series/mnist_cnn_custom_estimator/dataset.py
ls -l dataset.py

In [None]:
"""Convolutional Neural Network Custom Estimator for MNIST, built with tf.layers."""

from __future__ import absolute_import, division, print_function
import argparse
import os
import numpy as np
import time
import dataset

import tensorflow as tf
from tensorflow import keras

from tensorflow.python.keras.layers import Dense, Dropout, Activation, Flatten
from tensorflow.python.keras.layers import Conv2D, MaxPooling2D, LeakyReLU
from tensorflow.python.keras import backend as K

BATCH_SIZE = 100

MODEL_DIR = os.path.join("/tmp/tfmodels/mnist_cnn_estimator",
                          "keras_vgglike" + str(int(time.time())))
# This is too short for proper training (especially with 'Fashion-MNIST'), 
# but we'll use it here to make the notebook quicker to run.
NUM_STEPS = 5000

print("using model dir: %s" % MODEL_DIR)
# Tensorflow version should be >=1.4.
print(tf.__version__)

### Download Fashion-MNIST

Next, download Fashion-MNIST if you haven't already done so.  
If you have, skip the next two cells and just edit `DATA_DIR` to point to the correct directory.

In [None]:
%%bash
mkdir -p fashion_mnist
cd fashion_mnist
wget http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz
wget http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz
wget http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz
wget http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz
gunzip *
cd ..

In [None]:
%%bash
ls -l fashion_mnist

Edit the following value as appropriate.

In [None]:
DATA_DIR = 'fashion_mnist'
# DATA_DIR = '/tmp/MNIST_data'

Define the model function that will be used in constructing the Estimator.

Note use of the Keras layers.

In [None]:
def cnn_model_fn(features, labels, mode):
  """Model function for CNN."""

  # Input Layer
  # Reshape X to 4-D tensor: [batch_size, width, height, channels]
  # MNIST images are 28x28 pixels, and have one color channel
  input_layer = tf.reshape(features["pixels"], [-1, 28, 28, 1])

  if mode == tf.estimator.ModeKeys.TRAIN:
    K.set_learning_phase(True)
  else:
    K.set_learning_phase(False)

  conv1 = Conv2D(filters=32, kernel_size=(3, 3), padding="same",
            input_shape=(28,28,1), activation='relu')(input_layer)
  conv2 = Conv2D(filters=64, kernel_size=(3, 3), padding="same", activation='relu')(conv1)
  pool1 = MaxPooling2D(pool_size=(2, 2))(conv2)
  dropout1 = Dropout(0.5)(pool1)
  conv3 = Conv2D(filters=128, kernel_size=(3, 3), padding="same", activation='relu')(dropout1)
  conv4 = Conv2D(filters=256, kernel_size=(3, 3), padding="valid", activation='relu')(conv3)
  pool2 = MaxPooling2D(pool_size=(3, 3))(conv4)
  dropout2 = Dropout(0.5)(pool2)
  pool2_flat = Flatten()(dropout2)
  dense1 = Dense(256)(pool2_flat)
  lrelu = LeakyReLU()(dense1)
  dropout3 = Dropout(0.5)(lrelu)
  dense2 = Dense(256)(dropout3)
  lrelu2 = LeakyReLU()(dense2)
  logits = Dense(10, activation='linear')(lrelu2)

  predictions = {
      # Generate predictions (for PREDICT and EVAL mode)
      "classes": tf.argmax(input=logits, axis=1),
      # Add `softmax_tensor` to the graph. It is used for PREDICT and by the
      # `logging_hook`.
      "probabilities": tf.nn.softmax(logits, name="softmax_tensor")
  }
  prediction_output = tf.estimator.export.PredictOutput({"classes": tf.argmax(input=logits, axis=1),
     "probabilities": tf.nn.softmax(logits, name="softmax_tensor")})

  if mode == tf.estimator.ModeKeys.PREDICT:
    return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions,
        export_outputs={tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: prediction_output})

  # Calculate Loss (for both TRAIN and EVAL modes)
  onehot_labels = tf.one_hot(indices=tf.cast(labels, tf.int32), depth=10)
  loss = tf.losses.softmax_cross_entropy(
      onehot_labels=onehot_labels, logits=logits)
  # Generate some summary info
  tf.summary.scalar('loss', loss)
  tf.summary.histogram('conv1', conv1)
  tf.summary.histogram('dense', dense1)

  # Configure the Training Op (for TRAIN mode)
  if mode == tf.estimator.ModeKeys.TRAIN:
    optimizer = tf.train.AdamOptimizer(learning_rate=1e-4)
    train_op = optimizer.minimize(
        loss=loss,
        global_step=tf.train.get_global_step())

    return tf.estimator.EstimatorSpec(mode=mode, loss=loss, train_op=train_op)

  # Add evaluation metrics (for EVAL mode)
  eval_metric_ops = {
      "accuracy": tf.metrics.accuracy(
          labels=labels, predictions=predictions["classes"])}
  return tf.estimator.EstimatorSpec(
      mode=mode, loss=loss, eval_metric_ops=eval_metric_ops)

Define input functions for reading in data.

In [None]:
def train_input_fn(data_dir, batch_size=100):
  """Prepare data for training."""

  # When choosing shuffle buffer sizes, larger sizes result in better
  # randomness, while smaller sizes use less memory. MNIST is a small
  # enough dataset that we can easily shuffle the full epoch.
  ds = dataset.train(data_dir)
  ds = ds.cache().shuffle(buffer_size=50000).batch(batch_size=batch_size)

  # Iterate through the dataset a set number of times
  # during each training session.
  ds = ds.repeat(40)
  features = ds.make_one_shot_iterator().get_next()
  return {'pixels': features[0]}, features[1]


def eval_input_fn(data_dir, batch_size=100):
  features = dataset.test(data_dir).batch(
      batch_size=batch_size).make_one_shot_iterator().get_next()
  return {'pixels': features[0]}, features[1]

Create the Estimator object.

In [None]:
# Create the Estimator
mnist_classifier = tf.estimator.Estimator(
    model_fn=cnn_model_fn, model_dir=MODEL_DIR)

Train the model.  Pass the estimator object the input function to use, and some other config.

In [None]:
# Train and evaluate the model
train_input = lambda: train_input_fn(
    DATA_DIR,
    batch_size=BATCH_SIZE
)
eval_input = lambda: eval_input_fn(
    DATA_DIR,
    batch_size=BATCH_SIZE
)

# Set up logging for predictions
# Log the values in the "Softmax" tensor with label "probabilities"
tensors_to_log = {"probabilities": "softmax_tensor"}
logging_hook = tf.train.LoggingTensorHook(
    tensors=tensors_to_log, every_n_iter=2000)

train_spec = tf.estimator.TrainSpec(train_input,
                                  max_steps=NUM_STEPS,
                                  hooks=[logging_hook]
                                  )
def serving_input_receiver_fn():
    feature_tensor = tf.placeholder(tf.float32, [None, 784])
    return tf.estimator.export.ServingInputReceiver(
        {'pixels': feature_tensor}, {'pixels': feature_tensor})

exporter = tf.estimator.FinalExporter('cnn_mnist', serving_input_receiver_fn)

# While not shown here, we can also add a model 'exporter' to the EvalSpec.
eval_spec = tf.estimator.EvalSpec(eval_input,
                                steps=NUM_STEPS,
                                exporters=[exporter],
                                name='cnn_mnist_keras'
                                )


Now we'll define the `TrainSpec` and `EvalSpec` to pass to `tf.estimator.train_and_evaluate()`. As part of the `EvalSpec`, we define an Exporter.

In [None]:
tf.estimator.train_and_evaluate(mnist_classifier,
                                train_spec,
                                eval_spec)

We can look at the characteristics of the exported model.

In [None]:
%env MODEL_DIR=$MODEL_DIR

In [None]:
%%bash
exported_model_dir=$(ls ${MODEL_DIR}/export/cnn_mnist)
saved_model_cli show --dir ${MODEL_DIR}/export/cnn_mnist/${exported_model_dir} --tag serve --all

Now let's look at info about our training run in TensorBoard. 

**If you're running this notebook on colab**, you'll need to skip this step.

Start up TensorBoard as follows in a new terminal window, pointing it to the MODEL_DIR. (If you get a 'not found' error, make sure you've activated your virtual environment in that new window):
```
$ tensorboard --logdir=<model_dir>
```

Try the following to compare across runs:

```
$ tensorboard --logdir=/tmp/tfmodels
```

Or run the following (select Kernel --> Interrupt from the menu when you're done):

In [None]:
!tensorboard --logdir=/tmp/tfmodels

Copyright 2017 The TensorFlow Authors. All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

  http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.