## STATS 507 HW10 Problem 3: Build a Convolutional Neural Network using Estimators  
Code adapted from https://www.tensorflow.org/tutorials/estimators/cnn  
The __`tf.layers`__ module provides a high-level API that makes it easy to construct a neural network. It provides methods that facilitate the creation of dense (fully connected) layers and convolutional layers, adding activation functions, and applying dropout regularization. In this tutorial, you'll learn how to use layers to build a convolutional neural network model to recognize the handwritten digits in the MNIST data set.

### Building the CNN MNIST Classifier

Let's build a model to classify the images in the MNIST dataset using the following CNN architecture:

* __Convolutional Layer #1__: Applies 32 5x5 filters (extracting 5x5-pixel subregions), with ReLU activation function
Pooling Layer #1: Performs max pooling with a 2x2 filter and stride of 2 (which specifies that pooled regions do not overlap)
* __Convolutional Layer #2__: Applies 64 5x5 filters, with ReLU activation function
* __Pooling Layer #2__: Again, performs max pooling with a 2x2 filter and stride of 2
* __Dense Layer #1__: 1,024 neurons, with dropout regularization rate of 0.4 (probability of 0.4 that any given element will be dropped during training)
* __Dense Layer #2 (Logits Layer)__: 10 neurons, one for each digit target class (0–9).

The `tf.layers` module contains methods to create each of the three layer types above:

* `conv2d().` Constructs a two-dimensional convolutional layer. Takes number of filters, filter kernel size, padding, and activation function as arguments.
* `max_pooling2d().` Constructs a two-dimensional pooling layer using the max-pooling algorithm. Takes pooling filter size and stride as arguments.
* `dense().` Constructs a dense layer. Takes number of neurons and activation function as arguments.
Each of these methods accepts a tensor as input and returns a transformed tensor as output. This makes it easy to connect one layer to another: just take the output from one layer-creation method and supply it as input to another.

Add the following __`cnn_model_fn`__ function, which conforms to the interface expected by TensorFlow's Estimator API (more on this later in Create the Estimator). This function takes MNIST feature __data, labels, and mode__ (from `tf.estimator.ModeKeys: TRAIN, EVAL, PREDICT`) as arguments; configures the CNN; and returns __predictions, loss, and a training operation__:

In [16]:
from __future__ import absolute_import, division, print_function

import tensorflow as tf
import numpy as np

tf.logging.set_verbosity(tf.logging.INFO)

### Model function for CNN

In [17]:
def cnn_model_fn(features, labels, mode):
    
    '''Input Layer: [batch_size, image_height, image_width, channels]
    batch size: -1, which specifies that this dimension should be dynamically computed based on the number of input values in features["x"], holding the size of all other dimensions constant.'''
    
    input_layer = tf.reshape(features["x"], [-1, 28, 28, 1])

    '''Convolutional Layer #1: 
    filter, kernel_size: 32 5x5 filters to the input layer, with ReLU activation function
    padding: specify that the output tensor should have the same height and width values as the input tensor, 
    instructs TensorFlow to add 0 values to the edges of the input tensor to preserve height and width of 28'''
    
    conv1 = tf.layers.conv2d(
      inputs=input_layer,
      filters=32,
      kernel_size=[5, 5],
      padding="same",
      activation=tf.nn.relu)

  # Pooling Layer #1
    pool1 = tf.layers.max_pooling2d(inputs=conv1, pool_size=[2, 2], strides=2)

  # Convolutional Layer #2 and Pooling Layer #2
    conv2 = tf.layers.conv2d(
      inputs=pool1,
      filters=64,
      kernel_size=[5, 5],
      padding="same",
      activation=tf.nn.relu)
    pool2 = tf.layers.max_pooling2d(inputs=conv2, pool_size=[2, 2], strides=2)

  # Dense Layer
    pool2_flat = tf.reshape(pool2, [-1, 7 * 7 * 64])
    dense = tf.layers.dense(inputs=pool2_flat, units=1024, activation=tf.nn.relu)
    dropout = tf.layers.dropout(
        inputs=dense, rate=0.4, training=mode == tf.estimator.ModeKeys.TRAIN)

  # Logits Layer
    logits = tf.layers.dense(inputs=dropout, units=10)
    
    predictions = {
      # Generate predictions (for PREDICT and EVAL mode)
      "classes": tf.argmax(input=logits, axis=1),
      # Add `softmax_tensor` to the graph. It is used for PREDICT and by the
      # `logging_hook`.
      "probabilities": tf.nn.softmax(logits, name="softmax_tensor")
  }
    
    if mode == tf.estimator.ModeKeys.PREDICT:
        return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions)

  # Calculate Loss (for both TRAIN and EVAL modes)
    loss = tf.losses.sparse_softmax_cross_entropy(labels=labels, logits=logits)

  # Configure the Training Op (for TRAIN mode)
    if mode == tf.estimator.ModeKeys.TRAIN:
        optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.001)
        train_op = optimizer.minimize(
            loss=loss,
            global_step=tf.train.get_global_step())
        return tf.estimator.EstimatorSpec(mode=mode, loss=loss, train_op=train_op)

  # Add evaluation metrics (for EVAL mode)
    eval_metric_ops = {
      "accuracy": tf.metrics.accuracy(
          labels=labels, predictions=predictions["classes"])
    }
    
    return tf.estimator.EstimatorSpec(
        mode=mode, loss=loss, eval_metric_ops=eval_metric_ops)

### Training and Evaluating the CNN MNIST Classifier
#### Set up

In [18]:
# Load training and eval data
((train_data, train_labels),
 (eval_data, eval_labels)) = tf.keras.datasets.mnist.load_data()

train_data = train_data/np.float32(255) # 55,000 images
train_labels = train_labels.astype(np.int32)
eval_data = eval_data/np.float32(255) # 10,000 images
eval_labels = eval_labels.astype(np.int32)

# Create the Estimator: model_fn = model function, model_dir = directory where the model data (checkpoints) will be saved.
mnist_classifier = tf.estimator.Estimator(
    model_fn=cnn_model_fn, model_dir="/tmp/mnist_convnet_model")

# Set up logging for predictions: track progress during training
tensors_to_log = {"probabilities": "softmax_tensor"}

logging_hook = tf.train.LoggingTensorHook(
    tensors=tensors_to_log, every_n_iter=50)

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_model_dir': '/tmp/mnist_convnet_model', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x1301e02e8>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


#### Training

In [19]:
# Train the model
train_input_fn = tf.estimator.inputs.numpy_input_fn(
    x={"x": train_data},
    y=train_labels,
    batch_size=100,
    num_epochs=None,
    shuffle=True)

# train one step and display the probabilties
mnist_classifier.train(
    input_fn=train_input_fn,
    steps=1,
    hooks=[logging_hook])

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmp/mnist_convnet_model/model.ckpt-2002
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 2002 into /tmp/mnist_convnet_model/model.ckpt.
INFO:tensorflow:probabilities = [[0.01930306 0.00022307 0.94605637 0.01312365 0.00112087 0.0065746
  0.01026562 0.0001486  0.0015208  0.00166354]
 [0.00051872 0.00049125 0.02507609 0.12930958 0.22346169 0.1163171
  0.13362776 0.05573069 0.05072848 0.2647387 ]
 [0.00530184 0.00005542 0.9848638  0.00452356 0.00088718 0.00171237
  0.00150962 0.00007353 0.00095083 0.00012193]
 [0.01906956 0.02903073 0.15976055 0.36301047 0.01674925 0.17639239
  0.11490732 0.01679629 0.07522941 0.02905399]
 [0.9013267  0.00002989 0.01757689 0.00195442 0.00026924 0.06277401
  0.00237916 0.00113978 0.01226654 0

INFO:tensorflow:loss = 0.91013736, step = 2003
INFO:tensorflow:Saving checkpoints for 2003 into /tmp/mnist_convnet_model/model.ckpt.
Instructions for updating:
Use standard file APIs to delete files with this prefix.
INFO:tensorflow:Loss for final step: 0.91013736.


<tensorflow_estimator.python.estimator.estimator.Estimator at 0x1301e0048>

In [20]:
mnist_classifier.train(input_fn=train_input_fn, steps=1000)

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmp/mnist_convnet_model/model.ckpt-2003
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 2003 into /tmp/mnist_convnet_model/model.ckpt.
INFO:tensorflow:loss = 0.71231604, step = 2004
INFO:tensorflow:global_step/sec: 6.42397
INFO:tensorflow:loss = 0.6226307, step = 2104 (15.568 sec)
INFO:tensorflow:global_step/sec: 6.72287
INFO:tensorflow:loss = 0.6071113, step = 2204 (14.874 sec)
INFO:tensorflow:global_step/sec: 6.688
INFO:tensorflow:loss = 0.60804164, step = 2304 (14.953 sec)
INFO:tensorflow:global_step/sec: 6.5446
INFO:tensorflow:loss = 0.61811644, step = 2404 (15.279 sec)
INFO:tensorflow:global_step/sec: 6.67699
INFO:tensorflow:loss = 0.3932787, step = 2504 (14.977 sec)
INFO:tensorflow:global_step/sec: 6.71637
INFO:te

<tensorflow_estimator.python.estimator.estimator.Estimator at 0x1301e0048>

#### Evaluate model: accuracy

In [21]:
eval_input_fn = tf.estimator.inputs.numpy_input_fn(
    x={"x": eval_data},
    y=eval_labels,
    num_epochs=1,
    shuffle=False)

eval_results = mnist_classifier.evaluate(input_fn=eval_input_fn)
print(eval_results)

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2019-05-02T08:29:24Z
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmp/mnist_convnet_model/model.ckpt-3003
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Finished evaluation at 2019-05-02-08:29:29
INFO:tensorflow:Saving dict for global step 3003: accuracy = 0.8974, global_step = 3003, loss = 0.38864443
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 3003: /tmp/mnist_convnet_model/model.ckpt-3003
{'accuracy': 0.8974, 'loss': 0.38864443, 'global_step': 3003}
