In [None]:
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

In [None]:
# Imports
import numpy as np
import tensorflow as tf

tf.logging.set_verbosity(tf.logging.INFO)

Let's build a model to classify the images in the MNIST dataset using the following CNN architecture:

* __Convolutional Layer #1:__ Applies 32 5x5 filters (extracting 5x5-pixel subregions), with ReLU activation function
* __Pooling Layer #1:__ Performs max pooling with a 2x2 filter and stride of 2 (which specifies that pooled regions do not overlap)
* __Convolutional Layer #2:__ Applies 64 5x5 filters, with ReLU activation function
* __Pooling Layer #2:__ Again, performs max pooling with a 2x2 filter and stride of 2
* __Dense Layer #1:__ 1,024 neurons, with dropout regularization rate of 0.4 (probability of 0.4 that any given element will be dropped during training)
* __Dense Layer #2 (Logits Layer):__ 10 neurons, one for each digit target class (0–9).

The tf.layers module contains methods to create each of the three layer types above:

* `conv2d()`. Constructs a two-dimensional convolutional layer. Takes number of filters, filter kernel size, padding, and activation function as arguments.
* `max_pooling2d()`. Constructs a two-dimensional pooling layer using the max-pooling algorithm. Takes pooling filter size and stride as arguments.
* `dense()`. Constructs a dense layer. Takes number of neurons and activation function as arguments.

Each of these methods accepts a tensor as input and returns a transformed tensor as output. This makes it easy to connect one layer to another: just take the output from one layer-creation method and supply it as input to another.

### Convolutional Layer #1
In our first convolutional layer, we want to apply 32 5x5 filters to the input layer, with a ReLU activation function. We can use the conv2d() method in the layers module to create this layer as follows:

In [None]:
input_layer = tf.reshape(features["x"], [-1, 28, 28, 1])

In [3]:
conv1 = tf.layers.conv2d(
    inputs=input_layer,
    filters=32,
    kernel_size=[5, 5],
    padding="same",
    activation=tf.nn.relu)

NameError: name 'input_layer' is not defined

Our output tensor produced by conv2d() has a shape of [batch_size, 28, 28, 32]: the same height and width dimensions as the input, but now with 32 channels holding the output from each of the filters.

### Pooling Layer #1
Next, we connect our first pooling layer to the convolutional layer we just created. We can use the max_pooling2d() method in layers to construct a layer that performs max pooling with a 2x2 filter and stride of 2:

In [None]:
pool1 = tf.layers.max_pooling2d(inputs=conv1, pool_size=[2, 2], strides=2)

Our output tensor produced by max_pooling2d() (pool1) has a shape of [batch_size, 14, 14, 32]: the 2x2 filter reduces height and width by 50% each.

### Convolutional Layer #2


In [None]:
conv2 = tf.layers.conv2d(
    inputs=pool1,
    filters=64,
    kernel_size=[5, 5],
    padding="same",
    activation=tf.nn.relu)

### Pooling Layer #2

In [None]:
pool2 = tf.layers.max_pooling2d(inputs=conv2, pool_size=[2, 2], strides=2)

Pooling layer #2 takes conv2 as input, producing pool2 as output. pool2 has shape [batch_size, 7, 7, 64] (50% reduction of height and width from conv2).
Note that the last parameter for dimension went up as we used another conv layer with 64 filters
Also the dimension went  50% as we used max pooling

### Dense Layer
Before we connect the layer, however, we'll flatten our feature map (pool2) to shape [batch_size, features], so that our tensor has only two dimensions:

In [None]:
pool2_flat = tf.reshape(pool2, [-1, 7 * 7 * 64])

In [None]:
dense = tf.layers.dense(inputs=pool2_flat, units=1024, activation=tf.nn.relu)

Using dropout on our Dense layer to avoid overfitting

In [None]:
dropout = tf.layers.dropout(
    inputs=dense, rate=0.4, training=mode == tf.estimator.ModeKeys.TRAIN)

### Logits Layer
The final layer in our neural network is the logits layer, which will return the raw values for our predictions. We create a dense layer with 10 neurons (one for each target class 0–9), with linear activation (the default): This is the fully connected layer



In [None]:
logits = tf.layers.dense(inputs=dropout, units=10)

### Calculating Loss
Our labels tensor contains a list of predictions for our examples, e.g. [1, 9, ...]. In order to calculate cross-entropy, first we need to convert labels to 0s and 1s using `one hot encoding`
Then find the softmax between the predictions and labels

In [None]:
onehot_labels = tf.one_hot(indices=tf.cast(labels, tf.int32), depth=10)
loss = tf.losses.softmax_cross_entropy(
    onehot_labels=onehot_labels, logits=logits)

### Training
Connfigure the learning rate and training

In [None]:
if mode == tf.estimator.ModeKeys.TRAIN:
  optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.001)
  train_op = optimizer.minimize(
      loss=loss,
      global_step=tf.train.get_global_step())
  return tf.estimator.EstimatorSpec(mode=mode, loss=loss, train_op=train_op)

### Evaluation Metrics

In [None]:
eval_metric_ops = {
    "accuracy": tf.metrics.accuracy(
        labels=labels, predictions=predictions["classes"])}
return tf.estimator.EstimatorSpec(
    mode=mode, loss=loss, eval_metric_ops=eval_metric_ops)

### Create the estimator

In [None]:
# Create the Estimator
mnist_classifier = tf.estimator.Estimator(
    model_fn=cnn_model_fn, model_dir="/tmp/mnist_convnet_model")

### Configure Logging

In [None]:
# Set up logging for predictions
tensors_to_log = {"probabilities": "softmax_tensor"}
logging_hook = tf.train.LoggingTensorHook(
  tensors=tensors_to_log, every_n_iter=50)

### Load training and test data

In [None]:
# Load training and eval data
mnist = tf.contrib.learn.datasets.load_dataset("mnist")
train_data = mnist.train.images # Returns np.array
train_labels = np.asarray(mnist.train.labels, dtype=np.int32)
eval_data = mnist.test.images # Returns np.array
eval_labels = np.asarray(mnist.test.labels, dtype=np.int32)

### References
* https://www.tensorflow.org/tutorials/layers
* Cross Entropy = https://www.youtube.com/watch?v=tRsSi_sqXjI
* Estimator = https://www.tensorflow.org/get_started/custom_estimators