# Using the TensorFlow Estimator APIs

In this lab, we'll explore using TensorFlow's high-level [`tf.estimator`](https://www.tensorflow.org/programmers_guide/estimators) APIs, in order to easily build, train, evaluate, and use NN models.

We'll do this via both the 'MNIST' dataset, and [Fashion-MNIST](https://github.com/zalandoresearch/fashion-mnist/blob/master/doc/img/fashion-mnist-sprite.png), which is a direct drop-in replacement for the original MNIST dataset.
(You can read more about it, and why it was created, [here](https://github.com/zalandoresearch/fashion-mnist). It is a more challenging dataset than 'regular' MNIST, which has become too easy these days.)

The lab starts with a [`LinearClassifier`](https://www.tensorflow.org/api_docs/python/tf/estimator/LinearClassifier), then uses a [`DNNClassifier`](https://www.tensorflow.org/api_docs/python/tf/estimator/DNNClassifier) with multiple hidden layers.

As part of the lab, we'll explore what [TensorBoard](https://www.tensorflow.org/get_started/summaries_and_tensorboard) can do.



**If you're running this notebook on colab**, download the dataset.py file from the repo:

In [None]:
%%bash
wget https://raw.githubusercontent.com/amygdala/tensorflow-workshop/master/workshop_sections/high_level_APIs/dataset.py
ls -l dataset.py

Do some imports and check your version of TensorFlow.  It must be >=1.4, and ideally >=1.7.

In [None]:
from __future__ import absolute_import, division, print_function
import numpy as np
import os
import time

import tensorflow as tf
import dataset

print(tf.__version__)

# define a utility function for generating a new directory in which to save 
# model information, so multiple training runs don't stomp on each other.
def get_new_path(name=""):
    base = os.path.abspath("/tmp/tfmodels/mnist_estimators")
    logpath = os.path.join(base, name + "_" + str(int(time.time())))
    print("Logging to {}".format(logpath))
    return logpath

## Getting started: A Linear Classifier

First, let's build a LinearClassifier. 

We'll first build the models' input functions.

We'll use [Datasets](https://www.tensorflow.org/get_started/datasets_quickstart) to manage the input to our model. The [`tf.data`](https://www.tensorflow.org/api_docs/python/tf/data) module contains a collection of classes that allows you to easily load data, manipulate it, and pipe it into your model. 
[Datasets support highly scalable and performant input pipelines](https://www.tensorflow.org/performance/datasets_performance), and it is best practice to use them where possible.


In [None]:
DATA_DIR = "/tmp/MNIST_data"
NUM_STEPS = 5000
BATCH_SIZE = 100

def train_input_fn(data_dir, batch_size=100):
  """Prepare data for training."""

  # When choosing shuffle buffer sizes, larger sizes result in better
  # randomness, while smaller sizes use less memory. MNIST is a small
  # enough dataset that we can easily shuffle the full epoch.
  ds = dataset.train(data_dir)
  ds = ds.cache().shuffle(buffer_size=50000).batch(batch_size=batch_size)

  # Iterate through the dataset a set number of times
  # during each training session.
  ds = ds.repeat(40)
  features = ds.make_one_shot_iterator().get_next()
  return {'pixels': features[0]}, features[1]


def eval_input_fn(data_dir, batch_size=100):
  features = dataset.test(data_dir).batch(
      batch_size=batch_size).make_one_shot_iterator().get_next()
  return {'pixels': features[0]}, features[1]

Now, we'll define and train the LinearClassifier model.
Note that we didn't need to explicitly define a model graph or a training loop ourselves.


In [None]:
feature_columns = [tf.feature_column.numeric_column(
    "pixels", shape=784)]

linear_classifier = tf.estimator.LinearClassifier(
        feature_columns=feature_columns, 
        n_classes=10,
        model_dir=get_new_path("linear")
    )

train_input = lambda: train_input_fn(
    DATA_DIR,
    batch_size=BATCH_SIZE
)
linear_classifier.train(input_fn=train_input, steps=NUM_STEPS)

Once we've trained the model, we'll run the evaluate() method, which uses the trained model. To do this, it loads the most recent checkpointed model info available. The model checkpoint(s) are generated during the training process.


In [None]:
# Evaluate
eval_input = lambda: eval_input_fn(
    DATA_DIR,
    batch_size=BATCH_SIZE
)
results = linear_classifier.evaluate(input_fn=eval_input)
print(results)

(Note that the model accuracy is not great... we'll get back to that).

We can also use the model to make a few predictions.   
Note: If you wanted to actually deploy and serve the model, in order to support scalable predictions, you'd want to export it in a specific `SavedModel` format.  We'll get to that in a later example.

In [None]:
# predictions

def predict_input_fn():
  features = dataset.test(DATA_DIR).take(5).batch(batch_size=1).make_one_shot_iterator().get_next()
  return {'pixels': features[0]}, features[1]

predictions = linear_classifier.predict(input_fn=predict_input_fn)

for prediction in predictions:
    print("Predictions:    {} with probabilities {}\n".format(
        prediction["classes"], prediction["probabilities"]))  


In [None]:
# Bonus: What are the labels for these predictions?
# This will fail if matplotlib is not installed. You can just skip it if so.

import matplotlib.pyplot as plt
%matplotlib inline

pred_next_item = dataset.test(DATA_DIR).take(5).batch(batch_size=1).make_one_shot_iterator().get_next()
sess =  tf.Session()
while True:
  try:
    item = sess.run(pred_next_item)
    pred_label = item[1]
    pred_image = item[0]
    print("label: %s" % pred_label)
    sample = np.reshape(pred_image, (28,28))
    plt.figure()
    plt.imshow(sample, 'gray')
  except tf.errors.OutOfRangeError:
    break

## DNNClassifier: try a Deep Neural Net on the same task

Next, let's see if a Deep Neural Net, with multiple hidden layers, does better at classification of these images.
We'll use a [`DNNClassifier`](https://www.tensorflow.org/api_docs/python/tf/estimator/DNNClassifier) with multiple hidden layers.

First, do some imports and set some variables:

In [None]:
DATA_DIR = "/tmp/MNIST_data"
NUM_STEPS = 15000
BATCH_SIZE = 100

Next, we'll define a `DNNClassifier`, and run its `train()` method, which will train the model. Again note that we didn't need to explicitly define a model graph or a training loop ourselves. 

You'll notice that this code looks much the same as that above, aside from a couple additional parameters when defining the model.
We can use the same train and eval input functions as above.


First, let's try training the DNNClassifier with a .1 learning rate.

In [None]:
feature_columns = [tf.feature_column.numeric_column(
    "pixels", shape=784)]

LR = .1

dnn_classifier = tf.estimator.DNNClassifier(
        feature_columns=feature_columns,
        n_classes=10,
        hidden_units=[128, 32],
        optimizer=tf.train.ProximalAdagradOptimizer(learning_rate=LR),
        model_dir=get_new_path("dnn")
    )

dnn_classifier.train(input_fn=train_input, steps=NUM_STEPS)

Now we'll evaluate the trained model. Note the accuracy.

In [None]:
# Evaluate

results = dnn_classifier.evaluate(input_fn=eval_input)
print(results)

Next, let's try using a .5 learning rate.

In [None]:
LR5 = .5

dnn_classifier5 = tf.estimator.DNNClassifier(
        feature_columns=feature_columns,
        n_classes=10,
        hidden_units=[128, 32],
        optimizer=tf.train.ProximalAdagradOptimizer(learning_rate=LR5),
        model_dir=get_new_path("dnn5")
    )

dnn_classifier5.train(input_fn=train_input, steps=NUM_STEPS)

In [None]:
# Evaluate
results = dnn_classifier5.evaluate(input_fn=eval_input)
print(results)

<hr>
To compare your results, let's start up TensorBoard! 

**Note**: If you're running this notebook on **colab, you will not be able to run TensorBoard from the notebook**, so you will need to skip this step.

You can start it as follows in a new terminal window. (If you get a 'not found' error, make sure you've activated your virtual environment in that new window):

```sh
$ tensorboard --logdir=/tmp/tfmodels/mnist_estimators
```
Look for it at localhost:6006

Alternately, run the following (select Kernel --> Interrupt from the menu when you're done).

In [None]:
!tensorboard --logdir=/tmp/tfmodels/mnist_estimators

## Fashion MNIST! and `tf.estimator.train_and_evaluate()`

Next, let's look at our results with a data set that's harder: [Fashion-MNIST](https://github.com/zalandoresearch/fashion-mnist#get-the-data).

<img src="https://storage.googleapis.com/amy-jo/images/fashion-mnist-sprite%20_sm.png" width="40%"
         alt="Fashion MNIST">

If you haven't already downloaded the Fashion-MNIST files, you can do so as follows. **If you've already downloaded them, you don't need to do so again.**

In [None]:
%%bash
mkdir -p fashion_mnist
cd fashion_mnist
wget http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz
wget http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz
wget http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz
wget http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz
gunzip *
cd ..

If wget is not installed on your machine, try **replacing** the `wget` lines with:
```
curl -o train-images-idx3-ubyte.gz http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz
curl -o train-labels-idx1-ubyte.gz http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz
curl -o t10k-images-idx3-ubyte.gz http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz
curl -o t10k-labels-idx1-ubyte.gz http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz
```
or [download directly from the site](https://github.com/zalandoresearch/fashion-mnist#get-the-data).

Confirm that everything looks okay. You want the files to be **unzipped**.

In [None]:
%%bash
ls -l fashion_mnist

### `tf.estimator.train_and_evaluate()`

TensorFlow’s version 1.4 release [introduced](https://cloud.google.com/blog/big-data/2018/02/easy-distributed-training-with-tensorflow-using-tfestimatortrain-and-evaluate-on-cloud-ml-engine) the [`tf.estimator.train_and_evaluate`](https://www.tensorflow.org/api_docs/python/tf/estimator/train_and_evaluate) function, which simplifies training, evaluation, and exporting of Estimator models. It abstracts away the details of distributed execution for training and evaluation, while also supporting consistent behavior across local/non-distributed and distributed configurations.

For this example, we'll use `tf.estimator.train_and_evaluate` instead of making separate 'train' and 'evaluate' calls.
To keep this example simple, we're not including model export.
We'll show that in a later lab.

In [None]:
# edit path to directory as necessary
FASHION_DATA_DIR = "fashion_mnist" 

train_input_fashion = lambda: train_input_fn(
    FASHION_DATA_DIR,
    batch_size=BATCH_SIZE
)
eval_input_fashion = lambda: eval_input_fn(
    FASHION_DATA_DIR,
    batch_size=BATCH_SIZE
)

feature_columns = [tf.feature_column.numeric_column(
    "pixels", shape=784)]

LR = .1

run_config = tf.estimator.RunConfig()
run_config = run_config.replace(model_dir=get_new_path("fashion_dnn"))

fashion_dnn_classifier = tf.estimator.DNNClassifier(
        feature_columns=feature_columns,
        n_classes=10,
        hidden_units=[128, 32],
        optimizer=tf.train.ProximalAdagradOptimizer(learning_rate=LR),
        config=run_config
    )

train_spec = tf.estimator.TrainSpec(train_input_fashion,
                                  max_steps=NUM_STEPS
                                  )

# While not shown here, we can also add a model 'exporter' to the EvalSpec.
eval_spec = tf.estimator.EvalSpec(eval_input_fashion,
                                steps=NUM_STEPS,
                                name='fashion-eval'
                                )


Add another metric to the estimator -- *recall* -- using Tensorflow's built-in metrics.

In [None]:
def my_recall(labels, predictions):
  return {'recall': tf.metrics.recall(labels, predictions['class_ids'])}

In [None]:
# add the recall metric to the estimator
fashion_dnn_classifier = tf.contrib.estimator.add_metrics(fashion_dnn_classifier, my_recall)

## Exercise

Before running `train_and_evaluate`, try adding a *precision* metric too.

In [None]:
# your new code here

In [None]:
tf.estimator.train_and_evaluate(fashion_dnn_classifier,
                                train_spec,
                                eval_spec)

You can see that the accuracy is significantly worse than with 'regular' MNIST. This dataset is harder! 


We can again make some predictions using our trained model:

In [None]:
# predictions

def predict_input_fn():
  features = dataset.test(FASHION_DATA_DIR).skip(5575).take(5).batch(batch_size=1).make_one_shot_iterator().get_next()
  return {'pixels': features[0]}, features[1]

predictions = fashion_dnn_classifier.predict(input_fn=predict_input_fn)

for prediction in predictions:
    print("Predictions:    {} with probabilities {}\n".format(
        prediction["classes"], prediction["probabilities"]))

In [None]:
# Bonus: What are the labels for these predictions?
# This will fail if matplotlib is not installed. You can just skip it if so.

import matplotlib.pyplot as plt
%matplotlib inline

pred_next_item = dataset.test(FASHION_DATA_DIR).skip(5575).take(5).batch(batch_size=1).make_one_shot_iterator().get_next()
sess =  tf.Session()
while True:
  try:
    item = sess.run(pred_next_item)
    pred_label = item[1]
    pred_image = item[0]
    print("label: %s" % pred_label)
    sample = np.reshape(pred_image, (28,28))
    plt.figure()
    plt.imshow(sample, 'gray')
  except tf.errors.OutOfRangeError:
    break  

<hr>
Let's compare results again using TensorBoard.

**Note**: If you're running this notebook on **colab**, you will not be able to run TensorBoard from the notebook, so you will need to skip this step.

If TensorBoard is still running in a terminal window from before, it should pick up the new data automatically, since we pointed it to the parent directory of all the training runs we're doing. (If it doesn't seem to have done so, just reload).

Otherwise, start up TensorBoard as follows in a new terminal window. (If you get a 'not found' error, make sure you've activated your virtual environment in that new window):

```sh
$ tensorboard --logdir=/tmp/tfmodels/mnist_estimators
```
Look for it at localhost:6006

Or run the following (select Kernel --> Interrupt from the menu when you're done):

In [None]:
!tensorboard --logdir=/tmp/tfmodels/mnist_estimators

## Exercise

Try training a DNNClassifier model, using Fashion MNIST, with a .5 learning rate. 
Does this training do better or worse than the .1 learning rate on the Fashion MNIST dataset?

In [None]:
LR5 = .5
# Your edits here
fashion_dnn_classifier5 = ...

...

tf.estimator.train_and_evaluate(fashion_dnn_classifier5,
                                train_spec,
                                eval_spec)

Did this training run do better or worse than the .1 learning rate on fashion mnist?

Copyright 2018 Google Inc. All Rights Reserved.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
