#A Guide to Training Models Using ```tf.keras``` and ```tf.estimator```

In Tensorflow, we can train models using both ```tf.keras``` as well as ```tf.estimator```. In this guide, we will examine training methods for both of them, as well as how to convert ```tf.keras``` models into ```tf.estimator``` models. Lastly, we will compare and contrast the advantages/disadvatages of both methods.

##Setting up

First, let's set up our Tensorflow environment.

###Importing

In [0]:
import warnings
warnings.filterwarnings('ignore')

import numpy as np
np.random.seed(123) # for reproducibility

from __future__ import absolute_import, division, print_function, unicode_literals

import tensorflow as tf

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Flatten, MaxPool2D, Conv2D, Dense, Reshape, Dropout
from tensorflow.keras.utils import to_categorical

from tensorflow.keras.datasets import mnist

###Loading data

In [0]:
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train = X_train.reshape(X_train.shape[0], 28, 28, 1)
X_test = X_test.reshape(X_test.shape[0], 28, 28, 1)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
Y_train = to_categorical(y_train, 10)
Y_test = to_categorical(y_test, 10)

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


##Training with ```tf.keras```

Our first scenario is training a model for this dataset with ```tf.keras```. First, we'll define the model architecture and then we will compile and train the model. Let's get started!

The model architecture we will be using will be based off of the tutorial found [here](https://www.tutorialspoint.com/tensorflow/tensorflow_keras.htm). I previously modified this in another GCI task to support Tensorflow 2.x and additionally modified it again, so it should work well as an example keras model for this guide.

###Defining model architecture

In [0]:
keras_model = Sequential()
keras_model.add(Conv2D(32, 3, 3, activation = 'relu', input_shape = (28,28,1)))
keras_model.add(Conv2D(32, 3, 3, activation = 'relu'))
keras_model.add(MaxPool2D(pool_size = (2,2)))
keras_model.add(Dropout(0.25))
keras_model.add(Flatten())
keras_model.add(Dense(128, activation = 'relu'))
keras_model.add(Dropout(0.5))
keras_model.add(Dense(10, activation = 'softmax'))

Instructions for updating:
If using Keras pass *_constraint arguments to layers.


###Compiling model

In [0]:
keras_model.compile(loss = 'categorical_crossentropy', optimizer = 'adam', metrics = ['accuracy'])

###Fitting model

In [0]:
keras_model.fit(X_train, Y_train, batch_size = 32, epochs = 10, verbose = 1)

Train on 60000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x7fdd88c043c8>

##Training with ```tf.estimator```

Next, we'll take a look at how to train a model for this dataset with ```tf.estimator```. We can take advantage of the premade estimators (specifically ```DNNClassifier```) and tweak it to the needs of our specific model. We will be using the same model structure as before for means of comparison. We will also be using the same test/train splits used before, as well as the same batch/epoch sizes. The number of steps can be calculated as (total number of images)/(batch size) * (number of epochs) = 60000/32 * 10 = 18750.

###Reload data

In [0]:
(X_train, y_train), (X_test, y_test) = mnist.load_data()

###Defining model architecture

In [0]:
estimator_model = tf.estimator.DNNClassifier(
    feature_columns=[tf.feature_column.numeric_column("x", shape=[28, 28])], #feature that we are specifying
    hidden_units=[32, 32, 128, 10], #layers that we set up previously
    optimizer=tf.train.AdamOptimizer(),
    n_classes=10,
    dropout=0.25,
)

###Defining training inputs

In [0]:
train_input_fn = tf.estimator.inputs.numpy_input_fn(
    x={'x': X_train},
    y=y_train.astype('int32'),
    num_epochs=10,
    batch_size=32,
    shuffle=True,
)

###Training model

In [0]:
estimator_model.train(input_fn=train_input_fn, steps=18750)

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmp/tmp2b8x8ooy/model.ckpt-18750
Instructions for updating:
Use standard file utilities to get mtimes.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 18750 into /tmp/tmp2b8x8ooy/model.ckpt.
INFO:tensorflow:loss = 34.545532, step = 18751
INFO:tensorflow:global_step/sec: 274.076
INFO:tensorflow:loss = 36.36592, step = 18851 (0.366 sec)
INFO:tensorflow:global_step/sec: 357.32
INFO:tensorflow:loss = 27.184296, step = 18951 (0.280 sec)
INFO:tensorflow:global_step/sec: 350.188
INFO:tensorflow:loss = 26.193417, step = 19051 (0.285 sec)
INFO:tensorflow:global_step/sec: 392.717
INFO:tensorflow:loss = 45.261253, step = 19151 (0.257 sec)
INFO:tensorflow:global_step/sec: 362.822
INFO:tensorflow:loss = 34.116203, step = 19251 (0.27

<tensorflow_estimator.python.estimator.canned.dnn.DNNClassifier at 0x7fdd88c17a90>

##Converting ```tf.keras``` model to ```tf.estimator``` model

We can convert a ```tf.keras``` model to a ```tf.estimator``` model by using ```tf.keras.estimator.model_to_estimator()``` as demonstrated below. We will pass the ```keras_model``` previously compiled as an argument to this function.

In [0]:
converted_model = tf.keras.estimator.model_to_estimator(keras_model)

##Comparison and conclusion

Tensorflow is incredibly diverse and flexible. Both ```tf.keras``` and ```tf.estimator``` are great to use for training models.

```tf.keras``` is a very high-level API that abstracts deep learning model components. These include layers, functions, optimizers, etc. that make it very easy for developers to quickly get a neural net up and running.

```tf.estimator``` is also quite high level like ```tf.keras```, just like ```tf.keras``` it provides a high level abstraction over low level Tensorflow core operators.

When deciding which one to use, try to pick the best choice that suits your specific needs. For examples, some factors that might help you decide are:



*   ```tf.keras``` is sometimes easier to use for beginners creating their own models as the syntax is very easy to understand. Examples of this is the ```Sequential``` used in this guide, making it very easy to add layers.
*   ```tf.estimator``` comes with some very nice classifiers built in, such as ```DNNClassifier``` (the one used in this guide), ```LinearClassifier```, and more!
*   ```tf.estimator``` comes with support for distributed training across multiple servers with their API, but ```tf.keras``` does not.
*   ```tf.estimator``` comes with support for TensorBoard visualization of data, such as your graphs, statistics, etc. which can be saved and viewed.

Ultimately, ```tf.keras``` is more suited towards beginners, while ```tf.estimator``` is a more full package with more support for features such as distributed training, built-in classifiers, and TensorBoard support that developers will want to take advantage of.

For more information, check out the documentation for ```tf.keras``` [here](https://www.tensorflow.org/api_docs/python/tf/keras), and the documentation for ```tf.estimator``` [here](https://www.tensorflow.org/api_docs/python/tf/estimator/Estimator?version=stable).

