#  Build a Deep Learning Model with Keras vs tf.estimator API

Here we will use [MNIST](https://www.tensorflow.org/tutorials/) dataset to show how to make classification model using [Keras](https://keras.io/) and [tf.estimator API](https://www.tensorflow.org/api_docs/python/tf/estimator). Let's compare with those two models and at last I will dicuss about my opinion about those two.

There 4 section included :
- [Load dataset](#section1)
- [Keras model](#section2)
- [tf.estimator API](#section3)
- [tf.estimator on large dataset](#section4)

#### update tensorflow to the newest version

In [1]:
!pip install --upgrade tensorflow 

Collecting tensorflow
  Downloading https://files.pythonhosted.org/packages/b1/ad/48395de38c1e07bab85fc3bbec045e11ae49c02a4db0100463dd96031947/tensorflow-1.12.0-cp35-cp35m-manylinux1_x86_64.whl (83.1MB)
[K    100% |████████████████████████████████| 83.1MB 11kB/s  eta 0:00:01
[?25hCollecting protobuf>=3.6.1 (from tensorflow)
  Downloading https://files.pythonhosted.org/packages/bf/d4/db7296a1407cad69f043537ba1e05afab3646451a066ead7a314d8714388/protobuf-3.6.1-cp35-cp35m-manylinux1_x86_64.whl (1.1MB)
[K    100% |████████████████████████████████| 1.1MB 803kB/s eta 0:00:01
[?25hCollecting absl-py>=0.1.6 (from tensorflow)
  Downloading https://files.pythonhosted.org/packages/0c/63/f505d2d4c21db849cf80bad517f0065a30be6b006b0a5637f1b95584a305/absl-py-0.6.1.tar.gz (94kB)
[K    100% |████████████████████████████████| 102kB 9.6MB/s ta 0:00:01
[?25hCollecting grpcio>=1.8.6 (from tensorflow)
  Downloading https://files.pythonhosted.org/packages/d9/94/7c634ccc859169ceebca7140532b5267a0b0ed5583

#### import libraries

In [2]:
import tensorflow as tf
from keras.datasets import mnist
from keras.layers import Flatten,Dense,Dropout
from keras.models import Sequential
import numpy as np
import pandas as pd
tf.__version__

Using TensorFlow backend.


'1.12.0'

#  <a name="section1"></a> Load data

In [3]:
(x_train, y_train),(x_test, y_test) = mnist.load_data()

Downloading data from https://s3.amazonaws.com/img-datasets/mnist.npz


####  inpect data

Let's how many examples in training set and testing set. And each picture contains 28*28 pixels.

In [4]:
print ("Traning set feature shape: ", x_train.shape,"\n")
print ("Traning set label shape: ", y_train.shape,"\n")
print ("Testing set feature shape: ", x_train.shape,"\n")
print ("Testing set feature shape: ", y_train.shape,"\n")

Traning set feature shape:  (60000, 28, 28) 

Traning set label shape:  (60000,) 

Testing set feature shape:  (60000, 28, 28) 

Testing set feature shape:  (60000,) 



In [5]:
y_train[0:10]

array([5, 0, 4, 1, 9, 2, 1, 3, 1, 4], dtype=uint8)

So features contian 28*28 pixals  0-255 grey scale,

label is list of number 0-9. Now let's normalize feature between 0 and 1

In [6]:
# noramlized features from 0-1  
x_train, x_test = x_train / 255.0, x_test / 255.0

In [7]:
# define input_size
dim=28

# <a name="section2"></a> Keras model
lets run a baby example using keras

*Sequential* is defined in [Keras.layers](https://keras.io/layers/about-keras-layers/) module, once we define **"model=Sequential()"**. We then could used model.add() function to add any layers to define the model complexity.  
One another thing need to notice: the first layer need a **input_shape** arugement to define the input tensor shape
How easy it is?!   
The last line of code **"model.summary()"** return a table of parameters of the model you just build.

In [8]:
model = Sequential()
model.add(Flatten(input_shape=(dim,dim,)))
model.add(Dense(512, activation=tf.nn.relu))
model.add(Dropout(0.2))
model.add(Dense(10, activation=tf.nn.softmax))
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
flatten_1 (Flatten)          (None, 784)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 512)               401920    
_________________________________________________________________
dropout_1 (Dropout)          (None, 512)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 10)                5130      
Total params: 407,050
Trainable params: 407,050
Non-trainable params: 0
_________________________________________________________________


Once the model has been defined, we need [compile](https://keras.io/models/model/#compile) the model and pass arguments like *optimizer*, *loss* and *metrics*.

In [9]:
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

Then use **fit** function and pass **features tensors(or numpy arrays)** and **labels**. 

In [10]:
model.fit(x_train, y_train, batch_size= 32, epochs=5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x7f7966ae4898>

In [11]:
model.evaluate(x_test, y_test)



[0.068943760562455284, 0.9788]

It is returning the loss and metrics we difined before.  
I got accuracy: 0.9788 on my eval set on 5 epochs from scrach.

# <a name="section3"></a> tf.estimator API

#### DNNClassifier estimator API 
the following code in this section is modified from a [blog](https://codeburst.io/use-tensorflow-dnnclassifier-estimator-to-classify-mnist-dataset-a7222bf9f940) by *Macro Lanaro*.

To build estimator api, it needs to define a *"input function"* including *"feature columns"* and *"classifier"*. 

In [49]:
# Define the training inputs
train_input_fn = tf.estimator.inputs.numpy_input_fn(
    x={"x": x_train},
    y=y_train.astype(np.int32),
    num_epochs=None,
    batch_size=50,
    shuffle=True
)

**tf.feature_columns** is a powerful tool to define different type of features. Use *feature_columns* function, we can easily define different type of columns like **bucketized_column, categorical_column_with_hash_bucket, categorical_column_with_identity**, which means if the dataset needs feature engineering before training, this would be a good choice.  (See other *feature_columns* [here](https://www.tensorflow.org/api_docs/python/tf/feature_column))

In [45]:
# Specify feature
feature_columns = [tf.feature_column.numeric_column("x", shape=[28, 28])]

There are some pre-defined models(Regressor and Classifier) in estimator API. I will list some and give some short explanations here:  
[BaselineClassifier](https://www.tensorflow.org/api_docs/python/tf/estimator/BaselineClassifier): This classifier ignores feature values and will learn to predict the average value of each label. For single-label problems, this will predict the probability distribution of the classes as seen in the labels. For multi-label problems, this will predict the fraction of examples that are positive for each class. (**It basically do nothing.**)  
[BoostedTreesClassifier](https://www.tensorflow.org/api_docs/python/tf/estimator/BoostedTreesClassifier): A Classifier for Tensorflow Boosted Trees models. (I have not used TFBT yet, but If you want to know more about TFBT , you can find some sources [here](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/boosted_trees) and [there](http://ecmlpkdd2017.ijs.si/papers/paperID705.pdf))  
[DNNClassifier](https://www.tensorflow.org/api_docs/python/tf/estimator/DNNClassifier): A classifier for TensorFlow DNN models. (**It is easy to define a simple DNN model by using like "hidden_units=[1024, 512, 256]"**)  
[DNNLinearCombinedClassifier](https://www.tensorflow.org/api_docs/python/tf/estimator/DNNLinearCombinedClassifier): An estimator for TensorFlow Linear and DNN joined classification models. (**This is more costomized model. If some of columns you want to use Linear and some of the columns DNN, this model would be great. "linear_feature_columns=[...], dnn_feature_columns=[..]"**)  
[LinearClassifier](https://www.tensorflow.org/api_docs/python/tf/estimator/LinearClassifier):Train a linear model to classify instances into one of multiple possible classes. When number of possible classes is 2, this is binary classification. (**It is a Logisticregression with binary or Softmax layer with multiclasses, simple and easy**)



In [53]:
# Build 2 layer DNN classifier
classifier = tf.estimator.DNNClassifier(
    feature_columns=feature_columns,
    hidden_units=[256],
    optimizer=tf.train.AdamOptimizer(1e-4),
    n_classes=10,
    dropout=0.2,
    model_dir="./tmp/mnist_model"
)

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_save_summary_steps': 100, '_save_checkpoints_steps': None, '_global_id_in_cluster': 0, '_task_id': 0, '_task_type': 'worker', '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f780c5a7278>, '_num_worker_replicas': 1, '_protocol': None, '_eval_distribute': None, '_master': '', '_is_chief': True, '_tf_random_seed': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_num_ps_replicas': 0, '_keep_checkpoint_every_n_hours': 10000, '_save_checkpoints_secs': 600, '_train_distribute': None, '_experimental_distribute': None, '_model_dir': './tmp/mnist_model', '_log_step_count_steps': 100, '_device_fn': None, '_keep_checkpoint_max': 5, '_service': None, '_evaluation_master': ''}


In [54]:
import shutil
shutil.rmtree("./tmp/mnist_model", ignore_errors = True)
classifier.train(input_fn=train_input_fn, steps=30000)

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 0 into ./tmp/mnist_model/model.ckpt.
INFO:tensorflow:loss = 121.802, step = 1
INFO:tensorflow:global_step/sec: 61.4317
INFO:tensorflow:loss = 68.5095, step = 101 (1.702 sec)
INFO:tensorflow:global_step/sec: 62.6354
INFO:tensorflow:loss = 49.4702, step = 201 (1.514 sec)
INFO:tensorflow:global_step/sec: 70.9626
INFO:tensorflow:loss = 37.7748, step = 301 (1.405 sec)
INFO:tensorflow:global_step/sec: 67.2957
INFO:tensorflow:loss = 31.825, step = 401 (1.485 sec)
INFO:tensorflow:global_step/sec: 52.5849
INFO:tensorflow:loss = 28.4717, step = 501 (1.902 sec)
INFO:tensorflow:global_step/sec: 62.6429
INFO:tensorflow:loss = 21.2868, step = 601 (1.596 sec)
INFO:tensorflow:global_step/sec: 62.5158
INFO:tensorflow:loss = 33.1921

<tensorflow.python.estimator.canned.dnn.DNNClassifier at 0x7f780c5a71d0>

In order to estimate the accuracy of the model, we need to define another "input function"

In [56]:
# Define the test inputs
test_input_fn = tf.estimator.inputs.numpy_input_fn(
    x={"x": x_test},
    y=y_test.astype(np.int32),
    num_epochs=1,
    shuffle=False
)


In [57]:
# Evaluate accuracy
accuracy_score = classifier.evaluate(input_fn=test_input_fn)["accuracy"]
print("\nTest Accuracy: {0:f}%\n".format(accuracy_score*100))

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2019-01-15-01:08:47
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from ./tmp/mnist_model/model.ckpt-30000
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Finished evaluation at 2019-01-15-01:08:50
INFO:tensorflow:Saving dict for global step 30000: accuracy = 0.9809, average_loss = 0.0659267, global_step = 30000, loss = 8.34515
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 30000: ./tmp/mnist_model/model.ckpt-30000

Test Accuracy: 98.089999%



I got accuracy about 98.089999% on eval set.

# <a name="section4"></a> Create TensorFlow model using TensorFlow's Estimator API
tf.estimator

In previous sections, we learned how to **Kereas** and **estimator** API to build a simple model. The dataset is downloaded and saved in numpy arrays. When you need to hundle a large dataset like 10GB. The memory can handle it in pandas or numpy. TF API need to read data directly from csv file lines batch by batch. They are good for distributed learning as well.  
This section is modifed from the google cloud platform repository [here](https://github.com/GoogleCloudPlatform/training-data-analyst/blob/master/courses/machine_learning/deepdive/06_structured/3_tensorflow_dnn.ipynb). 

**add and save files for training and testing.csv**

In [13]:
pd.DataFrame(x_train.reshape(-1,dim*dim)).join(pd.DataFrame(y_train,columns=['label'])).to_csv('train.csv',index=False,header=False)

I seem it will take about 1 minute to compile those files and save in pandas.

In [14]:
pd.DataFrame(x_test.reshape(-1,dim*dim)).join(pd.DataFrame(y_test,columns=['label'])).to_csv('test.csv',index=False,header=False)

let us see how the df look like

In [15]:
temp=pd.read_csv('train.csv',header=None)

In [12]:
temp.head()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,775,776,777,778,779,780,781,782,783,784
0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,5
1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0
2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,4
3,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1
4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,9


In [55]:
temp2=pd.read_csv('test.csv',header=None)
temp2.head()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,775,776,777,778,779,780,781,782,783,784
0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,7
1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2
2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1
3,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0
4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,4


Columns 0-783 are pixels for the image; and column 784 is the label

#### Define read_dataset
We have to create a input_function to read data from files batch by batch and pack them into a dict with "key" equal to column names and "value" equal to values as input data source.

In [58]:
#define columns here
CSV_COLUMNS = ["pixel"+str(i) for i in range (dim*dim)]+['digit']
LABEL_COLUMN = 'digit'
DEFAULTS = [[0.0] for i in range(dim*dim)]+[[0]] # [[0],['NA'],[0]]

def read_dataset(filename, mode, batch_size=32):
    def _input_fn():
        def decode_csv(value_colum):
            columns = tf.decode_csv(value_colum,record_defaults = DEFAULTS)
            features= dict(zip(CSV_COLUMNS,columns))
            label = features.pop (LABEL_COLUMN)
            return features, label
        
        # Create list of files that match pattern
        file_list =  tf.gfile.Glob(filename)
        
        # Create dataset from file list
        dataset = tf.data.TextLineDataset(filename).map(decode_csv) # Transform each elem by applying decode_csv fn
        if mode == tf.estimator.ModeKeys.TRAIN:
            num_epochs = None   # indefinitely
            dataset = dataset.shuffle(buffer_size =10 *batch_size)
        else:
            num_epochs = 1   # end-of-input after this
        dataset = dataset.repeat(num_epochs).batch(batch_size)

        return dataset.make_one_shot_iterator().get_next()
    return _input_fn

#### Define a feature columns
Because we have grey scale values for the pixels we can just use **numeric_column** for those columns.

In [59]:
# Define feature columns  - not including label
def get_cols():
  # Define column types
  return [tf.feature_column.numeric_column('pixel'+str(i)) for i in range(dim*dim)]


''

#### Define a severing function
by defining the severing input, we could use it to evaluate the model. The serving functing is being used as **exporters** in **EvalSpec**. Basically, it tell what is the format when evaluate the model, it should be same as the traning format. Export your model to work with JSON dictionaries.

In [73]:
# Create serving input function to be able to serve predictions later using provided inputs
def serving_input_fn():
    csv_row = tf.placeholder(shape=[None], dtype=tf.string)
    columns = tf.decode_csv(csv_row,record_defaults = DEFAULTS[:-1])
    features= dict(zip(CSV_COLUMNS,columns))
    #feature_placeholders = dict(zip(['pixel'+str(i) for i in range(dim*dim)],  
    #                                [tf.placeholder(tf.float32, [None]) for i in range(dim*dim)]))
    
    #features = {key: tf.expand_dims(tensor,-1) for key,tensor in feature_placeholders.items()}

    return tf.estimator.export.ServingInputReceiver(features, {'csv_row': csv_row})#feature_placeholders)

This function we define a **train_and_evaluate** function to complile all parts together.  
It include three main section: *estimator*, *train_spec*, and *eval_spec*. 
*estimator* : here we just use a simple DNNClassifier here, we can also build Convolutional models as well later.
*train_spec* : pass input_fn as we defined before, and traning steps.
*eval_spec*: define the evaluation freqencies

In [75]:
def train_and_evaluate(output_dir):
    EVAL_INTERVAL = 300  #save checkpoint every 300s
    TRAIN_STEPS = 300
    run_config = tf.estimator.RunConfig(save_checkpoints_secs = EVAL_INTERVAL)
    
    estimator = tf.estimator.DNNClassifier (model_dir=output_dir, 
                                         feature_columns = get_cols(),hidden_units=[32],
                                            dropout=0.2,
                                            n_classes=10,
                                        config = run_config)

    exporter = tf.estimator.LatestExporter('exporter', serving_input_fn)

    train_spec = tf.estimator.TrainSpec(input_fn =  read_dataset('train.csv', mode = tf.contrib.learn.ModeKeys.TRAIN),
                                        max_steps= TRAIN_STEPS)

    
    eval_spec = tf.estimator.EvalSpec(input_fn= read_dataset('test.csv', mode = tf.contrib.learn.ModeKeys.EVAL), 
                                        steps = None,
                                        start_delay_secs = 60, # start evaluating after N seconds
                                        throttle_secs= EVAL_INTERVAL,  # evaluate every N seconds
                                        exporters = exporter)

    tf.estimator.train_and_evaluate(estimator,train_spec,eval_spec)

In [76]:
import shutil
shutil.rmtree('mnist', ignore_errors = True)
train_and_evaluate('mnist')

INFO:tensorflow:Using config: {'_save_summary_steps': 100, '_save_checkpoints_steps': None, '_global_id_in_cluster': 0, '_task_id': 0, '_task_type': 'worker', '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f780c2ae198>, '_num_worker_replicas': 1, '_protocol': None, '_eval_distribute': None, '_master': '', '_is_chief': True, '_tf_random_seed': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_num_ps_replicas': 0, '_keep_checkpoint_every_n_hours': 10000, '_save_checkpoints_secs': 300, '_train_distribute': None, '_experimental_distribute': None, '_model_dir': 'mnist', '_log_step_count_steps': 100, '_device_fn': None, '_keep_checkpoint_max': 5, '_service': None, '_evaluation_master': ''}
INFO:tensorflow:Not using Distribute Coordinator.
INFO:tensorflow:Running training and evaluation locally (non-distributed).
INFO:tensorflow:Start train and evaluate loop. The evaluate will 