# TF Estimator

TF Estimator interface design is inspired from the popular machine learning library SciKit Learn, allowing to create the estimator object from different kinds of available models, and then providing **four main functions** on any kind of **estimator**:
* `estimator.fit()`
* `estimator.evaluate()`
* `estimator.predict()`
* `estimator.export()`

The estimator object represents the model,but the model itself is created from the **model definition function** provided to the estimator.

![TF_Estimator](./TF_Estimator.png)

Using the Estimator API instead of building everything in core TensorFlow has the benefit of not worrying about graphs, sessions, initializing variables or other low-level details.

At the time of writing this book, TensorFlow provides following pre-built estimators:
* `tf.contrib.learn.KMeansClustering`
* `tf.contrib.learn.DNNClassifier`
* `tf.contrib.learn.DNNRegressor`
* `tf.contrib.learn.DNNLinearCombinedRegressor`
* `tf.contrib.learn.DNNLinearCombinedClassifier`
* `tf.contrib.learn.LinearClassifier`
* `tf.contrib.learn.LinearRegressor`
* `tf.contrib.learn.LogisticRegressor`

---
#### The simple workflow in TF Estimator API is as follows:

1. Find the pre-built Estimator that is relevant to the problem you are trying to solve.
2. Write the function to import the dataset.
3. Define the columns in data that contain features.
4. Create the instance of the pre-built estimator that you selected in step 1.
5. Train the estimator.
6. Use the trained estimator to do evaluation or prediction.
---

**Note:** Keras library discussed in the next chapter, provides a convenience function to convert Keras models to Estimators: `keras.estimator.model_to_estimator()`.

In [1]:
import os
import tensorflow as tf
import numpy as np
from tensorflow.examples.tutorials.mnist import input_data

In [2]:
tf.reset_default_graph()

In [3]:
mnist = input_data.read_data_sets(
    os.path.join('.', 'mnist'),
    one_hot=False
)

Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.
Instructions for updating:
Please write your own downloading logic.
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting ./mnist/train-images-idx3-ubyte.gz
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting ./mnist/train-labels-idx1-ubyte.gz
Extracting ./mnist/t10k-images-idx3-ubyte.gz
Extracting ./mnist/t10k-labels-idx1-ubyte.gz
Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.


### About this `mnist` tutorial thing
See here in the source code: [tensorflow-1.15.5/tensorflow/examples/tutorials/mnist/\_\_init\_\_.py](../tf1_source_code/tensorflow-1.15.5/tensorflow/examples/tutorials/mnist/__init__.py)

In [4]:
print(type(mnist))
dir_ = [x for x in dir(mnist) if '__' not in x]
print(f"dir of `mnist`:")
print(dir_)
dir_train_ = [x for x in dir(mnist.train) if '__' not in x]
print(f"dir of `mnist.train`:")
print(dir_train_)

<class 'tensorflow.contrib.learn.python.learn.datasets.base.Datasets'>
dir of `mnist`:
['_asdict', '_field_defaults', '_fields', '_fields_defaults', '_make', '_replace', 'count', 'index', 'test', 'train', 'validation']
dir of `mnist.train`:
['_epochs_completed', '_images', '_index_in_epoch', '_labels', '_num_examples', 'epochs_completed', 'images', 'labels', 'next_batch', 'num_examples']


In [15]:
print("mnist.train.images.shape:", mnist.train.images.shape)
print("mnist.train.images[0].shape:", mnist.train.images[0].shape)
print("--------------------------------------")
print("mnist.train.labels.shape:", mnist.train.labels.shape)
print("mnist.train.labels[0].shape:", mnist.train.labels[0].shape)
print("mnist.train.labels[0]", mnist.train.labels[0])

mnist.train.images.shape: (55000, 784)
mnist.train.images[0].shape: (784,)
--------------------------------------
mnist.train.labels.shape: (55000,)
mnist.train.labels[0].shape: ()
mnist.train.labels[0] 7


In [16]:
x_train = mnist.train.images
y_train = mnist.train.labels
x_test = mnist.test.images
y_test = mnist.test.labels

n_classes = 10
batch_size = 100
n_steps = 1000
learning_rate = 0.01

### Details
**📚 Read the links!**

#### [Estimator](https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/estimator/Estimator):

See all details in tha above link.

Signature:
```python
tf.estimator.Estimator(
    model_fn, model_dir=None, config=None, params=None, warm_start_from=None
)
```

#### [EstimatorSpec](https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/estimator/EstimatorSpec):
```python
tf.estimator.EstimatorSpec(
    mode, predictions=None, loss=None, train_op=None, eval_metric_ops=None,
    export_outputs=None, training_chief_hooks=None, training_hooks=None,
    scaffold=None, evaluation_hooks=None, prediction_hooks=None
)
```

#### ModeKeys: [tf.contrib.learn.ModeKeys](https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/contrib/learn/ModeKeys) (deprecated), [tf.estimator.ModeKeys](https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/estimator/ModeKeys):
The following standard keys are defined:
```
TRAIN: training mode.
EVAL: evaluation mode.
INFER: inference mode.
```

In [17]:
# NOTE.
# This function returns an tf.estimator.EstimatorSpec
# The nature of this EstimatorSpec can depend on the `mode` (PREDICT, TRAIN etc.)!

def model_fn(
    features, 
    labels, 
    mode  # `mode`, one of ModeKeys: https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/contrib/learn/ModeKeys
):
    """Define the model function
    """
    # /!\ See: 
    # https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/estimator/EstimatorSpec
    espec_op = tf.estimator.EstimatorSpec
    
    # Features is a dict as per Estimator specifications
    x = features['images']
    
    # Define the network
    layer_1 = tf.layers.dense(x, 32)
    layer_2 = tf.layers.dense(layer_1, 32)
    logits = tf.layers.dense(layer_2, n_classes)

    # Define predicted classes
    predicted_classes = tf.argmax(logits, axis=1)
    if mode == tf.estimator.ModeKeys.PREDICT:
        # If we are in PREDICT mode...
        espec = espec_op(
            mode,
            predictions=predicted_classes
        )
    
    else:
        # If we are in TRAIN mode...
        # Define loss and optimizer
        cross_entropy_op = tf.nn.sparse_softmax_cross_entropy_with_logits
        loss_op = tf.reduce_mean(
            cross_entropy_op(
                logits=logits,
                labels=tf.cast(
                    labels,
                    dtype=tf.int32
                )
                # ^ Interestingly they're having to cast to int32 here.
            )
        )
        optimizer = tf.train.GradientDescentOptimizer(
            learning_rate=learning_rate
        )
        train_op = optimizer.minimize(
            loss_op, 
            global_step=tf.train.get_global_step()  # This looks like a useful function!
        )

        # Define accuracy
        accuracy_op = tf.metrics.accuracy(
            labels=labels, 
            predictions=predicted_classes
        )

        espec = espec_op(
            # Estimator mode:
            mode=mode,
            # Predictions:
            predictions=predicted_classes,
            # Loss:
            loss=loss_op,
            # Training op (estimator.minimize):
            train_op=train_op,
            # Evaluation metrics:
            eval_metric_ops={'accuracy': accuracy_op}
        )

    return espec

In [18]:
# Create estimator object:
model = tf.estimator.Estimator(model_fn)  # Takes `model_fn`

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_model_dir': '/tmp/tmph6cqothj', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fdc16a17890>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


### Numpy input function
See: https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/estimator/inputs/numpy_input_fn
```python
tf.estimator.inputs.numpy_input_fn(
    x, y=None, batch_size=128, num_epochs=1, shuffle=None, queue_capacity=1000,
    num_threads=1
)
```

In [19]:
# Train the model:
train_input_fn = tf.estimator.inputs.numpy_input_fn(
    x={'images': x_train},
    y=y_train,
    batch_size=batch_size,
    num_epochs=None,
    shuffle=True
)

model.train(train_input_fn, steps=n_steps)

Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
Instructions for updating:
To construct input pipelines, use the `tf.data` module.
Instructions for updating:
To construct input pipelines, use the `tf.data` module.
INFO:tensorflow:Calling model_fn.
Instructions for updating:
Use keras.layers.Dense instead.
Instructions for updating:
Please use `layer.__call__` method instead.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
Instructions for updating:
To construct input pipelines, use the `tf.data` module.
INFO:tensorflow:Saving checkpoints for 0 into /tmp/tmph6cqothj/model.ckpt.
INFO:tensorflow:loss = 2.309273, step = 0
INFO:tensorflow:global

<tensorflow_estimator.python.estimator.estimator.Estimator at 0x7fdc16a17490>

In [20]:
# Evaluate the model:
eval_input_fn = tf.estimator.inputs.numpy_input_fn(
    x={'images': x_test},
    y=y_test,
    batch_size=batch_size,
    shuffle=False
)

model.evaluate(eval_input_fn)

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2021-07-19T14:25:00Z
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmp/tmph6cqothj/model.ckpt-1000
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Finished evaluation at 2021-07-19-14:25:00
INFO:tensorflow:Saving dict for global step 1000: accuracy = 0.8876, global_step = 1000, loss = 0.39507115
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 1000: /tmp/tmph6cqothj/model.ckpt-1000


{'accuracy': 0.8876, 'loss': 0.39507115, 'global_step': 1000}