# Notes from:
<BR>
    
https://www.tensorflow.org/tutorials/estimator/premade

In [1]:
import tensorflow as tf

In [2]:
tf.__version__

'2.1.0'

In [3]:
help(tf.estimator)

Help on package tensorflow_estimator.python.estimator.api._v2.estimator in tensorflow_estimator.python.estimator.api._v2:

NAME
    tensorflow_estimator.python.estimator.api._v2.estimator - Estimator: High level tools for working with models.

PACKAGE CONTENTS
    experimental (package)
    export (package)
    inputs (package)

FILE
    /Users/horace/Documents/projects/CMS/pyCMS/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/api/_v2/estimator/__init__.py




__All Estimators—whether pre-made or custom—are classes based on the `tf.estimator.Estimator` class.__

`tf.estimator`—a high-level TensorFlow API. Estimators encapsulate the following actions:

   - training
   - evaluation
   - prediction
   - export for serving


In [4]:
help(tf.estimator.Estimator)

Help on class EstimatorV2 in module tensorflow_estimator.python.estimator.estimator:

class EstimatorV2(Estimator)
 |  EstimatorV2(model_fn, model_dir=None, config=None, params=None, warm_start_from=None)
 |  
 |  Estimator class to train and evaluate TensorFlow models.
 |  
 |  The `Estimator` object wraps a model which is specified by a `model_fn`,
 |  which, given inputs and a number of other parameters, returns the ops
 |  necessary to perform training, evaluation, or predictions.
 |  
 |  All outputs (checkpoints, event files, etc.) are written to `model_dir`, or a
 |  subdirectory thereof. If `model_dir` is not set, a temporary directory is
 |  used.
 |  
 |  The `config` argument can be passed `tf.estimator.RunConfig` object containing
 |  information about the execution environment. It is passed on to the
 |  `model_fn`, if the `model_fn` has a parameter named "config" (and input
 |  functions in the same manner). If the `config` parameter is not passed, it is
 |  instantiated 

<br>
<br>


## Pre-made Estimators

<br>

Pre-made Estimators enable you to work at a much higher conceptual level than the base TensorFlow APIs. You no longer have to worry about creating the computational graph or sessions since Estimators handle all the "plumbing" for you. Furthermore, pre-made Estimators let you experiment with different model architectures by making only minimal code changes. `tf.estimator.DNNClassifier`, for example, is a pre-made Estimator class that trains classification models based on dense, feed-forward neural networks.

#### Structure of a pre-made Estimators program

1. Write one or more dataset importing functions.     (-1->)
2. Define the feature columns.                        (-2->)
3. Instantiate the relevant pre-made Estimator.       (-3->)
4. Call a training, evaluation, or inference method.  (-4->)

<br>
<br>
<br>

### EXAMPLE:  Iris Dataset with DNN



Based on this [colab notebook](https://www.tensorflow.org/tutorials/estimator/premade)

### - 0 - > Get the data

In [9]:
from __future__ import absolute_import, division, print_function, unicode_literals

import pandas as pd

In [7]:
CSV_COLUMN_NAMES = ['SepalLength', 'SepalWidth', 'PetalLength', 'PetalWidth', 'Species']
SPECIES = ['Setosa', 'Versicolor', 'Virginica']


In [10]:
train_path = tf.keras.utils.get_file(
    "iris_training.csv", "https://storage.googleapis.com/download.tensorflow.org/data/iris_training.csv")
test_path = tf.keras.utils.get_file(
    "iris_test.csv", "https://storage.googleapis.com/download.tensorflow.org/data/iris_test.csv")

train = pd.read_csv(train_path, names=CSV_COLUMN_NAMES, header=0)
test = pd.read_csv(test_path, names=CSV_COLUMN_NAMES, header=0)


In [12]:
train.describe()

Unnamed: 0,SepalLength,SepalWidth,PetalLength,PetalWidth,Species
count,120.0,120.0,120.0,120.0,120.0
mean,5.845,3.065,3.739167,1.196667,1.0
std,0.868578,0.427156,1.8221,0.782039,0.840168
min,4.4,2.0,1.0,0.1,0.0
25%,5.075,2.8,1.5,0.3,0.0
50%,5.8,3.0,4.4,1.3,1.0
75%,6.425,3.3,5.1,1.8,2.0
max,7.9,4.4,6.9,2.5,2.0


In [14]:
test.describe()

Unnamed: 0,SepalLength,SepalWidth,PetalLength,PetalWidth,Species
count,30.0,30.0,30.0,30.0,30.0
mean,5.836667,3.01,3.836667,1.206667,1.0
std,0.653628,0.463383,1.537459,0.694775,0.742781
min,4.3,2.2,1.1,0.1,0.0
25%,5.5,2.725,2.3,0.625,0.25
50%,5.75,3.0,4.25,1.3,1.0
75%,6.3,3.3,4.9,1.575,1.75
max,7.1,4.2,5.9,2.5,2.0


In [15]:
train_y = train.pop('Species')
test_y = test.pop('Species')

# The label column has now been removed from the features.
train.head()

Unnamed: 0,SepalLength,SepalWidth,PetalLength,PetalWidth
0,6.4,2.8,5.6,2.2
1,5.0,2.3,3.3,1.0
2,4.9,2.5,4.5,1.7
3,4.9,3.1,1.5,0.1
4,5.7,3.8,1.7,0.3


In [17]:
train_y

0      2
1      1
2      2
3      0
4      0
      ..
115    1
116    1
117    0
118    0
119    1
Name: Species, Length: 120, dtype: int64

<br>
<br>

### - 1 - > Create input functions

<br>

You must create input functions to supply data for training, evaluating, and prediction.

An input function is a function that returns a `tf.data.Dataset` object which outputs the following two-element tuple:

- features - A Python dictionary in which:
    - Each key is the name of a feature.
    - Each value is an array containing all of that feature's values.
- label - An array containing the values of the label for every example

_Your input function may generate the features dictionary and label list any way you like. However, we recommend using TensorFlow's Dataset API, which can parse all sorts of data_

In [18]:
def input_fn(features, labels, training=True, batch_size=256):
    """An input function for training or evaluating"""
    # Convert the inputs to a Dataset.
    dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels))

    # Shuffle and repeat if you are in training mode.
    if training:
        dataset = dataset.shuffle(1000).repeat()
    
    return dataset.batch(batch_size)



<br>
<br>

### - 2 - >  Define the feature columns
<br>
<br>

A feature column is an object describing how the model should use raw input data from the features dictionary. When you build an Estimator model, you pass it a list of feature columns that describes each of the features you want the model to use. The `tf.feature_column` module provides many options for representing data to the model.

<br>

_Feature columns can be far more sophisticated_

In [20]:
train.keys()

Index(['SepalLength', 'SepalWidth', 'PetalLength', 'PetalWidth'], dtype='object')

In [19]:
# Feature columns describe how to use the input.
my_feature_columns = []
for key in train.keys():
    my_feature_columns.append(tf.feature_column.numeric_column(key=key))


<br>
<br>

### - 3 - > Instantiate an estimator

<br>
<br>

For the Iris problem, `tf.estimator.DNNClassifier` seems like the best choice

In [21]:
# Build a DNN with 2 hidden layers with 30 and 10 hidden nodes each.
classifier = tf.estimator.DNNClassifier(
    feature_columns=my_feature_columns,
    # Two hidden layers of 30 and 10 nodes respectively.
    hidden_units=[30, 10],
    # The model must choose between 3 classes.
    n_classes=3)

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_model_dir': '/var/folders/mx/1sp31jld32qb099smm3djqsh0000gq/T/tmplqx7vz9l', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


<br>
<br>

### - 4 - > Train, Evaluate, and Predict

<br>

Now that you have an Estimator object, you can call methods to do the following:

- Train the model.
- Evaluate the trained model.
- Use the trained model to make predictions.


<br>
<br

__Train the model__

In [22]:
classifier.train(
    input_fn=lambda: input_fn(train, train_y, training=True),
    steps=5000)


Instructions for updating:
If using Keras pass *_constraint arguments to layers.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
INFO:tensorflow:Calling model_fn.


To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the author of this layer, you can disable autocasting by passing autocast=False to the base Layer constructor.

Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 0 into /var/folders/mx/1sp31jld32qb099smm3djqsh0000gq/T/tmplqx7vz9l/model

<tensorflow_estimator.python.estimator.canned.dnn.DNNClassifierV2 at 0x13a024390>

_Note that you wrap up your input_fn call in a `lambda` to capture the arguments while providing an input function that takes no arguments, as expected by the Estimator. The `steps` argument tells the method to stop training after a number of training steps._

<br>
<br

__Evaluate the trained model__

In [23]:
eval_result = classifier.evaluate(
    input_fn=lambda: input_fn(test, test_y, training=False))

print('\nTest set accuracy: {accuracy:0.3f}\n'.format(**eval_result))


INFO:tensorflow:Calling model_fn.


To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the author of this layer, you can disable autocasting by passing autocast=False to the base Layer constructor.

INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2020-02-13T00:40:08Z
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /var/folders/mx/1sp31jld32qb099smm3djqsh0000gq/T/tmplqx7vz9l/model.ckpt-5000
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Inference Time : 0.38408s
INFO:tensorflow:Finished evaluation at 2020-02-13-00:40:08
INFO:tensorflow:Saving dict for global step 5000: accuracy = 0.9, average_loss = 0.54562896, global_step = 5000, loss = 0.54562896
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 5000: /var/folders/mx/1sp31jld

_The `eval_result` dictionary also contains the average_loss (mean loss per sample), the loss (mean loss per mini-batch) and the value of the estimator's global_step (the number of training iterations it underwent)._

In [25]:
eval_result

{'accuracy': 0.9,
 'average_loss': 0.54562896,
 'loss': 0.54562896,
 'global_step': 5000}

<br>
<br>

__Making predictions (inferring) from the trained model__

In [26]:
# Generate predictions from the model
expected = ['Setosa', 'Versicolor', 'Virginica']
predict_x = {
    'SepalLength': [5.1, 5.9, 6.9],
    'SepalWidth': [3.3, 3.0, 3.1],
    'PetalLength': [1.7, 4.2, 5.4],
    'PetalWidth': [0.5, 1.5, 2.1],
}

def input_fn(features, batch_size=256):
    """An input function for prediction."""
    # Convert the inputs to a Dataset without labels.
    return tf.data.Dataset.from_tensor_slices(dict(features)).batch(batch_size)

predictions = classifier.predict(
    input_fn=lambda: input_fn(predict_x))


_The `predict` method returns a Python iterable, yielding a dictionary of prediction results for each example. The following code prints a few predictions and their probabilities:_

In [27]:
for pred_dict, expec in zip(predictions, expected):
    class_id = pred_dict['class_ids'][0]
    probability = pred_dict['probabilities'][class_id]

    print('Prediction is "{}" ({:.1f}%), expected "{}"'.format(
        SPECIES[class_id], 100 * probability, expec))


INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /var/folders/mx/1sp31jld32qb099smm3djqsh0000gq/T/tmplqx7vz9l/model.ckpt-5000
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
Prediction is "Setosa" (71.6%), expected "Setosa"
Prediction is "Versicolor" (47.2%), expected "Versicolor"
Prediction is "Virginica" (62.4%), expected "Virginica"
