Estimator API

In [2]:
from __future__ import absolute_import, division, print_function, unicode_literals
!pip install tensorflow==2.0.0

import tensorflow as tf
import pandas as pd

Collecting tensorflow==2.0.0
[?25l  Downloading https://files.pythonhosted.org/packages/46/0f/7bd55361168bb32796b360ad15a25de6966c9c1beb58a8e30c01c8279862/tensorflow-2.0.0-cp36-cp36m-manylinux2010_x86_64.whl (86.3MB)
[K     |████████████████████████████████| 86.3MB 122kB/s 
Collecting tensorflow-estimator<2.1.0,>=2.0.0
[?25l  Downloading https://files.pythonhosted.org/packages/fc/08/8b927337b7019c374719145d1dceba21a8bb909b93b1ad6f8fb7d22c1ca1/tensorflow_estimator-2.0.1-py2.py3-none-any.whl (449kB)
[K     |████████████████████████████████| 450kB 33.3MB/s 
Collecting tensorboard<2.1.0,>=2.0.0
[?25l  Downloading https://files.pythonhosted.org/packages/76/54/99b9d5d52d5cb732f099baaaf7740403e83fe6b0cedde940fabd2b13d75a/tensorboard-2.0.2-py3-none-any.whl (3.8MB)
[K     |████████████████████████████████| 3.8MB 35.6MB/s 
Installing collected packages: tensorflow-estimator, tensorboard, tensorflow
  Found existing installation: tensorflow-estimator 1.15.1
    Uninstalling tensorflow-estim

The dataset

We are going to classify Iris flower based on following features


1.   Sepal Length
2.   Sepal Width
3.   Petal Length
4.   Petal Width

Into Species


1.   Setosa
2.   Versicolor
3.   Virginica





In [0]:
CSV_COLUMN_NAMES = ['SepalLength', 'SepalWidth', 'PetalLength', 'PetalWidth', 'Species']
SPECIES = ['Setosa', 'Versicolor', 'Virginica']

Download and parse the iris flower data

In [0]:
train_path = tf.keras.utils.get_file('iris_training.csv', 'https://storage.googleapis.com/download.tensorflow.org/data/iris_training.csv')
test_path = tf.keras.utils.get_file('iris_training.csv', 'https://storage.googleapis.com/download.tensorflow.org/data/iris_test.csv')

train = pd.read_csv(train_path, names=CSV_COLUMN_NAMES, header=0)
test = pd.read_csv(test_path, names=CSV_COLUMN_NAMES, header=0)

Inspect data

In [13]:
train.head()

Unnamed: 0,SepalLength,SepalWidth,PetalLength,PetalWidth,Species
0,6.4,2.8,5.6,2.2,2
1,5.0,2.3,3.3,1.0,1
2,4.9,2.5,4.5,1.7,2
3,4.9,3.1,1.5,0.1,0
4,5.7,3.8,1.7,0.3,0


Remove the feature to be learned


In [14]:
train_y = train.pop('Species')
test_y = test.pop('Species')

# The label column has now been removed from the features
train_y.head()

0    2
1    1
2    2
3    0
4    0
Name: Species, dtype: int64

**Overview of programming with Estimator**
Derived from *tf.estimator.Estimator*

3 following stpes:

1.   Create one or more Input Function
2.   Defines the model's feature column
3.   Instantiate an Estimator, specifying the feature colummns and various hyperparameters

Then call one or more methods on the Estimator objects, passing the appropriate input function as the soruce of data

Create Input Function: An input function returns a *tf.data.Dataset* object which outputs the two-element tuple


In [0]:
def input_evaluation_set():
  features = {'SepalLength': np.array([6.4, 5.0]),
              'SepalWidth': np.array([2.8, 2.3]),
              'PetalLength': np.array([5.6, 3.3]),
              'PetalWidth': np.array([2.2, 1.0])}
  labels = np.array([2, 1])
  return features, labels

Creating dataset


In [0]:
def input_fn(features, labels, training=True, batch_size=256):
  # dataset
  dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels))

  # Shuffle and repeat for training
  if training:
    dataset = dataset.shuffle(1000).repeat()
  
  return dataset.batch(batch_size)

Define Feature Columns

In [0]:
# Feature Column defines how to use the input
my_feature_column = []
for key in train.keys():
  my_feature_column.append(tf.feature_column.numeric_column(key=key))

Instantiate an Estimator:

TensorFlow provides several pre-made classifier Estimators:

1.   tf.estimator.DNNClassifier for deep models that perform multi-class classification
2.   tf.estimator.DNNLinearComibedCLassifier for wide and deep model
3.   tf.estimator.LinearClassifier for classifiers based on linear models


In [18]:
# Build a DNNClassifier with 2 hidden layers
classifier = tf.estimator.DNNClassifier(
    feature_columns=my_feature_column,
    # Two hidden layers
    hidden_units=[30, 10],
    # THe model must choose between 3 classes
    n_classes=3)

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_model_dir': '/tmp/tmppekr24d4', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f0626d11080>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


Train, Evaluate and Predict

In [19]:
# Train the model
print(train)
print(train_y)
classifier.train(
    input_fn=lambda: input_fn(train, train_y, training=True),
    steps=5000)

     SepalLength  SepalWidth  PetalLength  PetalWidth
0            6.4         2.8          5.6         2.2
1            5.0         2.3          3.3         1.0
2            4.9         2.5          4.5         1.7
3            4.9         3.1          1.5         0.1
4            5.7         3.8          1.7         0.3
..           ...         ...          ...         ...
115          5.5         2.6          4.4         1.2
116          5.7         3.0          4.2         1.2
117          4.4         2.9          1.4         0.2
118          4.8         3.0          1.4         0.1
119          5.5         2.4          3.7         1.0

[120 rows x 4 columns]
0      2
1      1
2      2
3      0
4      0
      ..
115    1
116    1
117    0
118    0
119    1
Name: Species, Length: 120, dtype: int64
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in 

<tensorflow_estimator.python.estimator.canned.dnn.DNNClassifierV2 at 0x7f0626d11ac8>

Evaluate the model

In [0]:
eval_result = classifier.evaluate(
    input_fn=lambda: input_fn(test, test_y, training=False))

print('\n Test set accuracy: {accuracy:0.3f}\n'.format(**eval_result))

INFO:tensorflow:Calling model_fn.


To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the author of this layer, you can disable autocasting by passing autocast=False to the base Layer constructor.

INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2019-11-16T18:40:23Z
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmp/tmpvc225qdj/model.ckpt-5000
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Finished evaluation at 2019-11-16-18:40:23
INFO:tensorflow:Saving dict for global step 5000: accuracy = 0.90833336, average_loss = 0.41785252, global_step = 5000, loss = 0.41785252
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 5000: /tmp/tmpvc225qdj/model.ckpt-5000

 Test set accuracy: 0.908



Making Predictions

In [0]:
# Generate Predictions from the model
expected = ['Setosa','Versicolor', 'Virginica']
predict_x = {'SepalLength': [5.1, 5.9, 6.9],
            'SepalWidth': [3.3, 3.0, 3.1],
            'PetalLength': [1.7, 4.2, 5.4],
            'PetalWidth': [0.5, 1.5, 2.1]}

def input_fn(features, batch_size=256):
  """An Input Function for Prediction"""
  # Convert Inputs to a dataset withour labels
  return tf.data.Dataset.from_tensor_slices(dict(features)).batch(batch_size)

predictions = classifier.predict(
    input_fn=lambda: input_fn(predict_x)
)

The predict method returns a python iterable, yielding a dictionary of predictions for each example. Following code prints a few predictions and their probabilities.


In [0]:
for pred_dict, expec in zip(predictions, expected):
  class_id = pred_dict['class_ids'][0]
  probability = pred_dict['probabilities'][class_id]

  print('Prediction is "{}" ({:.1f}%), expected "{}"'.format(
      SPECIES[class_id], 100*probability, expec))

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmp/tmpvc225qdj/model.ckpt-5000
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
Prediction is "Setosa" (79.5%), expected "Setosa"
Prediction is "Versicolor" (47.9%), expected "Versicolor"
Prediction is "Virginica" (60.8%), expected "Virginica"


Conclusion