In [1]:
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '1'
# os.environ['TF_ENABLE_ONEDNN_OPTS'] = '0'
import numpy as np
import pandas as pd
import tensorflow as tf
import matplotlib.pyplot as plt
plt.style.use('dark_background')

In [2]:
mnist = pd.read_csv('slmnist.csv', header=None)
mnist.shape

(2000, 785)

In [3]:
mnist[0].value_counts()

1    500
0    500
3    500
2    500
Name: 0, dtype: int64

In [4]:
features = tf.constant(mnist.iloc[:,1:].values, dtype='float32')
features

<tf.Tensor: shape=(2000, 784), dtype=float32, numpy=
array([[142., 143., 146., ...,  65.,  38.,  23.],
       [141., 142., 144., ..., 182., 183., 183.],
       [156., 157., 160., ..., 198., 193., 182.],
       ...,
       [177., 179., 180., ..., 239., 233., 240.],
       [121., 129., 138., ..., 197., 198., 211.],
       [178., 178., 178., ..., 195., 194., 192.]], dtype=float32)>

In [5]:
labels = tf.constant(tf.keras.utils.to_categorical(mnist[0].values))
labels

<tf.Tensor: shape=(2000, 4), dtype=float32, numpy=
array([[0., 1., 0., 0.],
       [1., 0., 0., 0.],
       [0., 1., 0., 0.],
       ...,
       [0., 0., 1., 0.],
       [0., 0., 0., 1.],
       [0., 0., 1., 0.]], dtype=float32)>

# Defining neural network with Keras

## The sequential API
* Input layer
* Hidden layers
* Output layer
* Ordered in sequence

## Building a sequential model

In [6]:
# Import tensorflow
from tensorflow import keras
# Define a sequential model
model = keras.Sequential()
model

<keras.engine.sequential.Sequential at 0x7f5b7dff18b0>

In [7]:
# Define first hidden layer
model.add(keras.layers.Dense(16, activation='relu', input_shape=(28*28,)))
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense (Dense)               (None, 16)                12560     
                                                                 
Total params: 12,560
Trainable params: 12,560
Non-trainable params: 0
_________________________________________________________________


In [8]:
# Define second hidden layer
model.add(keras.layers.Dense(8, activation='relu'))
# Define output layer
model.add(keras.layers.Dense(4, activation='softmax'))
# Compile the model
model.compile('adam', loss='categorical_crossentropy')
# Summarize the model
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense (Dense)               (None, 16)                12560     
                                                                 
 dense_1 (Dense)             (None, 8)                 136       
                                                                 
 dense_2 (Dense)             (None, 4)                 36        
                                                                 
Total params: 12,732
Trainable params: 12,732
Non-trainable params: 0
_________________________________________________________________


## Using the functional API

But what if you want to train two models jointly to predict the same target? The functional API is for that.

In [9]:
# Define model 1 input layer shape
model1_inputs = tf.keras.Input(shape=(28*28,))
# Define model 2 input layer shape
model2_inputs = tf.keras.Input(shape=(10,))

In [10]:
# Define layer 1 for model 1
model1_layer1 = tf.keras.layers.Dense(12, activation='relu')(model1_inputs)
# Define layer 2 for model 1
model1_layer2 = tf.keras.layers.Dense(4, activation='softmax')(model1_layer1)

In [11]:
# Define layer 1 for model 2
model2_layer1 = tf.keras.layers.Dense(8, activation='relu')(model2_inputs)
# Define layer 2 for model 2
model2_layer2 = tf.keras.layers.Dense(4, activation='softmax')(model2_layer1)

In [12]:
# Merge model 1 and model 2
merged = tf.keras.layers.add([model1_layer2, model2_layer2])
# Define a functional model
model = tf.keras.Model(inputs=[model1_inputs, model2_inputs], outputs=merged)
model

<keras.engine.functional.Functional at 0x7f5b700eddf0>

In [13]:
model.summary()

Model: "model"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 input_1 (InputLayer)           [(None, 784)]        0           []                               
                                                                                                  
 input_2 (InputLayer)           [(None, 10)]         0           []                               
                                                                                                  
 dense_3 (Dense)                (None, 12)           9420        ['input_1[0][0]']                
                                                                                                  
 dense_5 (Dense)                (None, 8)            88          ['input_2[0][0]']                
                                                                                              

In [14]:
# Compile the model
model.compile('adam', loss='categorical_crossentropy')
model

<keras.engine.functional.Functional at 0x7f5b700eddf0>

## Exercises

### The sequential model in Keras

In chapter 3, we used components of the keras API in tensorflow to define a neural network, but we stopped short of using its full capabilities to streamline model definition and training. In this exercise, you will use the keras sequential model API to define a neural network that can be used to classify images of sign language letters. You will also use the .summary() method to print the model's architecture, including the shape and number of parameters associated with each layer.

In [15]:
# Define a Keras sequential model
model = keras.Sequential()
# Define the first dense layer
model.add(keras.layers.Dense(16, activation='relu', input_shape=(784,)))
# Define the second dense layer
model.add(keras.layers.Dense(8, activation='relu'))
# Define the output layer
model.add(keras.layers.Dense(4, activation='softmax'))
# Print the model architecture
print(model.summary())

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_7 (Dense)             (None, 16)                12560     
                                                                 
 dense_8 (Dense)             (None, 8)                 136       
                                                                 
 dense_9 (Dense)             (None, 4)                 36        
                                                                 
Total params: 12,732
Trainable params: 12,732
Non-trainable params: 0
_________________________________________________________________
None


Notice that we've defined a model, but we haven't compiled it.

The compilation step in keras allows us to set the **optimizer, loss function, and other useful training parameters in a single line of code.**

### Compiling a sequential model

In this exercise, you will work towards classifying letters from the Sign Language MNIST dataset; however, you will adopt a different network architecture than what you used in the previous exercise. There will be fewer layers, but more nodes. You will also apply dropout to prevent overfitting. Finally, you will compile the model to use the adam optimizer and the categorical_crossentropy loss. You will also use a method in keras to summarize your model's architecture. Note that keras has been imported from tensorflow for you and a sequential keras model has been defined as model.

In [16]:
model = keras.Sequential()
# Define the first dense layer
model.add(keras.layers.Dense(16, activation='sigmoid', input_shape=(784,)))
# Apply dropout to the first layer's output
model.add(keras.layers.Dropout(0.25))
# Define the output layer
model.add(keras.layers.Dense(4, activation='softmax'))
# Compile the model
model.compile('adam', loss='categorical_crossentropy')
# Print a model summary
model.summary()

Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_10 (Dense)            (None, 16)                12560     
                                                                 
 dropout (Dropout)           (None, 16)                0         
                                                                 
 dense_11 (Dense)            (None, 4)                 68        
                                                                 
Total params: 12,628
Trainable params: 12,628
Non-trainable params: 0
_________________________________________________________________


### Defining a multiple input model

In some cases, the sequential API will not be sufficiently flexible to accommodate your desired model architecture and you will need to use the functional API instead. If, for instance, you want to train two models with different architectures jointly, you will need to use the functional API to do this. In this exercise, we will see how to do this. We will also use the .summary() method to examine the joint model's architecture.

Note that keras has been imported from tensorflow for you. Additionally, the input layers of the first and second models have been defined as m1_inputs and m2_inputs, respectively. Note that the two models have the same architecture, but one of them uses a sigmoid activation in the first layer and the other uses a relu.

In [17]:
# Define model 1 input layer shape
m1_inputs = tf.keras.Input(shape=(28*28,))
# Define model 2 input layer shape
m2_inputs = tf.keras.Input(shape=(10,))

In [18]:
# For model 1, pass the input layer to layer 1 and layer 1 to layer 2
m1_layer1 = keras.layers.Dense(12, activation='sigmoid')(m1_inputs)
m1_layer2 = keras.layers.Dense(4, activation='softmax')(m1_layer1)
# For model 2, pass the input layer to layer 1 and layer 1 to layer 2
m2_layer1 = keras.layers.Dense(12, activation='relu')(m2_inputs)
m2_layer2 = keras.layers.Dense(4, activation='softmax')(m2_layer1)
# Merge model outputs and define a functional model
merged = keras.layers.add([m1_layer2, m2_layer2])
model = keras.Model(inputs=[m1_inputs, m2_inputs], outputs=merged)
# Print a model summary
model.summary()

Model: "model_1"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 input_3 (InputLayer)           [(None, 784)]        0           []                               
                                                                                                  
 input_4 (InputLayer)           [(None, 10)]         0           []                               
                                                                                                  
 dense_12 (Dense)               (None, 12)           9420        ['input_3[0][0]']                
                                                                                                  
 dense_14 (Dense)               (None, 12)           132         ['input_4[0][0]']                
                                                                                            

Notice that the .summary() method yields a new column: connected to. This column tells you how layers connect to each other within the network.

# Training and validation with Keras

## Overview of training and evaluation
1. Load and clean data
2. Define model
3. Train and validate model
4. Evaluate model

## How to train a model

In [19]:
features.shape, labels.shape

(TensorShape([2000, 784]), TensorShape([2000, 4]))

In [20]:
# Define a sequential model
model = tf.keras.Sequential()
# Define the hidden layer
model.add(tf.keras.layers.Dense(16, activation='relu', input_shape=(784,)))
# Define the output layer
model.add(tf.keras.layers.Dense(4, activation='softmax'))

In [21]:
# Compile model
model.compile('adam', loss='categorical_crossentropy')

In [22]:
# Train model
model.fit(features, labels)



<keras.callbacks.History at 0x7f5b70091280>

## The fit() operation
* Required arguments
    * features
    * labels
* Many optional arguments
    * batch_size
    * epochs
    * validation_split

## Performing validation

In [23]:
# Train model with validation split
model.fit(features, labels, epochs=10, validation_split=0.20)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7f5b4e6fedc0>

## Changing the metric

In [24]:
# Recompile the model with the accuracy metric
model.compile('adam', loss='categorical_crossentropy', metrics=['accuracy'])
# Train model with validation split
model.fit(features, labels, epochs=10, validation_split=0.20)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7f5b39794e50>

## Exercises

#### Training with Keras

In this exercise, we return to our sign language letter classification problem. We have 2000 images of four letters--A, B, C, and D--and we want to classify them with a high level of accuracy. We will complete all parts of the problem, including the model definition, compilation, and training.

Note that keras has been imported from tensorflow for you. Additionally, the features are available as sign_language_features and the targets are available as sign_language_labels.

In [25]:
# Define a sequential model
model = keras.Sequential()
# Define a hidden layer
model.add(keras.layers.Dense(16, activation='relu', input_shape=(784,)))
# Define the output layer
model.add(keras.layers.Dense(4, activation='softmax'))
# Compile the model
model.compile('SGD', loss='categorical_crossentropy')

# Complete the fitting operation
model.fit(features, labels, epochs=5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x7f5b30bb0280>

#### Metrics and validation with Keras

We trained a model to predict sign language letters in the previous exercise, but it is unclear how successful we were in doing so. In this exercise, we will try to improve upon the interpretability of our results. Since we did not use a validation split, we only observed performance improvements within the training set; however, it is unclear how much of that was due to overfitting. Furthermore, since we did not supply a metric, we only saw decreases in the loss function, which do not have any clear interpretation.

Note that keras has been imported for you from tensorflow.

In [26]:
# Define sequential model
model = keras.Sequential()
# Define the first layer
model.add(keras.layers.Dense(32, activation='sigmoid', input_shape=(784,)))
# Add activation function to classifier
model.add(keras.layers.Dense(4, activation='softmax'))
# Set the optimizer, loss function, and metrics
model.compile(optimizer='RMSprop', loss='categorical_crossentropy', metrics=['accuracy'])
# Add the number of epochs and the validation split
model.fit(features, labels, epochs=10, validation_split=0.1)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7f5b30adc070>

#### Overfitting detection

In this exercise, we'll work with a small subset of the examples from the original sign language letters dataset. A small sample, coupled with a heavily-parameterized model, will generally lead to overfitting. This means that your model will simply memorize the class of each example, rather than identifying features that generalize to many examples.

You will detect overfitting by checking whether the validation sample loss is substantially higher than the training sample loss and whether it increases with further training. With a small sample and a high learning rate, the model will struggle to converge on an optimum. You will set a low learning rate for the optimizer, which will make it easier to identify overfitting.

Note that keras has been imported from tensorflow.

In [27]:
# Define sequential model
model = keras.Sequential()
# Define the first layer
model.add(keras.layers.Dense(1024, activation='relu', input_shape=(784,)))
# Add activation function to classifier
model.add(keras.layers.Dense(4, activation='softmax'))
# Finish the model compilation
model.compile(optimizer=keras.optimizers.Adam(learning_rate=0.001), 
              loss='categorical_crossentropy', metrics=['accuracy'])
# Complete the model fit operation
model.fit(features, labels, epochs=50, validation_split=0.5)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


<keras.callbacks.History at 0x7f5b2a5196d0>

#### Evaluating models

Two models have been trained and are available: large_model, which has many parameters; and small_model, which has fewer parameters. Both models have been trained using train_features and train_labels, which are available to you. A separate test set, which consists of test_features and test_labels, is also available.

Your goal is to evaluate relative model performance and also determine whether either model exhibits signs of overfitting. You will do this by evaluating large_model and small_model on both the train and test sets. For each model, you can do this by applying the .evaluate(x, y) method to compute the loss for features x and labels y. You will then compare the four losses generated.

In [28]:
from sklearn.model_selection import train_test_split
train_features, test_features, train_labels, test_labels = \
    train_test_split(mnist.iloc[:,1:], mnist[0], test_size=0.5)
train_features = tf.constant(train_features, dtype='float32')
test_features = tf.constant(test_features, dtype='float32')
train_labels = tf.constant(tf.keras.utils.to_categorical(train_labels))
test_labels = tf.constant(tf.keras.utils.to_categorical(test_labels))

In [29]:
small_model = keras.Sequential()
small_model.add(keras.layers.Dense(8, activation='relu', input_shape=(784,)))
small_model.add(keras.layers.Dense(4, activation='softmax'))
small_model.compile('SGD', loss='categorical_crossentropy')
small_model.fit(train_features, train_labels, epochs=5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x7f5ad4244ca0>

In [30]:
large_model = keras.Sequential()
large_model.add(keras.layers.Dense(2, activation='relu', input_shape=(784,)))
large_model.add(keras.layers.Dense(4, activation='softmax'))
large_model.compile('SGD', loss='categorical_crossentropy')
large_model.fit(train_features, train_labels, epochs=5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x7f5ad4169340>

In [31]:
# Evaluate the small model using the train data
small_train = small_model.evaluate(train_features, train_labels)
# Evaluate the small model using the test data
small_test = small_model.evaluate(test_features, test_labels)
# Evaluate the large model using the train data
large_train = large_model.evaluate(train_features, train_labels)
# Evaluate the large model using the test data
large_test = large_model.evaluate(test_features, test_labels)
# Print losses
print('\n Small - Train: {}, Test: {}'.format(small_train, small_test))
print('Large - Train: {}, Test: {}'.format(large_train, large_test))


 Small - Train: 1.3860633373260498, Test: 1.386661171913147
Large - Train: 1.3861362934112549, Test: 1.3865036964416504


# Training models with the Estimators API

## What is the Estimators API?
* High level submodule
* Less flexible
* Enforces best practices
* Faster deployment
* Many premade models

## Model specification and training
1. Define feature columns
2. Load and transform data
3. Define an estimator
4. Apply train operation

## Defining feature columns

In [45]:
# Define a numeric feature column
size = tf.feature_column.numeric_column("size")
size

NumericColumn(key='size', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None)

In [46]:
# Define a categorical feature column
rooms = tf.feature_column.categorical_column_with_vocabulary_list("rooms",
                                                                  ["1", "2", "3", "4", "5"])
rooms

VocabularyListCategoricalColumn(key='rooms', vocabulary_list=('1', '2', '3', '4', '5'), dtype=tf.string, default_value=-1, num_oov_buckets=0)

In [47]:
rooms = tf.feature_column.indicator_column(rooms)
rooms

IndicatorColumn(categorical_column=VocabularyListCategoricalColumn(key='rooms', vocabulary_list=('1', '2', '3', '4', '5'), dtype=tf.string, default_value=-1, num_oov_buckets=0))

In [48]:
# Create feature column list
features_list = [size, rooms]
features_list

[NumericColumn(key='size', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None),
 IndicatorColumn(categorical_column=VocabularyListCategoricalColumn(key='rooms', vocabulary_list=('1', '2', '3', '4', '5'), dtype=tf.string, default_value=-1, num_oov_buckets=0))]

In [49]:
# other case: Define a matrix feature column
[tf.feature_column.numeric_column('image', shape=(784,))]

[NumericColumn(key='image', shape=(784,), default_value=None, dtype=tf.float32, normalizer_fn=None)]

## Loading and transforming data

In [50]:
# Define input data function
def input_fn():
    # Define feature dictionary
    features = {"size": [1340, 1690, 2720], "rooms": ["1", "3", "4"]}
    # Define labels
    labels = [221900, 538000, 180000]
    return features, labels

In [51]:
input_fn()

({'size': [1340, 1690, 2720], 'rooms': ['1', '3', '4']},
 [221900, 538000, 180000])

## Define and train a regression estimator

In [52]:
# Define a deep neural network regression
model0 = tf.estimator.DNNRegressor(feature_columns=features_list,
                                   hidden_units=[10, 6, 6, 3])

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_model_dir': '/tmp/tmp6hm96loe', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


In [53]:
# Train the regression model
model0.train(input_fn, steps=20)

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...
INFO:tensorflow:Saving checkpoints for 0 into /tmp/tmp6hm96loe/model.ckpt.
INFO:tensorflow:/tmp/tmp6hm96loe/model.ckpt-0.index
INFO:tensorflow:0
INFO:tensorflow:/tmp/tmp6hm96loe/model.ckpt-0.data-00000-of-00001
INFO:tensorflow:0
INFO:tensorflow:/tmp/tmp6hm96loe/model.ckpt-0.meta
INFO:tensorflow:200
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...
INFO:tensorflow:loss = 123695170000.0, step = 0
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 20...
INFO:tensorflow:Saving checkpoints for 20 into /tmp/tmp6hm96loe/model.ckpt.
INFO:tensorflow:/tmp/tmp6hm96loe/model.ckpt-20.meta
INFO:tensorflow:200
INFO:tensorflow:/tmp/tmp6hm96loe/model.ckp

<tensorflow_estimator.python.estimator.canned.dnn.DNNRegressorV2 at 0x7f5b70062700>

## Define and train a classifier estimator

```
# Define a deep neural network classifier
model1 = tf.estimator.DNNClassifier(feature_columns=features_list,
                                    hidden_units=[32, 16, 8], n_classes=4)

# Train the classifier
model1.train(inputfn, steps=20)
``````

https://www.tensorflow.org/guide/estimators

## Exercises

In [54]:
housing = pd.read_csv('kc_house_data.csv')
housing

Unnamed: 0,id,date,price,bedrooms,bathrooms,sqft_living,sqft_lot,floors,waterfront,view,...,grade,sqft_above,sqft_basement,yr_built,yr_renovated,zipcode,lat,long,sqft_living15,sqft_lot15
0,7129300520,20141013T000000,221900.0,3,1.00,1180,5650,1.0,0,0,...,7,1180,0,1955,0,98178,47.5112,-122.257,1340,5650
1,6414100192,20141209T000000,538000.0,3,2.25,2570,7242,2.0,0,0,...,7,2170,400,1951,1991,98125,47.7210,-122.319,1690,7639
2,5631500400,20150225T000000,180000.0,2,1.00,770,10000,1.0,0,0,...,6,770,0,1933,0,98028,47.7379,-122.233,2720,8062
3,2487200875,20141209T000000,604000.0,4,3.00,1960,5000,1.0,0,0,...,7,1050,910,1965,0,98136,47.5208,-122.393,1360,5000
4,1954400510,20150218T000000,510000.0,3,2.00,1680,8080,1.0,0,0,...,8,1680,0,1987,0,98074,47.6168,-122.045,1800,7503
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
21608,263000018,20140521T000000,360000.0,3,2.50,1530,1131,3.0,0,0,...,8,1530,0,2009,0,98103,47.6993,-122.346,1530,1509
21609,6600060120,20150223T000000,400000.0,4,2.50,2310,5813,2.0,0,0,...,8,2310,0,2014,0,98146,47.5107,-122.362,1830,7200
21610,1523300141,20140623T000000,402101.0,2,0.75,1020,1350,2.0,0,0,...,7,1020,0,2009,0,98144,47.5944,-122.299,1020,2007
21611,291310100,20150116T000000,400000.0,3,2.50,1600,2388,2.0,0,0,...,8,1600,0,2004,0,98027,47.5345,-122.069,1410,1287


### Preparing to train with Estimators

For this exercise, we'll return to the King County housing transaction dataset from chapter 2. We will again develop and train a machine learning model to predict house prices; however, this time, we'll do it using the estimator API.

Rather than completing everything in one step, we'll break this procedure down into parts. We'll begin by defining the feature columns and loading the data. In the next exercise, we'll define and train a premade estimator. Note that feature_column has been imported for you from tensorflow. Additionally, numpy has been imported as np, and the Kings County housing dataset is available as a pandas DataFrame: housing.

In [55]:
# Define feature columns for bedrooms and bathrooms
bedrooms = tf.feature_column.numeric_column("bedrooms")
bathrooms = tf.feature_column.numeric_column('bathrooms')

# Define the list of feature columns
feature_list = [bedrooms, bathrooms]

def input_fn():
	# Define the labels
	labels = np.array(housing['price'])
	# Define the features
	features = {'bedrooms': np.array(housing['bedrooms']), 
                'bathrooms': np.array(housing['bathrooms'])}
	return features, labels

Use a deep neural network regressor with 2 nodes in both the first and second hidden layers and 1 training step.

### Defining Estimators

In the previous exercise, you defined a list of feature columns, feature_list, and a data input function, input_fn(). In this exercise, you will build on that work by defining an estimator that makes use of input data.

In [56]:
# Define the model and set the number of steps
model = tf.estimator.DNNRegressor(feature_columns=feature_list, hidden_units=[2,2])
model.train(input_fn, steps=1)

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_model_dir': '/tmp/tmp0yy1oqda', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:ten

<tensorflow_estimator.python.estimator.canned.dnn.DNNRegressorV2 at 0x7f5aa0648400>

Modify the code to use a LinearRegressor(), remove the hidden_units, and set the number of steps to 2.

In [57]:
# Define the model and set the number of steps
model = tf.estimator.LinearRegressor(feature_columns=feature_list)
model.train(input_fn, steps=2)

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_model_dir': '/tmp/tmp_v7m6m4n', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:ten

<tensorflow_estimator.python.estimator.canned.linear.LinearRegressorV2 at 0x7f5aa018f430>

# What you learned
* Chapter 1
    * Low-level, basic, and advanced operations
    * Graph-based computation
    * Gradient computation and optimization
* Chapter 2
    * Data loading and transformation
    * Predefined and custom loss functions
    * Linear models and batch training
* Chapter 3
    * Dense neural network layers
    * Activation functions
    * Optimization algorithms
    * Training neural networks
* Chapter 4
    * Neural networks in Keras
    * Training and validation
    * The Estimators API

# TensorFlow extensions

In addition to what we covered, there are also a two important TensorFlow extensions that did not fit into the course, but may be worthwhile to explore on your own. The first is TensorFlow Hub, which allows users to import pretrained models that can then be used to perform transfer learning. This will be particularly useful when you want to train an image classifier with a small number of images, but want to make use of a feature-extractor trained on a much larger set of different images.

TensorFlow Probability is another exciting extension, which is also currently available as a standalone module. One benefit of using TensorFlow Probability is that it provides additional statistical distributions that can be used for random number generation. It also enables you to incorporate trainable statistical distributions into your models. Finally, TensorFlow Probability provides an extended set of optimizers that are commonly used in statistical research. This gives you additional tools beyond what the core TensorFlow module provides. 

* TensorFlow Hub
    * Pretrained models
    * Transfer learning
* TensorFlow Probability
    * More statistical distributions
    * Trainable distributions
    * Extended set of optimizers

# TensorFlow 2.0

Finally, I will say a few words about the difference between TensorFlow 2 and TensorFlow 1. If you primarily develop in 1, you may have noticed that you do not need to define static graphs or enable eager execution. This is done automatically in 2. Furthermore, TensorFlow 2 has substantially tighter integration with Keras. In fact, the core functionality of the TensorFlow 1 train module is handled by tf.Keras operations in 2. In addition to the centrality of Keras, the Estimators API also plays a more important role in TensorFlow 2. Finally, TensorFlow 2 also allows you to use static graphs, but they are available through the tf.function operation. 

* TensorFlow 2.0
    * eager_execution()
    * Tighter keras integration
    * Estimators
    * function()