# Chapter Four. Defining neural networks with Keras

In the final chapter, you'll use high-level APIs in TensorFlow to train a sign language letter classifier. You will use both the sequential and functional Keras APIs to train, validate, and evaluate models. You will also learn how to use the Estimators API to streamline the model definition and training process and to avoid errors.

> **Topics:**
- 1. Defining neural networks with Keras
    - 1.1 The sequential model in Keras
    - 1.2 Compiling a sequential model
    - 1.3 Defining a multiple input model
- 2. Training and validation with Keras
    - 2.1. Training with Keras
    - 2.2. Metrics and validation with Keras
    - 2.3 Overfitting detection
    - 2.4 Evaluating models
- 3. Training models with the Estimators API
    - 3.1. Preparing to train with Estimators
    - 3.2. Defining Estimators
- 4. Congratulations!

In [15]:
import numpy as np
import pandas as pd
import tensorflow as tf

from tensorflow import keras, Variable, ones, matmul

filepath = '../_datasets/'

## 1. Defining Neural Networks with Keras

### Classifying sign language letters

![][01-sign_language_letters]

### The sequential API

![][02-sequential_API]

- Input layer
- Hidden layers
- Output layer
- Ordered in sequence

### Building a sequential model
```Python
# Import tensorflow
import tensorflow as tf

# Define a sequential model
model = tf.keras.Sequential()

# Define first hidden layer
model.add(keras.layers.Dense(16, activation='relu', input_shape=(28*28,)))

# Define second hidden layer
model.add(keras.layers.Dense(8, activation='relu'))

# Define output layer
model.add(keras.layers.Dense(4, activation='softmax'))

# Compile the model
model.compile('adam', loss='categorical_crossentropy')
```

### The functional API

![][03-functional_API]

### Using the functional API
```Python
# Import tensorflow
import tensorflow as tf

# Define model 1 input layer shape
model1_inputs = tf.keras.Input(shape=(28*28,))

# Define model 2 input layer shape
model2_inputs = tf.keras.Input(shape=(10,))

# Define layer 1 for model 1
model1_layer1 = tf.keras.layers.Dense(12, activation='relu')(model1_inputs)

# Define layer 2 for model 1
model1_layer2 = tf.keras.layers.Dense(4, activation='softmax')(model1_layer1)

# Define layer 1 for model 2
model2_layer1 = tf.keras.layers.Dense(8, activation='relu')(model2_inputs)

# Define layer 2 for model 2
model2_layer2 = tf.keras.layers.Dense(4, activation='softmax')(model2_layer1)

# Merge model 1 and model 2
merged = tf.keras.layers.add([model1_layer2, model2_layer2])

# Define a functional model
model = tf.keras.Model(inputs=[model1_inputs, model2_inputs], outputs=merged)

# Compile the model
model.compile('adam', loss='categorical_crossentropy')
```

[01-sign_language_letters]:_Docs/01-sign_language_letters.png
[02-sequential_API]:_Docs/02-sequential_API.png
[03-functional_API]:_Docs/03-functional_API.png

### 1.1 The sequential model in Keras
In chapter 3, we used components of the `keras` API in `tensorflow` to define a neural network, but we stopped short of using its full capabilities to streamline model definition and training. In this exercise, you will use the `keras` sequential model API to define a neural network that can be used to classify images of sign language letters. You will also use the `.summary()` method to print the model's architecture, including the shape and number of parameters associated with each layer.

Note that the images were reshaped from (28, 28) to (784,), so that they could be used as inputs to a dense layer. Additionally, note that `keras` has been imported from `tensorflow` for you.

In [2]:
# Define a Keras sequential model
model = keras.Sequential()

# Define the first dense layer
model.add(keras.layers.Dense(16, activation='relu', input_shape=(784,)))

# Define the second dense layer
model.add(keras.layers.Dense(8, activation='relu'))

# Define the output layer
model.add(keras.layers.Dense(4, activation='softmax'))

# Print the model architecture
print(model.summary())

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense (Dense)                (None, 16)                12560     
_________________________________________________________________
dense_1 (Dense)              (None, 8)                 136       
_________________________________________________________________
dense_2 (Dense)              (None, 4)                 36        
Total params: 12,732
Trainable params: 12,732
Non-trainable params: 0
_________________________________________________________________
None


Notice that we've defined a model, but we haven't compiled it. ***The compilation step in `keras` allows us to set the optimizer, loss function, and other useful training parameters in a single line of code***. Furthermore, the `.summary()` method allows us to view the model's architecture.

### 1.2 Compiling a sequential model
In this exercise, you will work towards classifying letters from the Sign Language MNIST dataset; however, you will adopt a different network architecture than what you used in the previous exercise. There will be fewer layers, but more nodes. Additionally, you will compile the model to use the `adam` optimizer and the `categorical_crossentropy` loss. You will also use a method in `keras` to summarize your model's architecture.

In [3]:
# Define a Keras sequential model
model = keras.Sequential()

# Define the first dense layer
model.add(keras.layers.Dense(16, activation='sigmoid', input_shape=(784,)))

# Define the output layer
model.add(keras.layers.Dense(4, activation='softmax'))

# Compile the model
model.compile('adam', loss='categorical_crossentropy')

# Print a model summary
print(model.summary())

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_3 (Dense)              (None, 16)                12560     
_________________________________________________________________
dense_4 (Dense)              (None, 4)                 68        
Total params: 12,628
Trainable params: 12,628
Non-trainable params: 0
_________________________________________________________________
None


### 1.3 Defining a multiple input model
In some cases, the **sequential API** will not be sufficiently flexible to accommodate your desired model architecture and you will need to use the **functional API** instead. ***If, for instance, you want to train two models with different architectures jointly, you will need to use the functional API to do this***. In this exercise, we will see how to do this. We will also use the `.summary()` method to examine the joint model's architecture.

Note that `keras` has been imported from `tensorflow` for you. Additionally, the input layers of the first and second models have been defined as `m1_inputs` and `m2_inputs`, respectively. Note that the two models have the same architecture, but one of them uses a `sigmoid` activation in the first layer and the other uses a `relu`.

In [4]:
m1_inputs = tf.keras.layers.Input(shape=(28*28,))
m2_inputs = tf.keras.layers.Input(shape=(28*28,))

print(m1_inputs)
print(m2_inputs)

Tensor("input_1:0", shape=(None, 784), dtype=float32)
Tensor("input_2:0", shape=(None, 784), dtype=float32)


In [5]:
# For model 1, pass the input layer to layer 1 and layer 1 to layer 2
m1_layer1 = keras.layers.Dense(12, activation='sigmoid')(m1_inputs)
m1_layer2 = keras.layers.Dense(4, activation='softmax')(m1_layer1)

# For model 2, pass the input layer to layer 1 and layer 1 to layer 2
m2_layer1 = keras.layers.Dense(12, activation='relu')(m2_inputs)
m2_layer2 = keras.layers.Dense(4, activation='softmax')(m2_layer1)

# Merge model outputs and define a functional model
merged = keras.layers.add([m1_layer2, m2_layer2])
model = keras.Model(inputs=[m1_inputs, m2_inputs], outputs=merged)

# Print a model summary
print(model.summary())

Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_1 (InputLayer)            [(None, 784)]        0                                            
__________________________________________________________________________________________________
input_2 (InputLayer)            [(None, 784)]        0                                            
__________________________________________________________________________________________________
dense_5 (Dense)                 (None, 12)           9420        input_1[0][0]                    
__________________________________________________________________________________________________
dense_7 (Dense)                 (None, 12)           9420        input_2[0][0]                    
______________________________________________________________________________________________

Notice that the `.summary()` method yields a new column: `connected to`. This column tells you how layers connect to each other within the network. We can see that `dense_2`, for instance, is connected to the `input_2` layer. We can also see that the `add` layer, which merged the two models, connected to both `dense_1` and `dense_3`.

## 2. Training and validation with Keras

### Overview of training and evaluation
1. Load and clean data
2. Define model
3. Train and validate model
4. Evaluate model

### How to train a model

```Python
# Import tensorflow
import tensorflow as tf

# Define a sequential model
model = tf.keras.Sequential()

# Define the hidden layer
model.add(tf.keras.layers.Dense(16, activation='relu', input_shape=(784,)))

# Define the output layer
model.add(tf.keras.layers.Dense(4, activation='softmax'))

# Compile model
model.compile('adam', loss='categorical_crossentropy')

# Train model
model.fit(image_features, image_labels)
```

### The fit() operation
- Required arguments
    - `features`
    - `labels`
- Many optional arguments
    - `batch_size`
    - `epochs`
    - `validation_split`

### Batch size and epochs

- The numbers of examples in each batch is the **batch size**.
- The number of times you train on the full set of batches is called **numbers of epochs**
- In the image the batch size is 5 and the number of epochs is 2.

![][04-Batches_epochs]

### Performing validation

- The `validation_split` parameter it divide the data in two parts. 
    - The first part is the train set
    - The second part is the validation set
- Defining `validation_split = 0.20` means 20% of the data will be for validation   

![][05-validation]

```Python
# Train model with validation split
model.fit(features, labels, epochs=10, validation_split=0.20)
```

- In the next image we can see the training loss and the evaluation loss separately.
- If the training loss becomes substantially lower than the evaluation loss, is a clear indication the model is **overfitting**. To avoid overfittig we could: 
    - Terminate the training process before that point or
    - add regularization or
    - dropout    

![][06-validation]

### Changing the metric
```Python
# Recomile the model with the accuracy metric
model.compile('adam', loss='categorical_crossentropy', metrics=['accuracy'])

# Train model with validation split
model.fit(features, labels, epochs=10, validation_split=0.20)
```

![][07-metric]

### The evaluation() operation

- It's always a good idea to split off a test set before you begin to train and validate, **this way you can check the performance on the test set and the end of the training process**.
- Since you may tune model parameters in response to validation set performance, **using a separate test set will provide you with further assurance that you haven't overfitted**.

![][08-evaluation]

```Python
# Evaluate the test set
model.evaluate(test)
```

[04-Batches_epochs]:_Docs/04-Batches_epochs.png
[05-validation]:_Docs/05-validation.png
[06-validation]:_Docs/06-validation.png
[07-metric]:_Docs/07-metric.png
[08-evaluation]:_Docs/08-evaluation.png

### 2.1 Training with Keras
In this exercise, we return to our sign language letter classification problem. We have 2000 images of four letters--A, B, C, and D--and we want to classify them with a high level of accuracy. We will complete all parts of the problem, including the model definition, compilation, and training.

Note that `keras` has been imported from `tensorflow` for you. Additionally, the features are available as `sign_language_features` and the targets are available as `sign_language_labels`.

In [6]:
file = filepath+'slmnist.csv'

In [8]:
df_slmnist = pd.read_csv(file, header=None)
df_slmnist.head()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,775,776,777,778,779,780,781,782,783,784
0,1,142,143,146,148,149,149,149,150,151,...,0,15,55,63,37,61,77,65,38,23
1,0,141,142,144,145,147,149,150,151,152,...,173,179,179,180,181,181,182,182,183,183
2,1,156,157,160,162,164,166,169,171,171,...,181,197,195,193,193,191,192,198,193,182
3,3,63,26,65,86,97,106,117,123,128,...,175,179,180,182,183,183,184,185,185,185
4,1,156,160,164,168,172,175,178,180,182,...,108,107,106,110,111,108,108,102,84,70


In [35]:
from sklearn.preprocessing import OneHotEncoder, LabelEncoder

sign_language_labels = df_slmnist[df_slmnist.columns[0]].values.reshape(-1,1)

# binary encode
onehot_encoder = OneHotEncoder(sparse=False, categories='auto')
sign_language_labels = onehot_encoder.fit_transform(target)
sign_language_labels[:5]

array([[0., 1., 0., 0.],
       [1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 0., 1.],
       [0., 1., 0., 0.]])

In [63]:
# Selecting scaler
from sklearn import preprocessing
max_abs_scaler = preprocessing.MaxAbsScaler()

# Selecting features and applying scaler
sign_language_features = max_abs_scaler.fit_transform(df_slmnist[df_slmnist.columns[1:]].values)

784

In [61]:
# Define a sequential model
model = keras.Sequential()

# Define a hidden layer
model.add(keras.layers.Dense(16, activation='relu', input_shape=(784,)))

# Define the output layer
model.add(keras.layers.Dense(4, activation='softmax'))

# Compile the model
model.compile('SGD', loss='categorical_crossentropy')

# Complete the fitting operation
model.fit(sign_language_features, sign_language_labels, epochs=10)

Train on 2000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x1e39c30abe0>

### 2.2 Metrics and validation with Keras
We trained a model to predict sign language letters in the previous exercise, but it is unclear how successful we were in doing so. In this exercise, we will try to improve upon the interpretability of our results. Since **we did not use a validation split, we only observed performance improvements within the training set; however, it is unclear how much of that was due to overfitting**. Furthermore, since we did not supply a metric, **we only saw decreases in the loss function, which do not have any clear interpretation**.

In [69]:
# Define sequential model
model = keras.Sequential()

# Define the first layer
model.add(keras.layers.Dense(32, activation='sigmoid', input_shape=(sign_language_features.shape[1],)))

# Add activation function to classifier
model.add(keras.layers.Dense(4, activation='softmax'))

# Set the optimizer, loss function, and metrics
model.compile(optimizer='RMSprop', loss='categorical_crossentropy', metrics=['accuracy'])

# Add the number of epochs and the validation split
model.fit(sign_language_features, sign_language_labels, epochs=10, validation_split=0.1)

Train on 1800 samples, validate on 200 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x1e3b5524400>

With the `keras` API, you only needed 14 lines of code to define, compile, train, and validate a model. You may have noticed that your model performed quite well. In just 10 epochs, we achieved a classification accuracy of around 99% in the validation sample!

### 2.3 Overfitting detection
In this exercise, we'll work with a **small subset of the examples (50) from the original sign language letters dataset**. ***A small sample, coupled with a heavily-parameterized model, will generally lead to overfitting***. This means that your model will simply memorize the class of each example, rather than identifying features that generalize to many examples.

You will detect overfitting by checking whether the validation sample loss is substantially higher than the training sample loss and whether it increases with further training. With a small sample and a high learning rate, the model will struggle to converge on an optimum. You will set a low learning rate for the optimizer, which will make it easier to identify overfitting.

In [121]:
# Generate a random list for selecting a subsample of 50 examples
import random
l_index = random.sample(range(0,200), 50)

subsample_sign_language_labels = sign_language_labels[l_index]
subsample_sign_language_features = sign_language_features[l_index]

In [122]:
# Define sequential model
model = keras.Sequential()

# Define the first layer
model.add(keras.layers.Dense(1024, activation='relu', input_shape=(784,)))

# Add activation function to classifier
model.add(keras.layers.Dense(4, activation='softmax'))

# Finish the model compilation
model.compile(optimizer=keras.optimizers.Adam(), 
              loss='categorical_crossentropy', metrics=['accuracy'])

# Complete the model fit operation
model.fit(subsample_sign_language_features, subsample_sign_language_labels, epochs=200, validation_split=0.5)

Train on 25 samples, validate on 25 samples
Epoch 1/200
Epoch 2/200
Epoch 3/200
Epoch 4/200
Epoch 5/200
Epoch 6/200
Epoch 7/200
Epoch 8/200
Epoch 9/200
Epoch 10/200
Epoch 11/200
Epoch 12/200
Epoch 13/200
Epoch 14/200
Epoch 15/200
Epoch 16/200
Epoch 17/200
Epoch 18/200
Epoch 19/200
Epoch 20/200
Epoch 21/200
Epoch 22/200
Epoch 23/200
Epoch 24/200
Epoch 25/200
Epoch 26/200
Epoch 27/200
Epoch 28/200
Epoch 29/200
Epoch 30/200
Epoch 31/200
Epoch 32/200
Epoch 33/200
Epoch 34/200
Epoch 35/200
Epoch 36/200
Epoch 37/200
Epoch 38/200
Epoch 39/200
Epoch 40/200
Epoch 41/200
Epoch 42/200
Epoch 43/200
Epoch 44/200
Epoch 45/200
Epoch 46/200
Epoch 47/200
Epoch 48/200
Epoch 49/200
Epoch 50/200
Epoch 51/200
Epoch 52/200
Epoch 53/200
Epoch 54/200
Epoch 55/200
Epoch 56/200


Epoch 57/200
Epoch 58/200
Epoch 59/200
Epoch 60/200
Epoch 61/200
Epoch 62/200
Epoch 63/200
Epoch 64/200
Epoch 65/200
Epoch 66/200
Epoch 67/200
Epoch 68/200
Epoch 69/200
Epoch 70/200
Epoch 71/200
Epoch 72/200
Epoch 73/200
Epoch 74/200
Epoch 75/200
Epoch 76/200
Epoch 77/200
Epoch 78/200
Epoch 79/200
Epoch 80/200
Epoch 81/200
Epoch 82/200
Epoch 83/200
Epoch 84/200
Epoch 85/200
Epoch 86/200
Epoch 87/200
Epoch 88/200
Epoch 89/200
Epoch 90/200
Epoch 91/200
Epoch 92/200
Epoch 93/200
Epoch 94/200
Epoch 95/200
Epoch 96/200
Epoch 97/200
Epoch 98/200
Epoch 99/200
Epoch 100/200
Epoch 101/200
Epoch 102/200
Epoch 103/200
Epoch 104/200
Epoch 105/200
Epoch 106/200
Epoch 107/200
Epoch 108/200
Epoch 109/200
Epoch 110/200
Epoch 111/200
Epoch 112/200
Epoch 113/200


Epoch 114/200
Epoch 115/200
Epoch 116/200
Epoch 117/200
Epoch 118/200
Epoch 119/200
Epoch 120/200
Epoch 121/200
Epoch 122/200
Epoch 123/200
Epoch 124/200
Epoch 125/200
Epoch 126/200
Epoch 127/200
Epoch 128/200
Epoch 129/200
Epoch 130/200
Epoch 131/200
Epoch 132/200
Epoch 133/200
Epoch 134/200
Epoch 135/200
Epoch 136/200
Epoch 137/200
Epoch 138/200
Epoch 139/200
Epoch 140/200
Epoch 141/200
Epoch 142/200
Epoch 143/200
Epoch 144/200
Epoch 145/200
Epoch 146/200
Epoch 147/200
Epoch 148/200
Epoch 149/200
Epoch 150/200
Epoch 151/200
Epoch 152/200
Epoch 153/200
Epoch 154/200
Epoch 155/200
Epoch 156/200
Epoch 157/200
Epoch 158/200
Epoch 159/200
Epoch 160/200
Epoch 161/200
Epoch 162/200
Epoch 163/200
Epoch 164/200
Epoch 165/200
Epoch 166/200
Epoch 167/200
Epoch 168/200


Epoch 169/200
Epoch 170/200
Epoch 171/200
Epoch 172/200
Epoch 173/200
Epoch 174/200
Epoch 175/200
Epoch 176/200
Epoch 177/200
Epoch 178/200
Epoch 179/200
Epoch 180/200
Epoch 181/200
Epoch 182/200
Epoch 183/200
Epoch 184/200
Epoch 185/200
Epoch 186/200
Epoch 187/200
Epoch 188/200
Epoch 189/200
Epoch 190/200
Epoch 191/200
Epoch 192/200
Epoch 193/200
Epoch 194/200
Epoch 195/200
Epoch 196/200
Epoch 197/200
Epoch 198/200
Epoch 199/200
Epoch 200/200


<tensorflow.python.keras.callbacks.History at 0x1e3cbc3dbe0>

You may have noticed that the validation loss, `val_loss`, **was substantially higher than the `training loss`, loss.** Furthermore, if `val_loss` started to increase before the training process was terminated, then we may have overfitted. When this happens, you will want to **try decreasing the number of epochs.**

### 2.4 Evaluating models
Two models have been trained and are available: `large_model`, which has many parameters; and `small_model`, which has fewer parameters. Both models have been trained using `train_features` and `train_labels`, which are available to you. A separate test set, which consists of `test_features` and `test_labels`, is also available.

Your goal is to evaluate relative model performance and also determine whether either model exhibits signs of overfitting. You will do this by evaluating `large_model` and `small_model` on both the train and test sets. For each model, you can do this by applying the `.evaluate(x, y)` method to compute the loss for features `x` and labels `y`. You will then compare the four losses generated.

```Python
# Evaluate the small model using the train data
small_train = small_model.evaluate(train_features, train_labels)

# Evaluate the small model using the test data
small_test = small_model.evaluate(test_features, test_labels)

# Evaluate the large model using the train data
large_train = large_model.evaluate(train_features, train_labels)

# Evaluate the large model using the test data
large_test = large_model.evaluate(test_features, test_labels)

# Print losses
print('\n Small - Train: {}, Test: {}'.format(small_train, small_test))
print('Large - Train: {}, Test: {}'.format(large_train, large_test))


<script.py> output:
    
 32/100 [========>.....................] - ETA: 0s - loss: 1.0467
100/100 [==============================] - 0s 360us/sample - loss: 1.0176
    
 32/100 [========>.....................] - ETA: 0s - loss: 1.0472
100/100 [==============================] - 0s 54us/sample - loss: 1.0893
    
 32/100 [========>.....................] - ETA: 0s - loss: 0.0621
100/100 [==============================] - 0s 372us/sample - loss: 0.0473
    
 32/100 [========>.....................] - ETA: 0s - loss: 0.1021
100/100 [==============================] - 0s 55us/sample - loss: 0.2126
    
     Small - Train: 1.017621760368347, Test: 1.0893175601959229
    Large - Train: 0.047317686378955844, Test: 0.21255494594573976
```

Notice that the gap between the test and train set losses is **substantially higher** for `large_model`, **suggesting that overfitting may be an issue**. Furthermore, both test and train set performance is better for `large_model`. **This suggests that we may want to use `large_model`, but reduce the number of training epochs**.

## 3. Training models with the Estimators API

### What is the Estimators API?
- High level submodule
- Less flexible
- Enforces best practices
- Faster deployment
- Many premade models

![][09-estimators]

### Model specification and training

1. Define feature columns
2. Load and transform data
3. Define an estimator
4. Apply train operation

### Defining feature columns

```Python
# Import tensorflow under its standard alias
import tensorflow as tf

# Define a numeric feature column
size = tf.feature_column.numeric_column("size")

# Define a categorical feature column
rooms = tf.feature_column.categorical_column_with_vocabulary_list("rooms",["1", "2", "3", "4", "5"])

# Create feature column list
features_list = [size, rooms]

# Define a matrix feature column
features_list = [tf.feature_column.numeric_column('image', shape=(784,))]
```

### Loading and transforming data

```Python
# Define input data function
def input_fn():
    # Define feature dictionary
    features = {"size": [1340, 1690, 2720], "rooms": [1, 3, 4]}
    
    # Define labels
    labels = [221900, 538000, 180000]
    return features, labels
```

### Define and train a regression estimator
```Python
# Define a deep neural network regression
model0 = tf.estimator.DNNRegressor(feature_columns=feature_list,hidden_units=[10, 6, 6, 3])

# Train the regression model
model0.train(input_fn, steps=20)
```

### Define and train a deep neural network
```Python
# Define a deep neural network classifier
model1 = tf.estimator.DNNClassifier(feature_columns=feature_list,hidden_units=[32, 16, 8], n_classes=4)

# Train the classifier
model1.train(input_fn, steps=20)
```

- https://www.tensorflow.org/guide/estimators

[09-estimators]:_Docs/09-estimators.png

### 3.1 Preparing to train with Estimators
For this exercise, we'll return to the King County housing transaction dataset from chapter 2. We will again develop and train a machine learning model to predict house prices; however, this time, we'll do it using the `estimator` API.

Rather than completing everything in one step, we'll break this procedure down into parts. We'll begin by defining the feature columns and loading the data. In the next exercise, we'll define and train a premade estimator. Note that `feature_column` has been imported for you from `tensorflow`. Additionally, numpy has been imported as `np`, and the Kings County housing dataset is available as a pandas DataFrame: `housing`.

In [123]:
file = filepath+'kc_house_data.csv'
housing = pd.read_csv(file)
housing.head()

Unnamed: 0,id,date,price,bedrooms,bathrooms,sqft_living,sqft_lot,floors,waterfront,view,...,grade,sqft_above,sqft_basement,yr_built,yr_renovated,zipcode,lat,long,sqft_living15,sqft_lot15
0,7129300520,20141013T000000,221900.0,3,1.0,1180,5650,1.0,0,0,...,7,1180,0,1955,0,98178,47.5112,-122.257,1340,5650
1,6414100192,20141209T000000,538000.0,3,2.25,2570,7242,2.0,0,0,...,7,2170,400,1951,1991,98125,47.721,-122.319,1690,7639
2,5631500400,20150225T000000,180000.0,2,1.0,770,10000,1.0,0,0,...,6,770,0,1933,0,98028,47.7379,-122.233,2720,8062
3,2487200875,20141209T000000,604000.0,4,3.0,1960,5000,1.0,0,0,...,7,1050,910,1965,0,98136,47.5208,-122.393,1360,5000
4,1954400510,20150218T000000,510000.0,3,2.0,1680,8080,1.0,0,0,...,8,1680,0,1987,0,98074,47.6168,-122.045,1800,7503


In [125]:
# Define feature columns for bedrooms and bathrooms
bedrooms = tf.feature_column.numeric_column("bedrooms")
bathrooms = tf.feature_column.numeric_column("bathrooms")

# Define the list of feature columns
feature_list = [bedrooms, bathrooms]

def input_fn():
	# Define the labels
	labels = np.array(housing.price)
	# Define the features
	features = {'bedrooms':np.array(housing['bedrooms']), 
                'bathrooms':np.array(housing['bathrooms'])}
	return features, labels

### 3.2 Defining Estimators
In the previous exercise, you defined a list of feature columns, `feature_list`, and a data input function, `input_fn()`. In this exercise, you will build on that work by defining an estimator that makes use of input data.

1. Use a deep neural network regressor with 2 nodes in both the first and second hidden layers and 1 training step.
2. Modify the code to use a `LinearRegressor()`, remove the `hidden_units`, and set the number of steps to 2.

In [127]:
# Define the model and set the number of steps
model = tf.estimator.DNNRegressor(feature_columns=feature_list, hidden_units=[2,2])
model.train(input_fn, steps=1)

# Define the model and set the number of steps
model = tf.estimator.LinearRegressor(feature_columns=feature_list)
model.train(input_fn, steps=2)

W0624 17:25:15.697633 14436 estimator.py:1811] Using temporary folder as model directory: D:\Usuarios\marcgaso\AppData\Local\Temp\tmpuy9nestq
W0624 17:25:15.720728 14436 deprecation.py:323] From C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\training\training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
W0624 17:25:16.398309 14436 deprecation.py:323] From D:\Usuarios\marcgaso\AppData\Roaming\Python\Python37\site-packages\tensorflow_estimator\python\estimator\head\base_head.py:574: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
W0624 17:25:16.475466 14436 deprecation.py:506] From C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\p

<tensorflow_estimator.python.estimator.canned.linear.LinearRegressorV2 at 0x1e3c9ef9e10>

Note that you have other premade estimator options, such as `BoostedTreesRegressor()`, and can also create your own custom `estimators`.

## 4. Congratulations!

### What you learned
- **Chapter 1**
    - Low-level, basic, and advanced operations
    - Graph-based computation
    - Gradient computation and optimization
- **Chapter 2**
    - Data loading and transformation
    - Predefined and custom loss functions
    - Linear models and batch training
- **Chapter 3**
    - Dense neural network layers
    - Activation functions
    - Optimization algorithms
    - Training neural networks
- **Chapter 4**
    - Neural networks in Keras
    - Training and validation
    - The Estimators API
    
### TensorFlow extensions
- **TensorFlow Hub**
    - Pretrained models
    - Transfer learning
- **TensorFlow Probability**
    - More statistical distributions
    - Trainable distributions
    - Extended set of optimizers

### TensorFlow 2.0
- **TensorFlow 2.0**
    - `eager_execution()`
    - Tighter `keras` integration
    - `Estimators`