# High Level APIs
  
In the final chapter, you'll use high-level APIs in TensorFlow 2 to train a sign language letter classifier. You will use both the sequential and functional Keras APIs to train, validate, make predictions with, and evaluate models. You will also learn how to use the Estimators API to streamline the model definition and training process, and to avoid errors.

## Resources
  
**Notebook Syntax**
  
<span style='color:#7393B3'>NOTE:</span>  
- Denotes additional information deemed to be *contextually* important
- Colored in blue, HEX #7393B3
  
<span style='color:#E74C3C'>WARNING:</span>  
- Significant information that is *functionally* critical  
- Colored in red, HEX #E74C3C
  
---
  
**Links**
  
[NumPy Documentation](https://numpy.org/doc/stable/user/index.html#user)  
[Pandas Documentation](https://pandas.pydata.org/docs/user_guide/index.html#user-guide)  
[TensorFlow Documentation](https://www.tensorflow.org)  
[TensorFlow Playground](https://playground.tensorflow.org/#activation=tanh&batchSize=10&dataset=circle&regDataset=reg-plane&learningRate=0.03&regularizationRate=0&noise=0&networkShape=4,2&seed=0.20148&showTestData=false&discretize=false&percTrainData=50&x=true&y=true&xTimesY=false&xSquared=false&ySquared=false&cosX=false&sinX=false&cosY=false&sinY=false&collectStats=false&problem=classification&initZero=false&hideText=false)  
[TensorFlow Estimators](https://www.tensorflow.org/guide/estimator)  
  
---
  
**Notable Functions**
  
<table>
  <tr>
    <th>Index</th>
    <th>Operator</th>
    <th>Use</th>
  </tr>
  <tr>
    <td>1</td>
    <td>tf.constant()</td>
    <td>Creates a constant tensor with a specified value.</td>
  </tr>
  <tr>
    <td>2</td>
    <td>tf.Variable()</td>
    <td>Creates a mutable tensor variable that can be modified.</td>
  </tr>
  <tr>
    <td>3</td>
    <td>tf.zeros()</td>
    <td>Creates a tensor filled with zeros.</td>
  </tr>
  <tr>
    <td>4</td>
    <td>tf.ones()</td>
    <td>Creates a tensor filled with ones.</td>
  </tr>
  <tr>
    <td>5</td>
    <td>tf.zeros_like()</td>
    <td>Creates a tensor of zeros with the same shape as another tensor.</td>
  </tr>
  <tr>
    <td>6</td>
    <td>tf.ones_like()</td>
    <td>Creates a tensor of ones with the same shape as another tensor.</td>
  </tr>
  <tr>
    <td>7</td>
    <td>tf.fill()</td>
    <td>Creates a tensor filled with a specified scalar value.</td>
  </tr>
  <tr>
    <td>8</td>
    <td>tf.add()</td>
    <td>Performs element-wise addition of two tensors.</td>
  </tr>
  <tr>
    <td>9</td>
    <td>tf.multiply()</td>
    <td>Performs element-wise multiplication of two tensors.</td>
  </tr>
  <tr>
    <td>10</td>
    <td>tf.matmul()</td>
    <td>Performs matrix multiplication of two tensors.</td>
  </tr>
  <tr>
    <td>11</td>
    <td>tf.reduce_sum()</td>
    <td>Computes the sum of elements across specified dimensions of a tensor.</td>
  </tr>
  <tr>
    <td>12</td>
    <td>tf.gradient()</td>
    <td>Computes the gradients of a tensor with respect to another tensor.</td>
  </tr>
  <tr>
    <td>13</td>
    <td>tf.GradientTape</td>
    <td>Records operations for automatic differentiation to compute gradients.</td>
  </tr>
  <tr>
    <td>14</td>
    <td>tf.reshape()</td>
    <td>Reshapes a tensor into a specified shape.</td>
  </tr>
  <tr>
    <td>15</td>
    <td>tf.random()</td>
    <td>Generates random values from a specified distribution.</td>
  </tr>
  <tr>
    <td>16</td>
    <td>tf.random().uniform()</td>
    <td>Generates random values from a uniform distribution.</td>
  </tr>
  <tr>
    <td>17</td>
    <td>tf.GradientTape.watch()</td>
    <td>Used to start tracing Tensor by the Tape</td>
  </tr>
  <tr>
    <td>18</td>
    <td>tf.cast()</td>
    <td>Casts a tensor to a new datatype</td>
  </tr>
  <tr>
    <td>19</td>
    <td>tensorflow.keras.losses</td>
    <td>Retrieves a Keras loss as a function</td>
  </tr>
  <tr>
    <td>20</td>
    <td>tensorflow.keras.losses.mse()</td>
    <td>Computes the mean squared error between labels and predictions.</td>
  </tr>
  <tr>
    <td>21</td>
    <td>tensorflow.keras.losses.mae()</td>
    <td>Computes the mean absolute error between labels and predictions.</td>
  </tr>
  <tr>
    <td>22</td>
    <td>tensorflow.keras.losses.Huber()</td>
    <td>Computes Huber loss value.</td>
  </tr>
  <tr>
    <td>23</td>
    <td>tf.math.log()</td>
    <td>Computes natural logarithm of x element-wise.</td>
  </tr>
  <tr>
    <td>24</td>
    <td>tf.keras.optimizers.Adam</td>
    <td>Optimizer that implements the Adam algorithm. Adam optimization is a stochastic gradient descent method that is based on adaptive estimation of first-order and second-order moments.</td>
  </tr>
  <tr>
    <td>25</td>
    <td>tf.keras.optimizers.Adam.minimize()</td>
    <td>Calling .minimize() takes care of both computing the gradients and applying them to the variables.</td>
  </tr>
  <tr>
    <td>26</td>
    <td>tf.convert_to_tensor()</td>
    <td>Converts a numpy array or Python list to a Tensorflow tensor.</td>
  </tr>
  <tr>
    <td>27</td>
    <td>tf.gather()</td>
    <td>Gathers slices from a tensor along a specified axis.</td>
  </tr>
  <tr>
    <td>28</td>
    <td>tf.keras.optimizers.SGD</td>
    <td>Optimizer that implements the Stochastic Gradient Descent algorithm.</td>
  </tr>
  <tr>
    <td>29</td>
    <td>tf.keras.optimizers.RMSprop</td>
    <td>Optimizer that implements the RMSprop algorithm.</td>
  </tr>
  <tr>
    <td>30</td>
    <td>tf.keras.activations.sigmoid</td>
    <td>Computes the sigmoid activation function.</td>
  </tr>
  <tr>
    <td>31</td>
    <td>tf.keras.activations.relu</td>
    <td>Computes the ReLU activation function.</td>
  </tr>
  <tr>
    <td>32</td>
    <td>tf.random.normal()</td>
    <td>Generates random values from a normal distribution.</td>
  </tr>
  <tr>
    <td>33</td>
    <td>tf.keras.layers.Dense()</td>
    <td>A fully connected layer in a neural network.</td>
  </tr>
  <tr>
    <td>34</td>
    <td>tf.keras.layers.Dropout()</td>
    <td>Applies dropout regularization to the input.</td>
  </tr>
  <tr>
    <td>35</td>
    <td>tf.keras.losses.binary_crossentropy()</td>
    <td>Computes the binary cross-entropy loss.</td>
  </tr>
  <tr>
    <td>36</td>
    <td>tf.math.confusion_matrix()</td>
    <td>Computes the confusion matrix.</td>
  </tr>
  <tr>
    <td>37</td>
    <td>pd.crosstab()</td>
    <td>Computes a cross-tabulation of two or more factors.</td>
  </tr>
  <tr>
    <td>38</td>
    <td>np.hstack()</td>
    <td>Stacks arrays in sequence horizontally (column-wise).</td>
  </tr>
</table>

  
---
  
**Language and Library Information**  
  
Python 3.11.0  
  
Name: numpy  
Version: 1.24.3  
Summary: Fundamental package for array computing in Python  
  
Name: pandas  
Version: 2.0.3  
Summary: Powerful data structures for data analysis, time series, and statistics  
  
Name: matplotlib  
Version: 3.7.2  
Summary: Python plotting package  
  
Name: seaborn  
Version: 0.12.2  
Summary: Statistical data visualization  
  
Name: tensorflow  
Version: 2.13.0  
Summary: TensorFlow is an open source machine learning framework for everyone.  
  
Name: scikit-learn  
Version: 1.3.0  
Summary: A set of python modules for machine learning and data mining  
  
---
  
**Miscellaneous Notes**
  
<span style='color:#7393B3'>NOTE:</span>  
  
`python3.11 -m IPython` : Runs python3.11 interactive jupyter notebook in terminal.
  
`nohup ./relo_csv_D2S.sh > ./output/relo_csv_D2S.log &` : Runs csv data pipeline in headless log.  

In [40]:
import numpy as np                  # Numerical Python:         Arrays and linear algebra
import pandas as pd                 # Panel Datasets:           Dataset manipulation
import matplotlib.pyplot as plt     # MATLAB Plotting Library:  Visualizations
import seaborn as sns               # Seaborn:                  Visualizations
import tensorflow as tf             # TensorFlow:               Deep-Learning Neural Networks


## Defining neural networks with Keras
  
In chapter 3, we saw how to define neural networks in TensorFlow, both using linear algebra and higher level Keras operations. In this lesson, we will introduce the Keras sequential API, and expand on our brief and informal introduction of the Keras functional API.
  
**Classifying sign language letters**
  
Throughout this chapter, we'll focus on using Keras to classify four letters from the Sign Language MNIST dataset: a, b, c, and d. Note that the images appear to be low resolution because each is represented by a 28x28 matrix.
  
<img src='../_images/defining-neural-networks-with-keras.png' alt='img' width='740'>
  
**The sequential API**
  
Now, let's say we experiment with several different architectures and select the one that makes the most accurate predictions. It has an input layer, a first hidden layer with 16 nodes, and a second hidden layer with 8 nodes. We'll have 4 output nodes, since there are 4 letters in the dataset.
  
<img src='../_images/defining-neural-networks-with-keras1.png' alt='img' width='740'>
  
**The sequential API**
  
A good way to construct this model in Keras is to use the sequential API. This API is simpler and makes strong assumptions about how you will construct your model. It assumes that you have an input layer, some number of hidden layers, and an output layer. All of these layers are ordered one after the other in a sequence.
  
- input layer
- hidden layer
- output layer
- ordered in sequence
  
**Building a sequential model**
  
We'll start by importing `tensorflow`. We can then define a `keras.Sequential()` model, which we'll name model. Once we have defined this object, we can simply stack layers on top of it sequentially using the add method. Let's start by adding the first hidden layer, which is a dense layer with 16 nodes. We'll select a relu activation function and supply an `input_shape=`, which Keras requires for the first layer. This input shape is simply a tuple that contains the dimensions of our data. Since we'll be using 28 by 28 pixel images, reshaped into vector, we'll supply 28*28 comma as the input shape.
  
<img src='../_images/defining-neural-networks-with-keras2.png' alt='img' width='740'>
  
**Building a sequential model**
  
Next, we'll define a second hidden layer according to the desired model architecture. Finally, we specify that the model has 4 output nodes and uses a softmax activation function. If we want to check our model's architecture, we can use the `.summary()` method, which we'll return to in the upcoming exercises. The model has now been defined, but it is not yet ready to be trained. We must first perform a compilation step, where we specify the optimizer and loss function. Here, we've selected the adam optimizer and the categorical crossentropy loss function, which we'll use for classification problems with more than 2 classes.
  
<img src='../_images/defining-neural-networks-with-keras3.png' alt='img' width='740'>
  
**The functional API**
  
But what if you want to train two models jointly to predict the same target? The functional API is for that.
  
<img src='../_images/defining-neural-networks-with-keras4.png' alt='img' width='740'>
  
**Using the functional API**
  
As an example, let's say we have a set of 28x28 images and a set of 10 features of metadata. We want to use both to predict the image's class, but restrict how they interact in our model. We'll start by using the `keras.Input()` operation to define the input shapes for model 1 and model 2. Next, we define layer 1 and layer 2 as dense layers for model 1. Note that we have to pass the previous layer as an argument if we use the functional API, but did not with the sequential. You may remember that we did this in chapter 3. We were also using the functional API then.
  
<img src='../_images/defining-neural-networks-with-keras5.png' alt='img' width='740'>
  
**Using the functional API**
  
We now define layers 1 and 2 for model 2 and then use the add layer in keras to combine the outputs in a layer that merges the two models. Finally, we define a functional model. As inputs, it takes both the model 1 and model 2 inputs. As outputs, it takes the merged layer. The only thing left to do is `.compile()` it and train.
  
<img src='../_images/defining-neural-networks-with-keras6.png' alt='img' width='740'>
  

### The sequential model in Keras
  
In chapter 3, we used components of the `keras` API in `tensorflow` to define a neural network, but we stopped short of using its full capabilities to streamline model definition and training. In this exercise, you will use the `keras.Sequential()` model API to define a neural network that can be used to classify images of sign language letters. You will also use the `.summary()` method to print the model's architecture, including the shape and number of parameters associated with each layer.
  
Note that the images were reshaped from (28, 28) to (784,), so that they could be used as inputs to a dense layer. Additionally, note that `keras` has been imported from `tensorflow` for you.
  
1. Define a `keras.Sequential()` model named `model`.
2. Set the first layer to be `Dense()` and to have 16 nodes and a `relu` activation.
3. Define the second layer to be `Dense()` and to have 8 nodes and a `relu` activation.
4. Set the output layer to have 4 nodes and use a `softmax` activation function.

In [41]:
# Define a Keras sequential model
model = tf.keras.Sequential()

# Define the first dense layer
model.add(tf.keras.layers.Dense(16, activation='relu', input_shape=(784,)))

# Define the second dense layer
model.add(tf.keras.layers.Dense(8, activation='relu'))

# Define the output layer
model.add(tf.keras.layers.Dense(4, activation='softmax'))

# Print the model architecture
print(model.summary())


Model: "sequential_8"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_27 (Dense)            (None, 16)                12560     
                                                                 
 dense_28 (Dense)            (None, 8)                 136       
                                                                 
 dense_29 (Dense)            (None, 4)                 36        
                                                                 
Total params: 12732 (49.73 KB)
Trainable params: 12732 (49.73 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
None


Notice that we've defined a model, but we haven't compiled it. The compilation step in `keras` allows us to set the optimizer, loss function, and other useful training parameters in a single line of code. Furthermore, the `.summary()` method allows us to view the model's architecture.

### Compiling a sequential model
  
In this exercise, you will work towards classifying letters from the Sign Language MNIST dataset; however, you will adopt a different network architecture than what you used in the previous exercise. There will be fewer layers, but more nodes. You will also apply `keras.layers.Dropout()` to prevent overfitting. Finally, you will compile the model to use the `'adam'` `optimizer=` and the `'categorical_crossentropy'` `loss=`. You will also use a method in `keras` to summarize your model's architecture. Note that `keras` has been imported from `tensorflow` for you and a sequential `keras` model has been defined as model.
  
1. In the first dense layer, set the number of nodes to 16, the `activation=` to `'sigmoid'`, and the `input_shape=` to (784,).
2. Apply `keras.layers.Dropout()` at a rate of 25% to the first layer's output.
3. Set the output layer to be dense, have 4 nodes, and use a `'softmax'` `activation=` function.
4. Compile the model using an `'adam'` `optimizer=` and `'categorical_crossentropy'` `loss=` function.

In [42]:
# Model instantiation
model = tf.keras.Sequential()

# Define the first dense layer
model.add(tf.keras.layers.Dense(16, activation='sigmoid', input_shape=(784,)))

# Apply dropout to the first layer's output
model.add(tf.keras.layers.Dropout(0.25))

# Define the output layer
model.add(tf.keras.layers.Dense(4, activation='softmax'))

# Compile the model
model.compile(optimizer='adam', loss='categorical_crossentropy')

# Print a model summary
print(model.summary())

Model: "sequential_9"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_30 (Dense)            (None, 16)                12560     
                                                                 
 dropout_1 (Dropout)         (None, 16)                0         
                                                                 
 dense_31 (Dense)            (None, 4)                 68        
                                                                 
Total params: 12628 (49.33 KB)
Trainable params: 12628 (49.33 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
None


You've now defined and compiled a neural network using the `keras.Sequential()` model. Notice that printing the `model.summary()` method shows the layer type, output shape, and number of parameters of each layer.

### Defining a multiple input model
  
In some cases, the sequential API will not be sufficiently flexible to accommodate your desired model architecture and you will need to use the functional API instead. If, for instance, you want to train two models with different architectures jointly, you will need to use the functional API to do this. In this exercise, we will see how to do this. We will also use the `.summary()` method to examine the joint model's architecture.
  
Note that `keras` has been imported from `tensorflow` for you. Additionally, the input layers of the first and second models have been defined as `m1_inputs` and `m2_inputs`, respectively. Note that the two models have the same architecture, but one of them uses a `'sigmoid'` `activation=` in the first layer and the other uses a `'relu'`.
  
1. Pass model 1's input layer to its first layer and model 1's first layer to its second layer.
2. Pass model 2's input layer to its first layer and model 2's first layer to its second layer.
3. Use the `.add()` operation to combine the second layers of model 1 and model 2.
4. Complete the functional model definition.


In [43]:
# Instantiation of input layer for each model
m1_inputs = tf.keras.Input(shape=(784,))
m2_inputs = tf.keras.Input(shape=(784,))

In [44]:
# For model 1, pass the input layer to layer 1 and layer 1 to layer 2
m1_layer1 = tf.keras.layers.Dense(12, activation='sigmoid')(m1_inputs)
m1_layer2 = tf.keras.layers.Dense(4, activation='softmax')(m1_layer1)

# For model 2, pass the input layer to layer 1 and layer 1 to layer 2
m2_layer1 = tf.keras.layers.Dense(12, activation='relu')(m2_inputs)
m2_layer2 = tf.keras.layers.Dense(4, activation='softmax')(m2_layer1)

# Merge model outputs and define a functional model
merged = tf.keras.layers.add([m1_layer2, m2_layer2])
model = tf.keras.Model(inputs=[m1_inputs, m2_inputs], outputs=merged)

# Print a model summary
print(model.summary())

Model: "model_4"
__________________________________________________________________________________________________
 Layer (type)                Output Shape                 Param #   Connected to                  
 input_3 (InputLayer)        [(None, 784)]                0         []                            
                                                                                                  
 input_4 (InputLayer)        [(None, 784)]                0         []                            
                                                                                                  
 dense_32 (Dense)            (None, 12)                   9420      ['input_3[0][0]']             
                                                                                                  
 dense_34 (Dense)            (None, 12)                   9420      ['input_4[0][0]']             
                                                                                            

Notice that the `.summary()` method yields a new column: `Connected to`. This column tells you how layers connect to each other within the network. We can see that `dense_2`, for instance, is connected to the `input_2` layer. We can also see that the `.add()` layer, which merged the two models, connected to both `dense_1` and `dense_3`.

## Training with Keras
  
Earlier in the chapter, we defined neural networks in Keras. In this video, we will discuss how to train and evaluate them.
  
**Overview of training and evaluation**
  
Whenever we train and evaluate a model in `tensorflow`, we typically use the same set of steps. First, we'll load and clean the data. Second, we'll define a model, specifying an architecture. Third, we'll train and validate the model. And fourth, we perform evaluation.
  
1. Load and clean data
2. Define model
3. Train and validate model
4. Evaluate model
  
**How to train a model**
  
Let's see an example of how this works. We'll start by importing `tensorflow` and defining a `keras.Sequential()` model. We'll then `.add()` a dense layer to the model with 16 nodes and a `'relu'` `activation=` function. Note that our input shape is (784,), since our dataset consists of 28x28 images, reshaped into vectors. We next define the output layer, which has 4 nodes and a `'softmax'` `activation=` function.
  
<img src='../_images/defining-neural-networks-with-keras7.png' alt='img' width='740'>
  
**How to train a model**
  
We next compile the model, using the `'adam'` `optimizer=` and the `'categorical_crossentropy'` `loss=`. Finally, we train the model using the `.fit()` operation.
  
<img src='../_images/defining-neural-networks-with-keras8.png' alt='img' width='740'>
  
**The `.fit()` operation**
  
Notice that we only supplied two arguments to fit: features and labels. These are the only two required arguments; however, there are also many optional arguments, including `batch_size=`, `epochs=`, and `validation_split=`. We will cover each of these.
  
**Batch size and epochs**
  
Let's start with the difference between the `batch_size=` and `epochs=` parameters. The number of examples in each batch is the batch size, which is 32 by default. The number of times you train on the full set of batches is called the number of epochs. Here, the `batch_size=` is 5 and the number of `epochs=` is 2. Using multiple epochs allows the model to revisit the same batches, but with different model weights and possibly `optimizer=` parameters, since they are updated after each batch.
  
<img src='../_images/defining-neural-networks-with-keras9.png' alt='img' width='740'>
  
**Performing validation**
  
So what does the `validation_split=` parameter do? It divides the dataset into two parts. The first part is the train set and the second part is the validation set. Selecting a value of `0.20` will put 20% of the data in the validation set.
  
**Performing validation**
  
The benefit of using a `validation_split=` is that you can see how your model performs on both the data it was trained on, the training set, and a separate dataset it was not trained on, the validation set. Here, we can see the first 10 epochs of training. Notice that we can see the training loss and validation loss separately. If the training loss becomes substantially lower than the validation loss, this is an indication that we're overfitting. We should either terminate the training process before that point or add regularization or dropout.
  
<img src='../_images/defining-neural-networks-with-keras10.png' alt='img' width='740'>
  
**Changing the metric**
  
Another benefit of the high level `keras` API is that we can swap less informative metrics, such as the loss, for ones that are easily interpretable, such as the share of accurately classified examples. We can do this by supplying `'accuracy'` to the `metrics=` parameter of `model.compile()`. We then apply `.fit()` to the model again with the same settings.
  
<img src='../_images/defining-neural-networks-with-keras11.png' alt='img' width='740'>
  
**Changing the metric**
  
Using the accuracy metric, we can see that the model performs quite well. In just 10 epochs, it goes from an accuracy of 42% to over 99%. Notice that the model performs equally well in the validation set, which means that we're unlikely to be overfitting.
  
<img src='../_images/defining-neural-networks-with-keras12.png' alt='img' width='740'>
  
**The evaluation operation**
  
Finally, it is good idea to split off a test set before you begin to train and validate. You can use the `.evaluate()` operation to check performance on the test set at the end of the training process. Since you may tune model parameters in response to validation set performance, using a separate test set will provide you with further assurance that you have not overfitted. You now know how to streamline model training and validation in `keras`.


### Training with Keras
  
In this exercise, we return to our sign language letter classification problem. We have 2000 images of four letters--A, B, C, and D--and we want to classify them with a high level of accuracy. We will complete all parts of the problem, including the model definition, compilation, and training.
  
Note that `keras` has been imported from `tensorflow` for you. Additionally, the features are available as `sign_language_features` and the targets are available as `sign_language_labels`.
  
1. Define a sequential model named `model`.
2. Set the output layer to be dense, have 4 nodes, and use a `'softmax'` `activation=` function.
3. Compile the model with the `'SGD'` `optimizer=` and `'categorical_crossentropy'` `loss=`.
4. Complete the fitting operation and set the number of `epochs=` to 5.

In [45]:
# Load dataset
df = pd.read_csv('../_datasets/slmnist.csv', header=None)

# X/y split
X = df.iloc[:, 1:]
y = df.iloc[:, 0]

print(df.shape)
df.head()

(2000, 785)


Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,775,776,777,778,779,780,781,782,783,784
0,1,142,143,146,148,149,149,149,150,151,...,0,15,55,63,37,61,77,65,38,23
1,0,141,142,144,145,147,149,150,151,152,...,173,179,179,180,181,181,182,182,183,183
2,1,156,157,160,162,164,166,169,171,171,...,181,197,195,193,193,191,192,198,193,182
3,3,63,26,65,86,97,106,117,123,128,...,175,179,180,182,183,183,184,185,185,185
4,1,156,160,164,168,172,175,178,180,182,...,108,107,106,110,111,108,108,102,84,70


In [46]:
# Features normalized 
sign_language_features = (X -  X.mean()) / (X.max() - X.min()).to_numpy()

# Labels extracted, and one-hot-encoded
sign_language_labels = pd.get_dummies(y).astype(np.float32).to_numpy()

In [47]:
# Define a sequential model
model = tf.keras.Sequential()

# Define a hidden layer
model.add(tf.keras.layers.Dense(16, activation='relu', input_shape=(784, )))

# Define the output layer
model.add(tf.keras.layers.Dense(4, activation='softmax'))

# Compile the model
model.compile(optimizer='SGD', loss='categorical_crossentropy')

# Complete the fitting operation
model.fit(sign_language_features, sign_language_labels, epochs=5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.src.callbacks.History at 0x13a305890>

You probably noticed that your only measure of performance improvement was the value of the loss function in the training sample, which is not particularly informative. You will improve on this in the next exercise.

### Metrics and validation with Keras
  
We trained a model to predict sign language letters in the previous exercise, but it is unclear how successful we were in doing so. In this exercise, we will try to improve upon the interpretability of our results. Since we did not use a validation split, we only observed performance improvements within the training set; however, it is unclear how much of that was due to overfitting. Furthermore, since we did not supply a metric, we only saw decreases in the loss function, which do not have any clear interpretation.

Note that `keras` has been imported for you from `tensorflow`.
  
1. Set the first dense layer to have 32 nodes, use a `'sigmoid'` activation function, and have an input shape of (784,).
2. Use the root mean square propagation optimizer, a `'categorical_crossentropy'` loss, and the accuracy metric.
3. Set the number of epochs to 10 and use 10% of the dataset for validation.

In [48]:
# Define sequential model
model = tf.keras.Sequential()

# Define the first layer
model.add(tf.keras.layers.Dense(32, activation='sigmoid', input_shape=(784,)))

# Add activation function to classifier
model.add(tf.keras.layers.Dense(4, activation='softmax'))

# Set the optimizer, loss function, and metrics
model.compile(optimizer='RMSprop', loss='categorical_crossentropy', metrics=['accuracy'])

# Add the number of epochs and the validation split
model.fit(sign_language_features, sign_language_labels, epochs=10, validation_split=0.1)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.src.callbacks.History at 0x13a3d7fd0>

With the `keras` API, you only needed 14 lines of code to define, compile, train, and validate a model. You may have noticed that your model performed quite well. In just 10 epochs, we achieved a classification accuracy of over 90% in the validation sample!

### Overfitting detection
  
In this exercise, we'll work with a small subset of the examples from the original sign language letters dataset. A small sample, coupled with a heavily-parameterized model, will generally lead to overfitting. This means that your model will simply memorize the class of each example, rather than identifying features that generalize to many examples.
  
You will detect overfitting by checking whether the validation sample loss is substantially higher than the training sample loss and whether it increases with further training. With a small sample and a high learning rate, the model will struggle to converge on an optimum. You will set a low learning rate for the optimizer, which will make it easier to identify overfitting.
  
Note that `keras` has been imported from `tensorflow`.
  
1. Define a sequential model in `keras` named model.
2. Add a first dense layer with 1024 nodes, a `'relu'` activation, and an `input_shape=` of (784,).
3. Set the `learning_rate=` to 0.001.
4. Set the `.fit()` operation to iterate over the full sample 50 times and use 50% of the sample for validation purposes.

In [49]:
# Define sequential model
model = tf.keras.Sequential()

# Define the first layer
model.add(tf.keras.layers.Dense(1024, activation='relu', input_shape=(784, )))

# Add activation function to classifier
model.add(tf.keras.layers.Dense(4, activation='softmax'))

# Finish the model compilation
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
              loss='categorical_crossentropy', metrics=['accuracy'])

# Complete the model fit operation
model.fit(sign_language_features, sign_language_labels, epochs=50, validation_split=0.5)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


<keras.src.callbacks.History at 0x13a4de590>

You may have noticed that the validation loss, `val_loss`, was substantially higher than the training loss, `loss`. Furthermore, if `val_loss` started to increase before the training process was terminated, then we may have overfitted. When this happens, you will want to try decreasing the number of epochs.

### Evaluating models
  
Two models have been trained and are available: `large_model`, which has many parameters; and `small_model`, which has fewer parameters. Both models have been trained using `train_features` and `train_labels`, which are available to you. A separate test set, which consists of `test_features` and `test_labels`, is also available.
  
Your goal is to evaluate relative model performance and also determine whether either model exhibits signs of overfitting. You will do this by evaluating `large_model` and `small_model` on both the train and test sets. For each model, you can do this by applying the `.evaluate(x, y)` method to compute the loss for features `x` and labels `y`. You will then compare the four losses generated.
  
1. Evaluate the small model using the train data.
2. Evaluate the small model using the test data.
3. Evaluate the large model using the train data.
4. Evaluate the large model using the test data.

In [50]:
# Creating the small model
small_model = tf.keras.Sequential()

small_model.add(tf.keras.layers.Dense(8, activation='relu', input_shape=(784,)))
small_model.add(tf.keras.layers.Dense(4, activation='softmax'))

small_model.compile(optimizer=tf.keras.optimizers.SGD(learning_rate=0.01), 
                    loss='categorical_crossentropy', 
                    metrics=['accuracy'])


In [51]:
# Creating the large model
large_model = tf.keras.Sequential()

large_model.add(tf.keras.layers.Dense(64, activation='sigmoid', input_shape=(784,)))
large_model.add(tf.keras.layers.Dense(4, activation='softmax'))

large_model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001, 
                                                       beta_1=0.9, 
                                                       beta_2=0.999),
                   loss='categorical_crossentropy', 
                   metrics=['accuracy'])


In [52]:
from sklearn.model_selection import train_test_split


# X_train, X_test, y_train, y_test; Train/test split
train_features, test_features, train_labels, test_labels = train_test_split(sign_language_features, 
                                                                            sign_language_labels,
                                                                            test_size=0.5)


In [53]:
# Fitting models to X_train, y_train
small_model.fit(train_features, train_labels, epochs=30, verbose=False)
large_model.fit(train_features, train_labels, epochs=30, verbose=False)

<keras.src.callbacks.History at 0x13a59bf90>

In [54]:
# Evaluate the small model using the train data
small_train = small_model.evaluate(train_features, train_labels)

# Evaluate the small model using the test data
small_test = small_model.evaluate(test_features, test_labels)

# Evaluate the large model using the train data
large_train = large_model.evaluate(train_features, train_labels)

# Evalute the large model using the test data
large_test = large_model.evaluate(test_features, test_labels)

# Print losses
print('\n Small - Train: {}, Test: {}'.format(small_train, small_test))
print('Large - Train: {}, Test: {}'.format(large_train, large_test))


 Small - Train: [0.14092232286930084, 0.9919999837875366], Test: [0.15205100178718567, 0.9890000224113464]
Large - Train: [0.007727161981165409, 1.0], Test: [0.009323738515377045, 1.0]


Notice that the gap between the test and train set losses is high for `large_model`, suggesting that overfitting may be an issue. Furthermore, both test and train set performance is better for `large_model`. This suggests that we may want to use `large_model`, but reduce the number of training epochs.

## Training models with the Estimators API
  
In this video, we'll take a look at the high level Estimators API, which was elevated in importance in TensorFlow 2.0.
  
**What is the Estimators API?**
  
The Estimators API is a high level TensorFlow submodule. Relative to the core, lower-level TensorFlow APIs and the high-level Keras API, model building in the Estimator API is less flexible. This is because it enforces a set of best practices by placing restrictions on model architecture and training. The upside of using the Estimators API is that it allows for faster deployment. Models can be specified, trained, evaluated, and deployed with less code. Furthermore, there are many premade models that can be instantiated by setting a handful of model parameters.
  
- High level submodule 
- Less flexible 
- Faster deployment 
- Many premade models
  
<img src='../_images/estimator-api-keras-use.png' alt='img' width='740'>
  
1 Image taken from https://www.tensorflow.org/guide/premade_estimators
  
**Model specification and training**
  
So what does the typical model specification and training process look like in the Estimators API? Well, it starts with the definition of feature columns, which specify the shape and type of your data. Next, you load and transform your data within a function. The output of this function will be a dictionary object of features and your labels. The next step is to define an estimator. In this video, we'll use premade estimators, but you can also define custom estimators with different architectures. Finally, you will train the model you defined. Note that all model objects created through the Estimators API have train, evaluate, and predict operations.
  
1. Define feature columns
2. Load and transform data
3. Define an estimator
4. Apply train operation
  
**Defining feature columns**
  
Let's step through this procedure to get a sense of how it works. We'll first define the feature columns. If we were working with the housing dataset from chapter 2, we might define a numeric feature column for size using `feature_column.numeric_column`. Note that we supplied the dictionary key, "size," to the operation. We will do this for each feature column we create. We may also want a categorical feature column for the number of rooms using `feature_column.categorical_column_with_vocabulary_list`.
  
<img src='../_images/estimator-api-keras-use1.png' alt='img' width='740'>
  
**Defining feature columns**
  
We can then merge these into a list of features columns. Alternatively, if we were using the sign language MNIST dataset, we'd define a list containing a single vector of features.
  
<img src='../_images/estimator-api-keras-use2.png' alt='img' width='740'>
  
**Loading and transforming data**
  
We next need to define a function that transforms our data, puts the features in a dictionary, and returns both the features and labels. Note that we've simply taken three examples from the housing dataset for the sake of illustration. Using them, we've defined a dictionary with the keys "size" and "rooms," which maps to the feature columns we defined. Next, we define a list or array of labels, which give the price of the house in this case, and then return the features and labels.
  
<img src='../_images/estimator-api-keras-use3.png' alt='img' width='740'>
  
**Define and train a regression estimator**
  
We can now define and train the estimator. But before we do that, we have to define what estimator we actually want to train. If we're predicting house prices, we may want to use a deep neural network with a regression head using `estimator.DNNRegressor`. This allows us to predict a continuous target. Note that all we had to supply was the list of feature columns and the number of nodes in each hidden layer. The rest is handled automatically. We then apply the train function, supply our input function, and train for 20 `steps=`.
  
<img src='../_images/estimator-api-keras-use4.png' alt='img' width='740'>
  
**Define and train a deep neural network**
  
Alternatively, if we want to instead perform a classification task with a deep neural network, we just need to change the estimator to `estimator.DNNClassifier`, add the number of classes, and then train again. You can also use linear classifiers, boosted trees, and other common options. Just check the TensorFlow Estimators documentation for a complete list. Estimators might seem confusing initially, but they're very useful once you master them.
  
<img src='../_images/estimator-api-keras-use5.png' alt='img' width='740'>
  


### Preparing to train with Estimators
  
For this exercise, we'll return to the King County housing transaction dataset from chapter 2. We will again develop and train a machine learning model to predict house prices; however, this time, we'll do it using the `estimator` API.
  
Rather than completing everything in one step, we'll break this procedure down into parts. We'll begin by defining the feature columns and loading the data. In the next exercise, we'll define and train a premade `estimator`. Note that `feature_column` has been imported for you from `tensorflow`. Additionally, `numpy` has been imported as `np`, and the Kings County `housing` dataset is available as a `pandas` DataFrame: `housing`.
  
1. Complete the feature column for bedrooms and add another numeric feature column for bathrooms. Use `bedrooms` and `bathrooms` as the keys.
2. Create a list of the feature columns, `feature_list`, in the order in which they were defined.
3. Set `labels` to be equal to the price column in `housing`.
4. Complete the `bedrooms` entry of the `features` dictionary and add another entry for `bathrooms`.


In [55]:
housing = pd.read_csv('../_datasets/kc_house_data.csv')
print(housing.shape)
housing.head()

(21613, 21)


Unnamed: 0,id,date,price,bedrooms,bathrooms,sqft_living,sqft_lot,floors,waterfront,view,...,grade,sqft_above,sqft_basement,yr_built,yr_renovated,zipcode,lat,long,sqft_living15,sqft_lot15
0,7129300520,20141013T000000,221900.0,3,1.0,1180,5650,1.0,0,0,...,7,1180,0,1955,0,98178,47.5112,-122.257,1340,5650
1,6414100192,20141209T000000,538000.0,3,2.25,2570,7242,2.0,0,0,...,7,2170,400,1951,1991,98125,47.721,-122.319,1690,7639
2,5631500400,20150225T000000,180000.0,2,1.0,770,10000,1.0,0,0,...,6,770,0,1933,0,98028,47.7379,-122.233,2720,8062
3,2487200875,20141209T000000,604000.0,4,3.0,1960,5000,1.0,0,0,...,7,1050,910,1965,0,98136,47.5208,-122.393,1360,5000
4,1954400510,20150218T000000,510000.0,3,2.0,1680,8080,1.0,0,0,...,8,1680,0,1987,0,98074,47.6168,-122.045,1800,7503


In [63]:
# YIELD YIELD YIELD
# YIELD YIELD YIELD
# YIELD YIELD YIELD

# Define feature columns for bedrooms and bathrooms
bedrooms = tf.feature_column.numeric_column("bedrooms")
bathrooms = tf.feature_column.numeric_column("bathrooms")

# Define the list of feature columns
feature_list = [bedrooms, bathrooms]

# YIELD TO tf.keras.layers.Input()
# YIELD TO tf.keras.layers.Input()
# YIELD TO tf.keras.layers.Input()
def input_fn():
    # Define the labels
    labels = np.array(housing['price'])
    
    # Define the features
    features = {'bedrooms': np.array(housing['bedrooms']),
                'bathrooms': np.array(housing['bathrooms'])}
    
    return features, labels



<span style='color:#E74C3C'>WARNING DEPRECATION:</span>  

`bedrooms = tf.feature_column.numeric_column("bedrooms")`  
`bathrooms = tf.feature_column.numeric_column("bathrooms")`  
  
> <span style='color:#E74C3C'>WARNING:</span>  tensorflow:From /var/folders/pf/_zjf_55d7mgb5llg516d7fyc0000gn/T/ipykernel_60816/3128264456.py:2: numeric_column (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed in a future version.  
Instructions for updating:  
Use Keras preprocessing layers instead, either directly or via the `tf.keras.utils.FeatureSpace` utility. Each of `tf.feature_column.*` has a functional equivalent in `tf.keras.layers` for feature preprocessing when training a Keras model.
  


In the next exercise, we'll use the feature columns and data input function to define and train an estimator.

### Defining Estimators
  
In the previous exercise, you defined a list of feature columns, `feature_list`, and a data input function, `input_fn()`. In this exercise, you will build on that work by defining an estimator that makes use of input data.
  
1. Use a deep neural network regressor with 2 nodes in both the first and second hidden layers and 1 training step.
2. Modify the code to use a `LinearRegressor()`, remove the `hidden_units=`, and set the number of `steps=` to 2.

<span style='color:#E74C3C'>WARNING DEPRECATION:</span>  
<span style='color:#E74C3C'>WARNING DEPRECATION:</span>  
<span style='color:#E74C3C'>WARNING DEPRECATION:</span>  

```python
# Define the model and set the number of steps
model = tf.estimator.DNNRegressor(feature_columns=feature_list, hidden_units=[2,2])
model.train(input_fn, steps=1)
```

<span style='color:#E74C3C'>WARNING OUTPUT:</span>  
<span style='color:#E74C3C'>WARNING OUTPUT:</span>  
<span style='color:#E74C3C'>WARNING OUTPUT:</span>  

> INFO:tensorflow:Using default config.
INFO:tensorflow:Using default config.
WARNING:tensorflow:Using temporary folder as model directory: /var/folders/pf/_zjf_55d7mgb5llg516d7fyc0000gn/T/tmp3idhkhh4
WARNING:tensorflow:Using temporary folder as model directory: /var/folders/pf/_zjf_55d7mgb5llg516d7fyc0000gn/T/tmp3idhkhh4
INFO:tensorflow:Using config: {'_model_dir': '/var/folders/pf/_zjf_55d7mgb5llg516d7fyc0000gn/T/tmp3idhkhh4', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
INFO:tensorflow:Using config: {'_model_dir': '/var/folders/pf/_zjf_55d7mgb5llg516d7fyc0000gn/T/tmp3idhkhh4', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Calling model_fn.
  
> ---------------------------------------------------------------------------
> AttributeError                            Traceback (most recent call last)
/Users/alexandergursky/Local_Repository/_study-resources/Introduction_To_TensorFlow_In_Python/2023-07-13-4-High-Level-APIs.ipynb Cell 42 in 3
      1 # Define the model and set the number of steps
      2 model = tf.estimator.DNNRegressor(feature_columns=feature_list, hidden_units=[2,2])
----> 3 model.train(input_fn, steps=1)

> File /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/tensorflow_estimator/python/estimator/estimator.py:360, in Estimator.train(self, input_fn, hooks, steps, max_steps, saving_listeners)
    357 hooks.extend(self._convert_train_steps_to_hooks(steps, max_steps))
    359 saving_listeners = _check_listeners_type(saving_listeners)
--> 360 loss = self._train_model(input_fn, hooks, saving_listeners)
    361 tf.compat.v1.logging.info('Loss for final step: %s.', loss)
    362 return self

> File /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/tensorflow_estimator/python/estimator/estimator.py:1188, in Estimator._train_model(self, input_fn, hooks, saving_listeners)
   1186   return self._train_model_distributed(input_fn, hooks, saving_listeners)
   1187 else:
-> 1188   return self._train_model_default(input_fn, hooks, saving_listeners)

> File /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/tensorflow_estimator/python/estimator/estimator.py:1216, in Estimator._train_model_default(self, input_fn, hooks, saving_listeners)
   1213 features, labels, input_hooks = (
   1214     self._get_features_and_labels_from_input_fn(input_fn, ModeKeys.TRAIN))
   1215 worker_hooks.extend(input_hooks)
-> 1216 estimator_spec = self._call_model_fn(features, labels, ModeKeys.TRAIN,
   1217                                      self.config)
   1218 global_step_tensor = tf.compat.v1.train.get_global_step(g)
   1219 return self._train_with_estimator_spec(estimator_spec, worker_hooks,
   1220                                        hooks, global_step_tensor,
   1221                                        saving_listeners)

> File /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/tensorflow_estimator/python/estimator/estimator.py:1176, in Estimator._call_model_fn(self, features, labels, mode, config)
   1173   kwargs['config'] = config
   1175 tf.compat.v1.logging.info('Calling model_fn.')
-> 1176 model_fn_results = self._model_fn(features=features, **kwargs)
   1177 tf.compat.v1.logging.info('Done calling model_fn.')
   1179 if not isinstance(model_fn_results, model_fn_lib.EstimatorSpec):

> File /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/tensorflow_estimator/python/estimator/canned/dnn.py:1159, in DNNRegressorV2.__init__.._model_fn(features, labels, mode, config)
   1157 def _model_fn(features, labels, mode, config):
   1158   """Call the defined shared dnn_model_fn_v2."""
-> 1159   return dnn_model_fn_v2(
   1160       features=features,
   1161       labels=labels,
   1162       mode=mode,
   1163       head=head,
   1164       hidden_units=hidden_units,
   1165       feature_columns=tuple(feature_columns or []),
   1166       optimizer=optimizer,
   1167       activation_fn=activation_fn,
   1168       dropout=dropout,
   1169       config=config,
   1170       batch_norm=batch_norm)

> File /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/tensorflow_estimator/python/estimator/canned/dnn.py:570, in dnn_model_fn_v2(***failed resolving arguments***)
    566 # In TRAIN mode, create optimizer and assign global_step variable to
    567 # optimizer.iterations to make global_step increased correctly, as Hooks
    568 # relies on global step as step counter.
    569 if mode == ModeKeys.TRAIN:
--> 570   optimizer = optimizers.get_optimizer_instance_v2(optimizer)
    571   optimizer.iterations = tf.compat.v1.train.get_or_create_global_step()
    573 # Create EstimatorSpec.

> File /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/tensorflow_estimator/python/estimator/canned/optimizers.py:127, in get_optimizer_instance_v2(opt, learning_rate)
    125 if opt in six.iterkeys(_OPTIMIZER_CLS_NAMES_V2):
    126   if not learning_rate:
--> 127     if _optimizer_has_default_learning_rate(_OPTIMIZER_CLS_NAMES_V2[opt]):
    128       return _OPTIMIZER_CLS_NAMES_V2[opt]()
    129     else:

> File /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/tensorflow_estimator/python/estimator/canned/optimizers.py:90, in _optimizer_has_default_learning_rate(opt)
     89 def _optimizer_has_default_learning_rate(opt):
---> 90   signature = inspect.getargspec(opt.__init__)
     91   default_name_to_value = dict(zip(signature.args[::-1], signature.defaults))
     92   return 'learning_rate' in default_name_to_value

> AttributeError: module 'inspect' has no attribute 'getargspec'

<span style='color:#E74C3C'>WARNING DEPRECATION CODE:</span>  
<span style='color:#E74C3C'>WARNING DEPRECATION CODE:</span>  
<span style='color:#E74C3C'>WARNING DEPRECATION CODE:</span>  

```python
# Define the model and set the number of steps
model = estimator.LinearRegressor(feature_columns=feature_list)
model.train(input_fn, steps=2)
```

<span style='color:#E74C3C'>WARNING NULL EXERCISE SUGGESTION:</span>  
<span style='color:#E74C3C'>WARNING NULL EXERCISE SUGGESTION:</span>  
<span style='color:#E74C3C'>WARNING NULL EXERCISE SUGGESTION:</span>  
Note that you have other premade `estimator` options, such as `BoostedTreesRegressor()`, and can also create your own custom estimators.

<span style='color:#E74C3C'>WARNING DEPRECATION ESTIMATORS:</span>  
<span style='color:#E74C3C'>WARNING DEPRECATION ESTIMATORS:</span>  
<span style='color:#E74C3C'>WARNING DEPRECATION ESTIMATORS:</span>  
```python
model = tf.estimator.DNNRegressor(feature_columns=feature_list, hidden_units=[2,2])
model.train(input_fn, steps=1)
```

> <span style='color:#E74C3C'>WARNING:</span>  WARNING:tensorflow:From /var/folders/pf/_zjf_55d7mgb5llg516d7fyc0000gn/T/ipykernel_60816/3006336195.py:2: DNNRegressorV2.__init__ (from tensorflow_estimator.python.estimator.canned.dnn) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.keras instead.
  
> <span style='color:#E74C3C'>WARNING:</span>  ValueError: Received a feature column from TensorFlow v1, but this is a TensorFlow v2 Estimator. Please either use v2 feature columns (accessible via tf.feature_column.* in TF 2.x) with this Estimator, or switch to a v1 Estimator for use with v1 feature columns accessible via tf.compat.v1.estimator.* and tf.compat.v1.feature_column.*, respectively.
  
<span style='color:#E74C3C'>WARNING DEPRECATION:</span>  
Change in the inspect module in Python 3.11, which removed the `getargspec()` function. This change affects the compatibility of TensorFlow with Python 3.11.

<span style='color:#E74C3C'>WARNING DEPRECATION:</span>  
Warning: Estimators are not recommended for new code. Estimators run v1.Session-style code which is more difficult to write correctly, and can behave unexpectedly, especially when combined with TF 2 code. Estimators do fall under our compatibility guarantees, but will receive no fixes other than security vulnerabilities. See the migration guide for details.
  
<span style='color:#E74C3C'>CRITICAL DEPRECATION WARNING REFER TO DOCUMENTATION: NOTEBOOK>RESOURCE SECTION>Links>Tensorflow Estimators</span>  
  
<span style='color:#E74C3C'>CRITICAL DEPRECATION WARNING REFER TO DOCUMENTATION: NOTEBOOK>RESOURCE SECTION>Links>Tensorflow Estimators</span>  
  
<span style='color:#E74C3C'>CRITICAL DEPRECATION WARNING REFER TO DOCUMENTATION: NOTEBOOK>RESOURCE SECTION>Links>Tensorflow Estimators</span>  
  
<span style='color:#E74C3C'>CRITICAL DEPRECATION WARNING REFER TO DOCUMENTATION: NOTEBOOK>RESOURCE SECTION>Links>Tensorflow Estimators</span>  
  

## Chapter Complete
  
You've now completed this course on the fundamentals of the TensorFlow API in Python. In this final video, we'll review what you've learned, talk about two useful TensorFlow extensions, and then wrap-up with a discussion of the transition to TensorFlow 2.0.
  
What you learned
  
In chapter 1, you learned low-level, basic, and advanced operations in TensorFlow. You learned how to define and manipulate variables and constants. You also learned the graph-based computational model that underlies TensorFlow and how it can be used to compute gradients and solve arbitrary optimization problems. In chapter 2, you learned how to load and transform data for use in your TensorFlow projects. You also saw how to use predefined and custom loss functions. We ended with a discussion of how to train models, and when and how to divide the training into batches.
  
What you learned
  
In chapter 3, we moved on to training neural networks. You learned how to define neural network architecture in TensorFlow, both using low-level linear algebra operations and high-level Keras API operations. We talked about how to select activation functions and optimizers, and, ultimately, how to train models. In chapter 4, you learned how to make full use of the Keras API to train models in TensorFlow. We discussed the training and validation process and also introduced the high-level Estimators API, which can be used to streamline the production process.
  
TensorFlow extensions
  
In addition to what we covered, there are also a two important TensorFlow extensions that did not fit into the course, but may be worthwhile to explore on your own. The first is TensorFlow Hub, which allows users to import pretrained models that can then be used to perform transfer learning. This will be particularly useful when you want to train an image classifier with a small number of images, but want to make use of a feature-extractor trained on a much larger set of different images. TensorFlow Probability is another exciting extension, which is also currently available as a standalone module. One benefit of using TensorFlow Probability is that it provides additional statistical distributions that can be used for random number generation. It also enables you to incorporate trainable statistical distributions into your models. Finally, TensorFlow Probability provides an extended set of optimizers that are commonly used in statistical research. This gives you additional tools beyond what the core TensorFlow module provides.
  
TensorFlow 2.0
  
Finally, I will say a few words about the difference between TensorFlow 2 and TensorFlow 1. If you primarily develop in 1, you may have noticed that you do not need to define static graphs or enable eager execution. This is done automatically in 2. Furthermore, TensorFlow 2 has substantially tighter integration with Keras. In fact, the core functionality of the TensorFlow 1 train module is handled by `tf.keras` operations in 2. In addition to the centrality of Keras, the Estimators API also plays a more important role in TensorFlow 2. Finally, TensorFlow 2 also allows you to use static graphs, but they are available through the `tf.function` operation.
  
