In [1]:
%load_ext nb_black

<IPython.core.display.Javascript object>

- you're not going to do anything by hand anymore, you're going to use existing tools
- we're going to show you how to build and train a multilayer convolutional neural network in even fewer lines of code
- Keras system that we chose is built on TensorFlow by Google, and is a tool that makes building deep networks easy and effective
- the reason why we do this is because we want you to experience building bigger and better things


- in this lesson, you'll be building a deep neural network using a new set of tools
- you'll still have TensorFlow under the hood, but with an interface that makes testing and prototyping much faster

- to help us with this lesson, we're excited to welcome Drew Gray, who's leading the self-driving car team at Otto, which is now part of Uber
  - we're exploring whether we can get a car to drive itself, using only deep neural networks and nothing else
    - sometimes we call that behavioral cloning because you're training the network that clone human driving behavior
    - sometimes it's called end-to-end learning because the network is learning to predict the correct steering angle and speed, using only the inputs from the sensors


- for a number of years, people have been working on a more traditional sort of robotics approach
  - the robotics approach in building self-driving cars involves a lot of detail knowledge about sensors, controls and planning
  - with the deep learning approach, we don't have to program all that detail knowledge into the vehicle
    - we simply feed all the information we have into the network, and then we let the network figure out on its own what's important
    - also, deep learning allows us to build a feedback loop where the more we drive, the more data we can collect, which in turn allows us to learn how to drive even better

# Keras Overview

- [Keras](http://faroit.com/keras-docs/1.2.1/) makes coding deep neural networks simpler
- to demonstrate just how easy it is, you're going to build a simple fully-connected network in a few dozen lines of code
- we’ll be connecting the concepts that you’ve learned in the previous lessons to the methods that Keras provides
- the network you will build is similar to Keras’s [sample network](https://github.com/keras-team/keras/blob/master/examples/mnist_cnn.py) that builds out a convolutional neural network for [MNIST](http://yann.lecun.com/exdb/mnist/)
- however for the network you will build you're going to use a small subset of the [German Traffic Sign Recognition Benchmark](http://benchmark.ini.rub.de/?section=gtsrb&subsection=news) dataset that you've used previously
- the general idea for this example is that you'll first load the data, then define the network, and then finally train the network

# Neural Networks in Keras

- here are some core concepts you need to know for working with [Keras](https://keras.io/)

## Sequential Model

```python
 from keras.models import Sequential

    # Create the Sequential model
    model = Sequential()
```

- the [keras.models.Sequential](https://keras.io/models/sequential/) class is a wrapper for the neural network model
- it provides common functions like `fit()`, `evaluate()`, and `compile()`
  - we'll cover these functions as we get to them. Let's start looking at the layers of the model


- see the documentation for `keras.models.Sequential` in Keras 2.09 [here](https://faroit.github.io/keras-docs/2.0.9/models/sequential/)

## Layers

- a Keras layer is just like a neural network layer
- there are fully connected layers, max pool layers, and activation layers
- you can add a layer to the model using the model's `add()` function
  - for example, a simple model would look like this:

```python
from keras.models import Sequential
from keras.layers.core import Dense, Activation, Flatten

# Create the Sequential model
model = Sequential()

#1st Layer - Add a flatten layer
model.add(Flatten(input_shape=(32, 32, 3)))

#2nd Layer - Add a fully connected layer
model.add(Dense(100))

#3rd Layer - Add a ReLU activation layer
model.add(Activation('relu'))

#4th Layer - Add a fully connected layer
model.add(Dense(60))

#5th Layer - Add a ReLU activation layer
model.add(Activation('relu'))
```

- Keras will automatically infer the shape of all layers after the first layer
  - this means you only have to set the input dimensions for the first layer


- the first layer from above, `model.add(Flatten(input_shape=(32, 32, 3)))`, sets the input dimension to $(32, 32, 3)$ and output dimension to $(3072=32 x 32 x 3)$
- the second layer takes in the output of the first layer and sets the output dimensions to $(100)$
- this chain of passing output to the next layer continues until the last layer, which is the output of the model

## Quiz: Neural Networks in Keras

- in this quiz you will build a multi-layer feedforward neural network to classify traffic sign images using Keras
  - set the first layer to a Flatten() layer with the input_shape set to $(32, 32, 3)$
  - set the second layer to a Dense() layer with an output width of $128$
  - use a ReLU activation function after the second layer
  - set the output layer width to $5$, because for this data set there are only $5$ classes
  - use a softmax activation function after the output layer
  - train the model for $3$ epochs; you should be able to get over $50\%$ training accuracy


- to get started, review the Keras documentation about models and layers
- the Keras example of a [Multi-Layer Perceptron](https://github.com/keras-team/keras/blob/master/examples/mnist_mlp.py) network is similar to what you need to do here
  - use that as a guide, but keep in mind that there are a number of differences


- the data set used in these quizzes can be downloaded [here](https://d17h27t6h515a5.cloudfront.net/topher/2017/March/58dbf6d5_small-traffic-set/small-traffic-set.zip)

In [None]:
import pickle
import numpy as np
import tensorflow as tf

# Load pickled data
with open("resources/small_train_traffic.p", mode="rb") as f:
    data = pickle.load(f)

# split data
X_train, y_train = data["features"], data["labels"]

# Setup Keras
from keras.models import Sequential
from keras.layers.core import Dense, Activation, Flatten

# TODO: Build the Fully Connected Neural Network in Keras Here
model = Sequential()
model.add(Flatten(input_shape=(32, 32, 3)))
model.add(Dense(128))
model.add(Activation("relu"))
model.add(Dense(5))
model.add(Activation("softmax"))

# An Alternative Solution
# model = Sequential()
# model.add(Flatten(input_shape=(32, 32, 3)))
# model.add(Dense(128, activation='relu'))
# model.add(Dense(5, activation='softmax'))

# preprocess data
X_normalized = np.array(X_train / 255.0 - 0.5)

from sklearn.preprocessing import LabelBinarizer

label_binarizer = LabelBinarizer()
y_one_hot = label_binarizer.fit_transform(y_train)

model.compile("adam", "categorical_crossentropy", ["accuracy"])
# TODO: change the number of training epochs to 3
history = model.fit(X_normalized, y_one_hot, epochs=3, validation_split=0.2)

## Quiz: Convolutions in Keras

- build from the previous network
- add a [convolutional layer](https://keras.io/layers/convolutional/#convolution2d) with $32$ filters, a $3x3$ kernel, and valid padding before the flatten layer
- add a ReLU activation after the convolutional layer
- train for $3$ epochs again; you should be able to get over $50\%$ accuracy


- hint
  - the Keras example of a [convolutional neural network](https://github.com/keras-team/keras/blob/master/examples/mnist_cnn.py) for MNIST would be a good example to review

In [None]:
import pickle
import numpy as np
import tensorflow as tf

# Load pickled data
with open("resources/small_train_traffic.p", mode="rb") as f:
    data = pickle.load(f)

# split data
X_train, y_train = data["features"], data["labels"]

# Setup Keras
from keras.models import Sequential
from keras.layers.core import Dense, Activation, Flatten
from keras.layers.convolutional import Conv2D

# TODO: Build Convolutional Neural Network in Keras Here
model = Sequential()
model.add(
    Conv2D(input_shape=(32, 32, 3), filters=32, kernel_size=(3, 3), padding="valid")
)
model.add(Activation("relu"))
model.add(Flatten())
model.add(Dense(128))
model.add(Activation("relu"))
model.add(Dense(5))
model.add(Activation("softmax"))

# Preprocess data
X_normalized = np.array(X_train / 255.0 - 0.5)

from sklearn.preprocessing import LabelBinarizer

label_binarizer = LabelBinarizer()
y_one_hot = label_binarizer.fit_transform(y_train)

# compile and train model
# Training for 3 epochs should result in > 50% accuracy
model.compile("adam", "categorical_crossentropy", ["accuracy"])
history = model.fit(X_normalized, y_one_hot, epochs=3, validation_split=0.2)

## Quiz: Pooling in Keras

- build from the previous network
- add a 2x2 [max pooling layer](https://keras.io/layers/pooling/#maxpooling2d) immediately following your convolutional layer
- train for 3 epochs again; you should be able to get over $50\%$ training accuracy

In [None]:
import pickle
import numpy as np
import tensorflow as tf

# Load pickled data
with open("resources/small_train_traffic.p", mode="rb") as f:
    data = pickle.load(f)

# split the data
X_train, y_train = data["features"], data["labels"]

# Setup Keras
from keras.models import Sequential
from keras.layers.core import Dense, Activation, Flatten
from keras.layers.convolutional import Conv2D
from keras.layers.pooling import MaxPooling2D

# TODO: Build Convolutional Pooling Neural Network in Keras Here
model = Sequential()
model.add(
    Conv2D(input_shape=(32, 32, 3), filters=32, kernel_size=(3, 3), padding="valid")
)
model.add(MaxPooling2D((2, 2)))
model.add(Activation("relu"))
model.add(Flatten())
model.add(Dense(128))
model.add(Activation("relu"))
model.add(Dense(5))
model.add(Activation("softmax"))

# Preprocess data
X_normalized = np.array(X_train / 255.0 - 0.5)

from sklearn.preprocessing import LabelBinarizer

label_binarizer = LabelBinarizer()
y_one_hot = label_binarizer.fit_transform(y_train)

# compile and fit model
model.compile("adam", "categorical_crossentropy", ["accuracy"])
history = model.fit(X_normalized, y_one_hot, epochs=3, validation_split=0.2)

## Quiz: Dropout in Keras

- build from the previous network
- add a [dropout layer](https://keras.io/layers/core/#dropout) after the pooling layer
  - set the dropout rate to $50\%$
- make sure to note from the documentation above that the rate specified for dropout in Keras is the opposite of TensorFlow!
  - TensorFlow uses the probability to keep nodes, while Keras uses the probability to drop them

In [None]:
import pickle
import numpy as np
import tensorflow as tf

# Load pickled data
with open("resources/small_train_traffic.p", mode="rb") as f:
    data = pickle.load(f)

# split the data
X_train, y_train = data["features"], data["labels"]

# Setup Keras
from keras.models import Sequential
from keras.layers.core import Dense, Activation, Flatten, Dropout
from keras.layers.convolutional import Conv2D
from keras.layers.pooling import MaxPooling2D

# TODO: Build Convolutional Pooling Neural Network with Dropout in Keras Here
model = Sequential()
model.add(
    Conv2D(input_shape=(32, 32, 3), filters=32, kernel_size=(3, 3), padding="valid")
)
model.add(MaxPooling2D((2, 2)))
model.add(Dropout(0.5))
model.add(Activation("relu"))
model.add(Flatten())
model.add(Dense(128))
model.add(Activation("relu"))
model.add(Dense(5))
model.add(Activation("softmax"))

# Preprocess data
X_normalized = np.array(X_train / 255.0 - 0.5)

from sklearn.preprocessing import LabelBinarizer

label_binarizer = LabelBinarizer()
y_one_hot = label_binarizer.fit_transform(y_train)

# compile and fit model
model.compile("adam", "categorical_crossentropy", ["accuracy"])
history = model.fit(X_normalized, y_one_hot, epochs=3, validation_split=0.2)

## Quiz: Testing in Keras

- once you've picked out your best model, it's time to test it!
- try to get the highest validation accuracy possible
  - feel free to use all the previous concepts and train for as many epochs as needed
- select your best model and train it one more time
- use the test data and the Keras [evaluate()](https://keras.io/models/model/#evaluate) method to see how well the model does

In [2]:
import pickle
import numpy as np
import tensorflow as tf

# Load pickled data
with open("resources/small_train_traffic.p", mode="rb") as f:
    data = pickle.load(f)

# Split the data
X_train, y_train = data["features"], data["labels"]

# Setup Keras
from keras.models import Sequential
from keras.layers.core import Dense, Activation, Flatten, Dropout
from keras.layers.convolutional import Conv2D
from keras.layers.pooling import MaxPooling2D

# TODO: Build the Final Test Neural Network in Keras Here
model = Sequential()
model.add(
    Conv2D(input_shape=(32, 32, 3), filters=32, kernel_size=(3, 3), padding="valid")
)
model.add(MaxPooling2D((2, 2)))
model.add(Dropout(0.5))
model.add(Activation("relu"))
model.add(Flatten())
model.add(Dense(128))
model.add(Activation("relu"))
model.add(Dense(5))
model.add(Activation("softmax"))

# preprocess data
X_normalized = np.array(X_train / 255.0 - 0.5)

from sklearn.preprocessing import LabelBinarizer

label_binarizer = LabelBinarizer()
y_one_hot = label_binarizer.fit_transform(y_train)

# compile and fit the model
# set epochs to 10
model.compile("adam", "categorical_crossentropy", ["accuracy"])
history = model.fit(X_normalized, y_one_hot, epochs=10, validation_split=0.2)

Instructions for updating:
If using Keras pass *_constraint arguments to layers.



Using TensorFlow backend.



Train on 80 samples, validate on 20 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<IPython.core.display.Javascript object>

In [3]:
# evaluate model against the test data
with open("resources/small_test_traffic.p", "rb") as f:
    data_test = pickle.load(f)

X_test = data_test["features"]
y_test = data_test["labels"]

# preprocess data
X_normalized_test = np.array(X_test / 255.0 - 0.5)
y_one_hot_test = label_binarizer.fit_transform(y_test)

print("Testing")

metrics = model.evaluate(X_normalized_test, y_one_hot_test)
for metric_i in range(len(model.metrics_names)):
    metric_name = model.metrics_names[metric_i]
    metric_value = metrics[metric_i]
    print("{}: {}".format(metric_name, metric_value))

Testing
loss: 0.2102127969264984
accuracy: 1.0


<IPython.core.display.Javascript object>

# Conclusion

- the work we just did, building and training a multi-layer convolutional neural network, would have taken hundreds of lines of code just a few years ago
- as deep neural networks become increasingly important to everything from self-driving cars to voice recognition, new libraries are making it much easier to use deep learning to solve real problems


- next up, you're going to learn how to take networks that have already been trained, and fine-tune them to accelerate your own work
  - this is called transfer learning
  - transfer learning is a standard tool in deep learning
    - it really reduces your training time and improve your results