# Chapter 2: Deep Learning

## Your First Deep Neural Network

In the example below, we will be using an _artificial neural network_ (ANN) to classify images in the CIFAR-10 dataset.

In [1]:
%tensorflow_version 1.x
import numpy as np
from keras.utils import to_categorical
from keras.datasets import cifar10

(X_train, y_train), (X_test, y_test) = cifar10.load_data()

NUM_CLASSES = 10

# Scaling the values of the image data to the interval [0, 1].
X_train = X_train.astype('float32') / 255.0
X_test = X_test.astype('float32') / 255.0

# One-hot encode the labels.
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

TensorFlow 1.x selected.


Using TensorFlow backend.


Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz


### Building the Model

The two code cells below compare defining a neural network using Keras' sequential and functional APIs respectively.

In [0]:
from keras.models import Sequential
from keras.layers import Flatten, Dense

model = Sequential([
  Dense(200, activation='relu', input_shape=(32, 32, 3)),
  Flatten(),
  Dense(150, activation='relu'),
  Dense(10, activation='softmax'),
])

In [0]:
from keras.layers import Input, Flatten, Dense
from keras.models import Model

input_layer = Input(shape=(32, 32, 3))
flatten_layer = Flatten()(input_layer)
hidden1 = Dense(units=200, activation='relu')(flatten_layer)
hidden2 = Dense(units=150, activation='relu')(hidden1)
output_layer = Dense(units=10, activation='softmax')(hidden2)

model = Model(input_layer, output_layer)

As the name suggests, the `Input` layer is the entry point into the network. The `Flatten` layer flattens a multidimensional input into a single-dimensional output. The `Dense` layers are fully connected layers which multiply inputs by a weight and then applies a nonlinear _activation_ function to the output.

The activation functions we are most interested are the rectified linear unit (ReLU) function:

$$ \text{ReLU}(x) = \max(0,x) $$

The leaky ReLU function:

$$ \text{LeakyReLU}(x) = \left\{ \begin{matrix} x && \text{if}\;x\geq0 \\ \alpha x && \text{otherwise} \end{matrix} \right. $$

where $\alpha$ is a positive constant whose value is close to zero.

There is also the sigmoid function:

$$ S(x) = \left(1 + e^{-x} \right)^{-1} $$

Finally, there is the softmax function, where each output, $y_i$, is given by

$$ y_i = \frac{e^{x_i}}{\sum\limits_j e^{x_j}}. $$

You can use the `model.summary()` method to get information about the different layers in the model and check that the layers are the correct shape:

In [6]:
model.summary()

Model: "model_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_2 (InputLayer)         (None, 32, 32, 3)         0         
_________________________________________________________________
flatten_4 (Flatten)          (None, 3072)              0         
_________________________________________________________________
dense_10 (Dense)             (None, 200)               614600    
_________________________________________________________________
dense_11 (Dense)             (None, 150)               30150     
_________________________________________________________________
dense_12 (Dense)             (None, 10)                1510      
Total params: 646,260
Trainable params: 646,260
Non-trainable params: 0
_________________________________________________________________
