In [1]:
import keras
keras.__version__

  from ._conv import register_converters as _register_converters
Using TensorFlow backend.


'2.2.0'

## Getting started with Keras Sequential Model ##

The $Sequential$ model is a linear stack of layers.

You can create a $Sequential$ model by passing a list of layer instances to the constructor:

In [2]:
from keras.models import Sequential
from keras import layers
from keras import models

model = Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3,3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))

In [3]:
model.summary() # A convnet takes as input tensors shape(image_height, image_width, image_channels)
# Here we are configuring the convnet to process inputs of size (28,28,1) format of MNIST images
# We do this by passing the argument input_shape=(28, 28, 1)

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_1 (Conv2D)            (None, 26, 26, 32)        320       
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 13, 13, 32)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 11, 11, 64)        18496     
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 5, 5, 64)          0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 3, 3, 64)          36928     
Total params: 55,744
Trainable params: 55,744
Non-trainable params: 0
_________________________________________________________________


The output of every **Conv2D and MaxPooling2D** layer is a 3D tensor of shape(height, width, channels).

**Width & height** dimensions tend to shrink the deeper into the network.

**The number of channels** is controlled by the first argument passed to the **Conv2D** layers (32 or 64).

The next step is to feed the last output tensor (of shape (3, 3, 64)) into a densely connected classifier network:
    * a stack of Dense layers 
    * these classifiers process vectors (which are 1D), whereas the current output is a 3D tensor.
    

    


In [4]:
# First, we have to flatten the 3D outputs into 1D, and then add a few Dense, layers on top #

model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10, activation='softmax'))

In [5]:
# We'll do a 10-way classification, using a final layer with 10 outputs and a softmax activation
# This is what the network looks like now:
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_1 (Conv2D)            (None, 26, 26, 32)        320       
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 13, 13, 32)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 11, 11, 64)        18496     
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 5, 5, 64)          0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 3, 3, 64)          36928     
_________________________________________________________________
flatten_1 (Flatten)          (None, 576)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 64)                36928     
__________

The (3, 3, 64) outputs are lattened into vectors of shape (576, ) before going through **two Dense** layers
* We get 576 from 3 * 3 * 64

Now, let's train the convnet on the MNIST digits

In [6]:
# Training the convnet on MNIST images:

from keras.datasets import mnist
from keras.utils import to_categorical

(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

train_images = train_images.reshape((60000, 28, 28, 1))
train_images = train_images.astype('float32') / 255

test_images = test_images.reshape((10000, 28, 28, 1))
test_images = test_images.astype('float32') / 255

train_labels = to_categorical (train_labels)
test_labels = to_categorical(test_labels)

model.compile(optimizer='rmsprop',
             loss='categorical_crossentropy',
             metrics=['accuracy'])
model.fit(train_images, train_labels, epochs=5, batch_size=64)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0xb23d75ef0>

In [7]:
# Let's evaluate the model on the test data:

test_loss, test_acc = model.evaluate(test_images, test_labels)
test_acc



0.9906

The densely connected network from Chapter 2 had a test accuracy of 97.8%, this basic covnet has a test accuracy of 99.1%

We also decreased the error rate by 68%(relative).