In [1]:
from keras import layers
from keras import models

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              (None, 26, 26, 32)        320       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 13, 13, 32)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 11, 11, 64)        18496     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 5, 5, 64)          0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 3, 3, 64)          36928     
Total params: 55,744
Trainable params: 55,744
Non-trainable params: 0
_________________________________________________________________


In [2]:
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10, activation='softmax'))

model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              (None, 26, 26, 32)        320       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 13, 13, 32)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 11, 11, 64)        18496     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 5, 5, 64)          0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 3, 3, 64)          36928     
_________________________________________________________________
flatten (Flatten)            (None, 576)               0         
_________________________________________________________________
dense (Dense)                (None, 64)                3

In [4]:
from keras.datasets import mnist
from keras.utils import to_categorical

(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

train_images = train_images.reshape((60000, 28, 28, 1))
train_images = train_images.astype('float32') / 255

test_images = test_images.reshape((10000, 28, 28, 1))
test_images = test_images.astype('float32') / 255

train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(train_images, train_labels, epochs=5, batch_size=64)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x15ac724ba90>

In [5]:
test_loss, test_acc = model.evaluate(test_images, test_labels)
print(test_acc)

0.9905999898910522


In [None]:
#On the example in 2.1, we have got 0.9779 accuracy. We have decreased about 58% of error (relatively).

"""
Convolution
The fundamental difference between fully connected layer and convolutional layer is the following. Dense layer learns global pattern in
input attribute space, while convolutional layer learns local pattern. In image, it finds the pattern with small 2D window from input.
This fundamental attribute provides ConvNet two interesting properties:
1. Learned pattern gets translation invariance: If the ConvNet learned some pattern at the left bottom corner of some image, it can
detect the pattern from other space (such as left top corner). Fully connected network has to learn this as a new pattern rather 
recognizing as an existing pattern. This property makes ConvNet process the image efficiently. (fundamentally what we see is not
recognized differently just because of translation). It can learn expression that has generalization ability with relatively small 
number of samples.
2. ConvNet can learn spatial hierarchical structure of the pattern: First convolutional layer learns small local pattern such as edge.
Second convolutional layer learns bigger pattern consisted of patterns of first layer, etc. By using this method ConvNet can efficiently
learn complicated and abstract visual concepts (fundamentally the world we see has spatial hierarchial structure)

Convolution operation is applied to 3D tensor called feature map. This tensor is consisted of 2 spatial axis (height, width) and 1
depth axis (or channel axis). Since RGB image has 3 color channels (red, green, blue), dimension of its depth axis becomes 3. Dimension
of depth axis of black and white images like MNIST images is 1 (gray tone).Convolution operation extracts small patches from input
feature map and creates output feature map by applying same conversion to all of these patches.

Output feature map is also a 3D tensor that has height and width. Depth of output tensor depends on the situation since it is decided by
the layer's parameters. Then channels of depth axis do not longer mean specific color like RGB. Instead, it means some kind of filter.
Filter encodes some feature of input data. For example, one filter can encode if there exists a face in input.
"""