In [1]:
import keras
from keras import layers

In [3]:
inputs = keras.Input(shape=(28,28,1))
x = layers.Conv2D(filters=32, kernel_size=3, activation="relu")(inputs)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=64, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=128, kernel_size=3, activation="relu")(x)
x = layers.Flatten()(x)
outputs = layers.Dense(10, activation="softmax")(x)
model = keras.Model(inputs = inputs, outputs = outputs)

In [4]:
model.summary()

#### Explanation:

<ol>
  <li>CNNs expect input tensors of shape <code>(image_height, image_width, image_channels)</code> (excluding the batch dimension).</li>
  <li>For MNIST, the input shape is configured as <code>(28, 28, 1)</code>.</li>
  <li>The output of each <code>Conv2D</code> and <code>MaxPooling2D</code> layer is a rank-3 tensor with shape <code>(height, width, channels)</code>.</li>
  <li>The width and height of feature maps typically decrease as the network goes deeper.</li>
  <li>The number of output channels in <code>Conv2D</code> layers is determined by the first argument (e.g., 32, 64, or 128).</li>
  <li>After the final <code>Conv2D</code> layer in the example, the output shape is <code>(3, 3, 128)</code>.</li>
  <li>Densely connected (<code>Dense</code>) layers, used for classification, require 1D vector inputs.</li>
  <li>A <code>Flatten</code> layer is used to convert the 3D output of the convolutional part into a 1D vector before feeding it to the <code>Dense</code> layers.</li>
  <li>The final <code>Dense</code> layer for 10-way classification (like MNIST digits) has 10 output units and a <code>softmax</code> activation function.</li>
</ol>

In [7]:
from keras.datasets import mnist

(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

train_images = train_images.reshape((60000, 28, 28, 1))
train_images = train_images.astype("float32") / 255

test_images = test_images.reshape((10000, 28, 28, 1))
test_images = test_images.astype("float32") / 255

model.compile(
    optimizer="rmsprop",
    loss="sparse_categorical_crossentropy",
    metrics=["accuracy"]
)

model.fit(train_images, train_labels, epochs=5, batch_size=64)

Epoch 1/5
[1m938/938[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 10ms/step - accuracy: 0.8798 - loss: 0.3839
Epoch 2/5
[1m938/938[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 10ms/step - accuracy: 0.9851 - loss: 0.0477
Epoch 3/5
[1m938/938[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 10ms/step - accuracy: 0.9904 - loss: 0.0305
Epoch 4/5
[1m938/938[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 10ms/step - accuracy: 0.9924 - loss: 0.0247
Epoch 5/5
[1m938/938[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 10ms/step - accuracy: 0.9957 - loss: 0.0152


<keras.src.callbacks.history.History at 0x1c7a0ed7770>

#### Lets evaluate model on the test data

In [8]:
test_loss, test_accuracy = model.evaluate(test_images, test_labels)

[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.9892 - loss: 0.0341


#### Densely connected model from chapter 2 had a test accuracy of 97.8%, the basic convnet has a test accuracy of 98.92%: we decreased the error rate by about 55% (relative)