```
Title: Simple MNIST convnet
Author: [fchollet](https://twitter.com/fchollet)
Date created: 2015/06/19
Last modified: 2020/04/21
Description: A simple convnet that achieves ~99% test accuracy on MNIST.
```

Small modifications and exercises - James McDermott

* This notebook demonstrates the basic workflow - import libraries, load and preprocess data, create model, compile, train, evaluate or predict. Also, optionally, save/load and again evaluate or predict.


In [1]:
## Setup

import numpy as np
from tensorflow import keras
from tensorflow.keras import layers

In [2]:
"""
## Prepare the data
"""

# Model / data parameters
num_classes = 10
input_shape = (28, 28, 1)

# the data, split between train and test sets
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

# just to make training fast, we train on a small sub-sample 
# REMOVE these lines when you want to train for real
x_train = x_train[:1000]
y_train = y_train[:1000]

# Scale images to the [0, 1] range
x_train = x_train.astype("float32") / 255
x_test = x_test.astype("float32") / 255
# Make sure images have shape (28, 28, 1)
x_train = np.expand_dims(x_train, -1)
x_test = np.expand_dims(x_test, -1)
print("x_train shape:", x_train.shape) # (n_samples, height, width, n_channels)
print(x_train.shape[0], "train samples")
print(x_test.shape[0], "test samples")


# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)



Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
[1m11490434/11490434[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 0us/step
x_train shape: (1000, 28, 28, 1)
1000 train samples
10000 test samples


In [3]:
"""
## Build the model: a small network so we can look at neuron outputs
"""

# a Sequential model means it is composed of a sequence of layers
model = keras.Sequential(
    [   
        # we specify the layers as a Python list, where each element is a constructor call
        
        # an Input layer is just there to tell Keras the expected shape
        keras.Input(shape=input_shape),
        
        # a typical Conv block
        layers.Conv2D(32, kernel_size=(3, 3), activation="relu"),
        layers.MaxPooling2D(pool_size=(2, 2)),
        
        # another Conv block
        layers.Conv2D(64, kernel_size=(3, 3), activation="relu"),
        layers.MaxPooling2D(pool_size=(2, 2)),
        
        # Flatten converts an "image" (height, width, n_channels) to a flat array height x width x n_channels
        layers.Flatten(),
        
        # Dropout is a regulariser
        layers.Dropout(0.5),
        
        # a standard classification head
        layers.Dense(num_classes, activation="softmax"),
    ]
)

# very useful for checking you understand what you have written:
model.summary()

In [5]:
"""
## Train the model
"""

# basic hyperparameters
batch_size = 128
epochs = 3 # enough for a quick demo. increase to 15 for slightly better performance.

# compile: specify the loss and optimizer and any additional metrics we want to see (in addition to the loss)
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])

# fit: pass X and y. with a validation split, we'll see both train and validation values for 
# loss and for any additional metrics
model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, validation_split=0.1)

# save to disk for later
model.save("keras_mnist_32_64.keras")


Epoch 1/3
[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 97ms/step - accuracy: 0.6951 - loss: 1.3189 - val_accuracy: 0.7600 - val_loss: 1.0261
Epoch 2/3
[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 44ms/step - accuracy: 0.7229 - loss: 0.9751 - val_accuracy: 0.8200 - val_loss: 0.7644
Epoch 3/3
[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 47ms/step - accuracy: 0.7692 - loss: 0.7801 - val_accuracy: 0.8500 - val_loss: 0.5978


Now we'll evaluate - notice if we trained on a small subset (see above), test accuracy won't be 98%+ as promised. 

In [7]:
"""
## Evaluate the trained model
"""

score = model.evaluate(x_test, y_test, verbose=0)
print("Test loss:", score[0])
print("Test accuracy:", score[1])


Test loss: 0.6077887415885925
Test accuracy: 0.8295000195503235


In [8]:
"""
## Load the saved model and evaluate to be sure it's the same
"""

model2 = keras.models.load_model('keras_mnist_32_64.keras')
score = model2.evaluate(x_test, y_test, verbose=0)
print("Test loss:", score[0])
print("Test accuracy:", score[1])

Test loss: 0.6077887415885925
Test accuracy: 0.8295000195503235


**Exercises**:

1. Convert it to be a dense network (no conv layers). Can we still get good performance?
2. Suppose we don't divide the input pixel values by 255. So, instead of the range [0, 1], the NN sees the range [0, 255]. Will performance be worse or better? Predict, then try it.
3. What happens if we add another few Conv and Pool layers? Predict, then try it.
4. Check out Karpathy: http://karpathy.github.io/2019/04/25/recipe/. 