Most of the code comes from a sample from the book:
https://livebook.manning.com/book/deep-learning-with-python-second-edition


# Use Keras to demonstrate Deep Learning with the MNIST dataset


In [None]:
import keras
from keras import layers
import matplotlib.pyplot as plt

In [None]:
(train_images, train_labels), (test_images, test_labels) = keras.datasets.mnist.load_data()

## Explore the dataset
If you check the shape, you see that we have a training set of 60000 samples, with each a matrix of 28 by 28 data points (pixels). Every pixel has a value from 0 to 255. The train labels contain the numbers that have been written. Look at the shape of the data and explore the first image as a matrix of numbers. Can you see the number?

In [None]:
print(train_images.shape)
print(train_labels.shape)

image_index = 523

# Print the values in the matrix for the image with the specified index
for row in train_images[image_index]:
    for number in row:
        print(f"{number:03}", end=' ')
    print()

If you did not see it yet from the previous printout, check after some clean-up. All zeroes are replaced by spaces, and all other numbers are replaced by 1. This way, you can see the number that is written.

In [None]:
# Print all numbers from one row on one line, with 0 as space and others as 1
for row in train_images[image_index]:
    for number in row:
        print(' ' if number == 0 else '1', end=' ')
    print()  # To ensure the next print starts on a new line

In [None]:
# Print the label of the image
print(f"Label: {train_labels[image_index]}")

## Prepare the data
We need to reshape the data to a 1D array and normalize the data by dividing by 255.0. The labels are already in the correct format.

In [None]:
# The reshape creates one array of 784 long instead of a matrix of 28 x 28
train_images = train_images.reshape((60000, 28 * 28))
train_images = train_images.astype("float32") / 255
test_images = test_images.reshape((10000, 28 * 28))
test_images = test_images.astype("float32") / 255

## Create the model
This architecture defines a simple fully connected feedforward neural network (also known as a multilayer perceptron, or MLP) using Keras. It is typically used for tasks like classification where input data is represented in a flat, non-spatial form (e.g., tabular data or flattened images). The model has two Dense layers, which are densely connected (also called fully connected) neural layers. The first Dense layer has 512 units and uses the ReLU activation function. The second (and last) layer is a 10-way softmax layer, which means it will return an array of 10 probability scores (summing to 1). Each score will be the probability that the current digit image belongs to one of the 10 digit classes.

In [None]:
model = keras.Sequential([
    layers.Dense(512, activation="relu"),
    layers.Dense(10, activation="softmax")
])

model.compile(optimizer="rmsprop",
              loss="sparse_categorical_crossentropy",
              metrics=["accuracy"])

In [None]:
history = model.fit(train_images, train_labels, epochs=5, batch_size=128)

## Evaluate the model
The model is trained, and we can evaluate it on the test set. The accuracy on the test set is a good indication of how well the model generalizes to new, unseen data. Before we do that we check one image to explain the predictions.

In [None]:
# Reconstruct one of the images, we use this one to show the model works
# Don't forget to reshape the array of 784 back into a matrix of 28x28
test_image_index=17

digit = test_images[test_image_index]
fig = plt.figure
plt.imshow(digit.reshape((28,28)), cmap='gray')
plt.show()

predictions = model.predict(test_images[test_image_index:test_image_index+1])
print("Softmax predictions for the digit above")
for idx, pred in enumerate(predictions[0]):
    print('{} - {:.5f}'.format(idx,pred))
print(f'The most likely digit based on the max value is {predictions[0].argmax()}')
print(f'The label tells us it is a {test_labels[test_image_index]}')

In [None]:
# Evaluate the model on the test set
test_loss, test_acc = model.evaluate(test_images, test_labels)

In [None]:
print(f"test accuracy: {test_acc}, train accuracy: {history.history['accuracy'][4]}")