<a href="https://colab.research.google.com/github/schmelto/machine-learning-with-python/blob/main/Deeplearning/MNIST.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# MNIST-Dataset

In [None]:
import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt

## Load of the MNIST-Dataset
First of all we want to load the MNIST-Dataset.

In [18]:
data = keras.datasets.mnist
(train_images, train_labels), (test_images, test_labels) = data.load_data()

We have to normalize the values of the pixles by deviding the values by 255.

In [19]:
train_images = train_images / 255
test_images = test_images / 255

* The picture with the handwrited 5 has the label 5.
* We want to have the lable as a Vektor [0. 0. 0. 0. 0. 1. 0. 0. 0. 0.] which is in the necessary  for our network. This vektor has now on position 5 (starting at 0) a 1.

In [20]:
total_classes = 10
train_vec_labels = keras.utils.to_categorical(train_labels, total_classes)
test_vec_labels = keras.utils.to_categorical(test_labels, total_classes)

## Design of a network

Now we have normalized the input data and the labels are available as vectors. So we can finally start building a network for recognizing the handwritten numbers.

We want to define a very simple network with 3 layers (input layer, hidden layer and output layer):

* We use a keras.layers.Flatten layer as the input layer, which distributes the 28x28 matrices that we receive as inputs to 28x28 = 784 neurons
* Next, we use a keras.layers.Dense layer with 128 neurons for the hidden layer
* We use a keras.layers.Dense layer with 10 neurons as the output layer, since we want to recognize 10 classes (digits from 0-9)

In [12]:
model = keras.Sequential([
                          keras.layers.Flatten(input_shape=(28, 28)),
                          keras.layers.Dense(128, activation='sigmoid'),
                          keras.layers.Dense(10, activation='sigmoid')
])

## Compiling of the newtork

After we have defined our network, we have to compile it before we can start training.

In this step we define important parameters for the training phase:

* The **optimizer** is the learning algorithm used in training to improve the network. In the last week we already got to know Gradient Descent and its optimization Stochastic Gradient Descent.
* The **loss** is the cost function used. The aim during training is to minimize this. We already got to know the squared error function in week 1.
* The **metrics** are the metrics evaluated during training. For all classification problems we are interested in the "accuracy".
In this example we use

The Stochastic Gradient Descent ("sgd") learning algorithm as our optimizer.
The "mean_squared_error" cost function which, compared to the normal squared error cost function, does not calculate the sum but the mean of the errors of the output neurons.

In [21]:
# sgd = keras.optimizers.SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(
    optimizer='sgd',
    loss='mean_squared_error',
    metrics=['accuracy'])

## Train the network

Now we can finally train our network. To do this, we use the fit method and transfer our training images as inputs with the associated labels as desired outputs. The number of epochs indicates how often the network can see the entire training set. If we increase the number of epochs, we let our network learn longer.

In [22]:
model.fit(train_images, train_vec_labels, epochs=10, verbose=True)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x7f4bc5be44a8>

## Evalutaion of the newtork

So far, the network has only seen training images and learned from them. But the aim is to recognize new images of handwritten numbers with our network. That's why there is the test data with which we now want to check our network for accuracy in the case of unseen data.

In [23]:
eval_loss, eval_accuracy = model.evaluate(test_images, test_vec_labels, verbose=False)
print("Model accuracy: %.2f" % eval_accuracy)

Model accuracy: 0.58
