<a href="https://colab.research.google.com/github/danielbauer1979/MSDIA_PredictiveModelingAndMachineLearning/blob/main/GB888_VI_4_ImageClassificationLab.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Lab: Image Data and Image Classification




In this lab, we will take a first step into ML with image data. In particular, we will build some deep learning models for a very basic image classification problem: Predicting digits based on some hand-written images of written digits. This is a fairly classic problem, and the underlying [MNIST dataset](https://en.wikipedia.org/wiki/MNIST_database) -- which is available in Keras -- is pretty famous. The images consist of 28-by-28 pixels in grey-scale, and the classification problem (obviously) is predicting 0 through 9.

## Libraries and Data

Let's load the libraries and the dataset

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import random

from keras.models import Sequential
from keras.layers import Dense
from keras.datasets import mnist
from keras.utils import to_categorical

And let's set a random seed:

In [None]:
np.random.seed(42)

### Loading and Exloring the MNIST Dataset

The (famous!) [MNIST](https://en.wikipedia.org/wiki/MNIST_database) dataset consists of 60,000 labeled examples of handwritten digits. It is available in the keras library, so we can just load it.

In [None]:
(x_train, y_train), (x_test, y_test) = mnist.load_data()

Let's take a look at the first image. The `y` variable contains the actual number:

In [None]:
image_index = 0
y_train[0]

So the first image displays a 5. And the `x` variable is the image. Here is the corresponding image:

In [None]:
x_train[0]

So each $x$ variable is a 28 $\times$ 28 matrix of pixel values that denote grey scales, where 0 corresponds to very light and 255, the max value, corresponds to very dark (or the other way around). Let's visualize:

In [None]:
plt.imshow(x_train[image_index], cmap='Greys')
plt.show()

So this is about the simplest image dataset one can imagine.

Let's look at a few more:

In [None]:
image_index = 1
print(y_train[image_index])
plt.imshow(x_train[image_index], cmap='Greys')
plt.show()

In [None]:
image_index = 10
print(y_train[image_index])
plt.imshow(x_train[image_index], cmap='Greys')
plt.show()

Our task is fairly obvious: **We want to build a neural network that take the images as the inputs (our $x$-s) and predict the correspondoing digits (our $y$-s).**

In doing so, we convert the outcomes to categorical variables:

In [None]:
num_classes = 10

y_train = to_categorical(y_train, num_classes)
y_test = to_categorical(y_test, num_classes)

## Building and Training a DL Model

We start by building a feed-forward neural network. For that, we convert the images to vectors with $28 \times 28 = 784$ resulting features:

In [None]:
x_train = x_train.reshape(x_train.shape[0], 784)
x_test = x_test.reshape(x_test.shape[0], 784)

And we scale by dividing by the maximal pixel darkness:

In [None]:
x_train = x_train / 255
x_test = x_test/ 255

And let's build a feed-forward model, as we did before. An important difference, though, is that we use the soft-max function as the output layer for this multi-class problem, and we use 'categorical_crossentropy' as the (multi-class) loss function:

In [None]:
model = Sequential()
model.add(Dense(50, input_shape=(784, ), activation='relu', name='dense_1'))
model.add(Dense(25, activation='relu', name='dense_2'))
model.add(Dense(10, activation='softmax', name='dense_output'))
model.compile(loss='categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
model.summary()

Let's train it using 20 epochs:

In [None]:
history = model.fit(x_train, y_train, epochs=20, validation_data=(x_test, y_test))

So the validation loss is not really getting better, arguably we are already overfitting.

Let's evaluate the performance based on some randomly sampled test images:

In [None]:
def plot_digit(image, digit, plt, i):
    plt.subplot(4, 5, i + 1)
    plt.imshow(image, cmap=plt.get_cmap('gray'))
    plt.title(f"Digit: {digit}")
    plt.xticks([])
    plt.yticks([])

random.seed(5)

plt.figure(figsize=(16, 10))
for i in range(20):
    image = random.choice(x_test).squeeze()
    digit = np.argmax(model.predict(image.reshape((1, 784)))[0], axis=-1)
    plot_digit(image.reshape(28,28), digit, plt, i)

plt.show()

So it seems like the model performs pretty well, yet it is not perfectly accurate. We will see that more advanced NN architectures can help!