# Image Classification with the MNIST Dataset

## The MNIST Dataset

In the history of deep learning, the accurate image classification of the MNIST dataset, a collection of 70,000 grayscale images of handwritten digits from 0 to 9, was a major development. While today the problem is considered trivial, doing image classification with MNIST has become a kind of "Hello World" for deep learning.

Here are 40 of the images included in the MNIST dataset:

<img src="./images/mnist1.png" style="width: 600px;">

## Loading the Data Into Memory (with Keras)

In [1]:
from tensorflow.keras.datasets import mnist

In [2]:

(x_train, y_train), (x_valid, y_valid) = mnist.load_data()

## Exploring the MNIST Data

In [None]:
x_train.shape

In [None]:
x_valid.shape

In [None]:
x_train.dtype

In [6]:
x_train.min()

0

In [7]:
x_train.max()

255

In [None]:
x_train[0]

Using Matplotlib, we can render one of these grayscale images in our dataset:

In [None]:
import matplotlib.pyplot as plt

image = x_train[2]
plt.imshow(image, cmap='gray')

In [None]:
y_train[0]

## Preparing the Data for Training

### Flattening the Image Data

In [11]:
x_train = x_train.reshape(60000, 784)
x_valid = x_valid.reshape(10000, 784)

We can confirm that the image data has been reshaped and is now a collection of 1D arrays containing 784 pixel values each:

In [None]:
x_train.shape

In [None]:
x_train[0]

### Normalizing the Image Data

In [16]:
x_train = x_train / 255
x_valid = x_valid / 255 

We can now see that the values are all floating point values between `0.0` and `1.0`:

In [None]:
x_train.dtype

In [None]:
x_train.min()

In [None]:
x_train.max()

### Categorical Encoding

### Categorically Encoding the Labels

In [22]:
import tensorflow.keras as keras
num_categories = 10

y_train = keras.utils.to_categorical(y_train, num_categories)
y_valid = keras.utils.to_categorical(y_valid, num_categories)

Here are the first 10 values of the training labels, which you can see have now been categorically encoded:

In [None]:
y_train[0:9]

## Creating the Model

### Instantiating the Model

In [10]:
from tensorflow.keras.models import Sequential

model = Sequential()

### Creating the Input Layer

In [11]:
from tensorflow.keras.layers import Dense

In [12]:
model.add(Dense(units=512, activation='relu', input_shape=(784,)))

### Creating the Hidden Layer

In [13]:
model.add(Dense(units = 512, activation='relu'))

### Creating the Output Layer

In [14]:
model.add(Dense(units = 10, activation='softmax'))

### Summarizing the Model

In [None]:
model.summary()

### Compiling the Model

In [16]:
model.compile(loss='categorical_crossentropy', metrics=['accuracy'])

## Training the Model

In [None]:
history = model.fit(
    x_train, y_train, epochs=5, verbose=1, validation_data=(x_valid, y_valid)
)

## Clear the Memory

In [None]:
import IPython
app = IPython.Application.instance()
app.kernel.do_shutdown(True)