Image Classification with the MNIST Dataset

x_train: Images used for training the neural network
y_train: Correct labels for the x_train images, used to evaluate the model's predictions during training
x_valid: Images set aside for validating the performance of the model after it has been trained
y_valid: Correct labels for the x_valid images, used to evaluate the model's predictions after it has been trained


Loading the Data Into Memory (with Keras)

In [None]:
from tensorflow.keras.datasets import mnist

In [None]:
#With the mnist module, we can load the MNIST data, already partitioned into images and labels for both training and validation
# the data, split between train and validation sets
(x_train, y_train), (x_valid, y_valid) = mnist.load_data()

In [None]:
#Exploring the MNIST Data
x_train.shape
y_train.shape
x_train.dtype
x_train.min()
x_train.max()
x_train[0]

In [None]:
import matplotlib.pyplot as plt

image = x_train[0]
plt.imshow(image, cmap='gray')
#In this way we can now see that this is a 28x28 pixel image of a 5
y_train[0]

In [None]:
#Preparing the Data for Training
# 2-dimensional image (in our case 28x28 pixels)
#we're going to simplify things to start and reshape each image into a single array of 784 continuous pixels (note: 28x28 = 784)
#This is also called flattening the image.
x_train = x_train.reshape(60000, 784)
x_valid = x_valid.reshape(10000, 784)
x_train.shape
x_train[0]

In [None]:
#Normalizing the Image Data
x_train = x_train / 255
x_valid = x_valid / 255 
#We can now see that the values are all floating point values between 0.0 and 1.0
x_train.dtype
x_train.min()
x_train.max()

In [None]:
#Categorically Encoding the Labels
import tensorflow.keras as keras
num_categories = 10

y_train = keras.utils.to_categorical(y_train, num_categories)
y_valid = keras.utils.to_categorical(y_valid, num_categories)
#Here are the first 10 values of the training labels, which you can see have now been categorically encoded
y_train[0:9]

In [None]:
#Creating the Model

#With the data prepared for training, it is now time to create the model that we will train with the data
#This first basic model will be made up of several layers and will be comprised of 3 main parts:

#1)An input layer, which will receive data in some expected format

#2)Several hidden layers, each comprised of many neurons.
#Each neuron will have the ability to affect the network's guess with its weights, which are values that will be updated over many iterations as the network gets feedback on its performance and learns

#3)An output layer, which will depict the network's guess for a given image

#Instantiating the Model
from tensorflow.keras.models import Sequential

model = Sequential()

#Creating the Input Layer


#Next, we will add the input layer. This layer will be densely connected, meaning that each neuron in it, and its weights, will affect every neuron in the next layer
#To do this with Keras, we use Keras's Dense layer class.
from tensorflow.keras.layers import Dense

#The units argument specifies the number of neurons in the layer
#relu activation function that will help our network to learn how to make more sophisticated guesses about data than if it were required to make guesses based on some strictly linear function.
#The input_shape value specifies the shape of the incoming data which in our situation is a 1D array of 784 values
model.add(Dense(units=512, activation='relu', input_shape=(784,)))


#Creating the Hidden Layer
model.add(Dense(units = 512, activation='relu'))

#Creating the Output Layer
model.add(Dense(units = 10, activation='softmax'))
#activation function softmax which will result in each of the layer's values being a probability between 0 and 1 

# Summarizing the Model
# which will print a readable summary of a model
model.summary()

#Compiling the Model 
model.compile(loss='categorical_crossentropy', metrics=['accuracy'])
#loss function which will be used for the model to understand how well it is performing during training
#We also specify that we would like to track accuracy while the model trains

#Training the Model
#arguments:

#The training data
#The labels for the training data
#The number of times it should train on the entire training dataset (called an epoch)
#The validation or test data, and its labels

history = model.fit(
    x_train, y_train, epochs=5, verbose=1, validation_data=(x_valid, y_valid)
)


#Observing Accuracy
#The accuracy quickly reached close to 100%