# Classification
Some of the most common supervised learnings tasks are regression and classification. Classification is a type of supervised learning task that predicts discrete classes. There are two types of classification as well, binary and multiclass. An example of binary would be 'spam email or not spam email'. Multiclass would be attempting to classify multiple types of animals, plants, etc.  Regression is a type of learning task that attempts to predict continuous numerical values, such as the temperature of a flame, or the height of an individual. In this notebook we will focus on classification. We will be using the MNIST dataset. It is a set of sabout 70,000 images of handwritten digits. Each image contains a corresponding label representing the actual digit.

Thankfully, Scikit-Learn and Tensorflow contains a lot of functions that can be used to download datasets, with the MNIST dataset being one of them. You can download it like so:

In [1]:
import tensorflow as tf
import numpy as np
import pandas as pd
mnist = tf.keras.datasets.mnist
(X_train, y_train), (X_test, y_test) = mnist.load_data()

The above code block handles all necessary imports to load in the data. It also prepares a training and test set as well by unpacking the data into tuples. Next we will have to normalize the data so that every value is between the values 0 and 1:

In [2]:
X_train = tf.keras.utils.normalize(X_train, axis=1)
X_test = tf.keras.utils.normalize(X_test, axis=1)

Now we can begin to work on our model. We will use a basic sequential neural network model:

In [3]:
model = tf.keras.models.Sequential()

This creates the model, but we still have to add all the layers into it. We will use a Flatten layer, which basically takes the 28x28 grid and flattens it into an array of 786 pixels, each with a value that has been normalized to be between 0 and 1:

In [4]:
model.add(tf.keras.layers.Flatten(input_shape=(28, 28)))

Once we have our first layer we will have to introduce a Dense layer. This represents one of our 'hidden' layers:

In [5]:
model.add(tf.keras.layers.Dense(128, activation='relu'))

Notice that the above code still utilizes an activation function called 'relu', short for rectified linear unit. There are many activation functions, such as the sigmoid activation function, which squashes the output to between 0 and 1. What relu does is create a threshold, and if the output value does not cross the threshold it will ouput 0, but if it does cross the threshold, it will ouput the value as it is. 

Lets add another Dense layer:

In [6]:
model.add(tf.keras.layers.Dense(128, activation='relu'))

Now we will add our output layer, which will be of size 10. Of course, they will represent the decimal system of numbers, from 0-9. The final layer will also be a Dense layer:

In [7]:
model.add(tf.keras.layers.Dense(10, activation='softmax'))

As you can probably tell we have implemented the softmax activation function for our final layer. What this does is make the final 10 values sum up to 1. This is essentially a way of showing confidence in the prediction. For example, if we say that the first value has an ouput of 0.01, then that would indicate that the neural network does not think the y should be 0. However, if that first value were something like 0.95, then that would indicate that the neural network has high confidence that the digit is the number 0.

Now we can compile the model:

In [8]:
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

After that, we can fit the model to the training data:

In [9]:
model.fit(X_train, y_train, epochs=3)

Epoch 1/3
Epoch 2/3
Epoch 3/3


<keras.callbacks.History at 0x1d912448310>

The epochs value we provided is essentially how many times we want our model to look at the training data. We can also save the model like so:

In [10]:
model.save('mnist_classification.model')

INFO:tensorflow:Assets written to: mnist_classification.model\assets


Now let's try loading in the model for practice:

In [11]:
model = tf.keras.models.load_model('mnist_classification.model')

Once your model is trained on the training data, you will want to evaluate it on the test data. We can also unpack the loss and accuracy results of the evaluation using python's unpacking feature:

In [12]:
loss, accuracy = model.evaluate(X_test, y_test)



In [13]:
loss, accuracy

(0.10032467544078827, 0.9703999757766724)

What we want in the evaluation is a low loss and a high accuracy. Now we can try to test our model on our own handwritten digits. We can do this in paint or you can also scan them in and scale them down. Paint will be easier since you can manually edit the the image to be 28x28 pixels, just like our dataset. You will then predict using:

`prediction = model.predict(image)`

`print(f"This digit is most likely: {np.argmax(prediction)}`

`plt.imshow(image[0], cmap=plt.cm.binary)`