##Introduction
This project will look into training a model to analyze 60,000 images of handwritten numbers and classify them into the correct number that the handwritten number is intended to be. I will specifically be looking and the numbers 4 and 9 out of random. The goal of the project is to create a neural network that trains and tests data and analyze photos similar to how our brains would.

The data set is called MNIST and is a popular data set used for image analysis that can be found here https://www.kaggle.com/datasets/hojjatk/mnist-dataset


Below is the code used to install the packacges and tools we need to complete the process.

In [1]:
!pip install keras_tuner

Collecting keras_tuner
  Downloading keras_tuner-1.4.7-py3-none-any.whl (129 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/129.1 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m129.1/129.1 kB[0m [31m3.9 MB/s[0m eta [36m0:00:00[0m
Collecting kt-legacy (from keras_tuner)
  Downloading kt_legacy-1.0.5-py3-none-any.whl (9.6 kB)
Installing collected packages: kt-legacy, keras_tuner
Successfully installed keras_tuner-1.4.7 kt-legacy-1.0.5


In [2]:
import keras
import keras_tuner
from keras import layers, regularizers
from keras.optimizers.legacy import Adam
from keras.callbacks import EarlyStopping, TensorBoard

In [3]:
from keras.datasets import mnist

Next I will load in my data known as " MNIST" ( populat image data set) and create a mask that filters specifically for my numbers of choice which are 4 and 9.

In [7]:
(train_X, y_train), (test_X, y_test) = mnist.load_data()
#loading in the data

In [8]:
mask_train = (y_train == 4) | (y_train == 9)
mask_test = (y_test == 4) | (y_test == 9)
# filtering data for only numbers 4 and 9

In [9]:
X_train_filtered = train_X[mask_train, :]
X_test_filtered = test_X[mask_test, :]
y_train_filtered = y_train[mask_train]
y_test_filtered = y_test[mask_test]
# assigning images to a test and train set for X and Y

In [10]:
print('X_train: ' + str(X_train_filtered.shape))
print('Y_train: ' + str(y_train_filtered.shape))
print('X_test:  '  + str(X_test_filtered.shape))
print('Y_test:  '  + str(y_test_filtered.shape))
# this is what are values look like printed

X_train: (11791, 28, 28)
Y_train: (11791,)
X_test:  (1991, 28, 28)
Y_test:  (1991,)


Above we can see that we are left only with our filtered images as the dataset has been significantly reduced from the origianl 60000 we started with.

Below we are loading in the written number data set and are going to process the images to be easier to interpret. Below we are flattening the images into a 1D format instead of its original 2D format.

In [21]:
X_train_filtered = X_train_filtered.reshape(X_train_filtered.shape[0], -1)

In [22]:
X_test_filtered = X_test_filtered.reshape(X_test_filtered.shape[0], -1)

In [24]:
X_train_filtered_norm = X_train_filtered / 255
X_test_filtered_norm = X_test_filtered / 255

##Training
Below we are going to begin our classifier training which is going to assign the handwritten number immages to its corresponding labels.


In [25]:
from sklearn.linear_model import LogisticRegression

model = LogisticRegression(
                        fit_intercept=True,
                        multi_class='auto',
                        penalty='l2',
                        solver='saga',
                        max_iter=100,
                        C=50
                      )

In [26]:
model.fit(X_train_filtered_norm, y_train_filtered)



Below we are going to make the prediction on the test set, calculating accuracy (% of correct classifications), and showing the confusion matrix

In [27]:
y_pred = model.predict(X_test_filtered_norm)

In [28]:
accuracy = sum(y_pred == y_test_filtered) / len(y_test_filtered)
print(accuracy)
#percent of accuracy

0.9723756906077348


In [29]:
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test_filtered, y_pred)
print(cm)
# are confusion matrix

[[957  25]
 [ 30 979]]


##Neuronal Network Steps

Below is a helper function used to build a keras `model` object.

In [30]:
def define_model(units, num_layers, activation, lr, l2):
    model_layers = [
        layers.Dense(units, activation=activation, kernel_regularizer=regularizers.L2(l2=l2), kernel_initializer=keras.initializers.HeNormal())
        ] * num_layers
    model_layers += [layers.Dense(10)]
    model = keras.Sequential(model_layers)
    model.compile(loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True), metrics=['accuracy'], optimizer=Adam(learning_rate=lr))

    return model

Below is the code used to specify our parameters that we want our sweep to go over.

In [34]:
def build_model(hp):
    units = hp.Choice("units", [256, 512])
    activation = "relu"   # We'll just use ReLu for now
    lr = hp.Float("lr", min_value=1e-5, max_value=1e-1, sampling="log")
    l2 = hp.Float("l2", min_value=1e-5, max_value=1e-1, sampling="log")
    num_layers = hp.Choice("num_layers", [2, 3, 4])

    model = define_model(units=units, num_layers=num_layers, activation=activation, lr=lr, l2=l2)
    return model
        # call existing model-building code with the hyperparameter values.

Below I am going to link my drive

In [35]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


Below is the code that is going to initialize our tuner object. I am going to provide the folder for my sweep data to load and store into.

In [36]:
tuner = keras_tuner.BayesianOptimization(
    hypermodel= build_model,
    objective="val_loss",
    max_trials=8,
executions_per_trial=1,
    overwrite=False,
    directory="/content/Sienna MNIST Sweep"
)

Below is the code to do callbacks which is gonna chech for accuracy and other variables at the end of every trial run.

In [37]:
callbacks = [EarlyStopping(monitor="val_loss", patience=2, restore_best_weights=True, start_from_epoch=4)]
# call back

In [38]:
tuner.search(
    X_train_filtered_norm,
    y_train_filtered,
    epochs=8,
    verbose=1,
    validation_data=(X_test_filtered_norm, y_test_filtered),
    callbacks=callbacks
)

Trial 8 Complete [00h 00m 22s]
val_loss: 0.03786658123135567

Best val_loss So Far: 0.03786658123135567
Total elapsed time: 00h 03m 16s


In [39]:
tuner.results_summary(5)
# looking at the results of our best of the 8 trials

Results summary
Results in /content/Sienna MNIST Sweep/untitled_project
Showing 5 best trials
Objective(name="val_loss", direction="min")

Trial 7 summary
Hyperparameters:
units: 256
lr: 0.0010488706051187259
l2: 2.632986692186152e-05
num_layers: 4
Score: 0.03786658123135567

Trial 5 summary
Hyperparameters:
units: 512
lr: 0.0023592579903517264
l2: 1.828289374791153e-05
num_layers: 4
Score: 0.042577698826789856

Trial 0 summary
Hyperparameters:
units: 512
lr: 0.0036164835628463956
l2: 9.50349274910234e-05
num_layers: 3
Score: 0.05388177931308746

Trial 6 summary
Hyperparameters:
units: 256
lr: 3.2137021600125945e-05
l2: 3.199557829068188e-05
num_layers: 2
Score: 0.09823992848396301

Trial 1 summary
Hyperparameters:
units: 512
lr: 0.006596180810736199
l2: 0.002686256410508101
num_layers: 2
Score: 0.102649986743927


According to the Var loss factors our 7th trial is consedered our best one so we are going to store the data from that one into the linked file and this is what we are going to use to make our predicitons.

In [40]:
import json
from sklearn.metrics import confusion_matrix

trial_num = "7"
with open(f"/content/Sienna MNIST Sweep/untitled_project/trial_{trial_num}/trial.json", "r") as f:
    trial = json.load(f)
hp = trial["hyperparameters"]["values"]
model = define_model(units=hp["units"], num_layers=hp["num_layers"], activation="relu", lr=hp["lr"],
                     l2=hp["l2"])
model.load_weights(f"/content/Sienna MNIST Sweep/untitled_project/trial_{trial_num}/checkpoint")

<tensorflow.python.checkpoint.checkpoint.CheckpointLoadStatus at 0x79485c9b2710>

Below is the final confusion matrix for checking our dataset based on our most significant test trial done from or neural network.

In [41]:
import numpy as np
y_pred = np.argmax(model.predict(X_test_filtered_norm), axis=1)
cm = confusion_matrix(y_test_filtered, y_pred)
accuracy = sum(y_pred == y_test_filtered) / len(y_test_filtered)
print(accuracy)
print(cm)

0.9899547965846308
[[ 970   12]
 [   8 1001]]


##Analysis
In my analysis, I successfully developed a neural network-based image classification algorithm specifically designed to distinguish between the digits 4 and 9 in the MNIST dataset. By carefully filtering the data, defining the model, and training it, we achieved high accuracy in recognizing these particular handwritten digits. This achievement highlights the power of neural networks in focused image classification tasks and showcases their practical applicability in various domains.