Model Performance: 
98.9%~ test data set accuracy, 99.11% val_accuracy


Model: 
- Convolutional Neural Network (Sequential, 11 layers)
- Inputs 28x28 array,
- 2 Convolutional layers with 64 then 32 filters
- 2 Pooling layers
- 2 Dropout Layers
- 3 Dense Layers
- 1 Output Layer(more details below)

In [68]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf # machine learning part
import keras
from keras import layers
from keras import ops
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import EarlyStopping



ModuleNotFoundError: No module named 'tf'

In [40]:
training_data = pd.read_csv('./TestData/train.csv')
x_train_dataframe = training_data.iloc[:, training_data.columns != 'label']
x_train = x_train_dataframe.values
y_train = training_data.label.values

testing_data = pd.read_csv('./TestData/test.csv')

x_test = testing_data.values


In [5]:
x_train = tf.keras.utils.normalize(x_train)
x_test = tf.keras.utils.normalize(x_test)

Convolutional Layers: 
- Each convolutional layer passes on a "filter" which is a nxn matrix
- The filter slides along the grid of pixel values, and the layer passes a new image with the product of the filter and each corresponding sliding window

Pooling Layers: 
- Similar sliding window idea as convolution, with the "stride" equal to the number of pixels that the filter passes along
- Computes an operation on each valid position, and computes a single value as output, usually max or average
- Reduces the size of an intermediate features

Dropout Layers:
- A layer to prevent overfitting, randomly deactivates a fraction of input units to 0, set to 20%
- Ensure the network doesn't rely on just one neuron

Dense Layers: 
- Information from all neurons in the previous layer are used to determine the output of each neuron in the next


In [52]:
model = tf.keras.models.Sequential()
model.add(keras.Input(shape=(28, 28, 1)))

#CNN 128 filter that produces the same size image
# Pooling layer by factor of 2

model.add(layers.Conv2D(64, 3, padding="same", activation="relu"))
model.add(layers.MaxPool2D(pool_size=(2, 2)))
model.add(layers.Dropout(0.2))

model.add(layers.Conv2D(32, 2, padding="same", activation="relu"))
model.add(layers.MaxPool2D(pool_size=(2, 2)))
model.add(layers.Dropout(0.2))
#
model.add(layers.Flatten())

model.add(layers.Dense(128, activation="relu"))
model.add(layers.Dense(128, activation="relu"))
model.add(layers.Dense(128, activation="relu"))



model.add(layers.Dense(10, activation="softmax"))

In [63]:
optimizer = Adam(learning_rate=0.0008, beta_1=0.9, beta_2=0.999, epsilon=1e-07, amsgrad=False)

model.compile(optimizer= optimizer, loss='sparse_categorical_crossentropy', metrics=['accuracy'])

early_stopping = EarlyStopping(monitor='val_accuracy', patience=20, verbose=1, restore_best_weights=True)

x_train = x_train.reshape(-1, 28, 28)

model.fit(x_train, y_train, shuffle=True, validation_split=0.2, validation_data=None, epochs=20, callbacks=[early_stopping])

model.save('CNN.model.keras')

saved_model = tf.keras.models.load_model('CNN.model.keras')

Epoch 1/20
[1m1050/1050[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 8ms/step - accuracy: 0.9963 - loss: 0.0189 - val_accuracy: 0.9886 - val_loss: 0.1325
Epoch 2/20
[1m1050/1050[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 8ms/step - accuracy: 0.9955 - loss: 0.0234 - val_accuracy: 0.9890 - val_loss: 0.0963
Epoch 3/20
[1m1050/1050[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 8ms/step - accuracy: 0.9960 - loss: 0.0173 - val_accuracy: 0.9875 - val_loss: 0.1180
Epoch 4/20
[1m1050/1050[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 8ms/step - accuracy: 0.9957 - loss: 0.0233 - val_accuracy: 0.9900 - val_loss: 0.0874
Epoch 5/20
[1m1050/1050[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 8ms/step - accuracy: 0.9965 - loss: 0.0210 - val_accuracy: 0.9865 - val_loss: 0.0872
Epoch 6/20
[1m1050/1050[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 8ms/step - accuracy: 0.9958 - loss: 0.0159 - val_accuracy: 0.9911 - val_loss: 0.0766
Epoch 7/20
[1m1

In [66]:
x_test = x_test.reshape(-1, 28, 28)
predictions = saved_model.predict(x_test)
predicted_labels = predictions.argmax(axis=1)

[1m875/875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 2ms/step


In [67]:
results = pd.DataFrame({
    'ImageId': range(1, len(predicted_labels) + 1), 
    'Label': predicted_labels
})

results.to_csv('CNN_predictions.csv', index=False)