Kaggle competition Digit Recognizer https://www.kaggle.com/competitions/digit-recognizer/overview

Public score: 0.98946.

Because the simple layer approach brings overfitting https://www.kaggle.com/code/garfield2021/digit-recognizer-keras-simple-layers/notebook, I use convnet method in this notebook. The test accurary is ~0.989, beating ~0.97 from simple layers.

You can check out other convnet structures in YASSINE GHOUZAM's Introduction to CNN Keras - 0.997 (top 6%) https://www.kaggle.com/code/yassineghouzam/introduction-to-cnn-keras-0-997-top-6 and CANIP PAÇACI MNIST Digit Recognizer Easy %99.5 Accuracy https://www.kaggle.com/code/canippacaci/mnist-digit-recognizer-easy-99-5-accuracy.

Other main references include Deep Learning with Python by François Chollet.

In [None]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split

from tensorflow import keras
from keras import Input, Model
from keras.utils import to_categorical
from keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPooling2D

# 1. Load data

In [None]:
# Load the train and test datasets.
train = pd.read_csv('/kaggle/input/digit-recognizer/train.csv')
test = pd.read_csv('/kaggle/input/digit-recognizer/test.csv')

In [None]:
Y_train = train["label"]
X_train = train.drop(labels=["label"], axis=1)

del train

In [None]:
# The count distribution for 10 digits are close to uniform.
g = sns.histplot(data=Y_train)
Y_train.value_counts()

In [None]:
# No missing values.
X_train.isnull().any().describe()

In [None]:
test.isnull().any().describe()

In [None]:
print("X_train shape: ", X_train.shape)
print("Y_train shape: ", Y_train.shape)
print("test shape: ", test.shape)

# 2. Normalize and reshape

In [None]:
X_train, test = X_train / 255.0, test / 255.0

In [None]:
X_train = X_train.values.reshape(-1, 28, 28, 1)
test = test.values.reshape(-1, 28, 28, 1)

In [None]:
Y_train = to_categorical(Y_train, num_classes = 10)

In [None]:
X_train, X_val, Y_train, Y_val = train_test_split(X_train, Y_train, test_size=0.25)

In [None]:
fig = plt.figure(figsize = (11, 12))

for i in range(16):  
    plt.subplot(4,4,1 + i)
    plt.title(np.argmax(Y_train[i]),fontname="Aptos",fontweight="bold")
    plt.imshow(X_train[i,:,:,0], cmap=plt.get_cmap('gray'))
plt.show()


In [None]:
print("X_train shape: ", X_train.shape)
print("Y_train shape: ", Y_train.shape)
print("X_val shape: ", X_val.shape)
print("Y_val shape: ", Y_val.shape)

# 3. Build model

In [None]:
inputs = Input(shape=(28,28,1))

x = Conv2D(filters = 32, kernel_size = (3,3), activation ='relu')(inputs)
x = MaxPooling2D(pool_size=2)(x)
x = Dropout(0.2)(x)

x = Conv2D(filters = 64, kernel_size = (3,3), activation ='relu')(x)
x = MaxPooling2D(pool_size=2)(x)
x = Dropout(0.2)(x)

x = Conv2D(filters = 128, kernel_size = (3,3), activation ='relu')(x)
x = Flatten()(x)

outputs = Dense(10, activation = "softmax")(x)
model = Model(inputs=inputs, outputs=outputs)

In [None]:
model.summary()

In [None]:
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])

In [None]:
history = model.fit(
    X_train, Y_train,
    validation_data=(X_val, Y_val),
    batch_size=64,
    epochs=10
)

In [None]:
history_df = pd.DataFrame(history.history)

history_df.loc[:, ['loss', 'val_loss']].plot(title="cross-entropy")
print("Minimum validation loss: {}".format(history_df['val_loss'].min()))

history_df.loc[:, ['accuracy', 'val_accuracy']].plot(title="accuracy")
print("Maximum validation accuracy: {}".format(history_df['val_accuracy'].max()))

# 4. Predict and Submit

In [None]:
# Make predicitons based on the model trained before.
predictions = model.predict(test)

In [None]:
# Select the index with the maximum probability
predictions = np.argmax(predictions,axis =1)

In [None]:
predictions = pd.Series(predictions, name='Label')

In [None]:
submission = pd.concat([pd.Series(range(1,28001),name = "ImageId"), predictions],axis = 1)
submission.to_csv("submission.csv",index=False)