# MNIST hand written digit classification 

We tackle the classical MNIST digit classification problem using various neural networks.  

The data is of shape 28x28x1, where each data represents a picture of a handwritten digit from 0-9. Our task is to classify which digit it is.  

The first network I will try is a simple fully connected layer. The setup is Paperspace Gradient NVIDIA P-4000 GPU. 

In [3]:
pip install pandas 

Collecting pandas
[?25l  Downloading https://files.pythonhosted.org/packages/bb/71/8f53bdbcbc67c912b888b40def255767e475402e9df64050019149b1a943/pandas-1.0.3-cp36-cp36m-manylinux1_x86_64.whl (10.0MB)
[K     |████████████████████████████████| 10.0MB 13.5MB/s eta 0:00:01
[?25hCollecting pytz>=2017.2
[?25l  Downloading https://files.pythonhosted.org/packages/4f/a4/879454d49688e2fad93e59d7d4efda580b783c745fd2ec2a3adf87b0808d/pytz-2020.1-py2.py3-none-any.whl (510kB)
[K     |████████████████████████████████| 512kB 56.6MB/s eta 0:00:01
Installing collected packages: pytz, pandas
Successfully installed pandas-1.0.3 pytz-2020.1
You should consider upgrading via the 'pip install --upgrade pip' command.[0m
Note: you may need to restart the kernel to use updated packages.


### Simple Dense Model 

In [70]:
train = pd.read_csv("train.csv")
X_test = pd.read_csv("test.csv")
Y_train = train["label"]
X_train = train.drop(labels = ["label"],axis = 1)  

In [71]:
X_train = np.asarray(X_train) 
Y_train = np.asarray(Y_train) 

In [72]:
X_train = X_train / 255.0

In [73]:
from tensorflow.keras.utils import to_categorical 
#Y_train = to_categorical(Y_train)   

print(X_train.shape)
print(Y_train.shape)

(42000, 784)
(42000,)


In [74]:
def FCN(): 
    model = models.Sequential()
    model.add(layers.Dense(512, activation = 'relu', input_shape=(28*28,)))
    model.add(layers.Dense(10, activation = 'softmax'))
    model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
    return model 
    

In [75]:
model = FCN() 
model.fit(X_train, Y_train, epochs = 10, batch_size = 128)

Train on 42000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x7f5f99dc4748>

In [77]:
X_test = np.asarray(X_test) 
X_test = X_test/255.0 

In [79]:
predicted = model.predict_classes(X_test)
predicted

array([2, 0, 9, ..., 3, 9, 2])

In [82]:
submission = pd.read_csv('sample_submission.csv')
submission['Label'] = predicted 
submission.head(10)

Unnamed: 0,ImageId,Label
0,1,2
1,2,0
2,3,9
3,4,9
4,5,3
5,6,7
6,7,0
7,8,3
8,9,0
9,10,3


In [83]:
submission.to_csv("FCN.csv",index=False) 

Upon submitting our prediction from the dense network, we get an accuracy score of 97.57% on the public leaderboard. 

### Simple CNN Architecture 

This time, I will try to construct a simple CNN architecture with a few Conv2D and MaxPooling2D layers 

In [84]:
def custom_CNN(): 
    model = models.Sequential()
    model.add(layers.Conv2D(32,(3,3),activation='relu',input_shape=(28,28,1)))
    model.add(layers.MaxPooling2D((2,2))) 
    model.add(layers.Conv2D(64,(3,3),activation='relu'))
    model.add(layers.MaxPooling2D(2,2)) 
    model.add(layers.Conv2D(64,(3,3),activation='relu'))
    model.add(layers.Flatten()) 
    model.add(layers.Dense(64,activation='relu'))
    model.add(layers.Dense(10,activation='softmax')) 
    model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
    return model 

In [87]:
train = pd.read_csv("train.csv")
X_test = pd.read_csv("test.csv")
Y_train = train["label"]
X_train = train.drop(labels = ["label"],axis = 1)  

X_train = np.asarray(X_train) 
X_train = X_train.reshape((-1,28,28,1)) 
X_train = X_train/255.0 
Y_train = np.asarray(Y_train)    

print(X_train.shape) 
print(Y_train.shape)

X_test = np.asarray(X_test) 
X_test = X_test.reshape((-1,28,28,1)) 
X_test = X_test/255.0 

(42000, 28, 28, 1)
(42000,)


In [88]:
model = custom_CNN() 
model.fit(X_train, Y_train, epochs = 10, batch_size = 128)

Train on 42000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x7f5f983eafd0>

In [89]:
predicted = model.predict_classes(X_test)
predicted

array([2, 0, 9, ..., 3, 9, 2])

In [90]:
submission = pd.read_csv('sample_submission.csv')
submission['Label'] = predicted 
submission.head(10)

Unnamed: 0,ImageId,Label
0,1,2
1,2,0
2,3,9
3,4,0
4,5,3
5,6,7
6,7,0
7,8,3
8,9,0
9,10,3


In [92]:
submission.to_csv('simple_CNN.csv',index=False)

Upon submission, we get an accuracy of 99.06% on the public leaderboard. This is an improvement from the simple dense network. 

### Conclusion

From this small experiment, we could see the power of convolution neueral networks on image classifcation problems. 

Other things we could do to improve performance on the MNIST dataset 
1. Try augmenting data using ImageDataGenerator (like rotating handwritten digits) 
2. Try pretrained models such as LeNet, VGG-16, etc.