# Kannada MNIST Neural Network Challenge

This challenge required me to make my first neural network. There was a steep learning curve in the beginning for me, but I learned a lot along the way.

The following notebook helped guide me along the way:

https://www.kaggle.com/sauravjoshi23/kannada-mnist-comparing-accuracy-of-various-models

For this notebook specifically, I will be getting straight to the point of building the neural network and not presenting any eda or data visualizations. For this, please visit my previous notebook in this challenge where I used a random forest classifier with PCA 

In [1]:
import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

/kaggle/input/Kannada-MNIST/sample_submission.csv
/kaggle/input/Kannada-MNIST/Dig-MNIST.csv
/kaggle/input/Kannada-MNIST/train.csv
/kaggle/input/Kannada-MNIST/test.csv


In [2]:
import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Dense, Dropout, Lambda, Flatten
from keras.optimizers import Adam ,RMSprop
from sklearn.model_selection import train_test_split
from keras import  backend as K
from keras.preprocessing.image import ImageDataGenerator
from keras.utils.np_utils import to_categorical
from keras.preprocessing import image
from keras.layers import Convolution2D, MaxPooling2D
from tensorflow.keras import layers, callbacks
from keras.layers import BatchNormalization



In [3]:
train = pd.read_csv("../input/Kannada-MNIST/train.csv")
test  = pd.read_csv("../input/Kannada-MNIST/test.csv")
sample = pd.read_csv('../input/Kannada-MNIST/sample_submission.csv')


In [4]:
train.head()

Unnamed: 0,label,pixel0,pixel1,pixel2,pixel3,pixel4,pixel5,pixel6,pixel7,pixel8,...,pixel774,pixel775,pixel776,pixel777,pixel778,pixel779,pixel780,pixel781,pixel782,pixel783
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,1,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,2,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,3,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,4,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [5]:
test.head()

Unnamed: 0,id,pixel0,pixel1,pixel2,pixel3,pixel4,pixel5,pixel6,pixel7,pixel8,...,pixel774,pixel775,pixel776,pixel777,pixel778,pixel779,pixel780,pixel781,pixel782,pixel783
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,1,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,2,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,3,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,4,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


## Split features and test

In [6]:
X_train = (train.iloc[:,1:].values).astype('float32')
y_train = train.iloc[:,0].values.astype('int32')
X_test = (test.iloc[:,1:].values).astype('float32')

In [7]:
X_train

array([[0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       ...,
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.]], dtype=float32)

## Transforming data to be 4 dimensional

In [8]:
X_train = X_train.reshape(X_train.shape[0],28,28,1) 
X_test = X_test.reshape(X_test.shape[0],28,28,1) 

print(X_train.shape) 
print(X_test.shape)

(60000, 28, 28, 1)
(5000, 28, 28, 1)


## Standardizing the data

In [9]:
meanpx = X_train.mean().astype(np.float32)
stdpx = X_train.std().astype(np.float32)

def standardize(x):
    return (x-meanpx)/stdpx

In [10]:
y_train = to_categorical(y_train)
num_classes = y_train.shape[1]
num_classes

10

## CNN

In [11]:
gen = image.ImageDataGenerator()

In [12]:
X = X_train
y = y_train
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.10, random_state=42)
batches = gen.flow(X_train, y_train, batch_size=64)
val_batches=gen.flow(X_val, y_val, batch_size=64)

In [13]:
def cnn_model():
    model = Sequential([
        Lambda(standardize, input_shape=(28,28,1)),
        Convolution2D(32,(3,3), activation='relu'),
        Dropout(0.3),
        MaxPooling2D(),
        Convolution2D(64,(3,3), activation='relu'),
        Dropout(0.3),
        MaxPooling2D(),
        Flatten(),
        Dense(512, activation='relu'),
        Dense(10, activation='softmax')
        ])
    model.compile(Adam(), loss='categorical_crossentropy',
                  metrics=['accuracy'])
    return model

In [14]:
model= cnn_model()
model.optimizer.lr=0.01

In [15]:
history=model.fit_generator(generator=batches, steps_per_epoch=X_train.shape[0]//64, epochs=15, 
                    validation_data=val_batches, validation_steps=val_batches.n)



Epoch 1/15
Epoch 2/15
Epoch 3/15
Epoch 4/15
Epoch 5/15
Epoch 6/15
Epoch 7/15
Epoch 8/15
Epoch 9/15
Epoch 10/15
Epoch 11/15
Epoch 12/15
Epoch 13/15
Epoch 14/15
Epoch 15/15


15 Epochs took quite a bit of time, let's add in early stopping to save some time

In [16]:
#with callback

early_stopping = callbacks.EarlyStopping(
    monitor='loss', min_delta=0.001, patience=5, verbose=0,
    mode='auto', baseline=None, restore_best_weights=True
)

history=model.fit_generator(generator=batches, steps_per_epoch=X_train.shape[0]//64, epochs=20, 
                    validation_data=val_batches, validation_steps=val_batches.n, callbacks=[early_stopping])

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20


We can see here that early stopping was called and we only had to run 9 epochs before we started to see a decrease in accuracy. This callback helps us to not only save time, but more importantly prevent overfitting the data.

## Batch Normalization

In [17]:
def cnn_model_batch():
    model = Sequential([
        Lambda(standardize, input_shape=(28,28,1)),
        Convolution2D(32,(3,3), activation='relu'),
        Dropout(0.3),
        BatchNormalization(axis=1),
        MaxPooling2D(),
        Convolution2D(64,(3,3), activation='relu'),
        Dropout(0.3),
        BatchNormalization(axis=1),
        MaxPooling2D(),
        Flatten(),
        Dense(512, activation='relu'),
        Dense(10, activation='softmax')
        ])
    model.compile(Adam(), loss='categorical_crossentropy',
                  metrics=['accuracy'])
    return model

In [18]:
model= cnn_model_batch()
model.optimizer.lr=0.01

In [19]:
history=model.fit_generator(generator=batches, steps_per_epoch=X_train.shape[0]//64, epochs=20, 
                    validation_data=val_batches, validation_steps=val_batches.n, callbacks=[early_stopping])

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20


## Kaggle Submission

In [20]:
sample["label"] = np.argmax(model.predict(X_test), 1)
sample.head()
sample.to_csv("submission.csv", index=False, header=True)