# Digit Recognition

Digit racognition using CNN.

I am using data from analytics vidya digit recognition competition.

## Importing required libraries

In [3]:
import pandas as pd
import numpy as  np
import matplotlib.pyplot as plt
import seaborn as sns

* **pandas** - we use pandas to handle our csv files
* **matplotlib & seaborn** - used for charting and plotting

In [4]:
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix

* **sklearn** - Popular ML library.We will use it for splitting our data.

In [5]:
from keras.utils.np_utils import to_categorical
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Dense,Conv2D,Flatten,MaxPool2D,Dropout,BatchNormalization
from keras.optimizers import RMSprop,Adam
from keras.callbacks import ReduceLROnPlateau

* **Keras** : Popular Deep learning library,we will use it to build our CNN Network.

## Loading data

In [8]:
import cv2
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras import datasets, layers, models

In [10]:
mnist = tf.keras.datasets.mnist
(x_train,y_train),(x_test,y_test) = mnist.load_data()

train_X = tf.keras.utils.normalize(x_train, axis = 1)
test_X = tf.keras.utils.normalize(x_test, axis = 1)

## Understanding the train and test data

In [11]:
print('Train dataset has {} rows and {} columns'.format(train_X.shape[0],train_X.shape[1]))
print('test dataset has {} rows and {} columns'.format(test_X.shape[0],test_X.shape[1]))


Train dataset has 60000 rows and 28 columns
test dataset has 10000 rows and 28 columns


In [13]:
train_X.head()

In [None]:
test_X.head()

**Pixel 0 to Pixel 783**: These are the pixel values of the image metrics.That is each row contains 28 * 28 = 784 (0-783 here) values here.Each one of these values indicates the pixel value at i x 28 + j th pixel position in the image metric.

**train_y** file contains a target value i.e **label** for train data


In [None]:
train_y.head()
train_y = train_y.iloc[:,1]


In [None]:

train_y.head()


## Checking Target class distribution.

In [None]:
y = train_y.value_counts()
sns.barplot(y.index,y)


## Normalize pixel values

For most image data, the pixel values are integers with values between 0 and 255.

Neural networks process inputs using small weight values, and inputs with large integer values can disrupt or slow down the learning process. As such it is good practice to normalize the pixel values so that each pixel value has a value between 0 and 1.

It is valid for images to have pixel values in the range 0-1 and images can be viewed normally.

This can be achieved by dividing all pixels values by the largest pixel value; that is 255. This is performed across all channels, regardless of the actual range of pixel values that are present in the image.

In [None]:
train_X = train_X /255
test_X =test_X /255

## Reshape

In [None]:
train_X= train_X.values.reshape(-1,28,28,1)
test_X = test_X.values.reshape(-1,28,28,1)


In [None]:
print('The shape of train set now is',train_X.shape)
print('The shape of test set now is',test_X.shape)

## Encoding Target Values



Now we will encode our target value.Keras inbuild library to_categorical() is used to do the on-hot encoding.

In [None]:
train_y = to_categorical(train_y)

## Splitting train and test data

Now we will split out training data into train and validation data. 20 percent of the training data will be used for validation purpose.

In [None]:
X_train,X_test,y_train,y_test = train_test_split(train_X,train_y,random_state = 42 , test_size=0.20)

In [None]:
plt.imshow(X_train[0][:,:,0])

## Generating more data

In order to avoid overfitting problem , we need to expand our dataset artificially.

We can do it by some **data augmentation techniques**.

By applying these techniques we can double or triple the number of training examples and create a very robust model.

In [None]:
datagen = ImageDataGenerator(
            featurewise_center = False, # set input mean to 0 over the dataset
            samplewise_center = False,  # set each sample mean to 0
            featurewise_std_normalization = False, # divide inputs by std of the dataset
            samplewise_std_normalization = False,  # divide each input by its std
            zca_whitening = False,   # apply ZCA whitening
            rotation_range = 10,     # randomly rotate images in the range (degrees, 0 to 180)
            zoom_range = 0.1,       # Randomly zoom image 
            width_shift_range = 0.1,  # randomly shift images horizontally (fraction of total width)
            height_shift_range = 0.1, # randomly shift images vertically (fraction of total height)
            horizontal_flip = False,  # randomly flip images
            vertical_flip = False     # randomly flip images
)

datagen.fit(X_train)

I did not apply a vertical_flip nor horizontal_flip since it could have lead to misclassify symetrical numbers such as 6 and 9.

## Modelling

### CNN

In [None]:
model = Sequential()

model.add(Conv2D(filters = 32, kernel_size = (5,5),padding = 'Same', activation ='relu', input_shape = (28,28,1)))
model.add(Conv2D(filters = 32, kernel_size = (5,5),padding = 'Same',activation ='relu'))

model.add(BatchNormalization(momentum = .05))

model.add(MaxPool2D(pool_size=(2,2)))

model.add(Dropout(0.25))


model.add(Conv2D(filters = 64, kernel_size = (3,3),padding = 'Same', activation ='relu'))
model.add(Conv2D(filters = 64, kernel_size = (3,3),padding = 'Same', activation ='relu'))
model.add(BatchNormalization(momentum=0.05))
model.add(MaxPool2D(pool_size=(2,2), strides=(2,2)))
model.add(Dropout(0.25))


model.add(Conv2D(filters = 32, kernel_size = (5,5),padding = 'Same', activation ='relu', input_shape = (28,28,1)))
model.add(Conv2D(filters = 32, kernel_size = (5,5),padding = 'Same', activation ='relu'))
model.add(BatchNormalization(momentum=.05))
model.add(MaxPool2D(pool_size=(2,2)))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(256, activation = 'relu'))
model.add(Dropout(0.4))
model.add(Dense(10, activation = "softmax"))

In [None]:
model.summary()

## Optimizer


In simpler terms, optimizers shape and mold your model into its most accurate possible form by futzing with the weights. The loss function is the guide to the terrain, telling the optimizer when it’s moving in the right or wrong direction.

In [None]:
optimizer = Adam(learning_rate=0.001 , beta_1=0.9 ,beta_2 = 0.999)

In [None]:
model.compile(optimizer=optimizer , loss=['categorical_crossentropy'],metrics = ['accuracy'])


## Leraning rate reduction



In order to make the optimizer converge faster and closest to the global minimum of the loss function, i used an annealing method of the learning rate (LR).

With the ReduceLROnPlateau function from Keras.callbacks, i choose to reduce the LR by half if the accuracy is not improved after 3 epochs.

In [None]:
learning_rate_reduction = ReduceLROnPlateau(monitor = 'val_acc',
                                            patience = 5 ,
                                            verbose = 1,
                                            factor = 0.5 , 
                                            min_lr = 0.00001)


## Fitting Our Model

In [None]:
epochs = 20
batch_size = 100

In [None]:
history = model.fit_generator(datagen.flow(X_train,y_train,batch_size = batch_size),
                              epochs = epochs ,
                              validation_data = (X_test,y_test),
                              verbose = 2,
                              steps_per_epoch = X_train.shape[0]//batch_size,
                              callbacks =[learning_rate_reduction])


## Evaluatin our approach using graph

In [None]:
fig,ax=plt.subplots(2,1)
fig.set
x=range(1,1+epochs)
ax[0].plot(x,history.history['loss'],color='red')
ax[0].plot(x,history.history['val_loss'],color='blue')

ax[1].plot(x,history.history['accuracy'],color='red')
ax[1].plot(x,history.history['val_accuracy'],color='blue')
ax[0].legend(['trainng loss','validation loss'])
ax[1].legend(['trainng acc','validation acc'])
plt.xlabel('Number of epochs')
plt.ylabel('accuracy')

## Confusion Matrix

In [None]:
y_pre_test=model.predict(X_test)
y_pre_test=np.argmax(y_pre_test,axis=1)
y_test=np.argmax(y_test,axis=1)

In [None]:
conf=confusion_matrix(y_test,y_pre_test)
conf=pd.DataFrame(conf,index=range(0,10),columns=range(0,10))

In [None]:


conf



In [None]:
plt.figure(figsize=(8,6))
sns.set(font_scale=1.4)#for label size
sns.heatmap(conf, annot=True,annot_kws={"size": 16},cmap=plt.cm.Blues)# font size

## Some Misclassified Images

In [None]:
x=(y_pre_test-y_test!=0).tolist()
x=[i for i,l in enumerate(x) if l!=False]

In [None]:
fig,ax=plt.subplots(1,4,sharey=False,figsize=(15,15))

for i in range(4):
    ax[i].imshow(X_test[x[i]][:,:,0])
    ax[i].set_xlabel('Real {}, Predicted {}'.format(y_test[x[i]],y_pre_test[x[i]]))
    

## Predicting for test data

In [None]:
y_pre_test

In [None]:
test_y = model.predict(test_X)
test_y =np.argmax(test_y,axis=1)


In [None]:
test_y

In [None]:
test1 = test

In [None]:
test1 = test1.iloc[:,0:1]

In [None]:
test1

In [None]:
output = pd.DataFrame({'filename': test1.iloc[1:,0],
                     'label': test_y})
output.to_csv('submission1.csv', index=False)

Reference and credit:

1.[Analytics Vidya](http://www.analyticsvidhya.com/blog/2018/12/guide-convolutional-neural-network-cnn/)

2.[Kaggle notebook](http://www.kaggle.com/shahules/indian-way-to-learn-cnn)

