# Sign language MNIST

The dataset format is patterned to match closely with the classic MNIST. Each training and test case represents a label (0-25) as a one-to-one map for each alphabetic letter A-Z (and no cases for 9=J or 25=Z because of gesture motions). The training data (27,455 cases) and test data (7172 cases) are approximately half the size of the standard MNIST but otherwise similar with a header row of label, pixel1,pixel2….pixel784 which represent a single 28x28 pixel image with grayscale values between 0-255.

In [None]:
import numpy as np
import pandas as pd
#tensorflow version 2.0
import tensorflow as tf
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.metrics import confusion_matrix,classification_report
from sklearn.model_selection import train_test_split

#reading data
mnist_train = pd.read_csv('../input/sign-language-mnist/sign_mnist_train/sign_mnist_train.csv')
mnist_test = pd.read_csv('../input/sign-language-mnist/sign_mnist_test/sign_mnist_test.csv')

print("Datasets successfully loaded!")
print(f"Training dataset has {mnist_train.shape[0]} rows and {mnist_train.shape[1]} columns.")
print(f"Training dataset has {mnist_test.shape[0]} rows and {mnist_test.shape[1]} columns.")

### Standardizing and splitting data in 3 parts- train,validation and test set.

Here, we don't need any sklearn function like StandardScaler or RobustScalar because all the features are in same range (0 to 255). So, we can standardize them simply by dividing by 255.

In [None]:
#standardization
mnist_train.iloc[:,1:] /= 255
mnist_test.iloc[:,1:] /= 255

#splitting features and target column
x_train = mnist_train.iloc[:,1:]
y_train = mnist_train.iloc[:,0]
x_test= mnist_test.iloc[:,1:]
y_test=mnist_test.iloc[:,0]

#further splitting train set into validation and training set
x_train,x_validate,y_train,y_validate = train_test_split(x_train,y_train,test_size = 0.3)

Let's have a look at the images in our dataset.

In [None]:
plt.figure(figsize=(10, 10))
for i in range(36):
    plt.subplot(6, 6, i + 1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(np.array(x_test.iloc[i]).reshape(28,28))
    label_index = int(y_test[i])
    plt.title(label_index)
plt.show()

Let's see if our data is balanced.

In [None]:
sns.countplot(y_train)
plt.title('Classes distribution in train set');

In [None]:
sns.countplot(y_validate)
plt.title('Classes distribution in validation set');

In [None]:
sns.countplot(y_test)
plt.title('Classes distribution in test set');

Classes distribution is almost same in train and validation set. Though test set has comparitively few cases of some classes.

# Convolutional Neural Network

The term deep neural nets refers to any neural network with several hidden layers. Convolutional neural nets are a specific type of deep neural net which are especially useful for image recognition. Specifically, convolutional neural nets use convolutional and pooling layers, which reflect the translation-invariant nature of most images.

For this, we need to reshape our input data.

In [None]:
image_rows = 28
image_cols = 28
image_shape = (image_rows,image_cols,1)
x_train = tf.reshape(x_train,[x_train.shape[0],*image_shape])
x_test = tf.reshape(x_test,[x_test.shape[0],*image_shape])
x_validate = tf.reshape(x_validate,[x_validate.shape[0],*image_shape])

Now we define our model. The layer in model network (keras.layers.Flatten) transforms the format of the images from a two-dimensional array (of 28 by 28 pixels) to a one-dimensional array (of 28 * 28 = 784 pixels). This layer unstacks rows of pixels in the image and lining them up and has no parameters to learn; it only reformats the data. Pooling layers are then added to further reduce the number of parameters.

After the pixels are flattened, the network consists of a sequence of two keras.layers.Dense layers. These are densely connected, or fully connected, neural layers.

A problem with training neural networks is in the choice of the number of training epochs to use. Too many epochs can lead to overfitting of the training dataset, whereas too few may result in an underfit model. Early stopping is a method that allows you to specify an arbitrary large number of training epochs and stop training once the model performance stops improving on a hold out validation dataset.

In [None]:
cnn_model = tf.keras.Sequential([
    tf.keras.layers.Conv2D(filters=32,kernel_size=3,activation='relu',input_shape = image_shape),
    tf.keras.layers.MaxPooling2D(pool_size=2) ,# down sampling the output instead of 28*28 it is 14*14
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Conv2D(filters=32,kernel_size=3,activation='relu',input_shape = image_shape),
    tf.keras.layers.MaxPooling2D(pool_size=2) ,
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Flatten(), # flatten out the layers
    tf.keras.layers.Dense(100,activation='relu'),
    tf.keras.layers.Dropout(0.5),
    tf.keras.layers.Dense(100,activation='relu'),
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Dense(25,activation = 'softmax')
])

cnn_model.compile(loss ='sparse_categorical_crossentropy',
                  optimizer='adam',metrics =['accuracy'])

early_stop = tf.keras.callbacks.EarlyStopping(monitor='val_loss', mode='min', verbose=1, patience=10)

history = cnn_model.fit(
    x_train,
    y_train,
    batch_size=500,
    epochs=80,
    verbose=1,
    validation_data=(x_validate,y_validate),
    callbacks=early_stop
)

In [None]:
score = cnn_model.evaluate(x_test,y_test,verbose=0)
print('Test Loss : {:.4f}'.format(score[0]))
print('Test Accuracy : {:.4f}'.format(100*score[1]))

Almost 96% accuracy on test data is quite good.

Let us plot the Training Accuracy vs Loss to get a better understanding of the model training.

In [None]:
plt.figure(figsize=(10, 10))

plt.subplot(2, 2, 1)
plt.plot(history.history['loss'], label='Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.legend()
plt.title('Training - Loss Function')

plt.subplot(2, 2, 2)
plt.plot(history.history['accuracy'], label='Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.legend()
plt.title('Training - Accuracy');

In [None]:
cnn_pred = cnn_model.predict_classes(x_test)
target_names = ["Class {}".format(i) for i in range(24)]
print(classification_report(y_test,cnn_pred, target_names=target_names))

Except class 17 and 23 all classe have accuracy greater than 90%.

In [None]:
plt.figure(figsize=(10,10))
sns.heatmap(confusion_matrix(y_test,cnn_pred),cmap='seismic');

***Please upvote and provide suggestions.***