# Sign Language MNIST Dataset

## ASL
American Sign Language (ASL) is a complete, natural language that has the same linguistic properties as spoken languages, with grammar that differs from English. ASL is expressed by movements of the hands and face. It is the primary language of many North Americans who are deaf and hard of hearing, and is used by many hearing people as well.

## The dataset
The original MNIST image dataset of handwritten digits is a popular benchmark for image-based machine learning methods but researchers have renewed efforts to update it and develop drop-in replacements that are more challenging for computer vision and original for real-world applications. As noted in one recent replacement called the Fashion-MNIST dataset, the Zalando researchers quoted the startling claim that "Most pairs of MNIST digits (784 total pixels per sample) can be distinguished pretty well by just one pixel". To stimulate the community to develop more drop-in replacements, the Sign Language MNIST is presented here and follows the same CSV format with labels and pixel values in single rows. The American Sign Language letter database of hand gestures represent a multi-class problem with 24 classes of letters (excluding J and Z which require motion).

The dataset format is patterned to match closely with the classic MNIST. Each training and test case represents a label (0-25) as a one-to-one map for each alphabetic letter A-Z (and no cases for 9=J or 25=Z because of gesture motions). The training data (27,455 cases) and test data (7172 cases) are approximately half the size of the standard MNIST but otherwise similar with a header row of label, pixel1,pixel2â€¦.pixel784 which represent a single 28x28 pixel image with grayscale values between 0-255. The original hand gesture image data represented multiple users repeating the gesture against different backgrounds. The Sign Language MNIST data came from greatly extending the small number (1704) of the color images included as not cropped around the hand region of interest. To create new data, an image pipeline was used based on ImageMagick and included cropping to hands-only, gray-scaling, resizing, and then creating at least 50+ variations to enlarge the quantity.

In [None]:
from IPython.display import Image
Image(filename='../input/sign-language-mnist/amer_sign2.png')


## Content of the notebook:

* Loading dataset.
* Creation and training of the Convolutional Neural Netwok


--- 

# Loading the datasets 

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import os
import seaborn as sns

In [None]:
df_tr = pd.read_csv('../input/sign-language-mnist/sign_mnist_train.csv')
df_tr.head()

In [None]:
# load test set
df_te = pd.read_csv('../input/sign-language-mnist/sign_mnist_test.csv')

# Create train/test

X_test = df_te.iloc[:,1:].values
y_test = df_te[['label']].values

print('X_tr shape', X_test.shape, X_test.dtype)
print('y_te shape', y_test.shape, y_test.dtype)

X_test = (X_test - 128)/255

In [None]:
def show_img(img, df):
    
    # Take the label
    label = df['label'][img]
    
    # Take the pixels
    pixels = df.iloc[img, 1:]

    # The pixel intensity values are integers from 0 to 255
    pixels = np.array(pixels, dtype='uint8')

    # Reshape the array into 28 x 28 array (2-dimensional array)
    pixels = pixels.reshape((28, 28))

    # Plot
    plt.title('Label is {label}'.format(label=label))
    plt.imshow(pixels, cmap='gray')
    plt.show()


In [None]:
show_img(90, df_tr)

# Quick EDA

In [None]:
list_data = [df_tr, df_te]
fig,axes = plt.subplots(nrows=1,ncols=2,figsize=(15,6))

for data, ax, names in zip(list_data, axes.ravel(), ['train', 'test']):
    sns.countplot(data['label'], palette='rocket', ax=ax)
    ax.set_title("Frequency for each letter in the {} dataset".format(names))
    ax.set_xlabel('Letters')
    ax.set_ylabel('Frequency')
    ax.set_xticklabels(['A','B','C','D','E','F','G','H','I','K','L','M','N','O','P','Q','R','S',
                            'T','U','V','W','X','Y'])

plt.tight_layout()

In [None]:
# Create train/test

X = df_tr.iloc[:,1:].values
y = df_tr[['label']].values

from sklearn.model_selection import train_test_split

X_tr, X_v, y_tr, y_v = train_test_split(X, y, test_size=0.2, random_state=14)

print('X_tr shape', X_tr.shape, X_tr.dtype)
print('X_v shape', X_v.shape, X_v.dtype)
print('y_tr shape', y_tr.shape, y_tr.dtype)
print('y_v shape', y_v.shape, y_v.dtype)

X_tr = (X_tr - 128)/255
X_v = (X_v - 128)/255

In [None]:
# X_tr and y_tr to right shape for CNN

train_x = X_tr.reshape(-1,28,28,1) 
train_y = y_tr.reshape(-1,1) 

# val_x and val_y
val_x = X_v.reshape(-1,28,28,1)
val_y = y_v.reshape(-1,1)

X_test = X_test.reshape(-1, 28, 28, 1)

print(train_x.shape)
print(val_x.shape)

In [None]:
from sklearn.preprocessing import LabelBinarizer

lb=LabelBinarizer()
y_tr= lb.fit_transform(y_tr)
y_v= lb.fit_transform(y_v)
y_test= lb.fit_transform(y_test)

In [None]:
# With data augmentation to prevent overfitting
from tensorflow.keras.preprocessing.image import ImageDataGenerator

datagen = ImageDataGenerator(
        featurewise_center=False,  
        samplewise_center=False,  
        featurewise_std_normalization=False,  
        samplewise_std_normalization=False,  
        zca_whitening=False,  
        rotation_range=10,  
        zoom_range = 0.1, 
        width_shift_range=0.1, 
        height_shift_range=0.1,  
        horizontal_flip=False, 
        vertical_flip=False)  


datagen.fit(train_x)

In [None]:
import tensorflow as tf
import tensorflow.keras as keras
from tensorflow.keras.layers import Dense, Conv2D , MaxPool2D , Flatten , Dropout , BatchNormalization
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau, ModelCheckpoint

# Convolutional network 
model = keras.Sequential()

model.add(keras.layers.Conv2D(filters=32, kernel_size=3, strides=1, activation='relu', input_shape=(28, 28, 1)))
model.add(keras.layers.Conv2D(filters=32, kernel_size=3, strides=1, activation='relu'))

model.add(keras.layers.Conv2D(filters=64, kernel_size=2, strides=1, activation='relu'))
model.add(keras.layers.Conv2D(filters=64, kernel_size=2, strides=1, activation='relu'))
model.add(keras.layers.MaxPool2D(pool_size=2))

model.add(keras.layers.Conv2D(filters=128, kernel_size=2, strides=1, activation='relu'))
model.add(keras.layers.Conv2D(filters=128, kernel_size=2, strides=1, activation='relu'))
model.add(keras.layers.MaxPool2D(pool_size=2))
model.add(Dropout(0.2))

model.add(keras.layers.Conv2D(filters=256, kernel_size=2, strides=1, activation='relu'))
model.add(keras.layers.Conv2D(filters=256, kernel_size=2, strides=1, activation='relu'))
model.add(keras.layers.MaxPool2D(pool_size=2))

model.add(keras.layers.Flatten())

model.add(keras.layers.Dense(512,activation='relu'))
model.add(Dropout(0.25))

model.add(keras.layers.Dense(256,activation='relu'))

model.add(keras.layers.Dense(24, activation='softmax'))

print(model.summary())

# Compile the model
model.compile(optimizer=keras.optimizers.Adam(), loss='categorical_crossentropy', metrics=['acc'])


early_stopping = [EarlyStopping(patience=3, monitor='val_loss'), ReduceLROnPlateau(patience=2), 
                  ModelCheckpoint(filepath='ASL_MNIST_CNN_temp.h5', save_best_only=True)]

In [None]:
history = model.fit(datagen.flow(train_x, y_tr, batch_size = 250), epochs = 100, validation_data = (val_x, y_v) , callbacks = early_stopping)

model.save('ASL_MNIST_CNN.h5') # Saves architecture and weights
print('Model Saved')

In [None]:
test_loss, test_acurracy = model.evaluate(X_test, y_test)
print('Test loss: {:.2f}, accuracy: {:.2f}%'.format(test_loss, test_acurracy*100))

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np

# Create two plots: one for the loss value, one for the accuracy
fig, (ax1, ax2) = plt.subplots(nrows=1, ncols=2, figsize=(12, 4))

# Plot accuracy values
ax1.plot(history.history['loss'], label='train loss')
ax1.plot(history.history['val_loss'], label='val loss')
ax1.set_title('Validation loss {:.3f} (mean last 3)'.format(
    np.mean(history.history['val_loss'][-3:]) # last three values
))
ax1.set_xlabel('epoch')
ax1.set_ylabel('loss value')
ax1.legend()

# Plot accuracy values
ax2.plot(history.history['acc'], label='train acc')
ax2.plot(history.history['val_acc'], label='val acc')
ax2.set_title('Validation accuracy {:.3f} (mean last 3)'.format(
    np.mean(history.history['val_acc'][-3:]) # last three values
))
ax2.set_xlabel('epoch')
ax2.set_ylabel('accuracy')
ax2.legend()
plt.show()

In [None]:
predictions = model.predict_classes(X_test)
predictions
# Remove one hot encoding from the target
y_test_=np.argmax(y_test, axis=1)
y_test_[1]

In [None]:
from sklearn.metrics import classification_report

print(classification_report(y_test_, predictions, target_names = ['A','B','C','D','E','F','G','H','I','K','L','M','N','O','P','Q','R','S',
                        'T','U','V','W','X','Y']))

In [None]:
from sklearn.metrics import confusion_matrix
import seaborn as sns

matrix = confusion_matrix(y_true=y_test_, y_pred=predictions)

plt.figure(figsize = (20,15))
ax = sns.heatmap(matrix,cmap= "Blues", linecolor = 'black' , linewidth = 0, annot = True, fmt='', xticklabels=['A','B','C','D','E','F','G','H','I','K','L','M','N','O','P','Q','R','S',
                        'T','U','V','W','X','Y'], yticklabels=['A','B','C','D','E','F','G','H','I','K','L','M','N','O','P','Q','R','S',
                        'T','U','V','W','X','Y']);
ax.set(xlabel='Classified as', ylabel='True label')
plt.show()