# Building a neural network to recognize handwritten letters
***Required Packages:*** numpy, pandas, opencv-python, tensorflow, keras, scikit-learn, matplotlib

In [None]:
import numpy  as np
import cv2
import matplotlib.pyplot as plt
import pandas as pd

from keras.models import Sequential
from keras.layers import Dense, Flatten, Conv2D, MaxPool2D, Dropout
from keras.optimizers import SGD, Adam
from keras.callbacks import ReduceLROnPlateau, EarlyStopping
from keras.utils import to_categorical

from sklearn.model_selection import train_test_split
from sklearn.utils import shuffle

## Reading in the Data
Dataset available [here](https://www.kaggle.com/datasets/sachinpatel21/az-handwritten-alphabets-in-csv-format)

This dataset is comprised of images of letters that have been broken down into rows and columns of pixels. Below, we read in the dataset using pandas, and print the first 10 rows 

In [None]:
data = pd.read_csv(r"archive/A_Z_Handwritten_Data.csv").astype('float32') # Renamed file - _ instead of spaces
print(data.head(10))

Now, we need to separate our dataset into the labels and the images. In our dataset the first column contains the image labels. We can first drop the column containing the labels to assign the broken down images to the x variable, then we assign the labels to the y variable.

In [None]:
x = data.drop('0', axis = 1)
y = data['0']

When training and testing a neural network, it's useful to split our dataset into a "Training" and a "Testing" portion. The Training portion is what we'll use to train our models to detect letters, and the Testing portion is what we'll use to assess how accurate our model is. Below, we use the `train_test_split()` function to split our dataset into these two categories. After we do this, we need to reshape the data in our image variables so that they can be displayed as images. In the dataset, the images are stored as 784 columns of pixel data. Using the `np.reshape()` function, we can convert these to 28x28 matrices.

In [None]:
train_x, test_x, train_y, test_y = train_test_split(x, y, test_size = 0.2)

train_x = np.reshape(train_x.values, (train_x.shape[0], 28,28))
test_x = np.reshape(test_x.values, (test_x.shape[0], 28,28))

print("Train data shape: ", train_x.shape)
print("Test data shape: ", test_x.shape)

In our dataset, the labels (the letters of the alphabet) are stored using numbers from 0-25 with 0 being A and 25 being Z. Below, we create a **dictionary** where the keys correspond to the labels in number form and the values correspond to their letter counterparts.

In [None]:
word_dict = {0:'A',1:'B',2:'C',3:'D',4:'E',5:'F',6:'G',7:'H',8:'I',9:'J',10:'K',11:'L',12:'M',13:'N',14:'O',15:'P',16:'Q',17:'R',18:'S',19:'T',20:'U',21:'V',22:'W',23:'X', 24:'Y',25:'Z'}

It might be useful later on to know some basic characteristics of our dataset, like how frequently each letter of the alphabet is present in the training data. We can visualize this by counting how many times each label appears in the dataset and plotting these values like below: 

In [None]:
y_int = np.int0(y)
count = np.zeros(26, dtype='int')
for i in y_int:
    count[i] +=1
alphabets = []
for i in word_dict.values():
    alphabets.append(i)

fig, ax = plt.subplots(1,1, figsize=(10,10))
ax.barh(alphabets, count)

plt.xlabel("Number of elements ")
plt.ylabel("Alphabets")
plt.grid()
plt.show()

In the plot above, it seems like some letters of the alphabet are appear more frequently in our dataset than others. This makes sens, as letters that appear less frequently like F are more distinct, and therefore more easy for our model to identify. Other letters, like O, are similar to the shapes of other letters and can be represented in many ways, therefore it might be useful to train our model more heavily on letters such as these.

Now that we have an intuitive understanding of our dataset, let's plot some of the entries in the dataset to see what the images of our letters look like. To do this, we first shuffle our training dataset. Then we create a 3x3 grid of plots using the `plt.subplots()` function. Next, we need to perform what's called "thresholding" on our images using the `cv2.threshold()` function. This is done to try to eliminate noise from our data. After our data has been thresholded, we can plot it to get some examples of our training images.

In [None]:
shuff = shuffle(train_x[:100])

fig, ax = plt.subplots(3,3, figsize = (10,10))
axes = ax.flatten()

for i in range(9):
    _, shu = cv2.threshold(shuff[i], 30, 200, cv2.THRESH_BINARY)
    axes[i].imshow(np.reshape(shuff[i], (28,28)), cmap="Greys")
plt.show()

Before we train our model, we need to do some restructing of the data so that it can be more easily interpreted by our neural network. For example, our Y variable is a label that indicates which letter is being displayed. For the purposes of our neural network, this is best represented by categorical variables rather than integers. We can convert between the two datatypes using the `to_categorical()` function.

In [None]:
train_X = train_x.reshape(train_x.shape[0],train_x.shape[1],train_x.shape[2],1)
print("New shape of train data: ", train_X.shape)
test_X = test_x.reshape(test_x.shape[0], test_x.shape[1], test_x.shape[2],1)
print("New shape of test data: ", test_X.shape)
train_Y = to_categorical(train_y, num_classes = 26)
print("New shape of train labels: ", train_Y.shape)
test_Y = to_categorical(test_y, num_classes = 26)
print("New shape of test labels: ", test_Y.shape)

Now, it's time to set up our neural network! Here, we'll be using a Convolutional Neural Network (CNN) to extract various features of the images and assign them classifying labels. As a part of our CNN, we'll add various convolutional layers that break down our images in different ways. 

In [None]:
model = Sequential()
model.add(Conv2D(filters=32, kernel_size=(3, 3), activation='relu', input_shape=(28,28,1)))
model.add(MaxPool2D(pool_size=(2, 2), strides=2))
model.add(Conv2D(filters=64, kernel_size=(3, 3), activation='relu', padding = 'same'))
model.add(MaxPool2D(pool_size=(2, 2), strides=2))
model.add(Conv2D(filters=128, kernel_size=(3, 3), activation='relu', padding = 'valid'))
model.add(MaxPool2D(pool_size=(2, 2), strides=2))
model.add(Flatten())
model.add(Dense(64,activation ="relu"))
model.add(Dense(128,activation ="relu"))
model.add(Dense(26,activation ="softmax"))

After we've added all of the layers in our model, it's time to actually train and validate it! To do this, we first compile our model and assign optimization and loss functions that will be used to improve fit. Finally, we initialize the training of our model using the `model.fit()` function. Since our dataset is fairly large, we'll train our model using a single "epoch" (or number of iterations over our training data). We also specify that the test portions of our dataset that we specified earlier will be used to check the accuracy of the model. If we find that our trained model isn't terribly accurate, we could attempt to increase the number of training epochs.

In [None]:
model.compile(optimizer = Adam(learning_rate=0.001), loss='categorical_crossentropy', metrics=['accuracy'])
history = model.fit(train_X, train_Y, epochs=1,  validation_data = (test_X,test_Y))

After training our model, we might be interested in knowing how accurate it is. We can output information about that using the below code:

In [None]:
print("The validation accuracy is :", history.history['val_accuracy'])
print("The training accuracy is :", history.history['accuracy'])
print("The validation loss is :", history.history['val_loss'])
print("The training loss is :", history.history['loss'])

Now, we can directly test our models ability to classify letters using samples from our test dataset! To do this, we first select images from the test dataset and plot them. Then, we feed these selections into our model and ask it to predict the label of that image using the `model.predict()` function. After we've done that, we can add the prediction of each image to each plot's title.

In [None]:
fig, axes = plt.subplots(3,3, figsize=(8,9))
axes = axes.flatten()

test_X = shuffle(test_X)

for i,ax in enumerate(axes):
    img = np.reshape(test_X[i], (28,28))
    ax.imshow(img, cmap="Greys")
    img_reshape = np.reshape(img, (1,28,28,1))
    pred = word_dict[np.argmax(model.predict(img_reshape))]
    ax.set_title("Prediction: "+pred)
    ax.grid()
    
plt.show()

We can also use our model to predict the label of images **not** contained in our testing or training datasets. Testing our model against real world data is a great way to see if it is generalizable, or if it's only successful in labeling the training data. To do this, feel free to download an image of any letter and replace the path in the code below with the images path on your computer.

After our image is read in, it has to go through some processing steps in order to arrive at a format that can be interpreted by our model.

In [None]:
img = cv2.imread(r'k.png')
img_copy = img.copy()
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img = cv2.resize(img, (400,440))

img_copy = cv2.GaussianBlur(img_copy, (7,7), 0)
img_gray = cv2.cvtColor(img_copy, cv2.COLOR_BGR2GRAY)
_, img_thresh = cv2.threshold(img_gray, 100, 255, cv2.THRESH_BINARY_INV)
img_final = cv2.resize(img_thresh, (28,28))
img_final =np.reshape(img_final, (1,28,28,1))

In [None]:
img_pred = word_dict[np.argmax(model.predict(img_final))]
cv2.putText(img, "Prediction: " + img_pred, (20,410), cv2.FONT_HERSHEY_DUPLEX, 1.3, color = (255,0,30))
cv2.imshow("Letter Recognition", img)

while (1):
    k = cv2.waitKey(1) & 0xFF
    if k == 27:                 # Press escape to close window
        break
cv2.destroyAllWindows()