# Letter image recognition

![pic](https://imgur.com/qBl14zL.png)

Download

[Kaggle Dataset](https://www.kaggle.com/c/street-view-getting-started-with-julia/data)

## Data Preprocessing

### Image Color
Almost all images in training and test data sets are color images. The first step in preprocessing is to convert all the images to grayscale. It simplifies the data that is entered into the network and also allows the model to be more generalized, since a blue letter and a red letter are the same for the classification of this image. Therefore, this preprocessing to reduce the channel of the image color should have no negative effect on the final accuracy, as most of the text is highly contrasting with the background.

### Image Resizing
Since the images have different shapes and sizes, we have to normalize the images so that we can determine the model's input. There are two major issues that need to be addressed in this process: Which image size do we choose? Should we keep the image aspect ratio?

At first, I also thought it would be better to keep the aspect ratio of the image as it would not distort the image. This can also lead to confusion between O and O (uppercase o and zero). However, after some testing, it seems that the model does not maintain the aspect ratio better.

With respect to image size, a 16 × 16 image allows very fast training but does not give the best results. These small images are the perfect choice for quick test ideas. The use of 32 × 32 images makes training quick and provides good accuracy. Finally, the use of a 64x64 image made the training considerably slower and slightly improved than the 32x32 image. I chose to use 32 × 32 images because it is the best compromise between speed and accuracy.

In [20]:
import os
import glob
import pandas as pd
import math
import numpy as np
import cv2
from IPython.display import Image

path = "D:/Program/dataset/First Steps With Julia"

# The target size after image conversion
img_height, img_width = 32, 32

# Save the directory after converting the image
suffix = "Preproc"
trainDataPath = path + "/train" + suffix
testDataPath = path + "/test" + suffix

# create directory
if not os.path.exists(trainDataPath):
    os.makedirs(trainDataPath)

if not os.path.exists(testDataPath):
    os.makedirs(testDataPath)
    
trainDataPath

'D:/Program/dataset/First Steps With Julia/trainPreproc'

In [32]:
%%time
### Image size and image color preprocessing ###

for datasetType in ["train","test"]:
    # ordered the datapath, use "key=os.path.getmtime"
    imgFiles = sorted(glob.glob(path + "/" + datasetType + "/*.bmp"), key=os.path.getmtime)
    imgData = np.zeros((len(imgFiles), img_height, img_width))
    
    for i, imgFilePath in enumerate(imgFiles):
        # Image Color processing, to gray scale
        img_gray = cv2.imread(imgFilePath,0)        
        imgResized = cv2.resize(img_gray, (img_height, img_width))
        imgData[i] = imgResized
        
        # get the basename, ex: 1.Bmp or 6284.Bmp
        filename = os.path.basename(imgFilePath)
        filenameDotSplit = filename.split(".")
        
        # zfill, ex: 00001.bmp or 06284.bmp
        newFilename = filenameDotSplit[0].zfill(5) + "." + filenameDotSplit[-1].lower()
        newFilepath = path + "/" + datasetType + suffix + "/" + newFilename
        cv2.imwrite(newFilepath, imgResized)

    # add the dimension of "Channel"
    print("Before: ", imgData.shape)
    imgData = imgData[:,:,:,np.newaxis] 
    print("After: ", imgData.shape)
    
    # Data standardization
    imgData = imgData.astype('float32')/255
    
    # Save the image converted ndarray object as a numpy object to the file system
    np.save(path + "/" + datasetType + suffix + ".npy", imgData)
        

Before:  (6283, 32, 32)
After:  (6283, 32, 32, 1)
Before:  (6220, 32, 32)
After:  (6220, 32, 32, 1)
Wall time: 1min 46s


## Label Conversion
We also have to convert one-hot encoding of the character's label. It is necessary to provide tag information to the CNN neural network. This process consists of two steps. First, we convert characters to consecutive integers. Since the characters to be predicted are [0 ~ 9], [a ~ z] and [A ~ Z] are 62 characters in total, we will assign each character to an integer from [0 to 61].



In [37]:
import keras

def label2int(ch):
    # Given a string representing one Unicode character, 
    # return an integer representing the Unicode code point of that character. 
    asciiVal = ord(ch)
    if(asciiVal<=57): #0-9
        asciiVal-=48
    elif(asciiVal<=90): #A-Z
        asciiVal-=55
    else: #a-z
        asciiVal-=61
    return asciiVal
    
def int2label(i):
    if(i<=9): #0-9
        i+=48
    elif(i<=35): #A-Z
        i+=55
    else: #a-z
        i+=61
    return chr(i)

Using TensorFlow backend.


In [38]:
# Only keep "label information" column
y_train = pd.read_csv(path + "/trainLabels.csv").values[:,1] 
y_train

array(['n', '8', 'T', ..., 'P', 'N', 'R'], dtype=object)

In [40]:
# One-hot encoding for the label

# A-Z, a-z, 0-9, there are 62 categories
Y_train = np.zeros((y_train.shape[0], 62)) 

for i in range(y_train.shape[0]):
    Y_train[i][label2int(y_train[i])] = 1 # One-hot

# The converted label (Label) data stored in the file system for subsequent rapid loading and processing
np.save(path + "/" + "labelsPreproc.npy", Y_train)