# Seven segment digit classification

This notebook contains the various methods we considered to identify numbers composed of seven segment digits in photos. 

The following are the methodologies we considiered:


  1. Extract LCD screen, extract each digit withing the LCD and classify them seperately


  2. Extract LCD screen and classify the number as a whole.
  
  
There are several options to extract the LCD screen after pre-processing each image:


> - Detect angles within each photo, the angles the most to the left and right of the photo should approximately correspond to each corner of the LCD screen.

> - Using the cv2 package, find the contours within each photo that correspond to the biggest four segment object. If this method fails we simply crop a rectangle from the middle of the photo.


For the first method, we considered a simple approach to crop out each digit individually out of the extracted LCD screen. We simply divide the LCD in four equal segments. The crops are then passed to a deep neural net which will classify each digit individually. 


For the second method, the LCD screen is passed to a deep neural net which will both detect and classify the numbers accordignly. 


## Mount drive to load and save data

In [7]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


## Load packages

In [4]:
import logging
import shutil
import imutils
import glob
import os

import numpy as np
import pandas as pd
import matplotlib
import matplotlib.pyplot as plt 
matplotlib.use('agg')

import cv2
import skimage.filters as ft
import scipy.spatial as sp
from skimage.measure import label, regionprops
from PIL import Image

import tensorflow as tf
from sklearn.model_selection import train_test_split
import keras.backend
from keras.models import Sequential
from keras.models import Model
from keras.layers import Input, Dense, Conv2D, MaxPooling2D, Flatten
from keras.layers.core import Dropout, Activation
from keras.layers import BatchNormalization
from keras import regularizers
from keras.optimizers import Adam
from keras.utils import plot_model
from keras.callbacks import TensorBoard,EarlyStopping
from keras.backend.tensorflow_backend import set_session

Using TensorFlow backend.


## Extracting LCD screen

### Extraction with contour identification

First we define a class which we can apply automatically to all our input images. Note that all the preprocessing is integrated directly in this class and applied to each photo to increase chances of correctly isolating the LCD.

In [0]:
class frameExtractor:

    def __init__(self, image=None, src_file_name=None, dst_file_name=None, return_image=False, output_shape =(400,100)):
        """
        Use this class to extract the frame/LCD screen from the image. This is our step 1 for image preprocessing.
        The final frame is extracted in grayscale.
        Note that it works for the "digital" case and can be used for the "analog" case, but it is more efficient on the "digital" case.
        :param image: RGB image (numpy array NxMx3) with a screen to extract. If image is None, the image will be extracted from src_filename
        :param src_file_name: filename to load the source image where the screen needs to be extracted (e.g. HQ_digital/0a07d2cff5beb0580bca191427e8cd6e1a0eb678.jpg)
        :param dst_file_name: filename to save the preprocessed image (e.g. HQ_digital_frame/0a07d2cff5beb0580bca191427e8cd6e1a0eb678.jpg
        :param return_image: a boolean, if True extractAndSave returns an image (np. array) / if False it just saves the image.
        :param output_shape: shape (in pxl) of the output image.
        """
        if image is None :
            self.image = cv2.imread(src_file_name)
        else :
            self.image = image
        self.dst_file_name = dst_file_name
        self.return_image = return_image
        self.output_shape = output_shape
        self.raw_frame = None
        self.frame = None
        self.sliced_frame = None


    def distance_from_center(self, rectangle):
        """
        Use this function to measure how far a rectangle is from the center of an image.
        Most of the time the frame is approx. in the middle of the picture.
        Note that the code works for shapes that are approx. rectangles.
        :param rectangle: a 4x2 array with the coordinates of each corner of the rectangle.
        :return: the distance (a float) between the center of the rectangle and the center of the picture.
        """
        center_rc = 0.5*(rectangle[0]+ rectangle[2])
        center_image = 0.5*np.array([self.image.shape[1],self.image.shape[0]])
        distance = np.linalg.norm(center_rc-center_image)
        return distance



    def sort_pts_clockwise(A):
        """
        Use this function to sort in clockwise order points in R^2.
        Credit: https://stackoverflow.com/questions/30088697/4-1-2-numpy-array-sort-clockwise
        :param A: a Nx2 array with the 2D coordinates of the points to sort.
        :return: a Nx2 array with the points sorted in clockwise order starting with the top-left point.
        """
        # Sort A based on Y(col-2) coordinates
        sortedAc2 = A[np.argsort(A[:,1]),:]
        # Get top two and bottom two points
        top2 = sortedAc2[0:2,:]
        bottom2 = sortedAc2[2:,:]
        # Sort top2 points to have the first row as the top-left one
        sortedtop2c1 = top2[np.argsort(top2[:,0]),:]
        top_left = sortedtop2c1[0,:]
        # Use top left point as pivot & calculate sq-euclidean dist against
        # bottom2 points & thus get bottom-right, bottom-left sequentially
        sqdists = sp.distance.cdist(top_left[None], bottom2, 'sqeuclidean')
        rest2 = bottom2[np.argsort(np.max(sqdists,0))[::-1],:]
        # Concatenate all these points for the final output
        return np.concatenate((sortedtop2c1,rest2),axis =0)


    def adjust_gamma(image, gamma=1.0):
        """
        Use this function to adjust illumination in an image.
        Credit: https://stackoverflow.com/questions/33322488/how-to-change-image-illumination-in-opencv-python
        :param image: A grayscale image (NxM int array in [0, 255]
        :param gamma: A positive float. If gamma<1 the image is darken / if gamma>1 the image is enlighten / if gamma=1 nothing happens.
        :return: the enlighten/darken version of image
        """
        invGamma = 1.0 / gamma
        table = np.array([((i / 255.0) ** invGamma) * 255 for i in np.arange(0, 256)])
        return cv2.LUT(image.astype(np.uint8), table.astype(np.uint8))


    def frameDetection(self):
        """
        The core method of the class. Use it to extract the frame in the image.
        The extracted frame is in grayscale.
        The followed steps are :
            1. grayscale + smoothering + gamma to make the frame darker + binary threshold (rational = the frame is one of the darkest part in the picture).
            2. extract regions of "interest".
            3. heuristic to find a region of interest that is large enough, in the center of the picture and where length along x-axis > length along y-axis.
            4. make a perspective transform to crop the image and deal with perspective deformations.
        """
        self.image = imutils.resize(self.image, height=500)

        # Step 1: grayscale + smoothering + gamma to make the frame darker + binary threshold
        gray = cv2.cvtColor(self.image, cv2.COLOR_BGR2GRAY)
        blurred = cv2.GaussianBlur(gray, (5, 5), 0)
        gamma = frameExtractor.adjust_gamma(blurred, gamma=0.7)
        shapeMask = cv2.threshold(gamma, 0, 255, cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]

        # Step 2: extract regions of "interest".
        label_image = label(shapeMask)

        Cnt = None
        position = [0, 0, 0, 0]

        for region in regionprops(label_image):
            # Step 3: heuristic to find a region large enough, in the center & with length along x-axis > length along y-axis.
            minr, minc, maxr, maxc = region.bbox
            c = np.array([[minc, minr], [minc, maxr], [maxc, minr], [maxc, maxr]])

            if Cnt is None:
                Cnt = c
                position = [minr, minc, maxr, maxc]

            old_dist = self.distance_from_center(Cnt)
            new_dist = self.distance_from_center(c)

            Lx = maxc - minc
            Ly = maxr - minr

            c = frameExtractor.sort_pts_clockwise(c)

            if old_dist>new_dist and Ly<Lx and cv2.contourArea(c)>0.05*(shapeMask.shape[0]*shapeMask.shape[1]):
                displayCnt = c
                position = [minr, minc, maxr, maxc]

        Cnt = Cnt.reshape(4, 2)
        Cnt = frameExtractor.sort_pts_clockwise(Cnt)


        # Step 4: Make a perspective transform to crop the image and deal with perspective deformations.
        try:
            # Crop the image around the region of interest (but keep a bit of distance with a 30px padding).
            # Darken + Binary threshold + rectangle detection.
            # If this technique fails, raise an error and use basic methods (except part).

            crop_img = self.image[max(0, position[0] - 30):min(position[2] + 30, self.image.shape[0]),\
                       max(0, position[1] - 30):min(self.image.shape[1], position[3] + 30)]

            crop_blurred = cv2.GaussianBlur(crop_img, (5, 5), 0)
            crop_gamma = frameExtractor.adjust_gamma(crop_blurred, gamma=0.4)
            crop_gray = cv2.cvtColor(crop_gamma, cv2.COLOR_BGR2GRAY)
            crop_thresh = cv2.threshold(crop_gray, 0, 255, cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]

            cnts = cv2.findContours(crop_thresh.copy(), cv2.RETR_EXTERNAL,
                                    cv2.CHAIN_APPROX_SIMPLE)
            cnts = cnts[0] if imutils.is_cv2() else cnts[1]
            cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
            Cnt_bis = None

            for c in cnts:
                peri = cv2.arcLength(c, True)
                approx = cv2.approxPolyDP(c, 0.02 * peri, True)

                if len(approx) == 4:
                    Cnt_bis = approx
                    break

            if cv2.contourArea(Cnt_bis)<0.5*(crop_img.shape[0]*crop_img.shape[1]):
                raise ValueError("Couldn't find the box, so switching to ad hoc method.")

            Cnt_bis = Cnt_bis.reshape(4, 2)
            Cnt_bis = frameExtractor.sort_pts_clockwise(Cnt_bis)
            src_pts = Cnt_bis.copy()
            src_pts = src_pts.astype(np.float32)

            dst_pts = np.array([[0, 0], [400, 0], [400, 100], [0, 100]], dtype=np.float32)
            dst_pts = dst_pts.astype(np.float32)

            persp = cv2.getPerspectiveTransform(src_pts, dst_pts)
            warped = cv2.warpPerspective(crop_img, persp, (400, 100))


        except:
            # More basic techniques that give +/- acceptable results when the first technique fails.
            src_pts = Cnt.copy()
            src_pts = src_pts.astype(np.float32)

            dst_pts = np.array([[0, 0], [400, 0], [400, 100], [0, 100]], dtype=np.float32)
            dst_pts = dst_pts.astype(np.float32)

            persp = cv2.getPerspectiveTransform(src_pts, dst_pts)
            warped = cv2.warpPerspective(gray, persp, (400, 100))

        # Frame is extracted from the initial image in grayscale (not other processing done on the image).
        self.raw_frame = warped


    # TODO : check why they fail
    """
    http://www.amphident.de/en/blog/preprocessing-for-automatic-pattern-identification-in-wildlife-removing-glare.html
    http://people.csail.mit.edu/yichangshih/mywebsite/reflection.pdf
    http://news.mit.edu/2015/algorithm-removes-reflections-photos-0511
    """
    def preprocessFrame(self):
        """
        Final preprocessing that outputs a clean image 'cleaned_img' with more contrasts
        """
        try :
            gray = cv2.cvtColor(self.raw_frame, cv2.COLOR_BGR2GRAY)
        except :
            gray = self.raw_frame
        thresh = cv2.equalizeHist(gray)
        thresh = cv2.threshold(thresh, 45, 255, cv2.THRESH_BINARY_INV)[1]
        cleaned_img = cv2.dilate(thresh, None, iterations=1)
        self.frame = cleaned_img


    def sliceFrame(self):
        """
        Use this method to slice the frame and only keep the integer part (e.g. 123.45 becomes 123).
        Heuristic: comma is approx. at 8/13 of the image.
        :return:
        """
        stop_at = int(np.floor(self.output_shape[0]*8/13))
        self.sliced_frame = np.array(self.frame)[:,:stop_at]


    def extractAndSaveFrame(self):
        """
        Use this method to
                1. detect and select the frame/screen.
                2. preprocessing to only keep numbers (and remove noise).
                3. slice the frame to only keep integer part.
                4. save the sliced frame in dst_file_name.
        :return: the extracted frame (np.array) if it was specified when instantiating the class.
        """
        self.frameDetection()
        self.preprocessFrame()
        self.sliceFrame()
        cv2.imwrite(self.dst_file_name, self.sliced_frame)
        if self.return_image:
            return self.sliced_frame
        else:
            return

We then apply the class to all our input photos and save the extracted LCD images to the Datasets_frames folder (Check your directories). We check for failed extractions for each quality level of photos (printed with the "fail" list)

In [6]:
if __name__ == "__main__":

    if os.path.exists('Datasets_frames/'):
        shutil.rmtree('Datasets_frames/')
        os.makedirs('Datasets_frames/')
    else:
        os.makedirs('Datasets_frames/')

    fail = [0, 0, 0]

    for file in glob.glob('/content/drive/My Drive/Data_Analog_Digital/HQ_digital/*jpg'):

        try:
            f = frameExtractor(image=None,
                               src_file_name=file,
                               dst_file_name='Datasets_frames/' + str(file).split('/')[-1],
                               return_image=False,
                               output_shape=(400, 100))
            f.extractAndSaveFrame()
        except:
            fail[0] += 1

    for file in glob.glob('/content/drive/My Drive/Data_Analog_Digital/LQ_digital/*jpg'):
        try:
            f = frameExtractor(image=None,
                               src_file_name=file,
                               dst_file_name='Datasets_frames/' + str(file).split('/')[-1],
                               return_image=False,
                               output_shape=(400, 100))
            f.extractAndSaveFrame()
        except:
            fail[1] += 1

    for file in glob.glob('/content/drive/My Drive/Data_Analog_Digital/MQ_digital/*jpg'):
        try:
            f = frameExtractor(image=None,
                               src_file_name=file,
                               dst_file_name='Datasets_frames/' + str(file).split('/')[-1],
                               return_image=False,
                               output_shape=(400, 100))
            f.extractAndSaveFrame()
        except:
            fail[2] += 1

    print(fail)

[0, 0, 0]


### Extraction with angle identification

Now that we have extracted the LCD from each photo, we can either follow the first or second methodology briefly described above. 

## Method 1: Individual digit detection

> **1.** We need to seperate each digit within the cropped out LCD screen. To do this we simply divide the extracted LCD in four equal segments (cadran 1, 2, 3, 4). We then save each cropped digit in a folder corresponding to its digit based on the csv relating this information.

> **2.** We then need to generate a dataset (array form) to pass to our neural net for training.

> **3.** Finally we define the architecture of our neural net and pass the previously generated dataset to train. 




### 1. Digit detection

We start by constructing a class which we can pass automatically to all the LCD images we extracted.

In [0]:
class cutDigits:

    def __init__(self, image=None, src_file_name=None, dst_folder_name='Datasets_digits', last_digit=4, labels=None):
        """
        The aim of this class is to extract digits from the frame-only preprocessed image.
        We to delimit digits by bounding boxes.
        We tried several approaches, but we present here the most successful one, a "dummy" yet efficient approach.
        :param image: RGB image (numpy array NxMx3) of a SLICED SCREEN. If image is None, the image will be extracted from src_filename
        :param src_file_name: filename of a SLICED SCREEN to load the source image (e.g. HQ_digital_preprocessing/0a07d2cff5beb0580bca191427e8cd6e1a0eb678.jpg)
        :param dst_folder_name: home FOLDERname where to save the extracted digits.
        :param last_digit: int, the number of digits you want to extract starting from the left (0 = no digits / 4 = all four digits).
        :param labels: list, list of labels corresponding to the image, e.g. if th image shows 123.45, the labels will be ['x',1,2,3].
        """
        if image is None :
            self.image = cv2.imread(src_file_name)
        else:
            self.image = image
        self.src_file_name = src_file_name
        self.dst_folder_name = dst_folder_name
        self.last_digit=last_digit
        self.labels = labels

        self.box_size = None
        self.boxes = []



    def get_bounding_box_dummy(self):
        """
        Use this method to get bounding boxes and extract numbers by dividing the area in 4 equal parts ("dummy" yet efficient approach).
        """

        self.boxes = []
        self.box_size = self.image.shape[1]/4

        for i in range(self.last_digit):
            inf = i * self.box_size
            sup = (i+1) * self.box_size
            self.boxes += [self.image[:, int(inf):int(sup)]]


    def save_to_folder(self) :
        """
        Use this method to save the extracted bounding boxes.
        """
        if self.dst_folder_name is None :
            return

        for i in range(len(self.boxes)):
            if self.labels :
                box = self.boxes[i]
                label = self.labels[i]
                src_file_name = self.src_file_name.split('/')[-1].split('.')[0]
                dst_file_name = 'Datasets_digits/%s/%s_%s.jpg' % (label, src_file_name, str(i))
                cv2.imwrite(dst_file_name, box)
                
            else:
                pass

            #else :
          #      box = self.boxes[i]
           #     src_file_name = self.src_file_name.split('/')[-1].split('.')[0]
            #    dst_file_name = 'Datasets_digits/%s/%s_%s.jpg' % ('missing_label', src_file_name, str(i))
            #    cv2.imwrite(dst_file_name, box)


We then apply this class to all the frames extracted and saved in the Datasets_frames folder and save the individual digits in the Datasets_digits folder.

In [0]:
if __name__ == "__main__":
    
    
    if os.path.exists('Datasets_digits/'):
        shutil.rmtree('Datasets_digits/')
        for i in range(0,11):
            os.makedirs('Datasets_digits/%i' %i)
    else:
        for i in range(0,11):
            os.makedirs('Datasets_digits/%i' %i)

    # TODO: check why they fail

    fail = 0

    df = []
    # NB: These 3 datasets were made with Excel
    
    suffix = "csv"
    csv_directory = "/content/drive/My Drive/Data_Analog_Digital/"
    csv_files = [i for i in os.listdir(csv_directory) if i.endswith( suffix )]
    df = []
    for i in range(len(csv_files)):
        data = pd.read_csv(csv_directory +csv_files[i], sep=';', index_col = 0)
        df.append(data)
            
    df = pd.concat(df, axis=0)
    df = df.replace("X", 10)

    for i in range(df.shape[0]):
        line = df.iloc[i]
        labels = [line.cadran_1, line.cadran_2, line.cadran_3, line.cadran_4]
        file_name = line.image
        src_file_name = "content/Datasets_frames/%s" % file_name

        try :
            cutter = cutDigits(src_file_name="/" + src_file_name, labels=labels)
            cutter.get_bounding_box_dummy()
            cutter.save_to_folder()

        except :
            fail += 1

    print(fail)

### 2. Dataset generation

We now need to generate a dataset to train our neural net on. We again consider a class to automate this process to all images we have.

In [0]:
class Dataset:
    
    def __init__(self):
            self.data = self.full_data()
                        
    def full_data(self):
        suffix = ".csv"
        csv_directory = '/content/drive/My Drive/Data_Analog_Digital/'
        csv_files = [i for i in os.listdir(csv_directory) if i.endswith( suffix )]
        full_data = []
        for i in range(len(csv_files)):
            data = pd.read_csv(csv_directory +csv_files[i], sep=';', index_col = 0)
            full_data.append(data)
            
        full_data = pd.concat(full_data, axis=0)
        full_data = full_data.replace("X", 10)
        return full_data
        
class Dataset_Multi(Dataset):
    
    def __init__(self):
        Dataset.__init__(self)
        self.frame_directory = '/content/Datasets_frames/'
        self.frame_data = self.data[self.data["image"].isin(os.listdir(self.frame_directory))]
        
    def convert_to_arrays(self,samples):
        X = []
        for sample in samples:
            ID =  'Datasets_frames/' + "%s" % (sample)
            img = Image.open(ID)
            img = np.array(img)
            img = img.reshape((img.shape[0],img.shape[1],1))
            X.append(img)
        X = np.asarray(X)
        return X

This dataset can then be passed to the neural net for training.

### 3. Defining neural net architecture

The architecture of this neural net is a result of general reccomendations online and in several published articles related to some extent to our topic.

In [0]:
class Model(object):
    
    def __init__(self):
        
        self.data_init()
        self.model_init()
    
    def data_init(self):
        pass
    
    def model_init(self):
        pass
    
    def train_predict(self):
        pass

class Model_Multi(Model):
    
    def __init__(self):
        Model.__init__(self)
 
    def data_init(self):
        self.dataset = Dataset_Multi()
        self.data = self.dataset.frame_data
        self.X =  self.data.iloc[:,0]
        self.y = self.data.iloc[:,1:]
        
        self.ids_train, self.ids_val, self.y_train, self.y_val = train_test_split(self.X, self.y, test_size=0.25, random_state=1)        
        self.y_train_vect = [self.y_train["cadran_1"], self.y_train["cadran_2"], self.y_train["cadran_3"], self.y_train["cadran_4"]]
        self.y_val_vect =  [self.y_val["cadran_1"], self.y_val["cadran_2"], self.y_val["cadran_3"], self.y_val["cadran_4"]]
        
        self.X_train = self.dataset.convert_to_arrays(self.ids_train)
        self.X_val = self.dataset.convert_to_arrays(self.ids_val)
              
    def model_init(self):

        model_input = Input((100,246,1))

        x = Conv2D(32, (3, 3), padding='same', name='conv2d_hidden_1', kernel_regularizer=regularizers.l2(0.01))(model_input)
        x = BatchNormalization()(x)
        x = Activation('relu')(x)
        x = MaxPooling2D(pool_size=(2, 2), strides=(3, 3),name='maxpool_2d_hidden_1')(x)
        x = Dropout(0.30)(x)

        x = Conv2D(64, (3, 3), padding='same', name='conv2d_hidden_2', kernel_regularizer=regularizers.l2(0.01))(x)
        x = BatchNormalization()(x)
        x = Activation('relu')(x)
        x = MaxPooling2D(pool_size=(2, 2), strides=(3, 3),name='maxpool_2d_hidden_2')(x)
        x = Dropout(0.30)(x)

        x = Conv2D(128, (3, 3), padding='same', name='conv2d_hidden_3', kernel_regularizer=regularizers.l2(0.01))(x)
        x = BatchNormalization()(x)
        x = Activation('relu')(x)
        x = MaxPooling2D(pool_size=(2, 2), strides=(3, 3),name='maxpool_2d_hidden_3')(x)
        x = Dropout(0.30)(x)

        x = Flatten()(x)

        x = Dense(256, activation ='relu', kernel_regularizer=regularizers.l2(0.01))(x)

        digit1 = (Dense(output_dim =11,activation = 'softmax', name='digit_1'))(x)
        digit2 = (Dense(output_dim =11,activation = 'softmax', name='digit_2'))(x)
        digit3 = (Dense(output_dim =11,activation = 'softmax', name='digit_3'))(x)
        digit4 = (Dense(output_dim =11,activation = 'softmax', name='digit_4'))(x)

        outputs = [digit1, digit2, digit3, digit4]

        self.model = keras.models.Model(input = model_input , output = outputs)
        self.model._make_predict_function()
        
    def train(self, lr = 1e-3, epochs=50):
        optimizer = Adam(lr=lr, decay=lr/10)
        self.model.compile(loss="sparse_categorical_crossentropy", optimizer= optimizer, metrics = ['accuracy'])
        keras.backend.get_session().run(tf.initialize_all_variables())
        self.history = self.model.fit(self.X_train, self.y_train_vect, batch_size= 50, nb_epoch=epochs, verbose=1, validation_data=(self.X_val, self.y_val_vect))
        
        
    def plot_loss(self):
        
        for i in range(1,5):
            plt.figure(figsize=[8,6])
            plt.plot(self.history.history['digit_%i_loss' %i],'r',linewidth=0.5)
            plt.plot(self.history.history['val_digit_%i_loss' %i],'b',linewidth=0.5)
            plt.legend(['Training loss', 'Validation Loss'],fontsize=18)
            plt.xlabel('Epochs ',fontsize=16)
            plt.ylabel('Loss',fontsize=16)
            plt.title('Loss Curves Digit %i' %i,fontsize=16)
            plt.show()
        
        
      

    def plot_acc(self):
        
        for i in range(1,5):
            plt.figure(figsize=[8,6])
            plt.plot(self.history.history['digit_%i_acc' %i],'r',linewidth=0.5)
            plt.plot(self.history.history['val_digit_%i_acc' %i],'b',linewidth=0.5)
            plt.legend(['Training Accuracy', 'Validation Accuracy'],fontsize=18)
            plt.xlabel('Epochs ',fontsize=16)
            plt.ylabel('Accuracy',fontsize=16)
            plt.title('Accuracy Curves Digit %i' %i,fontsize=16)
            plt.show()
        

    def predict(self):
        self.y_pred = self.model.predict(self.X_val)
        correct_preds = 0
        
        for i in range(self.X_val.shape[0]):
            pred_list_i = [np.argmax(pred[i]) for pred in self.y_pred]
            val_list_i  = self.y_val.values[i].astype('int')
            if np.array_equal(val_list_i, pred_list_i):
                correct_preds = correct_preds + 1
            print('exact accuracy', correct_preds / self.X_val.shape[0])
            
        mse = 0 
        diff = []
        for i in range(self.X_val.shape[0]):
                pred_list_i = [np.argmax(pred[i]) for pred in self.y_pred]
                pred_number = 1000* pred_list_i[0] + 100* pred_list_i[1] + 10 * pred_list_i[2] + 1* pred_list_i[3]
                val_list_i  = self.y_val.values[i].astype('int')
                val_number = 1000* val_list_i[0] + 100*  val_list_i[1] + 10 *  val_list_i[2] + 1*  val_list_i[3]
                diff.append(val_number - pred_number)
        print('difference label vs. prediction', diff)

    
    def train_predict(self):
        
        self.train()
        self.plot_loss()
        self.plot_acc()
        self.predict()

This class can then simply be launched with a one liner to train on all the photos we have. Before we enter this process, we will showcase the second more simple methodology we considered which does not crop each digit. 

## Method 2: LCD number recognition

In this methodology, we train the neural net to not only classify digits but also find a way to detect them on its own within the extracted LCD. 

### 1. Data generation

In [0]:
class Dataset:
    
    def __init__(self):
            self.data = self.full_data()
                        
    def full_data(self):
        suffix = ".csv"
        csv_directory = '/content/drive/My Drive/Data_Analog_Digital/'
        csv_files = [i for i in os.listdir(csv_directory) if i.endswith( suffix )]
        full_data = []
        for i in range(len(csv_files)):
            data = pd.read_csv(csv_directory +csv_files[i], sep=';', index_col = 0)
            full_data.append(data)
            
        full_data = pd.concat(full_data, axis=0)
        full_data = full_data.replace("X", 10)
        return full_data        
        
class Dataset_Single(Dataset):
    
    def __init__(self):
        Dataset.__init__(self)
        self.digits_directory = '/content/Datasets_digits/'
        self.digits_data = self.digits_data()

    
    def digits_data(self):
        ids = []
        labels = []
        for i in range(11):
            directory = self.digits_directory + '%i/' %i
            for j in os.listdir(directory):
                ids.append(directory+j)
                labels.append(i)
        digits_data = pd.DataFrame(list(zip(ids,labels)))
        
        return digits_data 
                
    def convert_to_arrays(self,samples):
        X = []
        for sample in samples:
            img = Image.open(sample)
            img = np.array(img)
            img.resize((100,256))
            img = img.reshape((img.shape[0],img.shape[1],1))
            X.append(img)
        X = np.asarray(X)
        return X

### 2. Model architecture definition

Again the architecture of this neural net is a result of general reccomendations online and in published articles related to some extent to our topic.

In [0]:
class Model(object):
    
    def __init__(self):
        
        self.data_init()
        self.model_init()
    
    def data_init(self):
        pass
    
    def model_init(self):
        pass
    
    def train_predict(self):
        pass
        
class Model_Single(Model):
    
    
    def __init__(self):
            Model.__init__(self)

    def data_init(self):
        self.dataset = Dataset_Single()

        self.data = self.dataset.digits_data         
        self.X =  self.data.iloc[:,0]
        self.y = self.data.iloc[:,1]

        self.ids_train, self.ids_val, self.y_train, self.y_val = train_test_split(self.X, self.y, test_size=0.25, random_state=1)
        self.X_train = self.dataset.convert_to_arrays(self.ids_train)
        self.X_val = self.dataset.convert_to_arrays(self.ids_val)

    def model_init(self):


        model_input = Input((100, 256, 1))
        x = Conv2D(32, (3, 3), padding='same', name='conv2d_hidden_1', kernel_regularizer=regularizers.l2(0.01))(model_input)
        x = BatchNormalization()(x)
        x = Activation('relu')(x)
        x = MaxPooling2D(pool_size=(2, 2), strides=(3, 3),name='maxpool_2d_hidden_1')(x)
        x = Dropout(0.30)(x)

        x = Conv2D(63, (3, 3), padding='same', name='conv2d_hidden_2', kernel_regularizer=regularizers.l2(0.01))(x)
        x = BatchNormalization()(x)
        x = Activation('relu')(x)
        x = MaxPooling2D(pool_size=(2, 2), strides=(3, 3),name='maxpool_2d_hidden_2')(x)
        x = Dropout(0.30)(x)

        x = Conv2D(128, (3, 3), padding='same', name='conv2d_hidden_3', kernel_regularizer=regularizers.l2(0.01))(x)
        x = BatchNormalization()(x)
        x = Activation('relu')(x)
        x = MaxPooling2D(pool_size=(2, 2), strides=(3, 3),name='maxpool_2d_hidden_3')(x)
        x = Dropout(0.30)(x)

        x = Flatten()(x)

        x = Dense(1024, activation ='relu', kernel_regularizer=regularizers.l2(0.01))(x)

        output = Dense(output_dim =11,activation = 'softmax', name='output')(x)

        self.model = keras.models.Model(input = model_input , output = output)
        self.model._make_predict_function() 

    def train(self, lr = 1e-3, epochs=5):
        optimizer = Adam(lr=lr, decay=lr/10)
        self.model.compile(loss="sparse_categorical_crossentropy", optimizer= optimizer, metrics = ['accuracy'])
        keras.backend.get_session().run(tf.initialize_all_variables())
        self.history = self.model.fit(self.X_train, self.y_train, batch_size= 32, nb_epoch=30, verbose=1, validation_data=(self.X_val, self.y_val))


    def plot_acc(self):
        plt.figure(figsize=[8,6])
        plt.plot(self.history.history['acc'],'r',linewidth=0.5)
        plt.plot(self.history.history['val_acc'],'b',linewidth=0.5)
        plt.legend(['Training Accuracy', 'Validation Accuracy'],fontsize=18)
        plt.xlabel('Epochs ',fontsize=16)
        plt.ylabel('Accuracy',fontsize=16)
        plt.title('Accuracy Curves Digit',fontsize=16)
        plt.show()

    def plot_loss(self):        
        plt.figure(figsize=[8,6])
        plt.plot(self.history.history['loss'],'r',linewidth=0.5)
        plt.plot(self.history.history['val_loss'],'b',linewidth=0.5)
        plt.legend(['Training loss', 'Validation Loss'],fontsize=18)
        plt.xlabel('Epochs ',fontsize=16)
        plt.ylabel('Loss',fontsize=16)
        plt.title('Loss Curves Digit',fontsize=16)
        plt.show()

    def predict(self):
        
        self.y_pred = self.model.predict(self.X_val)
        
        ids = []
        pred_list = []
        val_list = []

        for i in range(self.X_val.shape[0]):
            self.val_id = self.ids_val.values[i]
            ids.append(str(self.val_id.split('/')[2].split('-')[0][:-1]))
            pred_list_i = np.argmax(self.y_pred[i]).astype('int')
            pred_list.append(pred_list_i)
            val_list_i  = self.y_val.values[i].astype('int')
            val_list.append(val_list_i) 

        q = []

        for i in np.unique(ids):
            q.append([i, np.where(np.isin(ids,i))[0]])

        correct_count = 0 
        for i in range(len(q)):
            v = []
            p = []
            for j in range(len((q[i][1]))):
                idx = (q[i][1][j])
                val_list_i = val_list[idx]
                pred_list_i = pred_list[idx]
                v.append(val_list_i)
                p.append(pred_list_i)
            if np.array_equal(p, v):
                correct_count = correct_count + 1
        print('real_acc', correct_count /self.X_val.shape[0])


    def train_predict(self):

        self.train()
        self.plot_loss()
        self.plot_acc()
        self.predict()

This class is adapted as the previous one to train on a set of photos automatically with a one-liner.

## Training & Evaluation

In this section we launch and evaluate both our models for comparison. 

### Model 1: Individual digit detection

In [0]:
model_1 = Model_Multi()
model_1.train_predict()

### Model 2: LCD number recognition

In [0]:
model_2 = Model_Single()
model_2.train_predict()