## Image Classification and Text Extraction

### Image Classification:
For the image classification we have used CNN to classify the images in three samples, these samples are depending upon how the image looks. The first sample conatins the column based images, the second sample contains only the row bsed samples and the third sample contains blank background samples.

In [2]:
# Importing the Keras libraries and packages
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense
from keras.preprocessing.image import ImageDataGenerator

Using TensorFlow backend.


In [3]:
# Initialising the CNN
classifier = Sequential()

# Step 1 - Convolution
classifier.add(Conv2D(32, (3, 3), input_shape = (64, 64, 3), activation = 'relu'))

# Step 2 - Pooling
classifier.add(MaxPooling2D(pool_size = (2, 2)))

# Adding a second convolutional layer
classifier.add(Conv2D(32, (3, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2, 2)))

# Step 3 - Flattening
classifier.add(Flatten())

# Step 4 - Full connection
classifier.add(Dense(units = 128, activation = 'relu'))
classifier.add(Dense(units = 3, activation = 'softmax'))

# Compiling the CNN
classifier.compile(optimizer = 'adam', loss = 'sparse_categorical_crossentropy', metrics = ['accuracy'])

Instructions for updating:
Colocations handled automatically by placer.


In [4]:
train_datagen = ImageDataGenerator(rescale = 1./255,
                                   shear_range = 0.2,
                                   zoom_range = 0.2,
                                   horizontal_flip = True)

test_datagen = ImageDataGenerator(rescale = 1./255)

training_set = train_datagen.flow_from_directory('images/train/',
                                                 target_size = (64, 64),
                                                 batch_size = 32,
                                                 class_mode = 'binary')

test_set = test_datagen.flow_from_directory('images/test/',
                                            target_size = (64, 64),
                                            batch_size = 32,
                                            class_mode = 'binary')

Found 88 images belonging to 3 classes.
Found 32 images belonging to 3 classes.


In [6]:
classifier.fit_generator(training_set,
                         steps_per_epoch = 8000,
                         epochs = 10,
                         validation_data = test_set,
                         validation_steps = 2000)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.callbacks.History at 0x1590a99c0b8>

In [284]:
#accuracy
accuracy=classifier.evaluate_generator(test_set)
print('Accuracy of the model on the test set: ',accuracy[1])

Accuracy of the model on the test set:  1.0


In [195]:
#imports for testing
import numpy as np
from keras.preprocessing import image
training_set.class_indices

{'Sample 1': 0, 'Sample 2': 1, 'Sample 3': 2}

In [196]:
#testing from validation set
test_image = image.load_img('images/validation/iCard_021979_1_Daker_Sarah.jpg', target_size = (64, 64))
test_image = image.img_to_array(test_image)
test_image = np.expand_dims(test_image, axis = 0)
result = classifier.predict(test_image)
result

array([[0., 0., 1.]], dtype=float32)

### Extracting text from Images

Here we will use normalise the image, to improves its accuracy and then use AWS textract to extract the data from the images and then save the output of it in form of CSV. 

In [307]:
# import the necessary packages
from PIL import Image
import pytesseract
import cv2
import os
import shutil

import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl
from PIL import Image
import boto3
import pandas as pd

Importing functions from function.ipynb in this notebook. By using this approach we can add as many new functions we want or change the functions depending upon the new samples. We just need to import the notebook and the functions.

In [308]:
import import_ipynb
import functions
from functions import image_norm_sample_one,image_norm_sample_two,image_norm_sample_three,ocr

classification_move() functions classifies the images, normalise it and then extracts the information from the image in one single function.

In [309]:
def classification_move(imgpath):
    test_image = image.load_img(imgpath, target_size = (64, 64))
    test_image = image.img_to_array(test_image)
    test_image = np.expand_dims(test_image, axis = 0)
    result = classifier.predict(test_image)

    if result[0][0] == 1:
        print('Normalising the image...')
        path = image_norm_sample_one(imgpath)
        print('OCR in progress...')
        d = ocr(path)
    elif result[0][1] == 1:
        print('Normalising the image...')
        path = image_norm_sample_two(imgpath)
        print('OCR in progress...')
        d = ocr(path)
    else:
        print('Normalising the image...')
        path = image_norm_sample_three(imgpath)
        print('OCR in progress...')
        d = ocr(path)
    return d

This is the main function that needs to be called. This function does all the required steps and gives the output in the form of a CSV.

In [310]:
#paths
currentpath = os.getcwd()
basepath = os.path.join(currentpath, 'ocr_test')
csvpath = os.path.join(currentpath, 'csv')
normalisedpath = os.path.join(currentpath, 'normalised')
backupnorm = os.path.join(currentpath, 'backup_normalised')
filename = 'file1'
savepath = os.path.join(csvpath,filename + ".csv")

def ocr_process(basepath):
    df = pd.DataFrame()
    for entry in os.listdir(basepath):
        imgpath = basepath + '/' +  entry
        print('Classification in progress...')
        d = classification_move(imgpath)
        df = df.append(list(d.items()),ignore_index=True)
    df.columns = ['filename','text']
    #df["text"]= df["text"].str.upper().str.title() 
    #df.text = df.text.str.title()
    df.to_csv(savepath, index=False)
    
    files = os.listdir(normalisedpath)
    for f in files:
        filename = f.split('.')[0]
        temp = os.path.join(normalisedpath,filename + ".jpg")
        shutil.move(temp,backupnorm)
    
    return print('Exported the file contents to csv, path:/csv/')

In [311]:
ocr_process(basepath)

Classification in progress...
Normalising the image...
OCR in progress...
D:\Analytics\Quarter 4\Applications of AI\Final project\normalised\iCard_021979_1_Daker_Sarah.jpg
iCard_021979_1_Daker_Sarah.jpg
Classification in progress...
Normalising the image...
OCR in progress...
D:\Analytics\Quarter 4\Applications of AI\Final project\normalised\iCard_021982_1_Dakin_Hannah_Lois.jpg
iCard_021982_1_Dakin_Hannah_Lois.jpg
Classification in progress...
Normalising the image...
OCR in progress...
D:\Analytics\Quarter 4\Applications of AI\Final project\normalised\iCard_021988_1_Dako_Martha.jpg
iCard_021988_1_Dako_Martha.jpg
Exported the file contents to csv, path:/csv/
