
**Created by Sanskar Hasija**

**Keras-OCR VS EasyOCR VS PYTESSERACT**

**24 August 2021**


# <center> Keras-OCR VS EasyOCR VS PYTESSERACT </center>

### [1. KERAS-OCR](#kerasocr) ###
### [2. EASYOCR](#easyocr) ###
### [3. PYTESSERACT](#pytesseract) ###
##    [Conclusions](#conclusions) ##

## Installing Keras-ocr

In [None]:
from IPython.display import clear_output
!pip install keras-ocr
clear_output()

## IMPORTS

In [None]:
import pytesseract
import keras_ocr
import easyocr
import matplotlib.pyplot as plt
import cv2
%matplotlib inline

## TEST IMAGES

In [None]:
url = [
    "https://raw.githubusercontent.com/sanskar-hasija/ocr-comparision/main/test_images/image1.png",
    "https://raw.githubusercontent.com/sanskar-hasija/ocr-comparision/main/test_images/image2.png",
    "https://raw.githubusercontent.com/sanskar-hasija/ocr-comparision/main/test_images/image3.png",
    "https://raw.githubusercontent.com/sanskar-hasija/ocr-comparision/main/test_images/image4.png"
]
images = [ keras_ocr.tools.read(i) for i in url]

In [None]:
fig = plt.figure(figsize=(16,10))
rows = 2
columns = 2

fig.add_subplot(rows, columns, 1)
plt.imshow(images[0])
plt.axis('off')
plt.title("First Image")

fig.add_subplot(rows, columns, 2)
plt.imshow(images[1])
plt.axis('off')
plt.title("Second Image")

fig.add_subplot(rows, columns, 3)
plt.imshow(images[2])
plt.axis('off')
plt.title("Third Image")

fig.add_subplot(rows, columns, 4)
plt.imshow(images[3])
plt.axis('off')
plt.title("Fourth Image");

<a id="kerasocr"></a>
# KERAS_OCR

In [None]:
pipline = keras_ocr.pipeline.Pipeline() #Creting a pipline 
kerasocr_preds = pipline.recognize(images)

### Keras-ocr plots boxes of detected text with annotations on the input image.

## Results of Keras-OCR

In [None]:
fig,axs = plt.subplots(nrows = 4 , figsize = (30,30))
for ax , image,  prediction in zip(axs , images , kerasocr_preds):
    keras_ocr.tools.drawAnnotations(image, prediction, ax)

<a id="easyocr"></a>
# EASYOCR

In [None]:
text_reader = easyocr.Reader(['en']) #Initialzing the ocr

## Results of EASY OCR

### First Image

In [None]:
results = text_reader.readtext(images[0] )
for (bbox, text, prob) in results:
    print(text)
plt.imshow(images[0])
plt.title("First Image");

### Second Image

In [None]:
results = text_reader.readtext(images[1] )
for (bbox, text, prob) in results:
    print(text)
plt.imshow(images[1])
plt.title("Second Image");

### Third Image

In [None]:
results = text_reader.readtext(images[2] )
for (bbox, text, prob) in results:
    print(text)
plt.imshow(images[2])
plt.title("Third Image");

### Fourth Image

In [None]:
results = text_reader.readtext(images[3] )
for (bbox, text, prob) in results:
    print(text)
plt.imshow(images[3])
plt.title("Fourth Image");

<a id="pytesseract"></a>
# Pytesseract

In [None]:
tesseract_preds = []
for img in images:
    tesseract_preds.append(pytesseract.image_to_string(img))

##  Results of Pytesseract

### Image 1 

In [None]:
print(tesseract_preds[0])
plt.imshow(images[0])
plt.title("First Image");

### Image 2

In [None]:
print(tesseract_preds[1])
plt.imshow(images[1])
plt.title("Second Image");

## Image 3

In [None]:
print(tesseract_preds[2])
plt.imshow(images[2])
plt.title("Third Image");

## Image 4

In [None]:
print(tesseract_preds[3])
plt.imshow(images[3])
plt.title("Fourth Image");

<a id="conclusions"></a>
# CONCLUSIONS

### * Keras-OCR is image specific OCR tool. If text is inside the image and their fonts and colors are unorganized, Keras-ocr consumes time if used on CPU
### * EasyOCR is lightweight model which is giving a good performance for receipt or PDF conversion. It is giving more accurate results with organized texts like pdf files, receipts, bills. EasyOCR also performs well on noisy images
### * Pytesseract is performing well for high-resolution images. Certain morphological operations such as dilation, erosion, OTSU binarization can help increase pytesseract performance. It also provides better results on handwritten text as compared to EasyOCR
### * All these results can be further improved by performing specific image operations.

# <center>If you find this notebook useful, support with an upvote!</center>