## Improve OCR Results with OpenCV Image Filtering

Image Filtering is changing the appearance of an image.  In the context of OCR(Optical Character Recognition), we can apply image filtering to reduce the noise around the characters in order to achieve a higher success rate of OCR versus unfiltered images.

This notebook applies "Gaussian Blur" image filtering to an image, then calls the [Azure Computer Vision Read API](https://docs.microsoft.com/en-us/azure/cognitive-services/computer-vision/vision-api-how-to-topics/call-read-api) to OCR the image.  For challenging cases, this image pre-processing steps improves on the OCR results.

 **References**:<br/>
 1. https://github.com/RoshanTanisha/OpenCVExamples
 2. https://learnopencv.com/image-filtering-using-convolution-in-opencv/

In [None]:
conda install -c menpo opencv

In [None]:
pip install --upgrade pip

In [None]:
pip install opencv-contrib-python

In [None]:
import os, sys, math
import cv2
import numpy as np
import matplotlib.pyplot as plt
import requests
import json

In [None]:
data_dir_path = os.path.join(os.path.dirname(os.getcwd()), 'code')

In [None]:
# supported image formats, see https://docs.opencv.org/4.5.3/d4/da8/group__imgcodecs.html#ga288b8b3da0892bd651fce07b3bbd3a56

def read_image(image_path):
    image = cv2.imread(image_path)
    return image

In [None]:
def save_image(image, image_name):
    cv2.imwrite(os.path.join(data_dir_path, image_name), image)

In [None]:
# note - OpenCV uses BGR image format. So, when we read an image using cv2.imread() it interprets in BGR format by default.

def convertBGR2RGB(image):
    return cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

In [None]:
def convertBGR2GRAY(image):
    bw_img = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    return cv2.cvtColor(bw_img, cv2.COLOR_GRAY2BGR)

In [None]:
def show_image(image):
    plt.imshow(image)
    plt.show()
    cv2.destroyAllWindows()

In [None]:
def plot_image(image):
    plt.figure(figsize=(10, 10))
    plt.imshow(img)

In [None]:
# https://learnopencv.com/opencv-threshold-python-cpp/
# pure black=0, pure white=255

def apply_thresholding(image):
    _, thresholded_image = cv2.threshold(image, thresh=40, maxval=255, type=cv2.THRESH_BINARY)
    return thresholded_image

In [None]:
def transform(image, transform_type):
    
    kwargs = {
        'Laplacian': {
            'ddepth': cv2.CV_64F
        }
    }
    
    return getattr(cv2, transform_type)(image, **kwargs[transform_type])

In [None]:
# https://learnopencv.com/image-filtering-using-convolution-in-opencv/#gauss-blur-opencv
# "blurring" is also known as "smoothing" to remove noise from an image

def apply_gaussian_blur(image):
    return cv2.GaussianBlur(image, (5, 5), 0)

In [None]:
"""
Apply sharpening using kernel
"""
def apply_filter2D(image):
    kernel3 = np.array([[0, -1,  0],
                        [-1,  5, -1],
                        [0, -1,  0]])
    return cv2.filter2D(src=image, ddepth=-1, kernel=kernel3)

In [None]:
def apply_bilateral_filter(image):
    return cv2.bilateralFilter(image, 9, 75, 75)

In [None]:
def canny_edges(image):
    return cv2.Canny(image, 100, 200)

In [None]:
# let's upscale the image using new  width and height
def size_up(image, up_width, up_height):
    #up_width = 600
    #up_height = 400
    up_points = (up_width, up_height)
    return cv2.resize(image, up_points, interpolation= cv2.INTER_LINEAR)

In [None]:
# Scaling Up the image 1.2 times by specifying both scaling factors
# e.g. scale_up_x = 1.2, scale_up_y = 1.2
# Resizing with scaling factor, it helps keep the aspect ratio intact and preserves the display quality
def scale_up(image, scale_up_x, scale_up_y):
    return cv2.resize(image, None, fx= scale_up_x, fy= scale_up_y, interpolation= cv2.INTER_LINEAR)

### Apply Image Filter

In [None]:
filename = '10472-7.tif'
img = read_image(os.path.join(data_dir_path, filename))

In [None]:
show_image(img)

In [None]:
filtered_img = apply_gaussian_blur(img)
show_image(filtered_img)

In [None]:
# Read image from file system
#img = cv2.imread(filename)

In [None]:
# encode image as tif
_, img_encoded = cv2.imencode('.tiff', filtered_img)

## Call the [Read API](https://docs.microsoft.com/en-us/azure/cognitive-services/computer-vision/vision-api-how-to-topics/call-read-api) with image file and process the results by extracting the lines into a text file

In [None]:
# Get environment variables
computer_vision_key = os.getenv('COMPUTER_VISION_KEY')


In [None]:
# Request headers.
headers = {
    'Content-Type': 'image/tiff',
    'Ocp-Apim-Subscription-Key': computer_vision_key
}

In [None]:
vision_url = 'https://westus2.api.cognitive.microsoft.com/vision/v3.2/read/analyze?readingOrder=natural'

### Make the API call

In [None]:
# send http request with image and receive response
response = requests.post(vision_url, data=img_encoded.tostring(), headers=headers)

In [None]:
#print(response.headers['Operation-Location'])

In [None]:
get_results_url = response.headers['Operation-Location']

###  Get the results

In [None]:
results_response = requests.get(get_results_url, headers=headers)

In [None]:
json_file = results_response.content.decode('utf-8')

In [None]:
data = json.loads(json_file)

In [None]:
lines = data['analyzeResult']['readResults'][0]['lines']

In [None]:
for line in lines:
    print(line['text'])

### Write the results into a text file

In [None]:
text_filename = filename + '_microsoft.txt'

with open(text_filename, 'w') as f:
    for line in lines:
        print(line['text'])
        f.write(line['text'])
        f.write('\n')

**Author**: Sidney Phoon <br/>
**Date**: Sept 30, 2021