# Digital Meter Reader 
This notebook shows how to create a meter reader with OpenVINO Runtime. We use the pre-trained [PP-OCR](https://github.com/PaddlePaddle/PaddleOCR) to build up a inference task pipeline:

1. Configure the screen area of the meter reader.
2. Configure the layout information of the meter reader.
3. Preprocess the image based on the given information.
4. Perform OCR recognition.
5. Structure output information.

This notebook will demonstrate how to read meters with different screens. In case 1 there is is a field strength meter with  general LCD screen, which means that we can read the pure text from the screen. In case 2, there is a household electricity meter with segment code LCD screen, which means we can only see the text but also the outline of the text in the background. The background outline can also be misread by the model as content that needs to be recognized, so we need to add some preprocessing module to eliminate the background information.

The overall process of the two cases is similar, we only make some modifications in preprocessing for different meters. The following is the flowchart of two types of meter readers.

<img align='center' src= "https://user-images.githubusercontent.com/83450930/239390678-97ad22ad-5275-41f2-bb8a-2af83e8af0af.png" alt="drawing" width="1500"/>

In some cases, the screen area in the image is not in a fixed position. A detection model can be used to dynamically provide the screen area information. Please see [PaddleDetection](https://github.com/PaddlePaddle/PaddleDetection) for more details.

The tutorial consists of the following steps:
1. Prepare the PaddleOCR model.
2. Define configuration and helper functions
    - Prepare some configuration (take case 1 as an example)
    - Define a helper function to apply an affine transformation
    - Define helper functions for the preprocessing of text recognition
    - Define helper functions for the postprocessing of text recognition
3. Main Function without Segment code LCD
    - Download Image
    - Apply the affine transformation in the figure
    - Define the areas to be recognition
    - Recognize the text on the screen
    - Postprocessing for fixing the errors in recognition
4. Main Function with Segment code LCD
    - Prepare the Image, some configurations, and the special preprocessing of LCD screen
    - Recognize the text on the screen (containing the affine transformation)
    - Postprocessing for fixing the errors in recognition

## Imports

In [None]:
import os
import cv2
import numpy as np
import sys
import math
import matplotlib.pyplot as plt
import tarfile
import copy

from openvino.runtime import Core

sys.path.append("../utils")
import notebook_utils as utils

## PaddleOCR with OpenVINO™

[PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR) is an ultra-light OCR model trained with PaddlePaddle deep learning framework, which aims to create multilingual and practical OCR tools. 

The PaddleOCR pre-trained model used in the demo refers to the *"Chinese and English ultra-lightweight PP-OCR model (9.4M)"*. More open source pre-trained models can be downloaded at [PaddleOCR Github](https://github.com/PaddlePaddle/PaddleOCR) or [PaddleOCR Gitee](https://gitee.com/paddlepaddle/PaddleOCR).

A standard PaddleOCR includes two parts of deep learning models, text detection and text recognition. This notebook only needs the text recognition part. For running the model, we first initialize the runtime for inference, then, read the network architecture and model weights from the `.pdmodel` and `.pdiparams` files to load to the CPU.

More details for running PaddleOCR with OpenVINO™ are shown in [405-paddle-ocr-webcam](../405-paddle-ocr-webcam/405-paddle-ocr-webcam.ipynb).

### Download the Model for Text **Recognition**

The pre-trained models used in the demo are downloaded and stored in the "model" folder.

In [None]:
MODEL_DIR = "model"
RECOGNITION_MODEL_LINK = "https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar"
RECOGNITION_FILE_NAME = RECOGNITION_MODEL_LINK.split("/")[-1]

# download file
os.makedirs(MODEL_DIR, exist_ok=True)
utils.download_file(RECOGNITION_MODEL_LINK, directory=MODEL_DIR, show_progress=True)
file = tarfile.open(f"model/{RECOGNITION_FILE_NAME}")
res = file.extractall("model")
if not res:
    print(f"Detection Model Extracted to \"./{MODEL_DIR}\".")
else:
    print("Error Extracting the Detection model. Please check the network.")

### Load the Model for Text **Recognition** with Dynamic Shape

Input to text recognition model refers to detected bounding boxes with different image sizes, for example, dynamic input shapes. Hence:

1. Input dimension with dynamic input shapes needs to be specified before loading the text recognition model.
2. Dynamic shape is specified by assigning -1 to the input dimension or by setting the upper bound of the input dimension using, for example, `Dimension(1, 512)`.

>Note: Since the text recognition model is with dynamic input shape and the current release of OpenVINO 2022.2 does not support dynamic shape on iGPU, you cannot directly switch the device to iGPU for inference in this case. Otherwise, you may need to resize the input images to this model into a fixed size and then try running the inference on iGPU.

In [None]:
# Initialize OpenVINO Runtime for text recognition.
core = Core()

# Read the model and corresponding weights from a file.
rec_model_file_path = f"{MODEL_DIR}/ch_PP-OCRv3_rec_infer/inference.pdmodel"
rec_model = core.read_model(model=rec_model_file_path)

# Assign dynamic shapes to every input layer on the last dimension.
for input_layer in rec_model.inputs:
    input_shape = input_layer.partial_shape
    input_shape[3] = -1
    rec_model.reshape({input_layer: input_shape})

rec_compiled_model = core.compile_model(model=rec_model, device_name="CPU")

# Get input and output nodes.
rec_input_layer = rec_compiled_model.input(0)
rec_output_layer = rec_compiled_model.output(0)

## Configuration and Helper functions

To structure output, we should first config some parameters (for example, the coordinates of the corners of the screen).

Then, use the following helper functions for preprocessing and postprocessing frames:

1. Preprocessing the input image: use affine transformations to normalize skewed images.
2. Preprocessing for text recognition: resize and normalize detected box images to the same size (for example, `(3, 32, 320)` size for images with Chinese text) for easy batching in inference.
3. Postprocessing for structure output: fix some errors in recognition.

### Configuration

There are four pieces of information to config:
1. POINTS: the edge of the meter's screen, which can be determined by four corners
2. DESIGN_SHAPE: the original shape of the screen
3. RESULT_TEMP: a template of output, which is a directory. The keys in the `dict` are output features.
4. DESIGN_LAYOUT: the elements and their positions in the screen, which is in a standard DESIGN_SHAPE. Commonly, after the image is converted into a regular image, a frame can be drawn directly on the image to delineate the position of each element. So we define the DESIGN_LAYOUT `dict` after `Preprocessing the input image`. 

> note: The concepts of 'DESIGN_SHAPE' and 'DESIGN_LAYOUT' refer to the fact that the layout information of a display screen is determined during design, and these information is independent of the image. Therefore, you can obtain 'DESIGN_LAYOUT' information from your device manual.
However, a simple way to run the program is to regard the coordinates of some points in figure as 'DESIGN_LAYOUT'. You can use `Paint`, a windows application, to abtain the coordinates. You only need to move the mouse to a position on the image, and the tool will display the corresponding coordinates.

In the example image below, there are 8 features to output.
1. Info_Probe: Probe information of power frequency field strength meter. "探头:---" means there is no information about Probe.
2. Freq_Set: The work frequency, and the "实时值" is a const text to tell user that the preceding numbers are real-time.
3. Val_Total: The value measured. "无探头" means there is no information about Porbe, otherwise, there should be a float value.
4. Val_X: Value from the x-axis.
5. Val_Y: Value from the y-axis.
6. Val_Z: Value from the z-axis.
7. Unit
8. Field: One of the Conventional, electric field, magnetic field. The Chinese word "电场" means electric field.

<img align='center' src= "https://user-images.githubusercontent.com/83450930/236680146-5751e291-d509-4d71-a2cb-bfbf35609051.jpg" alt="drawing" width="200"/>

In [None]:
# The coordinates of the corners of the screen in case 1
POINTS = [[1121, 56],    # Left top
          [3242, 183],   # right top
          [3040, 1841],  # right bottom
          [1000, 1543]]   # left bottom

# The size of the screen in case 1
DESIGN_SHAPE = (1300, 1000)

# Output template in case 1
RESULT_TEMP = {"Info_Probe":"探头:---", 
               "Freq_Set":"", 
               "Val_Total":"无探头", 
               "Val_X":"", 
               "Val_Y":"", 
               "Val_Z":"", 
               "Unit":"A/m", 
               "Field":"常规"}

### Preprocessing the input image

Use affine transformations to normalize skewed images. After the preprocessing, the DESIGN_LAYOUT can be given manually.

In [None]:
def pre_processing(img, point_list, target_shape):
    """
    Preprocessing function for normalizing skewed images.
    Parameters:
        img (np.ndarray): Input image.
        point_list (List[List[int, int], List[int, int], List[int, int]]): Coordinates of the corners of the screen.
        target_shape (Tuple(int, int)): The design shape.
    """
    
    # affine transformations
    # point list is the coordinates of the corners of the screen
    # target shape is the design shape
    
    target_w, target_h = target_shape
    pts1 = np.float32(point_list)
    pts2 = np.float32([[0, 0],[target_w,0],[target_w, target_h],[0,target_h]])
    
    M = cv2.getPerspectiveTransform(pts1, pts2)
    img2 = cv2.warpPerspective(img, M, (target_w,target_h))
    return img2

### Preprocessing Image Functions for Text Recognition

In [None]:
# Preprocess for text recognition.
def resize_norm_img(img, max_wh_ratio):
    """
    Resize input image for text recognition

    Parameters:
        img: bounding box image from text detection 
        max_wh_ratio: value for the resizing for text recognition model
    """
    rec_image_shape = [3, 48, 320]
    imgC, imgH, imgW = rec_image_shape
    assert imgC == img.shape[2]
    character_type = "ch"
    if character_type == "ch":
        imgW = int((32 * max_wh_ratio))
    h, w = img.shape[:2]
    ratio = w / float(h)
    if math.ceil(imgH * ratio) > imgW:
        resized_w = imgW
    else:
        resized_w = int(math.ceil(imgH * ratio))
    resized_image = cv2.resize(img, (resized_w, imgH))
    resized_image = resized_image.astype('float32')
    resized_image = resized_image.transpose((2, 0, 1)) / 255
    resized_image -= 0.5
    resized_image /= 0.5
    padding_im = np.zeros((imgC, imgH, imgW), dtype=np.float32)
    padding_im[:, :, 0:resized_w] = resized_image
    return padding_im


def get_rotate_crop_image(img, points):
    '''
    img_height, img_width = img.shape[0:2]
    left = int(np.min(points[:, 0]))
    right = int(np.max(points[:, 0]))
    top = int(np.min(points[:, 1]))
    bottom = int(np.max(points[:, 1]))
    img_crop = img[top:bottom, left:right, :].copy()
    points[:, 0] = points[:, 0] - left
    points[:, 1] = points[:, 1] - top
    '''
    assert len(points) == 4, "shape of points must be 4*2"
    img_crop_width = int(
        max(
            np.linalg.norm(points[0] - points[1]),
            np.linalg.norm(points[2] - points[3])))
    img_crop_height = int(
        max(
            np.linalg.norm(points[0] - points[3]),
            np.linalg.norm(points[1] - points[2])))
    pts_std = np.float32([[0, 0], [img_crop_width, 0],
                          [img_crop_width, img_crop_height],
                          [0, img_crop_height]])
    M = cv2.getPerspectiveTransform(points, pts_std)
    dst_img = cv2.warpPerspective(
        img,
        M, (img_crop_width, img_crop_height),
        borderMode=cv2.BORDER_REPLICATE,
        flags=cv2.INTER_CUBIC)
    dst_img_height, dst_img_width = dst_img.shape[0:2]
    if dst_img_height * 1.0 / dst_img_width >= 1.5:
        dst_img = np.rot90(dst_img)
    return dst_img


def prep_for_rec(dt_boxes, frame):
    """
    Preprocessing of the detected bounding boxes for text recognition

    Parameters:
        dt_boxes: detected bounding boxes from text detection 
        frame: original input frame 
    """
    ori_im = frame.copy()
    img_crop_list = [] 
    for bno in range(len(dt_boxes)):
        tmp_box = copy.deepcopy(dt_boxes[bno])
        img_crop = get_rotate_crop_image(ori_im, tmp_box)
        img_crop_list.append(img_crop)
        
    img_num = len(img_crop_list)
    # Calculate the aspect ratio of all text bars.
    width_list = []
    for img in img_crop_list:
        width_list.append(img.shape[1] / float(img.shape[0]))
    
    # Sorting can speed up the recognition process.
    indices = np.argsort(np.array(width_list))
    return img_crop_list, img_num, indices


def batch_text_box(img_crop_list, img_num, indices, beg_img_no, batch_num):
    """
    Batch for text recognition

    Parameters:
        img_crop_list: processed detected bounding box images 
        img_num: number of bounding boxes from text detection
        indices: sorting for bounding boxes to speed up text recognition
        beg_img_no: the beginning number of bounding boxes for each batch of text recognition inference
        batch_num: number of images for each batch
    """
    norm_img_batch = []
    max_wh_ratio = 0
    end_img_no = min(img_num, beg_img_no + batch_num)
    for ino in range(beg_img_no, end_img_no):
        h, w = img_crop_list[indices[ino]].shape[0:2]
        wh_ratio = w * 1.0 / h
        max_wh_ratio = max(max_wh_ratio, wh_ratio)
    for ino in range(beg_img_no, end_img_no):
        norm_img = resize_norm_img(img_crop_list[indices[ino]], max_wh_ratio)
        norm_img = norm_img[np.newaxis, :]
        norm_img_batch.append(norm_img)

    norm_img_batch = np.concatenate(norm_img_batch)
    norm_img_batch = norm_img_batch.copy()
    return norm_img_batch

### Postprocessing Image Functions for Text Recognition

The results of text recognition are text-index, we should convert is to the characters. The following codes construct a decoder for the conversion.

In [None]:
class RecLabelDecode(object):
    """ Convert between text-label and text-index """

    def __init__(self,
                 character_dict_path=None,
                 character_type='ch',
                 use_space_char=False):
        support_character_type = [
            'ch', 'en', 'EN_symbol', 'french', 'german', 'japan', 'korean',
            'it', 'xi', 'pu', 'ru', 'ar', 'ta', 'ug', 'fa', 'ur', 'rs', 'oc',
            'rsc', 'bg', 'uk', 'be', 'te', 'ka', 'chinese_cht', 'hi', 'mr',
            'ne', 'EN', 'latin', 'arabic', 'cyrillic', 'devanagari'
        ]
        assert character_type in support_character_type, "Only {} are supported now but get {}".format(
            support_character_type, character_type)

        self.beg_str = "sos"
        self.end_str = "eos"

        if character_type == "en":
            self.character_str = "0123456789abcdefghijklmnopqrstuvwxyz"
            dict_character = list(self.character_str)
        elif character_type == "EN_symbol":
            # same with ASTER setting (use 94 char).
            self.character_str = string.printable[:-6]
            dict_character = list(self.character_str)
        elif character_type in support_character_type:
            self.character_str = []
            assert character_dict_path is not None, "character_dict_path should not be None when character_type is {}".format(
                character_type)
            with open(character_dict_path, "rb") as fin:
                lines = fin.readlines()
                for line in lines:
                    line = line.decode('utf-8').strip("\n").strip("\r\n")
                    self.character_str.append(line)
            if use_space_char:
                self.character_str.append(" ")
            dict_character = list(self.character_str)
        else:
            raise NotImplementedError
        self.character_type = character_type
        dict_character = self.add_special_char(dict_character)
        self.dict = {}
        for i, char in enumerate(dict_character):
            self.dict[char] = i
        self.character = dict_character

        
    def __call__(self, preds, label=None, *args, **kwargs):
        preds_idx = preds.argmax(axis=2)
        preds_prob = preds.max(axis=2)
        text = self.decode(preds_idx, preds_prob, is_remove_duplicate=True)
        if label is None:
            return text
        label = self.decode(label)
        return text, label

    
    def add_special_char(self, dict_character):
        dict_character = ['blank'] + dict_character
        return dict_character

    
    def decode(self, text_index, text_prob=None, is_remove_duplicate=False):
        """ convert text-index into text-label. """
        result_list = []
        ignored_tokens = self.get_ignored_tokens()
        batch_size = len(text_index)
        for batch_idx in range(batch_size):
            char_list = []
            conf_list = []
            for idx in range(len(text_index[batch_idx])):
                if text_index[batch_idx][idx] in ignored_tokens:
                    continue
                if is_remove_duplicate:
                    # only for predict
                    if idx > 0 and text_index[batch_idx][idx - 1] == text_index[
                            batch_idx][idx]:
                        continue
                char_list.append(self.character[int(text_index[batch_idx][
                    idx])])
                if text_prob is not None:
                    conf_list.append(text_prob[batch_idx][idx])
                else:
                    conf_list.append(1)
            text = ''.join(char_list)
            result_list.append((text, np.mean(conf_list)))
        return result_list

    
    def get_ignored_tokens(self):
        return [0]  # for ctc blank


# Since the recognition results contain chinese words, we should use 'ch' as character_type
text_decoder = RecLabelDecode(character_dict_path="../data/text/ppocr_keys_v1.txt",
                              character_type='ch',  
                              use_space_char=True)

### Postprocessing the output information
The results of text recognition may contain some errors or unimportant information. In order to make the recognition results more accurate, we establish an auxiliary function. This auxiliary function corrects the text recognition results for the following three situations:

1. Information that needs to be removed or replacing (keyword: 'RP'): For example, when recognizing values of Val_X, 'X:' will be recognized together, where 'X:' can be replace by '' from the final recognition result.
2. Information to be mapped (keyword: 'MP'): 'k/m' may be recognized as km, so km can be mapped as' k/m'.
3. Decimal point addition (keyword 'AD'): Sometimes decimal points may be missed. If the number of decimal places is known in advance, decimal points can be added.

In [None]:
# Post-processing, fix some error made in recognition
def post_processing(results, post_configration):
    """
    Postprocessing function for correcting the recognition errors.
    Parameters:
        results (Dict): The result directory.
        post_configration (Dict): The configuration directory.
    """
    for key in post_configration.keys():
        if len(post_configration[key]) == 0:
            continue  # nothing to do
        for post_item in post_configration[key]:
            key_word = post_item[0]
            if key_word == 'MP':  # mapping
                source_word = post_item[1]
                target_word = post_item[2]
                if source_word in results[key]:
                    results[key] = target_word
            elif key_word == 'RP':  # removing
                source_word = post_item[1]
                target_word = post_item[2]
                results[key] = results[key].replace(source_word, target_word)
            elif key_word == 'AD':  # add point
                add_position = post_item[1]
                results[key] = results[key][:add_position] + '.' + results[key][add_position:]
    return results

After estabilishing the function, only a configuration dictionary needs to be passed in for post-processing, such as `['MP', 'LF', '探头:LF-01']`, `['RP', 'X', ':']`, `['AD', -2]`.

## Main Function without Segment code LCD

### Get Input Image

In [None]:
# Download images
IMG_URL = "https://user-images.githubusercontent.com/83450930/236680146-5751e291-d509-4d71-a2cb-bfbf35609051.jpg"
IMG_FILE_NAME = IMG_URL.split("/")[-1]
utils.download_file(IMG_URL, show_progress=False)

# Read image
img = cv2.imread(IMG_FILE_NAME)

# Show input image
plt.imshow(img)

### Preprocessing the input image

Cut the screen part and use affine transformations to normalize skewed images.

In [None]:
# affine transformations to normalize skewed images
img = pre_processing(img, POINTS, DESIGN_SHAPE)

# The screen part is cut and corrected
plt.imshow(img)

### Selecte the Region to be recognized

In [None]:
# features and the layout information
DESIGN_LAYOUT = {'Info_Probe':[14, 36, 410, 135],  # feature_name, xmin, ymin, xmax, ymax
                 'Freq_Set':[5, 290, 544, 406], 
                 'Val_Total':[52, 419, 1256, 741], 
                 'Val_X':[19, 774, 433, 882], 
                 'Val_Y':[433, 773, 874, 884], 
                 'Val_Z':[873, 773, 1276, 883], 
                 'Unit':[1064, 291, 1295, 403], 
                 'Field':[5, 913, 243, 998]}

### Cut the element fields and recognition

In [None]:
# the input for recognition need the image, DESIGN information, compiled_model
def main_for_field_strength_meter(img, DESIGN_LAYOUT, RESULT_TEMP, rec_compiled_model, rec_output_layer, text_decoder):
    """
    Main program of processing the field strength meter.
    Parameters:
        img (np.ndarray): Input image.
        DESIGN_LAYOUT (Dict): The coordinates of elements in the screen.
        RESULT_TEMP (Dict): The template for structure output.
        rec_compiled_model: CompiledModel.
        rec_output_layer: The output of openvino model.
        text_decoder(RecLabelDecode): The decoder of raw-recognition results.
    """
    # copy the structure output template
    struct_result = copy.deepcopy(RESULT_TEMP)

    # structure recognition begins here
    for key in DESIGN_LAYOUT.keys():
        # cut imgs according the layout information
        xmin, ymin, xmax, ymax = DESIGN_LAYOUT[key]
        cut_img = img[ymin:ymax, xmin:xmax]
        
        h = ymax - ymin  # height of cut_img
        w = xmax - xmin  # width of cut_img
        dt_boxes = [np.array([[0,0],[w,0],[w,h],[0,h]],dtype='float32')]
        batch_num = 1

        # since the input img is cut, we do not need a detection model to find the position of texts
        # Preprocess detection results for recognition.
        img_crop_list, img_num, indices = prep_for_rec(dt_boxes, cut_img)

        # txts are the recognized text results
        rec_res = [['', 0.0]] * img_num
        txts = [] 

        for beg_img_no in range(0, img_num):

            # Recognition starts from here.
            norm_img_batch = batch_text_box(
                img_crop_list, img_num, indices, beg_img_no, batch_num)

            # Run inference for text recognition. 
            rec_results = rec_compiled_model([norm_img_batch])[rec_output_layer]

            # Postprocessing recognition results.
            rec_result = text_decoder(rec_results)
            for rno in range(len(rec_result)):
                rec_res[indices[beg_img_no + rno]] = rec_result[rno]   
            if rec_res:
                txts = [rec_res[i][0] for i in range(len(rec_res))] 

        # record the recognition result
        struct_result[key] = txts[0]
        
    return struct_result

struct_result = main_for_field_strength_meter(img, DESIGN_LAYOUT, RESULT_TEMP, rec_compiled_model, rec_output_layer, text_decoder)
        
# the raw output information
print(struct_result)

### Postprocessing the output information
For correcting the error in recognition, we only need to configure a `dict`.

In [None]:
# Congiguration for postprocessing of the results
RESULT_POST = {"Info_Probe":[['MP', 'LF', '探头:LF-01']],  # words need to be mapped
               "Freq_Set":[['RP', '实时值', ''], ['RP', ' ', '']],  # words need to be replace
               "Val_Total":[['RP', 'H2', 'Hz']],
               "Val_X":[['RP', 'X', ''], ['RP', ':', '']], 
               "Val_Y":[['RP', 'Y', ''], ['RP', ':', '']], 
               "Val_Z":[['RP', 'Z', ''], ['RP', ':', '']], 
               "Unit":[['MP', 'T', 'μT'],['MP', 'kV', 'kV/m'],['MP', 'kv', 'kV/m'],['MP', 'vm', 'V/m'],['MP', 'Vm', 'V/m'],['MP', 'A', 'A/m']], 
               "Field":[]}  # nothing need to do

In [None]:
# Postprocessing, to fix some error made in recognition
struct_result = post_processing(struct_result, RESULT_POST)

# Print result
print(struct_result)

## Main Function for segment code LCD screen

Here shows how to extract the value in the screen of domestic intelligent meter.

<img align='center' src= "https://user-images.githubusercontent.com/83450930/237240032-fa150a14-800c-4525-ad4f-686165d23aa4.png" alt="drawing" width="200"/>


This screen is Segment code LCD, where we can see the outline of the text in the background, but only the highlighted parts need to be recognized. In this case, the model may also consider the background content as a part that needs to be recognized. The information displayed on this screen is not centered, which is also detrimental to the text recognition model.

The following steps can be taken to address the above issues:

1. Perform binary processing on the image. By specifying a threshold, background information can be easily and quickly removed.
2. Separate the digital display area into many small blocks and remove those solid colored blocks to ensure that the text in final image for recognition is centered.

In addition, we can also add an external detection model to center the recognized text, or train a better recognition model to improve recognition accuracy. But these strategies all come with higher computational overhead.

### Configuration

In [None]:
# Prepare the image, configuration and pre/post-processing

# Download images
IMG_URL = "https://user-images.githubusercontent.com/83450930/237240032-fa150a14-800c-4525-ad4f-686165d23aa4.png"
IMG_FILE_NAME = IMG_URL.split("/")[-1]
utils.download_file(IMG_URL, show_progress=False)

# The coordinates of the corners of the screen
POINTS = [[275, 676],    # Left top, [x, y]
          [1775, 704],   # right top
          [1829, 2439],  # right bottom
          [205, 2469]]   # left bottom

# The size of the screen
DESIGN_SHAPE = (1500, 1800)

# Output template
RESULT_TEMP = {"Value":""}

# features and the layout information
DESIGN_LAYOUT = {'Value':[470, 436, 1050, 592]}

# a special pre-processing for the LCD screen
def pre_processing_for_LCD(cut_img):
    """
    A process function for the Segment code LCD
    Parameters:
        cut_img (np.ndarray): Input image.
    """
    # BGR-image to BINARY-image
    cut_img = cv2.cvtColor(cut_img, cv2.COLOR_BGR2GRAY)
    _, cut_img = cv2.threshold(cut_img, 64, 255, cv2.THRESH_BINARY)
    
    # delete the area without text
    step_size = 30
    for i in range(0, cut_img.shape[1], step_size):
        rate = len(np.where(cut_img[:, i:i + step_size] > 128)[0]) / (cut_img[:, i:i + step_size].shape[0] * cut_img[:, i:i + step_size].shape[1])
        if rate <= 0.95:
            cut_img = cut_img[:, i:]
            break
    
    # [h, w] to [h, w, 3]
    cut_img = np.expand_dims(cut_img, axis=2)
    cut_img = np.concatenate((cut_img, cut_img, cut_img), axis=-1)
    
    return cut_img
    
# Configuration for postprocessing of the results
RESULT_POST = {"Value":[['RP', ' ', ''], ['AD', -2]]}  # add point

### Pre-processing

In [None]:
# read image
img = cv2.imread(IMG_FILE_NAME)

# pre-processing
img = pre_processing(img, POINTS, DESIGN_SHAPE)

# Show pre-processed image
plt.imshow(img)

### Recognition

In [None]:
# the input for recognition need the image, DESIGN information, compiled_model
def main_for_electricity_meter(img, DESIGN_LAYOUT, RESULT_TEMP, rec_compiled_model, rec_output_layer, text_decoder):
    """
    Main program of processing the electricity meter.
    Parameters:
        img (np.ndarray): Input image.
        DESIGN_LAYOUT (Dict): The coordinates of elements in the screen.
        RESULT_TEMP (Dict): The template for structure output.
        rec_compiled_model: CompiledModel.
        rec_output_layer: The output of openvino model.
        text_decoder(RecLabelDecode): The decoder of raw-recognition results.
    """
    # copy the structure output template
    struct_result = copy.deepcopy(RESULT_TEMP)

    # structure recognition begins here
    for key in DESIGN_LAYOUT.keys():
        # cut imgs according the layout information
        xmin, ymin, xmax, ymax = DESIGN_LAYOUT[key]
        cut_img = img[ymin:ymax, xmin:xmax]

        if key == 'Value':
            # show the image before LCD pre-processing
            print('Left is the Value Part before the pre_processing_for_LCD')
            plt.subplot(1,2,1)
            plt.imshow(cut_img)

            cut_img = pre_processing_for_LCD(cut_img)

            # show the image after LCD pre-processing
            print('Right is the Value Part after the pre_processing_for_LCD')
            plt.subplot(1,2,2)
            plt.imshow(cut_img)

        h = ymax - ymin  # height of cut_img
        w = xmax - xmin  # width of cut_img
        dt_boxes = [np.array([[0,0],[w,0],[w,h],[0,h]],dtype='float32')]
        batch_num = 1

        # since the input img is cut, we do not need a detection model to find the position of texts
        # Preprocess detection results for recognition.
        img_crop_list, img_num, indices = prep_for_rec(dt_boxes, cut_img)

        # txts are the recognized text results
        rec_res = [['', 0.0]] * img_num
        txts = [] 

        for beg_img_no in range(0, img_num):

            # Recognition starts from here.
            norm_img_batch = batch_text_box(
                img_crop_list, img_num, indices, beg_img_no, batch_num)

            # Run inference for text recognition. 
            rec_results = rec_compiled_model([norm_img_batch])[rec_output_layer]

            # Postprocessing recognition results.
            rec_result = text_decoder(rec_results)
            for rno in range(len(rec_result)):
                rec_res[indices[beg_img_no + rno]] = rec_result[rno]   
            if rec_res:
                txts = [rec_res[i][0] for i in range(len(rec_res))] 

        # record the recognition result
        struct_result[key] = txts[0]
        
        return struct_result

struct_result = main_for_electricity_meter(img, DESIGN_LAYOUT, RESULT_TEMP, rec_compiled_model, rec_output_layer, text_decoder)
        
# the raw output information
print(struct_result)

### Post-processing of the results

In [None]:
# Post-processing, fix some error made in recognition
struct_result = post_processing(struct_result, RESULT_POST)

# the final output information
print(struct_result)

 ## Try it with your meter photos!
 
 For your photos, you only need to modify the `Configuration` and `post_processing` to run above!