# Handwritten Chinese and Japanese OCR

In this tutorial optical character recognition for handwritten Chinese (simplified) and Japanese is presented. Roman alphabet OCR can be find in [notebook 208](../208-optical-character-recognition). This model is capable of doing only one line of symbols each time. 

Models used for this notebooks are [handwritten-japanese-recognition](https://docs.openvinotoolkit.org/latest/omz_models_model_handwritten_japanese_recognition_0001.html) and [handwritten-simplified-chinese](https://docs.openvinotoolkit.org/latest/omz_models_model_handwritten_simplified_chinese_recognition_0001.html). To decode models output to readable text [kondate_nakayosi](https://github.com/openvinotoolkit/open_model_zoo/blob/master/data/dataset_classes/kondate_nakayosi.txt) and [scut_ept](https://github.com/openvinotoolkit/open_model_zoo/blob/master/data/dataset_classes/scut_ept.txt) charlists are used. Both model are from [Open Model Zoo](https://github.com/openvinotoolkit/open_model_zoo/).

## Imports modules required to run

In [None]:
import cv2
import matplotlib.pyplot as plt
import numpy as np
import sys
from itertools import groupby
from openvino.inference_engine import IECore
from os import path, makedirs

sys.path.append("../utils")
from notebook_utils import download_file

## Helper class

To run OCR we need both model and charlist, for this reason named ```Files``` class is created

In [None]:
class Files:
    def __init__(self, model: str, charlist_link: str, demo_image_name: str):
        self.model = model
        self.charlist_link = charlist_link
        self.demo_image_name = demo_image_name
        self.charlist_name = self.charlist_link.split('/')[-1]

## Settings

Set up all consts and folders used in this notebook

In [None]:
# Directories where data will be placed
model_folder = "model"
data_folder = "data"
charlist_folder = f"{model_folder}/charlists"

# Precision used by model
precision = "FP16"

model_extensions = ("bin", "xml")

## Names and links used for Japanese and Chinese

### Japanese

In [None]:
# Name of model that will be used
handwritten_japanese_model_name = "handwritten-japanese-recognition-0001"

# Link to charlist
japanese_charlist_link = "https://raw.githubusercontent.com/openvinotoolkit/open_model_zoo/master/data/dataset_classes/kondate_nakayosi.txt"

# Link to demo image
japanese_image_link = 'https://github.com/openvinotoolkit/open_model_zoo/raw/master/demos/handwritten_text_recognition_demo/python/data/handwritten_japanese_test.png'

# Extract image name from demo link
japanese_image_name = 'handwritten_japanese_test.png'

### Chinese

In [None]:
# Name of model that will be used
handwritten_simplified_chinese_model_name = "handwritten-simplified-chinese-recognition-0001"

# Link to charlist
simplified_chinese_charlist_link = "https://raw.githubusercontent.com/openvinotoolkit/open_model_zoo/master/data/dataset_classes/scut_ept.txt"

# Link to demo image
chinese_image_link = 'https://user-images.githubusercontent.com/36741649/140065813-1970cd70-53c6-4d6c-b403-7e6974df34f7.jpg'

# Extract image name from demo link
chinese_image_name = 'handwritten_chinese_test.jpg'

## Create directories for data and model

Charlists doesn't require to create additional folder as the download function itself creates subfolder inside model folder

In [None]:
makedirs(data_folder, exist_ok=True)
makedirs(model_folder, exist_ok=True)

## Download images

Download demo images for both models

In [None]:
download_file(url=chinese_image_link, filename=chinese_image_name, directory=data_folder, show_progress=False)
download_file(url=japanese_image_link, filename=japanese_image_name, directory=data_folder, show_progress=False)

## Group files used by languages to classes

In [None]:
# Japanese files grouped as a class
japanese_files = Files(model=handwritten_japanese_model_name, charlist_link=japanese_charlist_link, demo_image_name=japanese_image_name)

# Chinese files grouped as a class
chinese_files = Files(model=handwritten_simplified_chinese_model_name, charlist_link=simplified_chinese_charlist_link, demo_image_name=chinese_image_name)

## Download models and convert public model

If it is your first run models will download and convert here. It might take up to ten minutes. 

In [None]:
def download_files(language_files: Files):
    # Download model
    for extension in model_extensions:
        path_to_model = f'{model_folder}/intel/{language_files.model}/{precision}/{language_files.model}.{extension}'
        if not path.isfile(path_to_model):
            download_command = f'omz_downloader --name {language_files.model} --output_dir {model_folder} --precision {precision}'
            print(download_command)
            ! $download_command

    # Download charlist            
    if not path.isfile(f'{charlist_folder}/{language_files.charlist_name}'):
        download_file(language_files.charlist_link, directory=charlist_folder, show_progress=False)

### Download Japanese files

In [None]:
download_files(language_files=japanese_files)

### Download Chinese files

In [None]:
download_files(language_files=chinese_files)

## Select language

Depending on which language you wants to use, uncomment one of lines below to choose as ```currently_used_model``` either ```chinese_files.model``` or ```japanese_files.model```

In [None]:
def used_language(language: str) -> Files:
    languages = {
        "chinese": chinese_files,
        "japaneses": japaneses_files
    }
    if language not in languages.keys():
        raise KeyError(f"Invalid language choosen! Please pick one of those: {', '.join(languages.keys())}")
    return languages.get(language)

In [None]:
# Select language by using either use_language(language='chinese') or use_language(language='japanese')

selected_language = use_language(language='chinese')

## Load network and execute it

In [None]:
ie = IECore()

path_to_model = f"{model_folder}/intel/{selected_language.model}/{precision}/{selected_language.model}.xml"

net = ie.read_network(
    model=path_to_model
)

# To check available device names run line below
# print(ie.available_devices)

exec_net = ie.load_network(network=net, device_name="CPU")

## Fetch information about input and output layers 

It will be needed further to provide input and read output 

In [None]:
recognition_output_layer = next(iter(exec_net.outputs))
recognition_input_layer = next(iter(exec_net.input_info))

## Load an Image

In [None]:
# Read file name of demo file based on used model

file_name = selected_language.demo_image_name

# Text detection models expects image in grayscale format
# IMPORTANT!!! This model allows to read only one line at time

# Read image
image = cv2.imread(filename=f"{data_folder}/{file_name}", flags=cv2.IMREAD_GRAYSCALE)

## Fetch information about image and input layer shape

In [None]:
# Fetch shape
image_height, image_width = image.shape

# B,C,H,W = batch size, number of channels, height, width
_, _, H, W = net.input_info[recognition_input_layer].input_data.shape

# Calculate aspect ratio between image width and height to calculate padding
aspect_ratio = image_width / image_height

# Calculate scale ratio between input shape height and image height to resize image
scale_ratio = H / image_height

## Preprocess input

In [None]:
# Resize image to meet network expected input sizes
resized_image = cv2.resize(image, None, fx=scale_ratio, fy=scale_ratio, interpolation=cv2.INTER_AREA)

# Pad image to meet input size
resized_image = np.pad(resized_image, ((0, 0), (0, W - resized_image.shape[1])), mode='edge')

# Reshape to network input shape
input_image = resized_image[None, None, :, :]

## Visualise input

In [None]:
plt.figure(figsize=(20, 1))
plt.axis('off')
plt.imshow(resized_image, cmap='gray', vmin=0, vmax=255);

## Prepare charlist

Depending on used language charlists will differ

In [None]:
# Get dictionary to encode output, based on model documentation
used_charlist = selected_language.charlist_name

# With both models, there should be blank symbol added at index 0 of each charlists
blank_char = '~'

with open(f"{charlist_folder}/{used_charlist}", 'r', encoding='utf-8') as charlist:
    letters = blank_char + ''.join(line.strip() for line in charlist)

## Run inference

In [None]:
# Run inference on model
predictions = exec_net.infer(inputs={recognition_input_layer: input_image})[recognition_output_layer]

## Process infered data

In [None]:
# Remove unnececery dimension
predictions = np.squeeze(predictions)

# Run argmax to pick most possible symbols
predictions_indexes = np.argmax(predictions, axis=1)

In [None]:
# Use groupby to remove concurrent letters, as required by CTC greedy decoding
output_text_indexes = list(groupby(predictions_indexes))

# Remove grouper objects
output_text_indexes, _ = np.transpose(output_text_indexes, (1, 0))

# Remove blank symbols 
output_text_indexes = output_text_indexes[output_text_indexes != 0] 

# Assign letters to indexes from output array
output_text = [letters[letter_index] for letter_index in output_text_indexes]

## Print output

In [None]:
plt.figure(figsize=(20, 1))
plt.axis('off')
plt.imshow(resized_image, cmap='gray', vmin=0, vmax=255)

print(''.join(output_text))