Add batch img inference support for ocr det with readtext_batched #458

SamSamhuns · 2021-06-12T10:59:40Z

Batched image inference for text detection

reader = easyocr.Reader(['en'], cudnn_benchmark=True)
img_path = [
        "https://pytorch.org/tutorials/_static/img/thumbnails/cropped/profiler.png",
        "https://www.tensorflow.org/images/tf_logo_social.png",
        "https://storage.googleapis.com/gd-wagtail-prod-assets/original_images/evolving_google_identity_2x1.jpg"]
reader.readtext_batched(img_path, n_width=800, n_height=600)

Caveats:

For batched inference, all input images must be of the same size. They can be resized before or the n_width and n_height parameters can be used in readtext_batched. readtext_batched can also take a single image as input but returns a result list with one element, i.e. a further result[0] access will be required.
cudnn.benchmark mode set to True is better for batched inference hence I pass cudnn_benchmark=True in easyocr.Reader
GPU batched inference needs some warmup for the same batch size to see better performance on batched mode hence I use dummy = np.zeros([batch_size, 600, 800, 3], dtype=np.uint8); reader.readtext_batched(dummy)before timing the inferences.
Batched inference mode should be used when a large number of frames are needed to be processed i.e. detecting and recognizing text in a video, otherwise, sequential processing will be faster for processing one image per API call.
When running on GPU mode, the user will have to take care of the batch size themselves to prevent cuda out of memory error

Edited files

These changes although major should have no backward compatibility issues, but I would greatly appreciate extensive testing @rkcosmos . I am open to any suggestions or changes

utils.py

Added a new functionreformat_input_batched to take a list of file paths, numpy ndarrays, or byte stream objects

detection.py

Changed the get_textbox function to process a list of lists of bboxes and polys
Changed the test_net functions to accumulate the input image and send all the inputs in a single tensor to the CRAFT torch model

easyocr.py

Added a new function readtext_batched to take a list of file paths, numpy ndarrays, or byte stream objects now to process them in batch.
Change the detect function to process a list of images

I have a test script here to verify the functions are working as intended and added results for both CPU and GPU

As expected GPU batched inference is almost twice as fast as sequential GPU inference.

GPU results

CPU results

test_batch_easyocr.py program to generate the outputs above

from __future__ import print_function

import easyocr
import numpy as np
import time
import cv2
import sys
import os

if sys.version_info[0] == 2:
    from six.moves.urllib.request import urlretrieve
else:
    from urllib.request import urlretrieve


def test_single_and_batched_text_detection_and_prediction():
    reader = easyocr.Reader(['en'])
    # test with easy logos to ensure same results
    # test for single image with old api
    result = reader.readtext(
        "https://pytorch.org/tutorials/_static/img/thumbnails/cropped/profiler.png")
    assert len(result), 1
    assert result[0][1], 'PyTorch'
    print(result)
    print("Single image test with readtext successful")

    # test for single image with new api
    result = reader.readtext_batched(
        "https://pytorch.org/tutorials/_static/img/thumbnails/cropped/profiler.png")
    assert len(result), 1
    assert result[0][0][1], 'PyTorch'
    print(result)
    print("Single image test with readtext_batched successful")

    # test for a list of images in batch
    img_path = [
        "https://pytorch.org/tutorials/_static/img/thumbnails/cropped/profiler.png",
        "https://www.tensorflow.org/images/tf_logo_social.png",
        "https://storage.googleapis.com/gd-wagtail-prod-assets/original_images/evolving_google_identity_2x1.jpg"]

    """
    all images in image list must be of the same size for batched inference
        for eg, result = reader.readtext_batched(img_path) will fail here
        so either resize all images to the same size before passing to readtext_batched
        or call the func like so reader.readtext_batched(img_path, n_width=800, n_height=600)
    """
    # warning, for better results, it is recommended to maintain aspect while resizing
    result = reader.readtext_batched(img_path, n_width=800, n_height=600)
    assert len(result), 3
    assert result[0][0][1], 'PyTorch'
    assert result[1][0][1], 'TensorFlow'
    assert result[2][0][1], 'Google'
    print(result)
    print("Batched image test with readtext_batched successful")

    ############################################################################
    # inference time test between sequential and batch processing
    # batch processing will be faster when using GPU
    ############################################################################
    # pre-download, load and resize images for inference time test
    img_path = [
        "https://pytorch.org/tutorials/_static/img/thumbnails/cropped/profiler.png",
        "https://www.tensorflow.org/images/tf_logo_social.png",
        "https://storage.googleapis.com/gd-wagtail-prod-assets/original_images/evolving_google_identity_2x1.jpg"]

    cv2_images = []
    for i, path in enumerate(img_path):
        tmp, _ = urlretrieve(path)
        cv2_img = cv2.resize(cv2.imread(tmp), (800, 600))
        cv2_images.append(cv2_img)
        os.remove(tmp)

    img_repeat, num_loop = 5, 1
    cv2_images = np.array(cv2_images)
    # np repeat to get a batch of 15 images, getting arr 15,600,800,3
    cv2_images_repeat1 = np.repeat(cv2_images, repeats=img_repeat, axis=0)
    cv2_images_repeat2 = cv2_images_repeat1.copy()
    print(
        f"Running inference speed test with an image array of shape {cv2_images_repeat1.shape} for {num_loop} iterations")

    # sequential processing
    # run batch processing test
    reader = easyocr.Reader(['en'])
    itime = time.time()
    for i in range(num_loop):
        for img in cv2_images_repeat1:
            reader.readtext(img)
    print(
        "Single/Sequential image inference time per image: " +
        f"{(time.time()-itime)/(num_loop*cv2_images_repeat1.shape[0]):.3f}s")
    # batched processing
    reader = easyocr.Reader(['en'], cudnn_benchmark=True)

    # warmup for batched inference on GPU, using same batch size for all subsequent inference
    # cudnn benchmark should be set to True
    # see this issue https://discuss.pytorch.org/t/model-inference-very-slow-when-batch-size-changes-for-the-first-time/44911
    dummy = np.zeros([len(img_path) * img_repeat, 600, 800, 3], dtype=np.uint8)
    reader.readtext_batched(dummy)

    # run batch processing test
    itime = time.time()
    for i in range(num_loop):
        reader.readtext_batched(cv2_images_repeat2)
    print(
        "Batched image inference time per image: " +
        f"{(time.time()-itime)/(num_loop*cv2_images_repeat1.shape[0]):.3f}s")


test_single_and_batched_text_detection_and_prediction()

SamSamhuns · 2021-06-14T09:18:47Z

@rkcosmos, let me know if you need to have some tests or pipeline for verification as well. If you think this PR is good, we can later discuss a PR for improving the general code formatting with PEP guidelines as well. thanks

SaddamBInSyed · 2021-06-14T15:11:56Z

@SamSamhuns
thanks for your PR.
can u advise about GPU model details which you used ?

thank you

SamSamhuns · 2021-06-15T05:41:40Z

@SamSamhuns
thanks for your PR.
can u advise about GPU model details which you used ?

thank you

Tesla V100 DGX

myxzlpltk · 2022-10-21T11:37:34Z

why this method load all tensor into gpu memory? I got memory leaks about 26 gb while batch_size didn't work

ash2703 · 2023-06-23T14:26:12Z

When benchmarking on GPU
Did you clear cache after running sequential inference?
GPU inference tends to be much faster after warmup!

Add batch img inference support for ocr det with readtext_batched

SamSamhuns added 3 commits June 12, 2021 14:32

Add reformat_input_batched func to run reformat_input on img seq

4473d31

Enable batched tensor feeding to CRAFT torch model

78be56f

Add readtext_batched func to send img batches for ocr det

4a3829c

This was referenced Jun 12, 2021

Perform OCR multiple images at the same time? #394

Closed

Batch prediction #389

Closed

rkcosmos merged commit 89ec92f into JaidedAI:master Jun 24, 2021

thuc-moreh pushed a commit to moreh-dev/EasyOCR that referenced this pull request Jul 5, 2023

Merge pull request JaidedAI#458 from SamSamhuns/master

7a696e8

Add batch img inference support for ocr det with readtext_batched

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add batch img inference support for ocr det with readtext_batched #458

Add batch img inference support for ocr det with readtext_batched #458

SamSamhuns commented Jun 12, 2021 •

edited

Loading

SamSamhuns commented Jun 14, 2021

SaddamBInSyed commented Jun 14, 2021

SamSamhuns commented Jun 15, 2021

myxzlpltk commented Oct 21, 2022

ash2703 commented Jun 23, 2023

Add batch img inference support for ocr det with readtext_batched #458

Add batch img inference support for ocr det with readtext_batched #458

Conversation

SamSamhuns commented Jun 12, 2021 • edited Loading

Batched image inference for text detection

Edited files

GPU results

CPU results

SamSamhuns commented Jun 14, 2021

SaddamBInSyed commented Jun 14, 2021

SamSamhuns commented Jun 15, 2021

myxzlpltk commented Oct 21, 2022

ash2703 commented Jun 23, 2023

SamSamhuns commented Jun 12, 2021 •

edited

Loading