Combining word boxes into lines or paragraphs #22

mrm8488 · 2020-01-17T12:08:15Z

Is there anyway (maybe any built-in function) to detect EOL chars in a large text? Or, maybe it must be done by the client by comparing the words position vector.
Thanks in advance.

faustomorales · 2020-01-19T17:57:23Z

Hi @mrm8488 -- I do not think it is possible to detect the end of a line directly in the detector or recognizer models.

To synthesize word boxes into lines or paragraphs (as I believe you imply), the user would have to apply their own logic for stitching together the pieces. I would very much like to have some starter implementation of that logic in this repository, but just haven't had a chance to think it through and implement. That said, if you have thoughts and an approach in process, a PR for this feature would be very much appreciated! Please post back here if that's something you are interested in working on.

MounaBC · 2020-01-20T16:40:59Z

Hello @mrm8488 , just to give you an idea, the predictions is a list of (text, box) tuples, where each item represents a word and its position in the image (starting from top left)
box is an array of 4 items, each one of them representing a border of the word's box (its X and Y). Its structure is [[startX,startY], [endX,startY], [endX,endY], [startX, endY]]
With those information, you can organise your text the way you want to, it depends a lot on how it is in your images , the size of your images ...
For example, if your image has straight lines of text, you can first sort your results by ascending Y. Then, depending on the size of your image and text, you can define a threshold on the difference of Y coordiantes to separate your lines
Once you have that, you just sort by ascending X for each line.

mrm8488 · 2020-01-20T16:44:42Z

Thank you. I was thinking doing something like that.

Johndirr · 2020-02-09T09:33:35Z

I would do it like @MounaBC said. First sort the bounding boxes along the y-axis (top-bottom, highest endY value first) but then I would just categorize everything into a new line that overlaps on the y-axis. After this it's just sorting along the x-axis for every line.

EDIT 5
I posted my approach here: https://stackoverflow.com/a/60684094/5459124

rawat123 · 2020-05-28T14:02:37Z

I would do it like @MounaBC said. First sort the bounding boxes along the y-axis (top-bottom, highest endY value first) but then I would just categorize everything into a new line that overlaps on the y-axis. After this it's just sorting along the x-axis for every line.

EDIT 5
I posted my approach here: https://stackoverflow.com/a/60684094/5459124

hi, thanks for the code, i am trying to arrange all text in one line in ascending order
i am using below code, my image has multi line text
box ([x1,y1], [x2,y2] , [x3,y3], [x4,y4])
while executing the below code i got error after it arranges two boxes in same line
IndexError: arrays used as indices must be of integer (or boolean) type

import numpy as np
def isOnSameLine(boxOne, boxTwo):
    print(boxOne)
    print(boxTwo)
    boxOneStartY = boxOne[0, 1]
    boxOneEndY = boxOne[2, 1]
    boxTwoStartY = boxTwo[0, 1]
    boxTwoEndY = boxTwo[2, 1]
    if ((boxTwoStartY <= boxOneEndY and boxTwoStartY >= boxOneStartY)
            or (boxTwoEndY <= boxOneEndY and boxTwoEndY >= boxOneStartY)
            or (boxTwoEndY >= boxOneEndY and boxTwoStartY <= boxOneStartY)):
        return True
    else:
        return False
# list of indexes
temp = []
i = 0
box_groups = np.array([[[292.17706, 10.344554], [431.73145, 15.781749], [427.96115, 112.55261],
                        [288.40674, 107.11542]],
                       [[292.17706, 10.344554], [431.73145, 15.781749], [427.96115, 112.55261],
                        [288.40674, 107.11542]],
                       [[104.10318, 25.434502], [251.24907, 17.586721], [256.33423, 112.93329],
                        [109.18835, 120.78107]],
                       [[191.8359, 116.40875], [472.261, 113.45691], [473.46985, 228.30032],
                        [193.04477, 231.25217]]])
# TODO: check if there is more than one box_group
sorted_box_group = [4, 4]
while i < len(box_groups):
    for j in range(i + 1, len(box_groups)):
        if (isOnSameLine(box_groups[i], box_groups[j])):
            print(str(i) + " and " + str(j) + " on same line")
            if i not in temp:
                temp.append(i)
            if j not in temp:
                temp.append(j)
        else:
            print(str(i) + " and " + str(j) + " not on same line")
        # append temp with i if the current box (i) is not on the same line with any other box
        if len(temp) == 0:
            temp.append(i)
    print("-----------------")
    print(temp)
    print(sorted_box_group)
    # put boxes on same line into lined_box_group array
    lined_box_group = box_groups[np.array(temp)]
    # sort boxes by startX value
    lined_box_group = lined_box_group[np.argsort(lined_box_group[:, 0, 0])]
    # copy sorted boxes on same line into sorted_box_group
    print(i)
    print(temp[-1] + 1)
    sorted_box_group[i:temp[-1] + 1] = lined_box_group
    # skip to the index of the box that is not on the same line
    i = temp[-1] + 1
    # clear list of indexes
    temp = []

Johndirr · 2020-05-28T15:09:00Z

I don't know if I had to do some corrections later on so here is my most recent code:

def isOnSameLine(boxOne, boxTwo):
    boxOneStartY = boxOne[0,1]
    boxOneEndY = boxOne[2,1]
    boxTwoStartY = boxTwo[0,1]
    boxTwoEndY = boxTwo[2,1]
    if((boxTwoStartY <= boxOneEndY and boxTwoStartY >= boxOneStartY)
    or(boxTwoEndY <= boxOneEndY and boxTwoEndY >= boxOneStartY)
    or(boxTwoEndY >= boxOneEndY and boxTwoStartY <= boxOneStartY)):
        return True
    else:
        return False

def segmentLines(box_group):
    # sort by highest starty value (bottom left corner of box - [startX,startY], [endX,startY], [endX,endY], [startX, endY])
    box_group = box_group[np.argsort(box_group[:, 0, 1])]

    lined_box_group = np.zeros(box_group.shape)
    sorted_box_group = np.zeros(box_group.shape)

    # list of indexes
    temp = []
    i = 0

    # check if there is more than one box in the box_group
    if len(box_group) > 1:
        while i < len(box_group):
            for j in range(i + 1, len(box_group)):
                if(isOnSameLine(box_group[i],box_group[j])):
                    # print(str(i) + " and " + str(j) + " on same line")
                    if i not in temp:
                        temp.append(i)
                    if j not in temp:
                        temp.append(j)
                # else:
                    # print(str(i) + " and " + str(j) + " not on same line")
            # append temp with i if the current box (i) is not on the same line with any other box
            if len(temp) == 0:
                temp.append(i)
            
            # put boxes on same line into lined_box_group array
            lined_box_group = box_group[np.array(temp)]
            # sort boxes by startX value
            lined_box_group = lined_box_group[np.argsort(lined_box_group[:, 0, 0])]
            # copy sorted boxes on same line into sorted_box_group
            sorted_box_group[i:temp[-1]+1] = lined_box_group
            
            # skip to the index of the box that is not on the same line
            i = temp[-1] + 1
            # clear list of indexes
            temp = []
    else:
        # since there is only one box in the boxgroup do nothing but copying the box
        # print("only one box in boxgroup")
        sorted_box_group = box_group
        
    return sorted_box_group

I get the actual frame to analyse for text from a cv2.VideoCapture and then do the following:

box_groups = detector.detect([frame])
box_group = box_groups[0]

if len(box_group) > 0:
            # sort bounding boxes into lines
            sorted_box_group = segmentLines(box_group)
            # recognize text
            recognizedtext = recognizer.recognize_from_boxes([frame], [sorted_box_group])
            text = " ".join(recognizedtext[0])
        else:
            text = ""

Duv54 · 2020-09-11T14:18:05Z

Do you have any idea to group the boxes into blocks / paragraphs? I tried to find algorithms to do this but failed. I only succeeded in improving the line segmenter by introducing the y-distance of the two centers of the boxes.

Thank you

faustomorales changed the title ~~Query about end of line~~ Combining word boxes into lines or paragraphs Jan 19, 2020

faustomorales mentioned this issue Jun 21, 2020

sort bounding boxes #91

Closed

faustomorales added the enhancement New feature or request label Aug 16, 2020

YC7225 mentioned this issue Jun 11, 2021

how to map line by line text detection and recognition? open-mmlab/mmocr#272

Closed

shreevatsa mentioned this issue Feb 19, 2023

Splitting page into lines shreevatsa/ambuda#33

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Combining word boxes into lines or paragraphs #22

Combining word boxes into lines or paragraphs #22

mrm8488 commented Jan 17, 2020

faustomorales commented Jan 19, 2020

MounaBC commented Jan 20, 2020

mrm8488 commented Jan 20, 2020

Johndirr commented Feb 9, 2020 •

edited

Loading

rawat123 commented May 28, 2020 •

edited

Loading

Johndirr commented May 28, 2020

Duv54 commented Sep 11, 2020

Combining word boxes into lines or paragraphs #22

Combining word boxes into lines or paragraphs #22

Comments

mrm8488 commented Jan 17, 2020

faustomorales commented Jan 19, 2020

MounaBC commented Jan 20, 2020

mrm8488 commented Jan 20, 2020

Johndirr commented Feb 9, 2020 • edited Loading

rawat123 commented May 28, 2020 • edited Loading

Johndirr commented May 28, 2020

Duv54 commented Sep 11, 2020

Johndirr commented Feb 9, 2020 •

edited

Loading

rawat123 commented May 28, 2020 •

edited

Loading