Hi,
I'm using layout parser to perform OCR on a research paper, but on almost every page of the pdf the text boxes are not properly aligned. For example I input this page:

perform detection using:
model = lp.Detectron2LayoutModel('lp://PubLayNet/mask_rcnn_X_101_32x8d_FPN_3x/config',
extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8],
label_map={0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"})
layout = model.detect(image)
# Show the detected layout of the input image
lp.draw_box(image, layout, box_width=3)
The detected image is shown below:

As can be seen, the bottom left box is not properly aligned, which causes problem with the sort script, as given in the tutorial:
# sort the left and right blocks and assign id to each
h, w = image.size
left_interval = lp.Interval(0, w/2*1.05, axis='x').put_on_canvas(image)
left_blocks = text_blocks.filter_by(left_interval, center=True)
left_blocks.sort(key = lambda b:b.coordinates[1])
right_blocks = [b for b in text_blocks if b not in left_blocks]
right_blocks.sort(key = lambda b:b.coordinates[1])
# And finally combine the two list and add the index
# according to the order
text_blocks = lp.Layout([b.set(id = idx) for idx, b in enumerate(left_blocks + right_blocks)])
# visualize the cleaned text blocks
lp.draw_box(image, text_blocks,
box_width=3,
show_element_id=True)

The misaligned box is given an index of 0. Which is not correct.
Is there any way to avoid this problem?
Thank you
Hi,
I'm using layout parser to perform OCR on a research paper, but on almost every page of the pdf the text boxes are not properly aligned. For example I input this page:
perform detection using:
The detected image is shown below:
As can be seen, the bottom left box is not properly aligned, which causes problem with the sort script, as given in the tutorial:
The misaligned box is given an index of
0. Which is not correct.Is there any way to avoid this problem?
Thank you