Skip to content

ZZR8066/SEMv2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SEMv2: Table Separation Line Detection Based on Instance Segmentation

This repository contains the source code of SEMv2: SEMv2: Table Separation Line Detection Based on Instance Segmentation.

Introduction

pipeline

In this work, we adhere to the principle of the split-and-merge based methods and propose an accurate table structure recognizer, termed SEMv2 (SEM: Split, Embed and Merge). Unlike the previous works in the “split” stage, we aim to address the table separation line instance-level discrimination problem and introduce a table separation line detection strategy based on conditional convolution. Specifcally, we design the “split” in a top-down manner that detects the table separation line instance frst and then dynamically predicts the table separation line mask for each instance. The final table separation line shape can be accurately obtained by processing the table separation line mask in a row-wise/column-wise manner.

Dataset

statics_iflytab

iFLYTAB collects table images of various styles from different scenarios. Specifcally, we collect both wired and wireless tables from digital documents, and camera-captured photos. It is worth noting that we also hold a table structure recognition challenge based on the proposed iFLYTAB dataset, which is available at http://challenge.xfyun.cn/topic/info?type=structure. This competition is organized by the iFLYTEK company in conjunction with China Society of Image and Graphics (CSIG). The training set and validation set can be downloaded from this website. The annotations for the validation set are located in the evaluation/validation_annotation folder.

Metric

We use both F1-Measure [1] and Tree-Edit-Distance-based Similarity (TEDS) [2] metric, which are commonly adopted in table structure recognition literature and competitions, to evaluate the performance of our model for recognition of the table structure. The evaluation code can be found in in the evaluation/eval.py file.

[1] M. Hurst, A constraint-based approach to table structure derivation, in: ICDAR, 2003.

[2] X. Zhong, E. ShafeiBavani, A. Jimeno Yepes, Image-based table recognition: Data, model, and evaluation, in: ECCV, 2020.

Performance

We perform comprehensive experiments on the SciTSR, PubTabNet, cTDaR, WTW and the proposed iFLYTAB dataset to verify the effectiveness of SEMv2.

  • Ablation Study

ablation_study

  • SciTSR and PubTabNet

scitsr_pubtabnet

  • cTDaR TrackB1-Historical and WTW

cTDaR_WTW

  • iFLYTAB

iflytab

Requirements

pip install requirements.txt

Training

cd SEMv2
python runner/train.py --cfg default

Citation

If you find SEMv2 useful in your research, please consider citing:

@misc{
  zhang2023semv2,
  title={SEMv2: Table Separation Line Detection Based on Conditional Convolution}, 
  author={Zhenrong Zhang and Pengfei Hu and Jiefeng Ma and Jun Du and Jianshu Zhang and Huihui Zhu and Baocai Yin and Bing Yin and Cong Liu},
  year={2023},
  eprint={2303.04384},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published