TableExtraction
This is a tool to extract tables in document images. The method is composed of 6 steps:
- Line segment detection
- Horizontal and vertical segment filtering
- Line segment recovery
- Suppression of segments belonging to text
- Table cell extraction
- Table reconstruction
Quick setup
- Requires OpenCv library.
- CMakeLists.txt provided for cmake.
To install the program see Install.txt file
Examples
Result (with intermediate steps) for eu-002_page0.png:
./TableExtraction -i ../Samples/eu-002_page0.png -o eu-002_page0_res.png
Input image |
FBSD detector |
Filtering and recovering segments |
Removing text segments |
Table extraction |
Output image |
Result for eu-001_page0.png:
./TableExtraction -i ../Samples/eu-001_page0.png -o eu-001_page0_res.png
Input image |
Output image |
Result for us-001_page0.png:
./TableExtraction -i ../Samples/us-001_page0.png -o us-001_page0_res.png
Input image |
Output image |
Result for 1_301.jpg:
./TableExtraction -i ../Samples/1_301.jpg -o 1_301_res.png
Input image |
Output image |
Limit cases
Result for 10.1.1.1.2111_7.jpg:
./TableExtraction -i ../Samples/10.1.1.1.2111_7.jpg -o 10.1.1.1.2111_7_res.png
Input image |
Output image (graphics are mistakenly recognized as tables) |
Result for 1078_082.png:
./TableExtraction -i ../Samples/1078_082.png -o 1078_082_res.png
Input image |
Output image (Boundless table non detected) |