This project automates table extraction from scanned images or documents using a hybrid deep learning approach:
- Table detection with Hugging Faceβs
Table Transformer. - Cropping & padding detected regions.
- Table structure recognition with PaddleOCR
PPStructureV3. - Export to Excel for easy downstream use.
- Detects multiple tables in a single page image.
- Automatically crops detected regions with margin padding.
- Recognizes table structure and cell text.
- Exports structured tables into Excel (
.xlsx). - Supports both CPU and GPU acceleration.
- Works with JPEG / PNG / scanned document images.
project/
βββ main.py # Main pipeline script
βββ requirements.txt # Python dependencies
βββ README.md # Documentation
βββ input/ # Place input images here
βββ output/ # Cropped tables + extracted Excel files
git clone https://github.com/yourusername/table-extraction.git
cd table-extractionpython3 -m venv venv
source venv/bin/activate # Linux / Mac
venv\Scripts\activate # Windowspip install -r requirements.txt-
Place your input image (e.g.,
try.jpeg) in theinput/folder. -
Run the script:
python main.py- Outputs:
- Cropped tables β stored as
.pnginoutput/. - Extracted structured tables β saved as Excel
.xlsxinoutput/.
-
Input Image
- User provides a scanned image/document (
JPEGorPNG).
- User provides a scanned image/document (
-
Table Detection
- Hugging Face Table Transformer locates tables and returns bounding boxes.
-
Cropping & Expansion
- Each bounding box is expanded by 0.3 cm margin (converted to pixels).
- The table region is cropped and saved as an image.
-
Table Recognition
-
Cropped tables are passed to PaddleOCR PPStructureV3.
-
Recognizes:
- Table grid structure (rows/columns).
- Text inside each cell.
-
-
Export Results
-
Each detected table is exported as:
table_XX.png(cropped table image).table_XX.xlsx(structured Excel file).
-
-
Logs & Timing
-
Console prints:
- Number of detected tables.
- Confidence scores.
- Output file paths.
- Total execution time.
-
python main.pyExample output:
Table 0 cropped: output/table_00.png (score=0.912)
Table 1 cropped: output/table_01.png (score=0.878)
Parsed results saved in output/
Execution time: 1.42 mins
- PyTorch β Table Transformer inference.
- Hugging Face Transformers β Pretrained model loading.
- Pillow β Image manipulation.
- PaddleOCR (PPStructureV3) β Table structure + OCR.
- NumPy β Array operations.