Inconsistent Table Detection Across Pages

@xavctn Hi!

I'm currently working on implementing table extraction as part of a custom RAG flow and found img2table to be a great tool for handling tabular data. However, since I'm aiming to build a reliable solution that works for 99.9% of my documents, I wanted to clarify the library's behavior as I noticed some inconsistencies.

Specifically, the attached document contains a small table in the header of each page. img2table correctly detects the table on page 1, but fails to detect the same (or very similar) tables on pages 2 and 3. I’m wondering what might be causing this, as the table layout is nearly identical across all pages.

Here’s the basic code I’m using:

`pdf = PDF(pdf_document_path)`
` pdf_tables = pdf.extract_tables()
`

Any suggestions or insights would be greatly appreciated!

[example.pdf](https://github.com/user-attachments/files/19430511/example.pdf)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Inconsistent Table Detection Across Pages #248

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Inconsistent Table Detection Across Pages #248

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions