This repository was archived by the owner on Apr 11, 2025. It is now read-only.

Description
The network parser keeps running infinitly
When parsing a table with a lot of different alignments the parsers keeps running infinitly.
It happens on the Network parser, since the Hybrid parser depends on it, that one will hang as well.
Steps to reproduce the bug
- Parse the file 4th page of file tabula/schools.pdf with the network or hybrid parser.
- It keeps running
Expected behavior
Not an infinite execution. Was expecting a parsing error. Or a retunerd table.
Code
pdf_file, kwargs = "tabula/schools.pdf", {"pages": "4"}
tables = pypdf_table_extraction.read_pdf(filename, flavor="network", debug=True, **kwargs)
PDF
Screenshots

Environment
- OS: [e.g. macOS]
- Python version: 3.10
- Numpy version: 1.5.3
- OpenCV version:
- Ghostscript version: 0.7
- pypdf_table_extraction version: from repo, between release 0.0.2 and 1.0.0
Additional context