cells2table

Parsing tables in document images with cell detection models

Implemented pipelines

PaddlePaddle

Classification model (wired / wireless)
Cell detection model with different weights for each class

Uses ONNX weights downloaded automatically from Hugging Face on first use.

Instalation

With uv, add to your project with:

uv add cells2table

ONNX models need a ONNX Runtime installed to run. You can install one on your own or use one of the optionals already configured.

Optional	Description
`docling`	For docling usage
`huggingface`	For downloading models
`onnx_cuda`	For NVIDIA GPUs
`onnx_openvino`	For Intel GPUs and CPUs
`onnx_cpu`	Default CPU runtime

Usage

cells2table only extract structural information from the tables. Another library is needed to extract content from the cells.

Docling

A docling plugin is provided to allow integrating cells2table in a complete pipeline.

Usage example:

from cells2table.docling import CustomDoclingTableStructureOptions

pipeline_options = PdfPipelineOptions(
    allow_external_plugins=True,
    table_structure_options=CustomDoclingTableStructureOptions(),
)

converter = DocumentConverter(
    format_options={
        InputFormat.PDF: PdfFormatOption(pipeline_options=pipeline_options),
        InputFormat.IMAGE: PdfFormatOption(pipeline_options=pipeline_options),
    }
)

result = converter.convert("path/to/document.pdf")
print(result.document.export_to_markdown())

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.github/workflows		.github/workflows
src		src
tests		tests
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cells2table

Implemented pipelines

PaddlePaddle

Instalation

Usage

Docling

About

Uh oh!

Releases 2

Languages

License

jspast/cells2table

Folders and files

Latest commit

History

Repository files navigation

cells2table

Implemented pipelines

PaddlePaddle

Instalation

Usage

Docling

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Languages