CheckBoxOCR

A CLI tool to detect checked and unchecked checkboxes from PDFs and images.

This model is optimised (and mainly trained on) computer generated checkboxes. It may struggle with handwritten ones. This will be improved in a future update.

How to use

Run the executable from the CheckBoxOCR folder with your file as an argument:

./CheckBoxOCR/ocr yourfile.pdf

On Windows:

CheckBoxOCR\ocr.exe yourfile.pdf

Supported input file types:

Documents (.pdf)
Images (.png, .jpg, .jpeg, .bmp)

Output

Annotated images with detected checkboxes saved in the output/ folder (This will be automatically generated for you on first run)
Console output lists checkbox statuses and their positions on each page

Integration

(c.pdf is a sample pdf file to ensure the model runs)

You can integrate this tool into your own automation or pipelines by calling the binary directly. (Replace yourfile.pdf with the image or PDF you want to use) Example (shell):

./CheckBoxOCR/ocr yourfile.pdf

Example (Python):

import subprocess
subprocess.run(["./CheckBoxOCR/ocr", "yourfile.pdf"])

This makes it easy to include checkbox detection in document processing systems or form analysis flows.

This is a packaged tool, there is no need to install any other packages.

Any Problems? Feel free to raise them via the issues tab. Improvements in the model will be released bi-monthly.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
ReadMe.md		ReadMe.md
c.pdf		c.pdf
ocr		ocr

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CheckBoxOCR

How to use

Output

Integration

About

Uh oh!

Releases 1

Packages

DevSTheDeveloper/CheckBoxOCR

Folders and files

Latest commit

History

Repository files navigation

CheckBoxOCR

How to use

Output

Integration

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Packages