Python packge for using OmniOcr: https://omniocr.ai
pip install omniocr
Get your API key from: https://omniocr.ai/
Then you can start to OCR documents with:
export OMNIOCR_API_KEY=<OMNIOCR_API_KEY>
omniocr examples/resources/sample.pdf \
--model=lightonocr-2-1b \
--format=markdown \
--pages "1-3" > output.mdAlternatively, you can run it programmatically:
from omniocr import OmniOcr
client = OmniOcr()
document = client.process(
"examples/resources/sample.pdf",
model="lightonocr-2-1b",
format="markdown",
pages="1-3"
)
print(document)There are two types of formats that omniocr supports:
- markdown conversion -- this is the simplest, the document is just converted to markdown, typically with placeholders for images
- block-based output -- if you need bounding boxes for where the text comes from, you should use a model that supports bounding box outputs