Release v0.1.0 · PhilixTheExplorer/lexo

First public release. Lexo (Local EXtraction and OCR) is a local-first desktop
document OCR tool that turns PDFs and images into clean, editable text, with
strong support for Burmese (Myanmar script) using free, high-accuracy Google
Docs OCR. Everything runs on your machine; the only network call is the optional
OCR, on your own Google account.

It is a complete, from-scratch rebuild of the old OCR Text Extractor (the legacy
Tkinter app is gone).

Highlights

Smart OCR routing — digital PDF pages use their embedded text layer
(instant, lossless); only scanned pages are OCR'd. --force-ocr overrides.
Free Burmese-first OCR — Google Docs OCR via the Drive API, on your own
account, behind a pluggable provider port.
PDF operations — extract page ranges, split, crop, rotate, merge, and
split two-up spreads.
Desktop GUI + CLI — a PySide6 app with a visual crop/split editor and a
proofread pane, and a scriptable Typer CLI, both over the same engine.
Burmese-aware text — NFC normalization, zero-width-space-safe cleaning,
and a bundled Noto Sans Myanmar font.
Exports — plain text (default), Markdown (YAML frontmatter), and JSONL.

Install

uv tool install lexo

OCR needs a one-time Google Drive API setup (bring your own OAuth client). See the
README.

Full changelog: see CHANGELOG.md.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.1.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Highlights

Install

Uh oh!