Skip to content

Releases: PhilixTheExplorer/lexo

v0.1.1

17 Jun 10:46
85f35fb

Choose a tag to compare

Changed

  • Use SPDX license expression for verified PyPI metadata.
  • Expand PyPI classifiers: add environment, intended audience, and per-version Python tags.
  • Add Changelog and Releases links to project URLs.
  • Fix README logo URL to absolute path so it renders on PyPI.

v0.1.0

17 Jun 10:06
81a140d

Choose a tag to compare

First public release. Lexo (Local EXtraction and OCR) is a local-first desktop
document OCR tool that turns PDFs and images into clean, editable text, with
strong support for Burmese (Myanmar script) using free, high-accuracy Google
Docs OCR. Everything runs on your machine; the only network call is the optional
OCR, on your own Google account.

It is a complete, from-scratch rebuild of the old OCR Text Extractor (the legacy
Tkinter app is gone).

Highlights

  • Smart OCR routing — digital PDF pages use their embedded text layer
    (instant, lossless); only scanned pages are OCR'd. --force-ocr overrides.
  • Free Burmese-first OCR — Google Docs OCR via the Drive API, on your own
    account, behind a pluggable provider port.
  • PDF operations — extract page ranges, split, crop, rotate, merge, and
    split two-up spreads.
  • Desktop GUI + CLI — a PySide6 app with a visual crop/split editor and a
    proofread pane, and a scriptable Typer CLI, both over the same engine.
  • Burmese-aware text — NFC normalization, zero-width-space-safe cleaning,
    and a bundled Noto Sans Myanmar font.
  • Exports — plain text (default), Markdown (YAML frontmatter), and JSONL.

Install

uv tool install lexo

OCR needs a one-time Google Drive API setup (bring your own OAuth client). See the
README.

Full changelog: see CHANGELOG.md.