Skip to content

Releases: TrueLipstick/TrueLipstick-pdf-image-ocr-extractor

PDF & Image OCR Text Extractor v1.0.0 - Offline Tesseract OCR for Open WebUI

07 Mar 03:11

Choose a tag to compare

PDF & Image OCR Text Extractor for Open WebUI

First release! Features:

  • Text-based PDF extraction via PyMuPDF
  • Scanned/image PDF OCR via Tesseract
  • Standalone image OCR (PNG, JPG, TIFF, etc.)
  • Multi-language support (English + Swedish, configurable)
  • Dual input: chat uploads and file path/URL
  • Multiple output formats (text, markdown, save-to-file)
  • Fully offline — no external API calls
  • Docker-ready with custom Dockerfile

machine-that-learned-to-read-landscape