Skip to content

v3.5.0 — Image OCR, book scans, map-reduce summaries, source-aware tones

Choose a tag to compare

@MKS-01 MKS-01 released this 17 Jun 19:28
c1bb357

What's new

  • Image OCR — drop an image path; Ollama vision extracts the text and reads it aloud (_ocr_via_ollama, auto pick_vision_model)
  • Multi-page / book scans — a folder or glob of page photos is OCR'd in filename order and stitched into one continuous document (fetch_multi_page)
  • Map-reduce summarization — long scans summarize end-to-end instead of truncating; _batches → condense → combine, recursion depth ≤ 3
  • Source-aware tones — URL reads as a livelier article (temp 0.8); image/folder reads as a measured book (temp 0.6) that opens by naming its chapter/topic. Auto by source, no new commands

Details

  • New pipeline/tones.py: Tone dataclass, ARTICLE / BOOK instances, classify_source, tone_for
  • extract.py: _book_title_from_text derives chapter/topic from first OCR lines; HEIC/TIFF/BMP/WebP → JPEG via sips
  • summarize.py: system param threaded through _summarize_once / _map_reduce; per-batch progress via WS
  • set_temperature on Synthesizer + CsmEngine for per-read delivery tuning
  • CLI input guard extended for absolute paths, globs, and tilde paths
  • Tests: test_tones.py, test_summarize_batches.py

Full Changelog: v3.3.0...v3.5.0