Skip to content

v4.2.0

Choose a tag to compare

@richardr1126 richardr1126 released this 07 Jun 20:11
· 85 commits to main since this release

✨ What's New

🚀 New

  • Multilingual Reader & Speech Support:
    • Added full support for reading multilingual documents in PDF, EPUB, and HTML viewers, passing the document language down to the segmenters and text tokenizers.
    • Implemented automatic EPUB metadata language detection.
    • Added warning alerts in the reader UI and audiobook creation panel if the document language mismatches the selected voice's language.
    • Integrated Intl.Segmenter for word/sentence tokenization, allowing proper parsing of non-space-separated languages (e.g. Japanese, Chinese).
    • Forwarded detected/selected language options dynamically to capable TTS providers (OpenAI, Replicate).
  • Background Tasks Engine: A database-backed task runner featuring atomic claims/leases, execution tracking, in-process self-hosted loops, and daily Vercel cron route endpoints (/api/admin/tasks/tick).
  • Admin Tasks UI: An administrative control panel in Settings to view background tasks, trigger runs manually, and configure execution intervals.
  • Document Blob Leasing & Mutation Locks: A concurrency locking mechanism using S3/object storage leases with exponential backoff and jitter to prevent race conditions during upload finalization.

⚡ Improved

  • User Data Management (Export & Claiming):
    • Expanded data exporting to package and download new metadata types (job events, document settings, auth sessions, linked accounts, and TTS segment cache tables) and download generated TTS segment audio files from S3/object storage.
    • Upgraded guest-to-account claiming to copy S3 synthesized speech assets and document settings to the new account.
  • Preferences Inheritance & Syncing: Decoupled preference architecture where users can inherit global admin configurations, synced automatically between local Dexie and server-side profiles.
  • Replicate Parameter Mapping: Graph-walking OpenAPI parser that dynamically resolves voice and language parameter inputs for custom Replicate models (Kokoro, Gemini, Minimax, Qwen, Inworld) with LRU caching.

🐛 Fixed

  • Preview generation: fix document preview generation issue #106
  • Highlighting Performance: Added a fast linear scan optimizer that resolves exact highlight matches instantly, preventing UI thread freezes on large documents.
  • TTS API Gateway Fallback: Automatically catches 400/422 errors from OpenAI-compatible gateways that reject custom language parameters, retrying and caching the request without the parameter.
  • User Storage Deletion Cleanup: Overhauled account deletion to purge user-scoped S3 asset prefixes (audiobooks, segments, and temp uploads) before database rows cascade-delete, preventing orphaned storage files.

⚠️ Upgrading from v4.1.2

  • If using self-hosted nodes, the task scheduler starts in-process automatically; Vercel deployments use a daily cron route (/api/admin/tasks/tick).

Full Changelog: v4.1.2...v4.2.0