feat: opt-in auto-OCR on image download + image-download fix (P13b-3)#141
Merged
Conversation
…13b-3 precursor) classifyDownloadOutputs routed every image extension to `thumb`, so a single-image download (a photo or carousel) produced an empty media list — no library item at all. Now image files are tentative thumbnails: alongside a video/audio download they stay thumbnails, but when a download has no video/audio the images ARE the media (→ image items). Reuses mediaTypeForExt so classification matches the insert loop's type assignment. This unblocks auto-OCR-on-download (P13b-3) and fixes image downloads generally (they now appear in the library, with dimensions/OCR/etc.). https://claude.ai/code/session_013JoYmLCosYt5tQ8qwdbL1T
Follow-up to P13b-1: optionally auto-scan image downloads for text on completion, so search coverage grows automatically. Opt-in (default off); mirrors P13a-2 auto-summarize. OCR is free + offline (bundled ML Kit), so there's no model download or "needs setup" nudge. - Settings: `autoOcrOnDownload` (default false) + setter. - Pure `shouldAutoOcr` (enabled & engine-available & image & not-yet-scanned). - Queue: gated block in `_persistCompleted` (after auto-summary) scans each image item via ocrEngine → `updateOcrText` (FTS reindexes); `ocrCount` in `_PersistResult`; an `ai` success inbox entry when text is found. - Settings UI: an "Image text (OCR)" auto-scan card in AI & graph settings, shown only where ML Kit OCR runs. - Tests: shouldAutoOcr truth table, settings round-trip, and queue cases (image+text → ocrText + entry; default-off no-op; video skipped). The realistic image test relies on the precursor classifier fix. - Docs: P13-PLAN P13b-3 status, VERIFICATION P13b-3 + image-download fix. No schema/deps change. https://claude.ai/code/session_013JoYmLCosYt5tQ8qwdbL1T
…s (P13b-3 sweep) Pre-merge sweep of the image-download/classification work: - MediaThumb now falls back to the image FILE for `image` items with a null thumbnail (they were showing a movie-icon placeholder in the grid, dashboard, collections, hero shuttle, and related strips). Typed fallback icon for images is now image_outlined. - classifyDownloadOutputs collapses an image + its yt-dlp `--write-thumbnail` sidecar to ONE item: with no video/audio, the largest image is the media and the next-largest is its thumbnail (carousels expand to one task/folder per photo, so multiple images here = photo + thumbnail). Prevents a duplicate image item and gives image items a real thumbnail. - Quick wins in _persistCompleted: auto-transcribe skips image items (no wasted whisper transcode of a photo); durationSec gated to non-image. - Tests: classifier photo+thumbnail collapse (real temp files), MediaThumb image null-thumb renders Image.file (not the movie icon), queue cases (image+thumbnail → one item with a thumbnail; whisper skipped on images). - Docs: VERIFICATION (thumbnail rendering, single-item, export), BACKLOG (unconditional --write-thumbnail; non-mediaTypeForExt image formats), P13-PLAN P13b-3 sweep note. https://claude.ai/code/session_013JoYmLCosYt5tQ8qwdbL1T
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What & why
Closes out P13b with opt-in auto-OCR on image downloads — plus a precursor bug fix the maintainer chose to land first, since it's the enabler.
Two commits:
1.
fix:image-only downloads become image items (precursor)classifyDownloadOutputsrouted every image extension (.jpg/.png/.webp/…) tothumb, so a single-image download (an Instagram/X photo, or a carousel) produced an empty media list → no library item at all. Auto-OCR (which scansoutputs.media) could never fire, and image downloads were silently dropped generally.Fix: image files are tentative thumbnails — alongside a video/audio download they stay thumbnails, but when a download has no video/audio, the images are the media (→
imageitems). ReusesmediaTypeForExtso classification matches the insert loop. The video+thumbnail case is unchanged. This also fixes image downloads app-wide (they now appear with dimensions, on-demand OCR, etc.).2.
feat:opt-in auto-OCR on download (P13b-3)Mirrors P13a-2 auto-summarize. OCR is free + offline (bundled ML Kit) — no model download, no "needs setup" nudge.
autoOcrOnDownload(default off) + setter.shouldAutoOcr(enabled & engine-available & image & not-yet-scanned)._persistCompleted(after auto-summary) scans each image item →updateOcrText(FTS reindexes);ocrCountin_PersistResult; anaisuccess inbox entry when text is found (no nudge — OCR is always available).Tests
dart formatclean ·flutter analyzeNo issues ·flutter test800 passed — classifier image cases (image-only → media; carousel → all media; video+image → thumbnail unchanged),shouldAutoOcrtruth table, settings round-trip, and queue cases (image+text →ocrText+aientry; default-off no-op; video skipped). The realistic auto-OCR queue test relies on the precursor fix to produce an image item from a.jpgdownload. No schema/deps change.Honest notes
This completes P13b (OCR + translation + auto-OCR). Next subphase is P13c — smart auto-tagging.
https://claude.ai/code/session_013JoYmLCosYt5tQ8qwdbL1T
Generated by Claude Code