Skip to content

feat: Update All albums button + fast, accurate index scans#342

Merged
lstein merged 2 commits into
masterfrom
lstein/feature/update-all-albums
Jul 5, 2026
Merged

feat: Update All albums button + fast, accurate index scans#342
lstein merged 2 commits into
masterfrom
lstein/feature/update-all-albums

Conversation

@lstein

@lstein lstein commented Jul 5, 2026

Copy link
Copy Markdown
Owner

Summary

  • "Update All" button in Album Management updates every album's index, two at a time. Backend semaphores serialize the GPU-encoding and file-scan stages across albums, so the pipeline overlaps one album's scan/UMAP with another's encoding without contention. The queue tracks completion by polling the backend directly, so it keeps working through albums when the dialog is closed and repaints its live progress label ("Updating 2/7…") on reopen. Hidden when no albums exist; labels no longer word-wrap.

  • Scan performance rework around the min_image_dimension gate (fix(indexing): gate image indexing on pixel dimensions, not file bytes #269), whose per-file header open made scans crawl on large/NFS libraries (~190 ms/file measured):

    • Update scans traverse by extension only, diff against the index, and dimension-probe only files not already indexed.
    • A scan_rejects.npz sidecar remembers gate-rejected files by size/mtime, so previously rejected files (e.g. NAS thumbnails) are dismissed on a stat alone in every later scan. Invalidated automatically when either gate threshold changes.
    • Byte-size gate bands, calibrated by sampling 5,000 of a real 122k-file library: files over 500 KB pass without opening (0/1,175 sampled failures) and files under a per-album min_image_bytes floor reject without opening (8 KB default = 0.15% measured false negatives vs. 49% at the pre-fix(indexing): gate image indexing on pixel dimensions, not file bytes #269 100 KB cutoff).
    • Traversal prunes hidden directories and known thumbnail caches (@eaDir, __MACOSX, photomap_index). This is a correctness fix too: Shotwell's 360px thumbs (.shotwell/thumbs/thumbs360/) pass both gates and were being indexed as photos; existing entries self-heal (removed as "missing") on the next update.
    • A process-wide scan semaphore prevents concurrent album scans from thrashing disk seeks and the GIL (mirrors the encoder semaphore from fix(concurrency): close race conditions from code review #243).
    • The "Checking new image files (N of M)…" phase drives a real progress bar.
  • min_image_bytes is a per-album setting (bytes in YAML, edited in kb next to the pixel gate: [8] kb [256] pixels; 0 disables), threaded through config, routers, and the album editor.

  • "Index updated " on album cards now reports when an update operation last completed (via a last_updated marker file) instead of the .npz mtime, so a no-change update no longer looks stale. The .npz mtime is untouched — it still drives the UMAP cache staleness check.

Behavior notes

  • Raising min_image_dimension no longer purges now-too-small images on update (only on full rebuild), since updates no longer re-probe indexed files.
  • .picasaoriginals (a hidden dir) is now skipped by the hidden-dir rule; its contents are pre-edit duplicates of edited photos.

Test plan

  • 452 backend + 396 frontend tests pass; ruff/ESLint/Prettier clean.
  • New backend tests: scan serialization (no overlapping traversals), reject-cache skip/revalidate/invalidate, byte-floor with zero file opens, hidden-dir + thumbnail-dir pruning (incl. the Shotwell case), gate-pass progress callback, min_image_bytes API round-trip, last-updated timestamp advancing on a no-change update.
  • New frontend tests: Update All worker-pool concurrency cap, backend-driven completion waiting with transient-failure tolerance, button repaint across dialog close/reopen, failure isolation, hidden-when-empty.
  • Manually verified against a real 86k-image NFS-mounted library (traversal count dropped from 200k+ to ~122k candidates from pruning alone; author-confirmed "much much faster").

🤖 Generated with Claude Code

Album Management gains an "Update All" button that updates every album's
index, pipelined two at a time (backend semaphores serialize the GPU and
disk-heavy stages, so one album scans/UMAPs while another encodes). The
queue polls the backend directly, so it survives closing and reopening the
dialog; the button label shows live progress and hides when no albums exist.

Index scans are reworked around the min_image_dimension gate from #269,
whose per-file header open made update scans crawl (~190ms/file over NFS):

- Scans traverse by extension only and dimension-probe just the files not
  already indexed; a scan_rejects.npz sidecar remembers rejected files by
  size/mtime so they are dismissed on a stat alone in later scans.
- The gate gains byte-size bands, sampled from a real 122k-file library:
  files over 500KB pass without opening (0/1175 sampled failures) and files
  under a new per-album min_image_bytes floor (default 8KB, measured 0.15%
  false negatives) are rejected without opening. The floor is edited next
  to the pixel gate in the album editor ("[8] kb [256] pixels"); 0 disables.
- Traversal prunes hidden directories and known thumbnail caches (@eadir,
  __MACOSX, photomap_index). Shotwell's 360px thumbs pass both gates, so
  only pruning keeps them out of the index.
- A process-wide scan semaphore stops concurrent album scans from thrashing
  disk seeks and the GIL.
- The "Checking new image files" phase drives a real progress bar (its
  total is known up front) instead of a static message.

The album card's "Index updated <when>" timestamp now reports when an
update operation last completed (a last_updated marker file) rather than
the .npz mtime, so a no-change update no longer looks stale.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@lstein lstein enabled auto-merge (squash) July 5, 2026 01:38
@lstein lstein merged commit 816831e into master Jul 5, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant