You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Add `RescanReason` enum (7 variants) and `index-rescan-notification` Tauri event emitted from every code path that falls back to a full rescan
- Pre-check in `resume_or_scan()` compares stored `last_event_id` with `FSEventsGetCurrentEventId()` before starting the FSEvents stream — prevents the 1024-capacity `try_send` channel in `cmdr-fsevent-stream` from being overwhelmed with millions of replayed events
- Truncate `entries` + `dir_stats` via new `TruncateData` writer message before rescanning a stale DB — `INSERT OR REPLACE` on a populated table with the `platform_case` collation takes ~30 min vs ~2.5 min on empty
- Add `flush_blocking()` to `IndexWriter` for sync contexts
- Add `did_buffer_overflow()` accessor to `EventReconciler`
- Frontend: listen for `index-rescan-notification`, show info toast with reason-specific user-friendly message (8s timeout, deduped by `id: 'index-rescan'`)
| |-- Linux: Always full rescan (no event journal; existing DB used for instant enrichment)
39
+
| |-- Incomplete previous scan (has data but no scan_completed_at)? -> notify + fresh scan
37
40
| |-- Otherwise -> fresh full scan
38
41
|
39
42
Full scan:
@@ -118,8 +121,12 @@ Key test files are alongside each module (test functions within `#[cfg(test)]` b
118
121
119
122
**APFS firmlinks**: Scan from `/` only, skip `/System/Volumes/Data`. Normalize all paths via firmlink prefix map so DB lookups work regardless of how the user navigated to a path.
120
123
124
+
**Rescan notification system (`RescanReason` enum)**: Every code path that falls back to a full rescan emits an `index-rescan-notification` event with a `RescanReason` variant and human-readable details. The frontend maps each reason to a user-friendly toast message. Seven reasons: `StaleIndex` (pre-check gap), `JournalGap` (in-loop gap), `ReplayOverflow` (>1M events), `TooManySubdirRescans` (>1K MustScanSubDirs), `WatcherStartFailed`, `ReconcilerBufferOverflow` (>500K buffered events during scan), `IncompletePreviousScan` (has data but no `scan_completed_at`). The pre-check in `resume_or_scan()` catches stale indexes before starting the FSEvents stream, preventing the cmdr-fsevent-stream channel (1024 capacity, `try_send`) from being overwhelmed.
125
+
121
126
## Gotchas
122
127
128
+
**INSERT OR REPLACE on a populated DB is catastrophically slow**: The `platform_case` collation (NFD + case fold on macOS) runs for every B-tree comparison during unique index lookups. On an empty DB a full scan takes ~2.5 min; on a populated DB with 5.5M entries the same scan takes ~30 min because each `INSERT OR REPLACE` triggers ~20 collation calls to traverse the B-tree. The `StaleIndex` path truncates `entries` and `dir_stats` via `TruncateData` + `flush_blocking()` before starting the scan to avoid this. Never do a full rescan into a populated DB without clearing first.
129
+
123
130
**Cold-start replay enters live mode immediately after flush**: The `run_replay_event_loop` doesn't emit `index-dir-updated` during Phase 1 (replay). It collects affected paths, flushes the writer (ensuring all writes are committed), emits a single batched notification, re-enables micro-scans, and enters live mode right away (~100ms from startup). Post-replay verification (`verify_affected_dirs`) runs in a background task (`run_background_verification`) concurrently with live events. This is safe because the writer serializes all writes. Any corrections found by verification are emitted as a separate `index-dir-updated` batch.
124
131
125
132
**Live events are deduplicated and batched with a 1s window**: Both `run_live_event_loop` and the Phase 3 live loop in `run_replay_event_loop` collect incoming events into a `HashMap<String, FsChangeEvent>` keyed by normalized path. On each 1s flush tick, only the deduplicated set is processed through `process_live_event`. `merge_fs_events` keeps the most significant flags when events collide: `must_scan_sub_dirs` always wins, then `removed`, then `created`, then `modified`. `UpdateLastEventId` is sent once per batch (in `process_live_batch`) instead of per-event, reducing writer channel pressure during event storms.
0 commit comments