You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
1. Replay flush hang — Micro-scans during replay sent nested transactions that broke the writer. Fixed by adding replay_active suppression flag to MicroScanManager.
2. Live events silently dropped — cmdr-fsevent-stream used strict from_bits() which rejected events with unknown macOS flag bits. Changed to from_bits_truncate().
3. Micro-scans permanently suppressed after replay — set_replay_active(false) was placed after the infinite Phase 3 loop. Moved it inside run_replay_event_loop before Phase 3 starts.
4. delete_subtree taking 14+ seconds per call — LIKE queries on a 5M-row table caused full table scans. Converted to range queries (path > prefix/ AND path < prefix0) that use the PRIMARY KEY index. Result: replay dropped from 5+ minutes to 2.3 seconds.
Copy file name to clipboardExpand all lines: apps/desktop/src-tauri/src/indexing/CLAUDE.md
+12-3Lines changed: 12 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,12 +10,12 @@ Full design: `docs/specs/drive-indexing/plan.md`
10
10
11
11
-**mod.rs** -- Public API: `init()`, `start_indexing()`, `stop_indexing()`, `clear_index()`, `enrich_entries_with_index()`. `IndexManager` coordinates all subsystems. Global read-only store for enrichment.
12
12
-**store.rs** -- SQLite schema (entries, dir_stats, meta), read queries (`get_dir_stats_batch`, `get_index_status`), DB open/migrate. Schema version check: mismatch triggers drop+rebuild.
13
-
-**writer.rs** -- Single writer thread, owns the write connection, processes `WriteMessage` channel (bounded mpsc). Priority: `UpdateDirStats` before `InsertEntries`.
13
+
-**writer.rs** -- Single writer thread, owns the write connection, processes `WriteMessage` channel (unbounded mpsc). Priority: `UpdateDirStats` before `InsertEntries`. `Flush` variant + async `flush()` method let callers wait for all prior writes to commit.
14
14
-**scanner.rs** -- jwalk-based parallel directory walker. `scan_volume()` for full scan, `scan_subtree()` for micro-scans. Exclusion filter for macOS system paths. Physical sizes (`st_blocks * 512`).
15
15
-**micro_scan.rs** -- `MicroScanManager`: bounded task pool (default 3 concurrent), priority queue (`UserSelected` > `CurrentDir`), deduplication, cancellation. Skips after full scan completes.
16
16
-**aggregator.rs** -- Dir stats computation. Bottom-up after full scan (O(N) single pass), per-subtree after micro-scan, incremental delta propagation up ancestor chain for watcher events.
17
17
-**watcher.rs** -- Drive-level FSEvents watcher via `cmdr-fsevent-stream`. File-level events with event IDs. Supports `sinceWhen` for cold-start replay.
18
-
-**reconciler.rs** -- Buffers FSEvents during scan, replays after scan completes using event IDs to skip stale events. Processes live events for file creates/removes/modifies.
18
+
-**reconciler.rs** -- Buffers FSEvents during scan, replays after scan completes using event IDs to skip stale events. Processes live events for file creates/removes/modifies. Key functions (`process_fs_event`, `emit_dir_updated`) are `pub(super)` so `mod.rs` can call them directly during cold-start replay.
@@ -90,6 +91,14 @@ Key test files are alongside each module (test functions within `#[cfg(test)]` b
90
91
91
92
## Gotchas
92
93
94
+
**Cold-start replay uses two-phase flush**: The `run_replay_event_loop` doesn't emit `index-dir-updated` during Phase 1 (replay). It collects affected paths, flushes the writer (ensuring all writes are committed), then emits a single batched notification. This prevents the frontend from reading stale data.
95
+
96
+
**Live events are batched with a 300 ms window**: Both `run_live_event_loop` and the Phase 3 live loop in `run_replay_event_loop` use `tokio::select!` with a 300 ms `tokio::time::interval` to collect affected paths in a `HashSet` and emit a single `index-dir-updated` per flush. This prevents UI flicker from rapid per-event notifications (FSEvents can fire hundreds of events per second during bulk operations). `process_live_event` collects paths into the caller's `HashSet` instead of emitting directly.
97
+
98
+
**Writer-side delete-with-propagation**: `DeleteEntry` and `DeleteSubtree` handlers in the writer automatically read old data before deleting and propagate accurate negative deltas. This means every deletion -- replay, live, verification -- gets correct dir_stats updates without callers needing to send separate `PropagateDelta` messages. `delete_subtree` and `propagate_delta` have no internal transactions, so they're safe inside the replay's `BEGIN IMMEDIATE` transaction.
99
+
100
+
**Post-replay verification is bidirectional**: `verify_affected_dirs` checks both directions: (1) stale entries in DB but not on disk (sends `DeleteEntry`/`DeleteSubtree`), and (2) missing entries on disk but not in DB (sends `UpsertEntry` + `PropagateDelta` for files, collects directory paths for `scan_subtree`). New directories are scanned and their subtree totals propagated up the ancestor chain. The `GLOBAL_INDEX_STORE` mutex guard is scoped to avoid holding it across `.await` points (the guard is not `Send`).
101
+
93
102
**Schema version mismatch drops the DB**: If `schema_version` in meta doesn't match what the code expects, the entire DB is deleted and rebuilt. No migration path (it's a cache, not user data).
94
103
95
104
**`verifier.rs` is a placeholder**: Per-navigation readdir diff is a future milestone. Currently just a TODO comment.
0 commit comments