vdavid
diff --git a/‎apps/desktop/src-tauri/src/indexing/CLAUDE.md‎
Lines changed: 2 additions & 2 deletions b/‎apps/desktop/src-tauri/src/indexing/CLAUDE.md‎
Lines changed: 2 additions & 2 deletions
@@ -17,7 +17,7 @@ Full design: `docs/specs/drive-indexing/plan.md`
 - **micro_scan.rs** -- `MicroScanManager`: bounded task pool (default 3 concurrent), priority queue (`UserSelected` > `CurrentDir`), deduplication, cancellation. Skips after full scan completes.
 - **aggregator.rs** -- Dir stats computation. Bottom-up after full scan (O(N) single pass), per-subtree after micro-scan, incremental delta propagation up ancestor chain for watcher events. Two entry points for full aggregation: `compute_all_aggregates_reported` (loads maps from SQL) and `compute_all_aggregates_with_maps` (accepts pre-built maps from the writer). Both accept an `on_progress: &mut dyn FnMut(AggregationProgress)` callback and delegate to `compute_and_write()` for the shared topological sort + bottom-up computation + batch write. Progress is reported at phase transitions and every ~1% during compute/write loops. `AggregationPhase` enum: `SavingEntries` (flushing writer channel), `LoadingDirectories`, `Sorting`, `Computing`, `Writing`. `backfill_missing_dir_stats` is a catch-up pass that finds directories without `dir_stats` rows and computes their stats bottom-up; triggered after reconciler replay and cold-start replay via `BackfillMissingDirStats` writer message.
 - **watcher.rs** -- Drive-level filesystem watcher. macOS: FSEvents via `cmdr-fsevent-stream` with event IDs and `sinceWhen` replay. Linux: `notify` crate (inotify backend) with recursive watching and synthetic event counter. Other platforms: stub. `supports_event_replay()` lets callers branch on whether journal replay is available.
-- **reconciler.rs** -- Buffers FSEvents during scan (capped at 500K events; overflow sets `buffer_overflow` flag forcing full rescan), replays after scan completes using event IDs to skip stale events. Processes live events for file creates/removes/modifies using integer-keyed write messages (`UpsertEntryV2`, `DeleteEntryById`, `DeleteSubtreeById`, `PropagateDeltaById`). Resolves filesystem paths to entry IDs via `store::resolve_path()` using a read connection passed by callers. Key functions (`process_fs_event`, `emit_dir_updated`) are `pub(super)` so `mod.rs` can call them directly during cold-start replay.
+- **reconciler.rs** -- Buffers FSEvents during scan (capped at 500K events; overflow sets `buffer_overflow` flag forcing full rescan), replays after scan completes using event IDs to skip stale events. Processes live events for file creates/removes/modifies using integer-keyed write messages (`UpsertEntryV2`, `DeleteEntryById`, `DeleteSubtreeById`, `PropagateDeltaById`). Resolves filesystem paths to entry IDs via `store::resolve_path()` using a read connection passed by callers. Key functions (`process_fs_event`, `emit_dir_updated`) are `pub(super)` so `mod.rs` can call them directly during cold-start replay. `reconcile_subtree()` handles MustScanSubDirs by diffing filesystem vs DB directory-by-directory instead of delete-then-reinsert, making it safe to interrupt at any point.
 - **firmlinks.rs** -- Parses `/usr/share/firmlinks`, builds prefix map, normalizes paths. Converts `/System/Volumes/Data/Users/foo` to `/Users/foo`.
 - **verifier.rs** -- Placeholder for per-navigation background readdir diff (future milestone).
 
@@ -112,7 +112,7 @@ Key test files are alongside each module (test functions within `#[cfg(test)]` b
 
 **Physical sizes (`st_blocks * 512`)**: More meaningful for disk usage than logical size. May overcount ~10-20% for APFS clones (shared blocks). Volume usage bar uses `statfs()` for true totals.
 
-**Subtree rescans delete descendants first**: `scan_subtree` sends `DeleteDescendantsById(root_id)` to the writer before inserting fresh entries. This prevents orphaned entries that previously caused DB bloat (4x) and missing dir_stats. The root entry is preserved (its existing ID is reused by `ScanContext`). The delete and subsequent inserts are serialized through the single writer channel, so no race conditions. `ComputeSubtreeAggregates` runs after the scan to recompute stats.
+**MustScanSubDirs uses reconciliation, not delete-then-reinsert**: `reconcile_subtree()` diffs the filesystem against the DB directory-by-directory, only inserting/deleting/updating entries that changed. This is safe to interrupt at any point (no bulk delete phase that could leave the DB empty). For brand-new directories discovered during reconciliation, a `flush_blocking()` + re-resolve cycle ensures their IDs are available before recursing into them. `scanner::scan_subtree` (which uses destructive `DeleteDescendantsById`) is still used by micro-scans but no longer by MustScanSubDirs.
 
 **In-memory accumulation eliminates aggregation SQL queries**: During a full scan, the writer thread accumulates two HashMaps in `AccumulatorMaps` as `InsertEntriesV2` batches arrive: `direct_stats` (parent_id -> file size/count/dir count) and `child_dirs` (parent_id -> child dir IDs). When `ComputeAllAggregates` fires, these maps are passed to `compute_all_aggregates_with_maps()`, skipping the two expensive full-table-scan SQL queries (`bulk_get_children_stats_by_id` and `bulk_get_child_dir_ids`) that previously dominated aggregation time (~70%). Maps are cleared on `TruncateData` and after aggregation completes. Falls back to SQL queries if maps are empty.