Skip to content

Commit 12e98e5

Browse files
committed
Indexing: stop the full reconcile from wedging on a large delta
A local or network full reconcile sent one UpsertEntryV2/Delete per changed entry, and each one propagated its size delta up the entire ancestor chain — O(entries × tree-depth). But the reconcile already finishes with a single ComputeAllAggregates that recomputes every dir_stats from scratch, so that per-entry propagation is pure waste: its results are immediately overwritten. On a large delta (e.g. completing a 270k partial up to a 6M-entry /) the writer can't drain, the walk thread parks on a full channel, and the app appears hung for hours. Add a writer 'bulk reconcile' mode (SetDeltaPropagation): the two full-reconcile walkers bracket their walk with a BulkReconcileGuard (RAII — re-enables propagation on every exit path, including cancel/empty-root/error/panic), and the writer skips ONLY the ancestor propagation while it's on. Entry writes, hardlink dedup, and new-dir dir_stats row init are all kept; the final ComputeAllAggregates makes the result identical. Writes drop from O(N×depth) to O(N). The live FSEvents path is untouched — it has no final aggregate, so it keeps propagating. Latent since the network reconcile landed (a6a2f58); the local port (368d726) exposed it at / scale. The perf gate only ever tested no-op and 1%-changed deltas; added a large-delta guard test that asserts no per-entry propagation happens mid-walk (RED if propagation is forced on). 438 indexing tests green.
1 parent c40448e commit 12e98e5

6 files changed

Lines changed: 347 additions & 30 deletions

File tree

apps/desktop/src-tauri/src/indexing/DETAILS.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -373,6 +373,8 @@ A FIRST scan (empty DB) truncates and bulk-builds — fastest on an empty table.
373373

374374
**The single-aggregate coverage constraint (load-bearing).** After the reconcile walk, the rescan path stamps every re-listed dir (`MarkDirsListed`) and runs ONE bottom-up `ComputeAllAggregates`. It must NOT fire `PropagateMinSubtreeEpoch` per dir: the gate measured per-dir propagation across ~37k dirs at ~2× SLOWER than a truncate rebuild (the ancestor-walk degenerates toward O(dirs × depth) when every dir re-stamps to the same new epoch), while a single bottom-up aggregate is faster than truncate. After a reconcile the writer's accumulator maps are empty (reconcile uses `UpsertEntryV2`/`Delete*`, never `InsertEntriesV2`, and `UpsertEntryV2` doesn't populate the maps), so `ComputeAllAggregates` takes the O(dirs) bulk-SQL `compute_all_aggregates_reported` path — which recomputes coverage AND sizes for the whole tree in one pass. Per-dir `PropagateMinSubtreeEpoch` stays ONLY for the small-scope LIVE reconciles (`reconcile_subtree`: per-navigation verifier, `MustScanSubDirs`, SMB-overflow `FullRefresh`), where the chain is short.
375375

376+
**Decision: the full reconcile suppresses per-entry ancestor propagation (`SetDeltaPropagation`).** The single-aggregate rule above governs the FINISH; this governs the WALK. Each `UpsertEntryV2`/`DeleteEntryById`/`DeleteSubtreeById` the diff emits would otherwise auto-walk the ancestor `dir_stats` chain (`propagate_delta_by_id` + `propagate_min_subtree_epoch` + `propagate_recursive_has_symlinks`) — O(entries × depth) across an entire pass. On a large delta (a 270k→6M partial-completion) that wedged the writer for hours: the channel stays full, so the walk thread parks on `send` and the app can't drain. It's also pure waste, because the FINISH's one `ComputeAllAggregates` recomputes every dir's `dir_stats` from the entries table, overwriting whatever the per-entry walk produced. So both full-reconcile walkers (`local_reconcile::run_local_reconcile`, `volume_scanner::reconcile_volume_via_trait`) bracket their BFS with `reconciler::BulkReconcileGuard` — it sends `WriteMessage::SetDeltaPropagation(false)` before the walk and restores `true` on EVERY exit (clean finish, cancel, empty-root, disconnect, error, panic) via `Drop`, so the shared long-lived writer is never left non-propagating for the live event loop that follows. The writer keeps everything else under suppression: the entry insert/update/delete, the hardlink dedup (`has_sized_entry_for_inode`), and the new-directory zero-valued `dir_stats` row init (enrichment needs a row to read mid-walk; the final aggregate fills it) — ONLY the ancestor PROPAGATION is skipped. **Why the LIVE path keeps propagating:** `reconcile_subtree` and the FSEvents handlers have NO final full aggregate, so their per-entry propagation IS the mechanism that keeps `dir_stats` correct — they must never set this flag. **Don't re-add per-entry propagation to the bulk path** (it reintroduces the hours-long wedge); the guard test `bulk_reconcile_suppresses_per_entry_propagation_until_final_aggregate` (asserts dir_stats stay zero mid-walk, exact after the aggregate) pins this.
377+
376378
**The shared per-dir diff.** `reconciler::diff_dir_against_db(dir_id, live_children, db_children, writer)` is the one place the add/remove/modify/type-change diff lives. THREE walk sources feed it source-agnostic `LiveChild`s: the local live small-scope reconcile (`reconcile_subtree`, `std::fs::read_dir`), the local full-tree rescan (`local_reconcile::run_local_reconcile`, `std::fs::read_dir` BFS), and the network full rescan (`volume_scanner::reconcile_volume_via_trait`, `Volume::list_directory` BFS). It keeps `next_id` from the shared `Arc<AtomicI64>` (never `MAX(id)`); ids growing across rescans is harmless (i64). The epoch is bumped once at the scan-start funnel (continuity break), so the whole tree reads stale-but-visible and each reconciled dir flips fresh as it's re-listed (partial-rescan `/a` fresh / `/b` stale). The shared FINISH (stamp listed dirs → ONE `ComputeAllAggregates`) lives once in `reconciler::finish_reconcile`/`send_marks`, called by both full-rescan walkers so they can't drift on the marks-before-aggregate ordering.
377379

378380
**Recursion set is decoupled from the write decision (load-bearing).** `diff_dir_against_db` returns `matched_child_dirs` for EVERY child dir present in both the live listing and the DB, regardless of whether that dir's own metadata changed — and the BFS recurses into all of them. A child dir being "unchanged" at its parent's level (same mtime → no `UpsertEntryV2`) says NOTHING about whether its OWN subtree was ever listed, so the walk MUST descend into it anyway. Gating recursion on `changed` was the exact prod bug: enabling indexing on an already-partially-indexed share (root + top-level dirs known, subtrees never listed) would match the top dirs, write nothing, recurse nowhere, and "complete" in 0.0s over an unscanned share. The write decision stays change-gated (an unchanged dir emits zero rows, preserving the no-op-cheap property); only the recursion set is unconditional. Regression-locked by `reconcile_descends_into_existing_unchanged_child_dirs` (asserts every dir in a multi-level tree is re-listed, not just the root); the older `reconcile_noop_writes_zero_entry_rows` couldn't catch it because a no-op tree has zero writes AND zero legitimate recursion targets.

apps/desktop/src-tauri/src/indexing/local_reconcile.rs

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -220,6 +220,13 @@ fn run_local_reconcile(
220220
// after a level's flush, never by absolute path.
221221
let mut new_dirs: Vec<(PathBuf, i64, String)> = Vec::new();
222222

223+
// Suppress per-entry ancestor propagation for the bulk walk; the guard restores
224+
// it on EVERY exit (clean finish, cancel, empty-root, error). The shared finish
225+
// recomputes all dir_stats via one `ComputeAllAggregates`, so the per-entry walk
226+
// would be redundant O(entries × depth) work that wedges the writer on a large
227+
// delta. See `reconciler::BulkReconcileGuard`.
228+
let _bulk_guard = reconciler::BulkReconcileGuard::begin(writer);
229+
223230
while let Some((dir_path, dir_id)) = queue.pop_front() {
224231
if cancelled.load(Ordering::Relaxed) {
225232
// Cancel: leave the prior index intact (no truncate ran) and send
@@ -873,6 +880,52 @@ mod tests {
873880
);
874881
}
875882

883+
/// End-to-end correctness on a large ADD delta through the real
884+
/// `run_local_reconcile` path (which now brackets its walk with
885+
/// `SetDeltaPropagation(false/true)`): reconciling a populated on-disk tree into
886+
/// an index that holds ONLY the root (every entry is new) must complete and
887+
/// produce correct recursive `dir_stats`.
888+
///
889+
/// This proves the bulk-mode suppression is invisible to the final result: the
890+
/// per-entry ancestor propagation is skipped for the whole walk, yet the single
891+
/// `ComputeAllAggregates` in `finish_reconcile` recomputes every dir's stats
892+
/// exactly. (The `_no_op` convergence test pins the unchanged-tree case; this
893+
/// pins the add-everything case the wedge actually hit.)
894+
#[test]
895+
fn reconcile_from_empty_index_builds_correct_aggregates() {
896+
let h = setup();
897+
let root = tree_root();
898+
let rp = root.path();
899+
// Seed ONLY the root dir (no contents): the reconcile adds the whole tree.
900+
ensure_path_in_db(&h, &norm(rp));
901+
902+
// Tree with known totals: 5 files (150 bytes), 3 dirs (a, a/deep, b).
903+
std::fs::create_dir_all(rp.join("a/deep")).unwrap();
904+
std::fs::create_dir(rp.join("b")).unwrap();
905+
std::fs::write(rp.join("a/f1.txt"), vec![b'x'; 10]).unwrap();
906+
std::fs::write(rp.join("a/f2.txt"), vec![b'x'; 20]).unwrap();
907+
std::fs::write(rp.join("a/deep/f3.txt"), vec![b'x'; 30]).unwrap();
908+
std::fs::write(rp.join("b/f4.txt"), vec![b'x'; 40]).unwrap();
909+
std::fs::write(rp.join("top.txt"), vec![b'x'; 50]).unwrap();
910+
911+
let summary = run_reconcile(&h, rp, false).expect("reconcile from empty");
912+
assert!(!summary.was_cancelled, "the reconcile must complete, not cancel");
913+
914+
// Root aggregates: every file and dir, deduped sizes summed.
915+
let root_id = resolve(&h, rp).expect("root resolved");
916+
let root_stats = dir_stats(&h, root_id).expect("root stats");
917+
assert_eq!(root_stats.recursive_file_count, 5, "root must count all 5 files");
918+
assert_eq!(root_stats.recursive_dir_count, 3, "root must count a, a/deep, b");
919+
assert_eq!(root_stats.recursive_logical_size, 150, "root must sum all file sizes");
920+
921+
// A sub-aggregate (a holds f1, f2, and a/deep/f3 = 60 bytes, 3 files, 1 dir).
922+
let a = resolve(&h, &rp.join("a")).expect("a resolved");
923+
let a_stats = dir_stats(&h, a).expect("a stats");
924+
assert_eq!(a_stats.recursive_file_count, 3, "a must count f1, f2, deep/f3");
925+
assert_eq!(a_stats.recursive_dir_count, 1, "a must count deep");
926+
assert_eq!(a_stats.recursive_logical_size, 60, "a must sum 10 + 20 + 30");
927+
}
928+
876929
/// Cancel (data-safety): a cancelled reconcile returns `was_cancelled`
877930
/// (so the completion handler writes no scan_completed_at) and sends no
878931
/// marks/aggregate, leaving the prior index intact and un-restamped.

apps/desktop/src-tauri/src/indexing/reconciler.rs

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -623,6 +623,43 @@ pub(super) fn finish_reconcile(listed_ids: &[i64], epoch: u64, writer: &IndexWri
623623
Ok(())
624624
}
625625

626+
/// RAII bracket for a FULL reconcile's bulk walk: tells the writer to STOP
627+
/// per-entry ancestor `dir_stats` propagation for the walk's duration, then
628+
/// RESTORES it on EVERY scope exit (clean finish, cancel, empty-root, error, or
629+
/// panic) so the shared, long-lived writer is never left non-propagating for the
630+
/// LIVE event loop that runs afterwards.
631+
///
632+
/// Why suppress: the full reconcile emits thousands of `UpsertEntryV2` / `Delete*`;
633+
/// letting each one walk the ancestor chain
634+
/// (`propagate_delta_by_id` / `propagate_min_subtree_epoch` /
635+
/// `propagate_recursive_has_symlinks`) is O(entries × tree-depth) and wedges the
636+
/// writer for hours on a large delta. It's also pure waste: `finish_reconcile`'s
637+
/// single `ComputeAllAggregates` recomputes every dir's stats from the entries
638+
/// table, overwriting whatever per-entry propagation produced. The LIVE path
639+
/// (`reconcile_subtree`, FSEvents) has NO final aggregate, so it MUST keep
640+
/// propagating — which is exactly why this guard restores the default on exit.
641+
///
642+
/// Both sends are best-effort: on a hard writer-gone error the restore send fails
643+
/// and is ignored, matching how the surrounding walk already treats writer sends.
644+
pub(super) struct BulkReconcileGuard {
645+
writer: IndexWriter,
646+
}
647+
648+
impl BulkReconcileGuard {
649+
/// Begin the bracket: disable per-entry propagation on the writer.
650+
pub(super) fn begin(writer: &IndexWriter) -> Self {
651+
let _ = writer.send(WriteMessage::SetDeltaPropagation(false));
652+
Self { writer: writer.clone() }
653+
}
654+
}
655+
656+
impl Drop for BulkReconcileGuard {
657+
fn drop(&mut self) {
658+
// Re-enable per-entry propagation for the subsequent live path.
659+
let _ = self.writer.send(WriteMessage::SetDeltaPropagation(true));
660+
}
661+
}
662+
626663
/// Reconcile a subtree by diffing the filesystem against the DB directory-by-directory.
627664
///
628665
/// Unlike `scanner::scan_subtree` which deletes all descendants then re-inserts,

apps/desktop/src-tauri/src/indexing/volume_scanner.rs

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -496,6 +496,13 @@ pub(crate) async fn reconcile_volume_via_trait(
496496
// completed. See `indexing/DETAILS.md` § "Non-destructive rescan".
497497
let mut new_dirs: Vec<(PathBuf, i64, String)> = Vec::new();
498498

499+
// Suppress per-entry ancestor propagation for the bulk walk; the guard restores
500+
// it on EVERY exit (clean finish, cancel, empty-root, disconnect, error). The
501+
// shared finish recomputes all dir_stats via one `ComputeAllAggregates`, so the
502+
// per-entry walk would be redundant O(entries × depth) work. See
503+
// `reconciler::BulkReconcileGuard`.
504+
let _bulk_guard = reconciler::BulkReconcileGuard::begin(&writer);
505+
499506
loop {
500507
if cancelled.load(Ordering::Relaxed) {
501508
// User cancel: stop, but leave the prior index intact (no truncate ran).

0 commit comments

Comments
 (0)