Skip to content

Commit b590a54

Browse files
committed
Indexing: detect stale index, notify user, rescan
- Add `RescanReason` enum (7 variants) and `index-rescan-notification` Tauri event emitted from every code path that falls back to a full rescan - Pre-check in `resume_or_scan()` compares stored `last_event_id` with `FSEventsGetCurrentEventId()` before starting the FSEvents stream — prevents the 1024-capacity `try_send` channel in `cmdr-fsevent-stream` from being overwhelmed with millions of replayed events - Truncate `entries` + `dir_stats` via new `TruncateData` writer message before rescanning a stale DB — `INSERT OR REPLACE` on a populated table with the `platform_case` collation takes ~30 min vs ~2.5 min on empty - Add `flush_blocking()` to `IndexWriter` for sync contexts - Add `did_buffer_overflow()` accessor to `EventReconciler` - Frontend: listen for `index-rescan-notification`, show info toast with reason-specific user-friendly message (8s timeout, deduped by `id: 'index-rescan'`)
1 parent 207ddee commit b590a54

6 files changed

Lines changed: 210 additions & 18 deletions

File tree

apps/desktop/src-tauri/src/indexing/CLAUDE.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,8 +32,11 @@ App startup
3232
|-- init(): register IndexManagerState in Tauri
3333
|-- start_indexing(): create IndexManager, open SQLite, spawn writer thread
3434
|-- resume_or_scan():
35-
| |-- macOS: Has existing index + last_event_id? -> sinceWhen replay (FSEvents journal)
35+
| |-- macOS: Has existing index + last_event_id?
36+
| | |-- Pre-check: event gap > 1M? -> emit index-rescan-notification (StaleIndex), truncate entries+dir_stats, full scan
37+
| | |-- Otherwise -> sinceWhen replay (FSEvents journal)
3638
| |-- Linux: Always full rescan (no event journal; existing DB used for instant enrichment)
39+
| |-- Incomplete previous scan (has data but no scan_completed_at)? -> notify + fresh scan
3740
| |-- Otherwise -> fresh full scan
3841
|
3942
Full scan:
@@ -118,8 +121,12 @@ Key test files are alongside each module (test functions within `#[cfg(test)]` b
118121

119122
**APFS firmlinks**: Scan from `/` only, skip `/System/Volumes/Data`. Normalize all paths via firmlink prefix map so DB lookups work regardless of how the user navigated to a path.
120123

124+
**Rescan notification system (`RescanReason` enum)**: Every code path that falls back to a full rescan emits an `index-rescan-notification` event with a `RescanReason` variant and human-readable details. The frontend maps each reason to a user-friendly toast message. Seven reasons: `StaleIndex` (pre-check gap), `JournalGap` (in-loop gap), `ReplayOverflow` (>1M events), `TooManySubdirRescans` (>1K MustScanSubDirs), `WatcherStartFailed`, `ReconcilerBufferOverflow` (>500K buffered events during scan), `IncompletePreviousScan` (has data but no `scan_completed_at`). The pre-check in `resume_or_scan()` catches stale indexes before starting the FSEvents stream, preventing the cmdr-fsevent-stream channel (1024 capacity, `try_send`) from being overwhelmed.
125+
121126
## Gotchas
122127

128+
**INSERT OR REPLACE on a populated DB is catastrophically slow**: The `platform_case` collation (NFD + case fold on macOS) runs for every B-tree comparison during unique index lookups. On an empty DB a full scan takes ~2.5 min; on a populated DB with 5.5M entries the same scan takes ~30 min because each `INSERT OR REPLACE` triggers ~20 collation calls to traverse the B-tree. The `StaleIndex` path truncates `entries` and `dir_stats` via `TruncateData` + `flush_blocking()` before starting the scan to avoid this. Never do a full rescan into a populated DB without clearing first.
129+
123130
**Cold-start replay enters live mode immediately after flush**: The `run_replay_event_loop` doesn't emit `index-dir-updated` during Phase 1 (replay). It collects affected paths, flushes the writer (ensuring all writes are committed), emits a single batched notification, re-enables micro-scans, and enters live mode right away (~100ms from startup). Post-replay verification (`verify_affected_dirs`) runs in a background task (`run_background_verification`) concurrently with live events. This is safe because the writer serializes all writes. Any corrections found by verification are emitted as a separate `index-dir-updated` batch.
124131

125132
**Live events are deduplicated and batched with a 1s window**: Both `run_live_event_loop` and the Phase 3 live loop in `run_replay_event_loop` collect incoming events into a `HashMap<String, FsChangeEvent>` keyed by normalized path. On each 1s flush tick, only the deduplicated set is processed through `process_live_event`. `merge_fs_events` keeps the most significant flags when events collide: `must_scan_sub_dirs` always wins, then `removed`, then `created`, then `modified`. `UpdateLastEventId` is sent once per batch (in `process_live_batch`) instead of per-event, reducing writer channel pressure during event storms.

apps/desktop/src-tauri/src/indexing/mod.rs

Lines changed: 131 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -247,6 +247,50 @@ pub struct IndexReplayProgressEvent {
247247
pub estimated_total: Option<u64>,
248248
}
249249

250+
/// Why a full rescan was triggered instead of incremental replay.
251+
/// Sent to the frontend as `index-rescan-notification` so the UI can show
252+
/// a transparent, user-friendly toast.
253+
#[derive(Debug, Clone, Serialize, Deserialize)]
254+
#[serde(rename_all = "snake_case")]
255+
pub enum RescanReason {
256+
/// Event ID gap too large — app hasn't run for a long time.
257+
StaleIndex,
258+
/// FSEvents journal unavailable (gap detected during replay).
259+
JournalGap,
260+
/// Replay processed too many events (safety limit exceeded).
261+
ReplayOverflow,
262+
/// Too many MustScanSubDirs events during replay.
263+
TooManySubdirRescans,
264+
/// DriveWatcher failed to start for replay.
265+
WatcherStartFailed,
266+
/// Reconciler event buffer overflowed during scan.
267+
ReconcilerBufferOverflow,
268+
/// Previous scan didn't complete (app crashed or was force-quit).
269+
IncompletePreviousScan,
270+
}
271+
272+
#[derive(Debug, Clone, Serialize, Deserialize)]
273+
#[serde(rename_all = "camelCase")]
274+
pub struct IndexRescanNotificationEvent {
275+
pub volume_id: String,
276+
pub reason: RescanReason,
277+
/// Human-readable details for logs (not shown to user directly).
278+
pub details: String,
279+
}
280+
281+
/// Emit an `index-rescan-notification` event and log the reason at INFO level.
282+
fn emit_rescan_notification(app: &AppHandle, volume_id: &str, reason: RescanReason, details: String) {
283+
log::info!("Index rescan triggered ({reason:?}): {details}");
284+
let _ = app.emit(
285+
"index-rescan-notification",
286+
IndexRescanNotificationEvent {
287+
volume_id: volume_id.to_string(),
288+
reason,
289+
details,
290+
},
291+
);
292+
}
293+
250294
// ── Response types ───────────────────────────────────────────────────
251295

252296
#[derive(Debug, Clone, Serialize, Deserialize)]
@@ -370,6 +414,36 @@ impl IndexManager {
370414
if let Some(ref last_event_id_str) = status.last_event_id {
371415
let last_event_id: u64 = last_event_id_str.parse().unwrap_or(0);
372416
if last_event_id > 0 {
417+
// Pre-check: compare stored event ID with current system event ID.
418+
// If the gap is too large, skip replay entirely — the cmdr-fsevent-stream
419+
// channel (1024 capacity, try_send) would silently drop most events,
420+
// and replaying millions of events is slower than a fresh scan anyway.
421+
let current_id = watcher::current_event_id();
422+
if current_id > 0 && current_id > last_event_id + JOURNAL_GAP_THRESHOLD {
423+
let gap = current_id - last_event_id;
424+
emit_rescan_notification(
425+
&self.app,
426+
&self.volume_id,
427+
RescanReason::StaleIndex,
428+
format!(
429+
"Stored last_event_id={last_event_id}, current system \
430+
event_id={current_id}, gap={gap} \
431+
(threshold={JOURNAL_GAP_THRESHOLD}). \
432+
The app likely hasn't run for a long time."
433+
),
434+
);
435+
// Truncate entries + dir_stats before scanning. INSERT OR REPLACE on a
436+
// populated DB with the `platform_case` collation is extremely slow
437+
// (30 min vs 2.5 min on empty). The stale data is useless anyway.
438+
if let Err(e) = self.writer.send(WriteMessage::TruncateData) {
439+
log::warn!("Failed to send TruncateData: {e}");
440+
}
441+
if let Err(e) = self.writer.flush_blocking() {
442+
log::warn!("Failed to flush after TruncateData: {e}");
443+
}
444+
return self.start_scan();
445+
}
446+
373447
log::debug!(
374448
"Existing index found (scan_completed_at={}, last_event_id={last_event_id}), \
375449
attempting sinceWhen replay",
@@ -381,8 +455,17 @@ impl IndexManager {
381455
log::debug!("Existing index found but no last_event_id, starting fresh scan");
382456
} else if status.scan_completed_at.is_some() {
383457
log::debug!("Existing index found, starting rescan (no event replay on this platform)");
458+
} else if status.last_event_id.is_some() {
459+
emit_rescan_notification(
460+
&self.app,
461+
&self.volume_id,
462+
RescanReason::IncompletePreviousScan,
463+
"Index DB exists but scan_completed_at is not set. Previous scan likely didn't \
464+
finish."
465+
.to_string(),
466+
);
384467
} else {
385-
log::debug!("No existing index (scan_completed_at not set), starting fresh scan");
468+
log::debug!("No existing index, starting fresh scan");
386469
}
387470

388471
self.start_scan()
@@ -403,7 +486,12 @@ impl IndexManager {
403486
log::debug!("DriveWatcher started for replay (sinceWhen={since_event_id}, current={current_id})");
404487
}
405488
Err(e) => {
406-
log::warn!("Failed to start DriveWatcher for replay: {e}, falling back to full scan");
489+
emit_rescan_notification(
490+
&self.app,
491+
&self.volume_id,
492+
RescanReason::WatcherStartFailed,
493+
format!("DriveWatcher failed to start for replay: {e}"),
494+
);
407495
return self.start_scan();
408496
}
409497
}
@@ -610,6 +698,18 @@ impl IndexManager {
610698
}
611699
log::debug!("Reconciler: buffered {buffered_count} events during scan");
612700

701+
if reconciler.did_buffer_overflow() {
702+
emit_rescan_notification(
703+
&app,
704+
&volume_id,
705+
RescanReason::ReconcilerBufferOverflow,
706+
"The filesystem watcher buffered over 500,000 events during the \
707+
scan, exceeding the reconciler's capacity. A lot of filesystem \
708+
activity was happening during the scan."
709+
.to_string(),
710+
);
711+
}
712+
613713
// Flush the writer to ensure all scan batches are committed
614714
// before opening the read connection. Without this, the WAL
615715
// snapshot may not include the latest InsertEntriesV2 batches,
@@ -1134,11 +1234,17 @@ async fn run_replay_event_loop(
11341234
if !first_event_checked {
11351235
first_event_checked = true;
11361236
if event.event_id > since_event_id + JOURNAL_GAP_THRESHOLD {
1137-
log::warn!(
1138-
"Journal gap detected: stored last_event_id={since_event_id}, \
1139-
first received event_id={}, gap={}",
1140-
event.event_id,
1141-
event.event_id - since_event_id,
1237+
emit_rescan_notification(
1238+
&app,
1239+
&volume_id,
1240+
RescanReason::JournalGap,
1241+
format!(
1242+
"Stored last_event_id={since_event_id}, first received event_id={}, \
1243+
gap={} (threshold={JOURNAL_GAP_THRESHOLD}). FSEvents journal may \
1244+
have been purged.",
1245+
event.event_id,
1246+
event.event_id - since_event_id,
1247+
),
11421248
);
11431249
// Re-enable micro-scans before falling back to full scan
11441250
micro_scans.set_replay_active(false);
@@ -1212,9 +1318,15 @@ async fn run_replay_event_loop(
12121318
// fall back to a full scan. Handles the FDA-toggle scenario where
12131319
// the app suddenly sees millions of previously hidden paths.
12141320
if event_count >= REPLAY_EVENT_COUNT_LIMIT {
1215-
log::warn!(
1216-
"Replay: event count ({event_count}) exceeded safety limit \
1217-
({REPLAY_EVENT_COUNT_LIMIT}). Aborting replay and falling back to full scan."
1321+
emit_rescan_notification(
1322+
&app,
1323+
&volume_id,
1324+
RescanReason::ReplayOverflow,
1325+
format!(
1326+
"Replay processed {event_count} events, exceeding the safety limit of \
1327+
{REPLAY_EVENT_COUNT_LIMIT}. This can happen when Full Disk Access was \
1328+
toggled."
1329+
),
12181330
);
12191331
micro_scans.set_replay_active(false);
12201332
if let Some(tx) = fallback_tx.take() {
@@ -1321,7 +1433,15 @@ async fn run_replay_event_loop(
13211433
// Queue any MustScanSubDirs rescans that were deferred during replay.
13221434
// If pending_rescans overflowed, trigger a full rescan via fallback.
13231435
if pending_rescans_overflow {
1324-
log::warn!("Replay: pending rescans overflowed, triggering full rescan");
1436+
emit_rescan_notification(
1437+
&app,
1438+
&volume_id,
1439+
RescanReason::TooManySubdirRescans,
1440+
format!(
1441+
"Replay accumulated more than {MAX_PENDING_RESCANS} directories needing full \
1442+
rescans. This typically means a major filesystem reorganization happened."
1443+
),
1444+
);
13251445
if let Some(tx) = fallback_tx.take() {
13261446
let _ = tx.send(());
13271447
}

apps/desktop/src-tauri/src/indexing/reconciler.rs

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -280,6 +280,11 @@ impl EventReconciler {
280280
});
281281
}
282282

283+
/// Whether the reconciler's event buffer overflowed during the scan.
284+
pub(super) fn did_buffer_overflow(&self) -> bool {
285+
self.buffer_overflow
286+
}
287+
283288
/// Number of buffered events (for diagnostics).
284289
#[cfg(test)]
285290
pub fn buffer_len(&self) -> usize {

apps/desktop/src-tauri/src/indexing/writer.rs

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,10 @@ pub enum WriteMessage {
6363
/// Flush: confirms all prior messages have been committed.
6464
/// The writer responds through the channel after processing this message.
6565
Flush(oneshot::Sender<()>),
66+
/// Truncate `entries` and `dir_stats` tables, preserving `meta`.
67+
/// Used before a full rescan on a stale DB to avoid slow `INSERT OR REPLACE`
68+
/// on a populated table with the expensive `platform_case` collation.
69+
TruncateData,
6670
/// Begin an explicit SQLite transaction.
6771
/// All subsequent writes are batched until `CommitTransaction`.
6872
/// Dramatically reduces fsync overhead for bulk operations (replay).
@@ -138,6 +142,19 @@ impl IndexWriter {
138142
})
139143
}
140144

145+
/// Send a `Flush` and block until all prior messages have been committed.
146+
/// Safe to call from synchronous code (no async runtime needed).
147+
pub fn flush_blocking(&self) -> Result<(), IndexStoreError> {
148+
let (tx, rx) = oneshot::channel();
149+
self.send(WriteMessage::Flush(tx))?;
150+
rx.blocking_recv().map_err(|_| {
151+
IndexStoreError::Io(std::io::Error::new(
152+
std::io::ErrorKind::BrokenPipe,
153+
"Writer thread dropped flush reply",
154+
))
155+
})
156+
}
157+
141158
/// Send a `Shutdown` message and wait for the writer thread to finish.
142159
///
143160
/// Joins the thread to ensure all buffered writes are flushed.
@@ -388,6 +405,20 @@ fn process_message(conn: &rusqlite::Connection, msg: WriteMessage, stats: &Write
388405
} => {
389406
propagate_delta_by_id(conn, entry_id, size_delta, file_count_delta, dir_count_delta);
390407
}
408+
WriteMessage::TruncateData => {
409+
let t = Instant::now();
410+
match conn.execute_batch(
411+
"DELETE FROM dir_stats; DELETE FROM entries; INSERT OR IGNORE INTO entries (id, parent_id, name, is_directory, is_symlink) VALUES (1, 0, '', 1, 0);",
412+
) {
413+
Ok(()) => {
414+
log::info!(
415+
"Writer: truncated entries + dir_stats ({}ms)",
416+
t.elapsed().as_millis(),
417+
);
418+
}
419+
Err(e) => log::warn!("Writer: truncate failed: {e}"),
420+
}
421+
}
391422
WriteMessage::ComputeAllAggregates => {
392423
let t = Instant::now();
393424
match aggregator::compute_all_aggregates(conn) {

apps/desktop/src/lib/indexing/CLAUDE.md

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -35,13 +35,14 @@ cancelNavPriority(path: string): Promise<void>
3535

3636
## Scan state (`index-state.svelte.ts`)
3737

38-
Module-level `$state` variables (`scanning`, `entriesScanned`, `dirsFound`) react to three Tauri events:
38+
Module-level `$state` variables (`scanning`, `entriesScanned`, `dirsFound`) react to four Tauri events:
3939

40-
| Event | Payload | Effect |
41-
| --------------------- | --------------------------------------------------- | ------------------------------------ |
42-
| `index-scan-started` | `{ volumeId }` | `scanning = true`, counters reset |
43-
| `index-scan-progress` | `{ volumeId, entriesScanned, dirsFound }` | Update counters |
44-
| `index-scan-complete` | `{ volumeId, totalEntries, totalDirs, durationMs }` | `scanning = false`, set final counts |
40+
| Event | Payload | Effect |
41+
| --------------------------- | --------------------------------------------------- | -------------------------------------------- |
42+
| `index-scan-started` | `{ volumeId }` | `scanning = true`, counters reset |
43+
| `index-scan-progress` | `{ volumeId, entriesScanned, dirsFound }` | Update counters |
44+
| `index-scan-complete` | `{ volumeId, totalEntries, totalDirs, durationMs }` | `scanning = false`, set final counts |
45+
| `index-rescan-notification` | `{ volumeId, reason, details }` | Show info toast with reason-specific message |
4546

4647
**Startup race condition**: The Rust indexer starts in Tauri's `setup()` hook before the frontend registers listeners.
4748
`initIndexState` uses a "listen first, then query" pattern: registers event listeners, then calls `get_index_status` IPC
@@ -105,4 +106,5 @@ No unit or integration tests exist for this module yet. Manual testing via the R
105106

106107
- `@tauri-apps/api/core``invoke`
107108
- `$lib/tauri-commands``listen`, `UnlistenFn`
109+
- `$lib/ui/toast``addToast` (rescan notification toasts)
108110
- `$lib/file-explorer/selection/selection-info-utils``formatNumber` (overlay only)

0 commit comments

Comments
 (0)