You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Split `indexing/search.rs` (2361 lines) into `search/{types,index,engine,query}.rs` with clear boundaries: `engine.rs` is pure (no I/O), `types.rs` is pure data, `query.rs` handles DB operations
- Move AI pipeline from `commands/` to `search/ai/{prompt,parser,mappings,query_builder}.rs`
- Split `SearchDialog.svelte` (1552 lines) into orchestrator + `AiSearchRow`, `SearchInputArea`, `SearchResults`
- Update all CLAUDE.md files, `architecture.md`, and path references
|`crash_reporter.rs`| Crash reporting |`check_pending_crash_report`, `dismiss_crash_report`, `send_crash_report`. Delegates to `crash_reporter` module. Send is skipped in dev/CI. |
28
-
|`search.rs`| Drive search |`prepare_search_index`, `search_files`, `release_search_index`, `translate_search_query`, `parse_search_scope`. Thin wrappers over `indexing::search` module. Post-filters directory sizes after `fill_directory_sizes`. AI search uses single-pass classification prompt → `ai_response_parser` → `ai_query_builder` pipeline. |
29
-
|`ai_response_parser.rs`| AI search parser | Key-value line parser for LLM classification responses. Validates enum fields, extracts keywords. Fallback keyword extraction when LLM fails. |
30
-
|`ai_query_builder.rs`| AI search builder | Maps parsed LLM enums (type, time, size, scope) into `SearchQuery` fields. Merges keywords + type into single regex pattern. Deterministic date/size computation. |
28
+
|`search.rs`| Drive search | Thin IPC wrappers over `search` module. `resolve_ai_backend` for AI provider config. Post-filters directory sizes after `fill_directory_sizes`. |
31
29
|`sync_status.rs`| Cloud sync status |`get_sync_status` — macOS delegates to `file_system::sync_status`; non-macOS returns empty map via `#[cfg]` on the function itself (not the module). |
note: brief limitation caveat if query involves unfilterable concepts
349
-
350
-
Rules:
351
-
- \"keywords\" = words likely in FILENAMES. Not descriptions.
352
-
- Use singular forms for keywords (contract, not contracts).
353
-
- \"I name them X\" / \"I mark them as X\" → keywords: X (not the descriptive words)
354
-
- Only set `time` when the user explicitly mentions a time period (yesterday, last week, recent, 2024, etc.). Never default to recent/today.
355
-
- Prefer `type` over `keywords` for well-known file categories. Don't put the type name in keywords.
356
-
- Don't put the file format in keywords when using a type. \"PDF documents\" → type: documents. \"sqlite databases\" → type: databases.
357
-
- If the user wants ONLY a specific format (not all files of that category), use the format as keyword without type: \"HEIC photos I haven't converted\" → keywords: .heic / note: can't determine conversion status
358
-
- \"not in X\" / \"but not in X\" / \"excluding X\" / \"except in X\" → ALWAYS use exclude: X
359
-
- \"ssh keys\"/\"env files\"/\"docker compose\"/\"shell scripts\" → type handles this, no keywords needed
360
-
- For content/semantic queries (\"photos of my cat\"), set type + add a note
361
-
362
-
Examples:
363
-
\"recent invoices, I mark them rymd\" → keywords: rymd / type: documents / time: recent
364
-
\"\u{5927}\u{304d}\u{306a}\u{52d5}\u{753b}\u{3092}\u{524a}\u{9664}\u{3057}\u{305f}\u{3044}\" → type: videos / size: large / note: can't determine safe to delete
365
-
\"node_modules folders taking up space\" → keywords: node_modules / folders: yes / size: large
366
-
\"screenshots from this week\" → type: screenshots / time: this_week
367
-
\"package.json not in node_modules\" → keywords: package.json / exclude: node_modules
Copy file name to clipboardExpand all lines: apps/desktop/src-tauri/src/indexing/CLAUDE.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -21,7 +21,7 @@ Full design: `docs/specs/drive-indexing/plan.md`
21
21
-**reconciler.rs** -- Buffers FSEvents during scan (capped at 500K events; overflow sets `buffer_overflow` flag forcing full rescan), replays after scan completes using event IDs to skip stale events. Processes live events for file creates/removes/modifies using integer-keyed write messages (`UpsertEntryV2`, `DeleteEntryById`, `DeleteSubtreeById`, `PropagateDeltaById`). Resolves filesystem paths to entry IDs via `store::resolve_path()` using a read connection passed by callers. Key functions (`process_fs_event`, `emit_dir_updated`) are `pub(super)` so `mod.rs` can call them directly during cold-start replay. `reconcile_subtree()` handles MustScanSubDirs by diffing filesystem vs DB directory-by-directory instead of delete-then-reinsert, making it safe to interrupt at any point.
-**verifier.rs** -- Per-navigation background readdir diff. On each directory navigation, `trigger_verification()` (called from `streaming.rs` and `operations.rs` after enrichment) is fully fire-and-forget: it spawns a task that acquires the `INDEXING` lock (never blocking the navigation thread), checks dedup/debounce via static `VerifierState` (in-flight set + recent timestamps), then spawns a second async task that: (1) reads DB children via `ReadPool`, (2) reads disk via `read_dir` (filtering through `scanner::should_exclude`), (3) diffs by normalized name, sending `UpsertEntryV2`/`DeleteEntryById`/`DeleteSubtreeById`/`PropagateDeltaById` corrections to the writer. New directories are flushed then scanned via `scan_subtree` with delta propagation. Debounce: 30s per path, max 2 concurrent verifications. Only runs after initial scan is complete (checks `scanning` flag). `invalidate()` clears state on shutdown/clear.
24
-
- **search.rs** -- In-memory search index for whole-drive file search. Lazily loads all entries from the index DB into a `Vec<SearchEntry>` for fast parallel scanning with rayon. Filenames are arena-allocated: all names are concatenated into a single `SearchIndex.names: String` buffer, and each `SearchEntry` stores `name_offset: u32` + `name_len: u16` instead of an owned `String`. During load, `row.get_ref(col).as_str()` borrows directly from SQLite's internal buffer (zero per-row heap allocations), then pushes into the arena. `name_folded` is NOT stored in the search index — instead, the search pattern is NFD-normalized at query time on macOS (APFS filenames are already NFD). `SearchIndex::name(&self, entry)` retrieves a `&str` slice from the arena. `search()` is a pure function: compiles glob/regex patterns, parallel-filters entries, sorts by recency. Global `SEARCH_INDEX` state with `Arc<SearchIndex>`, idle timer (5 min after dialog close), backstop timer (10 min with no activity), and load cancellation via `AtomicBool` checked every 100K rows. `WRITER_GENERATION` in writer.rs tracks mutations; stale indexes are detected on search. Scope filtering: `SearchQuery` accepts optional `include_paths` (absolute paths — search only within these subtrees) and `exclude_dir_names` (directory names/patterns to exclude at any depth). Include paths are resolved to entry IDs via `store::resolve_path()` (SQLite indexed lookups, microseconds) at the call site before `search()`, stored in `include_path_ids`. `prepare_scope_filter()` reads pre-resolved IDs and compiles exclude patterns as regexes. `ScopeFilter::matches()` walks the ancestor chain via `id_to_index` (O(1) per level) after all other filters pass. `parse_scope()` parses a user-typed comma-separated scope string (with quoting, escaping, `~` expansion, `!` excludes) into a `ParsedScope` struct. IPC commands in `commands/search.rs`: `prepare_search_index` (emits `search-index-ready` event when load completes), `search_files`, `release_search_index`, `translate_search_query` (AI natural language → structured query), `parse_search_scope` (scope string → structured `ParsedScope`).
24
+
**Search**: Moved to its own top-level module. See `src-tauri/src/search/CLAUDE.md`.
25
25
26
26
IPC commands in `commands/indexing.rs` -- thin wrappers over `IndexManager` methods.
0 commit comments