Skip to content

feat(ignore-paths): allow re-including subpaths of ignored directories#187

Merged
ldm0 merged 4 commits intocardisoft:masterfrom
buggerman:feat/include-paths
May 2, 2026
Merged

feat(ignore-paths): allow re-including subpaths of ignored directories#187
ldm0 merged 4 commits intocardisoft:masterfrom
buggerman:feat/include-paths

Conversation

@buggerman
Copy link
Copy Markdown
Contributor

@buggerman buggerman commented Apr 26, 2026

what this does

adds an include_paths exception list. paths under an ignore_paths prefix can be re-included by listing them in include_paths. before this the check was a pure starts_with, so there was no way to say "ignore /a but keep /a/b indexed".

i ran into this with /Volumes. wanted to ignore mounted volumes broadly but keep one subpath where i drop media — there's no way to express that in the current config.

the check

should_ignore_path does two passes now:

  1. is path under any ignore_directories prefix? if no, keep it (unchanged).
  2. does any include_paths entry overlap path in either direction? overlap matters in both directions during traversal — if the include sits below the path, we still need to recurse through this ancestor to reach it; if the include sits above the path, the descendant is explicitly preserved.

empty include_paths is a no-op so existing setups behave identically.

what changed

  • fswalk::WalkData::new and fswalk::should_ignore_path take an extra &[PathBuf]
  • search-cache: FileNodes, SearchCache::{noop, walk_data, try_read_persistent_cache, flush_*} thread the field
  • PersistentStorage schema gains include_paths; LSF_VERSION 5 → 6
  • IPC: set_watch_config / start_logic / WatchConfigUpdate / LogicStartConfig accept include_paths: Vec<String>
  • callers that don't need the feature (lsf, the persistent-roundtrip test) pass an empty list

follow-ups (not in this PR)

  • frontend wiringcardinal/src/hooks/useAppPreferences.ts and cardinal/src/utils/watchConfig.ts still pass two args to start_logic / set_watch_config. with this change those commands now require a third arg, so the simplest fix is includePaths: [] until the preferences UI surfaces it. happy to add the frontend stub here instead if you'd prefer one PR — let me know.
  • doc note in doc/inner/search-cache.md and doc/pub/search-syntax.md describing the include-paths config.

risks

  • cache invalidation: bumping LSF_VERSION to 6 means existing on-disk caches won't load and the next launch triggers a full rescan. expected, the schema actually changed.
  • workspace api break: WalkData::new, should_ignore_path, SearchCache::noop, walk_data, try_read_persistent_cache all changed signature. nothing outside the workspace consumes them.
  • frontend breaks until wiring lands — see the TODO above. without that follow-up, start_logic will fail at startup because Tauri can't deserialize the missing field.

verified

cargo fmt --check                              # clean (workspace + cardinal/src-tauri)
cargo clippy --workspace --all-targets         # 0 warnings
cargo clippy --all-targets   (in cardinal/src-tauri) # 0 warnings
cargo check --workspace --all-targets          # ok
cargo check                  (in cardinal/src-tauri) # ok
cargo test -p fswalk -p search-cache -p lsf    # 1187 passed, 4 ignored (pre-existing)

frontend (cardinal/src) was not touched, so npm test was not run.

new fswalk tests:

  • include_path_carves_out_subtree_from_ignored_dir
  • include_path_does_not_rescue_unrelated_ignored_dirs
  • multiple_include_paths_under_same_ignored_dir

noop_cache_walk_data_propagates_paths_and_filters (renamed from ..._path_and_ignore) now also asserts include-paths thread through walk_data.

The above was drafted with help from Claude

- add include_paths to WalkData, FileNodes, and PersistentStorage
- should_ignore_path keeps an ignored ancestor walkable when an
  include path sits below it
- expose include_paths via set_watch_config and start_logic
- bump persistent storage version to 6
- tests for subtree carve-out, unrelated ignored dirs, and
  multiple includes under one ignore
CI's `Tauri fmt` job uses nightly-2025-05-09; my local machine had the
newer 2025-12-11 which left the include_paths-bearing argument lists on
single lines. wrap them as the older rustfmt wants.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an include_paths exception list to the filesystem ignore logic so subpaths under ignored directories can be re-included, and threads that configuration through cache persistence and the Tauri IPC layer.

Changes:

  • Extend fswalk::should_ignore_path / WalkData to support include_paths overlap semantics and add targeted tests.
  • Thread include_paths through search-cache (cache construction, noop, walk_data, persistence schema/version bump).
  • Extend Tauri commands/config structs to accept include_paths and pass it into cache building/loading.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
search-cache/tests/scan_cancellation.rs Updates noop/walk_data tests for the new include-paths parameter threading.
search-cache/tests/edge_cases_comprehensive.rs Updates WalkData construction to include the new include-paths argument.
search-cache/src/tests/cache_flow.rs Updates persistent roundtrip test to pass include-paths.
search-cache/src/persistent.rs Adds include_paths to persisted schema and bumps LSF_VERSION 5→6.
search-cache/src/file_nodes.rs Stores and exposes include_paths alongside ignore_paths; updates into_parts.
search-cache/src/cache.rs Threads include_paths through cache lifecycle (read/write/noop/walk_data/ignore checks).
lsf/src/main.rs Updates CLI usage to pass empty include_paths.
fswalk/tests/deep_walk.rs Updates tests to pass empty include-paths where appropriate.
fswalk/src/lib.rs Implements include-path exceptions in should_ignore_path and adds new unit tests.
cardinal/src-tauri/src/lib.rs Normalizes and passes include-paths into cache load/build and noop fallback.
cardinal/src-tauri/src/commands.rs Extends IPC commands/config normalization to include include_paths.
cardinal/src-tauri/src/background.rs Threads include-paths through watch config updates and cache rebuild/rescan.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread cardinal/src-tauri/src/commands.rs Outdated
Comment on lines +442 to +448
include_paths: Vec<String>,
) {
if let Some(sender) = LOGIC_START.get() {
let _ = sender.try_send(LogicStartConfig {
watch_root,
ignore_paths,
include_paths,
Copy link

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

start_logic now requires include_paths, but the existing JS invoke('start_logic', { watchRoot, ignorePaths }) calls omit it. This will make app startup fail due to missing argument deserialization. To keep backward compatibility, accept include_paths as optional (or apply a serde default) and treat missing values as an empty list until the frontend wiring lands.

Suggested change
include_paths: Vec<String>,
) {
if let Some(sender) = LOGIC_START.get() {
let _ = sender.try_send(LogicStartConfig {
watch_root,
ignore_paths,
include_paths,
include_paths: Option<Vec<String>>,
) {
if let Some(sender) = LOGIC_START.get() {
let _ = sender.try_send(LogicStartConfig {
watch_root,
ignore_paths,
include_paths: include_paths.unwrap_or_default(),

Copilot uses AI. Check for mistakes.
Comment on lines 401 to 405
pub fn set_watch_config(
watch_root: String,
ignore_paths: Vec<String>,
include_paths: Vec<String>,
state: State<'_, SearchState>,
Copy link

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set_watch_config now requires include_paths, but the current frontend invoke payload only provides watchRoot/ignorePaths (no includePaths). As-is, Tauri deserialization will fail at runtime and the app won't be able to update watch config. Consider making include_paths optional (e.g., Option<Vec<String>> with unwrap_or_default()), or otherwise providing a backward-compatible default until the frontend is updated.

Copilot uses AI. Check for mistakes.
@hsqStephenZhang
Copy link
Copy Markdown
Collaborator

I like the idea, it looks like linux's directory permissions. In Linux, you might not have read (r) permission on a parent directory, but if you have execute (x), you can "jump over" it to access a child you do have permissions for.

Since we have this analogy/model, we might cover more combinations in the test, for example:

  1. ignore: [/Users/data, /Users/data/work/temp],include: [/Users/data/work]. Will /Users/data/work/temp still be ignored? Should we apply longest prefix match policy here?

addresses the nested-override case Stephen raised in the PR review.
previously any include overlap rescued a path; that broke configs like:

  ignore: [/a, /a/b/c]
  include: [/a/b]

where /a/b/c should stay ignored. now whichever prefix sits deeper wins,
ties keep the path. the ancestor-of-include exception stays so the
walker can still recurse through an ignored ancestor to reach the
include subtree.

- new test `deeper_ignore_re_ignores_subtree_of_an_include` covers the
  three-layer override case
- existing tests unchanged in intent and still pass under the new logic
…nfig

per Copilot's review — the existing JS frontend invokes these commands
with two args (watchRoot, ignorePaths), so requiring `include_paths` as
a non-optional Vec<String> would fail Tauri's serde deserialization at
startup and on every preferences save.

accept Option<Vec<String>> and unwrap_or_default() before threading
through. existing 2-arg JS calls deserialize cleanly; the eventual
frontend wiring just adds includePaths as a third field.
@buggerman
Copy link
Copy Markdown
Contributor Author

buggerman commented Apr 27, 2026

@hsqStephenZhang Nice analogy, and good catch on the edge case. With the original "any overlap rescues" rule, your example would have left /Users/data/work/temp un-ignored, which is wrong. A more specific ignore should beat a broader include.

Switched to longest-prefix-match in 4abc724. Among the entries in ignore_directories and include_paths whose prefix contains the path, whichever sits deeper wins. Ties keep the path. So for your config:

ignore:  [/Users/data, /Users/data/work/temp]
include: [/Users/data/work]
  • /Users/data/work/file.txt → kept (include depth 3 > ignore depth 2)
  • /Users/data/work/temp/file.txt → ignored (ignore depth 4 > include depth 3) ✓
  • /Users/data/file.txt → ignored (only ignore matches)

The ancestor-of-include exception stays so the walker can still recurse through an ignored ancestor to reach an include below it. New test deeper_ignore_re_ignores_subtree_of_an_include covers the three-layer override.

@buggerman
Copy link
Copy Markdown
Contributor Author

Copilot is right on lines 448 (start_logic) and 405 (set_watch_config). Same runtime risk i flagged in the PR description, and Option<Vec<String>> + unwrap_or_default() is the cleanest fix. It keeps the existing 2-arg frontend calls working until the UI wiring lands.

Done in 8b169d5 for both commands. LogicStartConfig and WatchConfigUpdate keep the inner Vec<String> so nothing further down the chain has to learn about the option.

Copy link
Copy Markdown
Member

@ldm0 ldm0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@buggerman Nice addition, overall it looks good. Please add frontend wiring in this PR too, which makes it more complete.

@buggerman
Copy link
Copy Markdown
Contributor Author

@buggerman Nice addition, overall it looks good. Please add frontend wiring in this PR too, which makes it more complete.

Alright. Coming soon.

@ldm0
Copy link
Copy Markdown
Member

ldm0 commented May 2, 2026

Okay, let's merge it first. We would like to add this feature in 0.1.24.

@ldm0 ldm0 merged commit 78b3bde into cardisoft:master May 2, 2026
10 checks passed
ldm0 pushed a commit that referenced this pull request May 2, 2026
* feat(preferences): surface include paths in the preferences UI

ships the frontend half of the include-paths feature merged in #187. the
rust side already accepts an optional `includePaths` arg on start_logic
and set_watch_config; this PR drives them.

- new useIncludePaths hook, mirroring useIgnorePaths (storage key
  `cardinal.includePaths`, defaults to empty)
- thread includePaths through useAppPreferences: read from the new hook,
  pass to start_logic on first launch, send via setWatchConfig when the
  user edits preferences, include in the change-detection equality check
- new textarea section in PreferencesOverlay below the existing ignore
  paths, with the same one-path-per-line shape and absolute-path
  validation
- WatchConfigPayload gains includePaths so the type forces every call
  site to ship it

* feat(i18n): translate include-paths labels into all 15 locales

mirrors the existing ignorePaths block — label, help, errors.absolute —
across en-US, ar-SA, de-DE, es-ES, fr-FR, hi-IN, it-IT, ja-JP, ko-KR,
pt-BR, ru-RU, tr-TR, uk-UA, zh-CN, zh-TW.

* test(preferences): cover include-paths wiring

- new useIncludePaths spec mirroring useIgnorePaths (default empty,
  hydrate, clean+persist updates, fallback on invalid JSON)
- useAppPreferences: mock useIncludePaths, thread includePaths through
  every existing start_logic / setWatchConfig assertion, add a test for
  the include-paths-only change path
- PreferencesOverlay: include the new textarea and field in baseProps,
  add a save-include-paths test, add a validation-blocks-save test, and
  cover the include-paths reset

* fix(preferences): generalize useIncludePaths writeErrorMessage

`useStoredState` uses `writeErrorMessage` for any write failure, not
just the initial default-persist. dropping "default" from the text so
a later user-initiated save that fails doesn't get logged as if the
default write was the problem. per Copilot's review on #195.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants