fix(index): refuse $HOME, system roots, and .lumenignore-blanketed dirs#160
Conversation
Indexing $HOME (or another tree the user has blanketed with a catch-all .lumenignore) walks the whole subtree, ignores every file, produces an empty index, and on macOS triggers TCC prompts for ~/Desktop, ~/Documents, ~/Library, etc. along the way. Multiple Claude sessions starting from $HOME amplified this into many concurrent indexers each repeating the same walk. Add merkle.IsRootUnindexable(dir), checked in two places that pick index roots: - findAncestorIndex now skips candidates that are unindexable, so walking up from a non-git subdirectory can no longer resolve to $HOME or to a user-disabled root. - runIndex refuses such targets before any config/embedder setup, so `lumen index $HOME` fails fast with a clear error. The check combines a hardcoded refusal list (/, $HOME, /Users, /tmp, /var, /etc, /usr, /Applications, /Library and macOS /private/* twins) with a .lumenignore catch-all probe (matches "**", "**/*", "*", etc.). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughAdds IsRootUnindexable(dir) to detect hardcoded/home/.lumenignore catch-all roots and enforces this check in runIndex and findAncestorIndex; tests exercise detection, index refusal, and ancestor traversal skipping. ChangesUn-indexable root detection and enforcement
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
internal/merkle/ignore_test.go (1)
86-193: ⚡ Quick winUse a table-driven structure for
TestIsRootUnindexable.This block repeats setup/assert logic across many inputs; a table-driven form will reduce duplication and make future case additions safer.
As per coding guidelines,
**/*_test.go: Use table-driven tests for multiple test cases in Go.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/merkle/ignore_test.go` around lines 86 - 193, Refactor TestIsRootUnindexable into a table-driven test: create a slice of test cases (struct with name, ignoreContents string, rootPath string or a flag to use t.TempDir, and expected bool) and iterate with t.Run for each case; move repeated setup (dir := t.TempDir(), writeFile calls, os.UserHomeDir handling) into per-case setup functions and call IsRootUnindexable for assertions; keep special hardcoded-paths and home-dir checks as separate cases in the table (referencing IsRootUnindexable, writeFile, t.TempDir, and os.UserHomeDir) so each scenario is a single table entry and duplicated logic is eliminated.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@cmd/index.go`:
- Around line 57-63: The initial unindexable check uses
merkle.IsRootUnindexable(projectPath) on the raw CLI path but projectPath is
later normalized/rewritten (e.g., to a git root), so re-run the same check after
any normalization step that mutates projectPath; specifically, after the code
that adjusts projectPath (the block around where projectPath is
reassigned/normalized) call merkle.IsRootUnindexable(projectPath) again and
return the same fmt.Errorf(...) if true so the protection cannot be bypassed
(apply the same fix to the later spot referenced by lines 83-87).
- Around line 61-63: The refusal message wrongly blames .lumenignore for all
cases; update the code around merkle.IsRootUnindexable and projectPath so the
error text is accurate: either change merkle.IsRootUnindexable to return a
reason (e.g., (bool, string) or an error) and include that reason in the
fmt.Errorf, or replace the hardcoded message with a generic one such as
"refusing to index %s: root is unindexable (possible reasons: .lumenignore,
hardcoded root, or $HOME)"; reference merkle.IsRootUnindexable and projectPath
when making the change.
---
Nitpick comments:
In `@internal/merkle/ignore_test.go`:
- Around line 86-193: Refactor TestIsRootUnindexable into a table-driven test:
create a slice of test cases (struct with name, ignoreContents string, rootPath
string or a flag to use t.TempDir, and expected bool) and iterate with t.Run for
each case; move repeated setup (dir := t.TempDir(), writeFile calls,
os.UserHomeDir handling) into per-case setup functions and call
IsRootUnindexable for assertions; keep special hardcoded-paths and home-dir
checks as separate cases in the table (referencing IsRootUnindexable, writeFile,
t.TempDir, and os.UserHomeDir) so each scenario is a single table entry and
duplicated logic is eliminated.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: ac57aeee-be4c-463e-95ba-4d47958a67f1
📒 Files selected for processing (6)
cmd/ancestor.gocmd/ancestor_test.gocmd/index.gocmd/index_test.gointernal/merkle/ignore.gointernal/merkle/ignore_test.go
…ason Addresses CodeRabbit review on PR #160: 1. `git.RepoRoot` could resolve `projectPath` upward into an un-indexable root (e.g. a git repo whose root is $HOME), bypassing the pre- normalization guard. Add a second `IsRootUnindexable` check immediately after normalization. `findAncestorIndex` already filters internally, but the git branch did not. 2. `IsRootUnindexable` now returns `(bool, string)` where the second value is the refusal reason ("hardcoded system root", "user home directory", ".lumenignore catch-all pattern"). `runIndex` includes that reason in its error so the message is accurate regardless of which branch fired. Tests refactored to table-driven form for the .lumenignore scenarios. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
internal/merkle/ignore.go (1)
291-306: ⚡ Quick winConsider adding
/Systemto the macOS refused roots.The
refusedRootsmap includes macOS-specific paths like/Applicationsand/Librarybut is missing/System, which contains macOS system files. While System Integrity Protection (SIP) prevents modification of/System, adding it to the refused list would provide users with a clear, immediate error message rather than letting the indexer attempt a slow walk that would ultimately fail.📝 Suggested addition
var refusedRoots = map[string]bool{ "/": true, "/Users": true, "/tmp": true, "/private/tmp": true, "/var": true, "/private/var": true, "/etc": true, "/usr": true, "/Applications": true, "/Library": true, + "/System": true, }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/merkle/ignore.go` around lines 291 - 306, Update the refusedRoots map in internal/merkle/ignore.go (the variable refusedRoots) to include the macOS system path "/System" with a true value so the indexer immediately refuses that root instead of attempting a slow walk; add the entry alongside the existing macOS paths like "/Library" and "/Applications".
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@internal/merkle/ignore.go`:
- Around line 291-306: refusedRoots currently lists only Unix-style
forward-slash paths but IsRootUnindexable() uses filepath.Clean() on the
incoming path, which on Windows produces backslash paths, so lookups miss
Windows system roots; update the refusedRoots map to include canonical Windows
entries (e.g., "C:\\", "C:\\Windows", "C:\\Program Files", "C:\\Program Files
(x86)") and optionally add "/System" and "/opt" for macOS/Unix, and ensure keys
are created using filepath.Clean or filepath.FromSlash so the map keys use the
same normalization as IsRootUnindexable() (or alternatively normalize the
checked path with filepath.ToSlash before map lookup) to make matching reliable
across platforms.
- Around line 325-332: IsRootUnindexable currently uses filepath.Clean only, so
symlinked paths (e.g. a symlink to $HOME) bypass the root checks; update
IsRootUnindexable to first call filepath.EvalSymlinks on the incoming dir and
use the resolved path for comparisons (falling back to filepath.Clean if
EvalSymlinks returns an error), and also resolve the user's home directory via
filepath.EvalSymlinks after os.UserHomeDir() before comparing; ensure all
subsequent checks in IsRootUnindexable (e.g., refusedRoots lookup and home-dir
equality) use the resolved/cleaned path.
---
Nitpick comments:
In `@internal/merkle/ignore.go`:
- Around line 291-306: Update the refusedRoots map in internal/merkle/ignore.go
(the variable refusedRoots) to include the macOS system path "/System" with a
true value so the indexer immediately refuses that root instead of attempting a
slow walk; add the entry alongside the existing macOS paths like "/Library" and
"/Applications".
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: e09130cb-a90a-416a-b6d4-52b13a7ea091
📒 Files selected for processing (5)
cmd/ancestor.gocmd/index.gocmd/index_test.gointernal/merkle/ignore.gointernal/merkle/ignore_test.go
🚧 Files skipped from review as they are similar to previous changes (4)
- cmd/ancestor.go
- cmd/index_test.go
- internal/merkle/ignore_test.go
- cmd/index.go
The refusal map previously held only Unix paths and IsRootUnindexable relied on filepath.Clean alone, so on Windows the system roots were never matched and a symlink to $HOME or a system root bypassed the guard. - Add Windows entries (C:\, C:\Windows, C:\Program Files, etc.) and the missing /opt, /home, /System Unix roots. - Resolve symlinks via filepath.EvalSymlinks (fallback to Clean when the target does not exist) and compare both forms against the map so /etc still matches on macOS where it symlinks to /private/etc. - Test the symlink-to-home case and scope the hardcoded-root test to paths that actually exist on the host OS. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Summary
Indexing
$HOME(or any tree the user has blanketed with a catch-all.lumenignore) was walking the whole subtree, ignoring every file, producing an empty index — and on macOS triggering TCC prompts for~/Desktop,~/Documents,~/Library, etc. along the way. Multiple Claude/Conductor sessions starting from$HOMEamplified this into many concurrent indexers each repeating the same walk.This adds
merkle.IsRootUnindexable(dir), called from the two places that pick index roots:findAncestorIndexnow skips candidates that are unindexable, so walking up from a non-git subdirectory can no longer resolve to$HOMEor to a user-disabled root.runIndexrefuses such targets before any config/embedder setup, solumen index $HOMEfails fast with a clear error.The check combines:
/,$HOME,/Users,/tmp,/var,/etc,/usr,/Applications,/Libraryand their macOS/private/*twins..lumenignorecatch-all probe — matches againstlumen-root-probe-…andlumen-root-probe-…/…so patterns like**,**/*,*/*, or bare*register as "ignore everything" but specific patterns likenode_modules/do not.Test plan
go test ./internal/merkle/ -run TestIsRootUnindexable— covers no-file, empty file, specific patterns, doublestar, combined, single-*, comments-only, hardcoded paths,$HOME, and a sibling-of-home regression.go test ./cmd/ -run TestFindAncestorIndex— new sub-test verifies ancestor walk skips a candidate whose.lumenignoredeclares it un-indexable.go test ./cmd/ -run TestRunIndex_RefusesUnindexableRoot— verifiesrunIndexreturns an error mentioning.lumenignorebefore any config/embedder work.make lint— 0 issues.make build-local— clean build.Out of scope (follow-up)
DiscoverNestedGitReposto honor the parent's.lumenignore— only matters when a refused root somehow slips through, which the two new guards prevent.🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Bug Fixes
Tests