Harden Windows launcher, unsafe root guards, and purge matching#175
Harden Windows launcher, unsafe root guards, and purge matching#175philipgraffshapiro wants to merge 5 commits into
Conversation
|
|
📝 WalkthroughWalkthroughThis PR implements root-guarding for Lumen's index operations: system roots, home directories, and agent session stores are now blocked from indexing. A new ChangesRoot Guard and Lumen Boundary File Enforcement
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
cmd/ancestor.go (1)
35-49:⚠️ Potential issue | 🟠 Major | ⚡ Quick winStop the ancestor walk at
.lumenignoreboundaries.
findAncestorIndex()never breaks when it reaches an ancestor containing.lumenignore, soboundary/subdircan still reuse a DB above that boundary. The current callers only checkhasLumenBoundaryFile()on the initialcwd/searchPath, so the boundary contract is bypassed for descendants.🛠️ Suggested fix
func findAncestorIndex(path, model string) string { candidate := filepath.Dir(path) for { unindexable, _ := merkle.IsRootUnindexable(candidate) if !pathCrossesSkipDir(candidate, path) && !unindexable { if _, err := os.Stat(config.DBPathForProject(candidate, model)); err == nil { return candidate } } + if hasLumenBoundaryFile(candidate) { + break + } parent := filepath.Dir(candidate) if parent == candidate { break } candidate = parent🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@cmd/ancestor.go` around lines 35 - 49, The ancestor search in findAncestorIndex() must stop walking up when it hits a lumen boundary file; update findAncestorIndex to, at each candidate directory before checking DB presence, call hasLumenBoundaryFile(candidate) and break the loop (return empty string or stop search) if it returns true so descendants can't reuse a DB above a .lumenignore boundary; keep the existing checks (pathCrossesSkipDir, merkle.IsRootUnindexable, os.Stat on config.DBPathForProject) otherwise and ensure the function returns appropriately when a boundary is encountered.
🧹 Nitpick comments (1)
cmd/purge.go (1)
215-223: ⚡ Quick winConsider adding
filepath.Cleanfor defensive path normalization.The
pathIsUnderfunction doesn't clean the paths before comparison, while the similarsameOrUnderRootfunction ininternal/merkle/root_guard.go(lines 161-171) does:func sameOrUnderRoot(path, root string) bool { path = strings.ToLower(filepath.Clean(path)) root = strings.ToLower(filepath.Clean(root)) // ... }Although the paths should already be normalized in practice (from
filepath.Absandfilepath.EvalSymlinksupstream), explicitly cleaning bothpathandrootbefore comparison would be more defensive against edge cases (e.g., stored paths with redundant elements like.\or//) and more consistent with the existing pattern in root_guard.go.♻️ Suggested defensive improvement
func pathIsUnder(path, root string) bool { + path = filepath.Clean(path) + root = filepath.Clean(root) root = strings.TrimRight(root, `\/`) candidate := root + string(filepath.Separator) if runtime.GOOS == "windows" { path = strings.ToLower(path) candidate = strings.ToLower(candidate) } return strings.HasPrefix(path, candidate) }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@cmd/purge.go` around lines 215 - 223, The pathIsUnder function should defensively normalize inputs like sameOrUnderRoot does: call filepath.Clean on both path and root before trimming and comparison to remove redundant elements; then proceed to trim the trailing separators on root, build candidate (root + string(filepath.Separator)), and perform the case-insensitive conversion only when runtime.GOOS == "windows" as currently implemented so the behavior is unchanged but more robust against unclean paths.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@cmd/hook.go`:
- Around line 142-162: The current logic in
generateSessionContextInternalWithDirective (cmd/hook.go) and resolveIndexRoot
(cmd/resolve.go) checks git.RepoRoot and may adopt the repo root before honoring
.lumenignore boundaries, causing background indexing and DB opens at the wrong
root; update both functions to call hasLumenBoundaryFile(cwd) first and
short-circuit (keep cwd and not adopt git.RepoRoot) when a boundary exists, only
falling back to git.RepoRoot/findAncestorIndex if no boundary is present,
preserving allowBackgroundIndex semantics; after changing the control flow
around hasLumenBoundaryFile, add regression tests in cmd/hook_test.go and
cmd/resolve_test.go for the case "git repo exists + boundary in subdir" to
assert cwd remains the subdir and SessionStart/prewarm and resolveIndexRoot use
the bounded path.
In `@internal/merkle/root_guard.go`:
- Around line 107-120: Refuse roots checks are case-sensitive on Windows: update
the exact-root map lookup and home-directory comparisons to use normalized,
case-insensitive comparisons (e.g., lowercased or strings.EqualFold) rather than
raw string equality so they match the same normalization performed by
sameOrUnderRoot(); specifically change the checks that reference
refusedRoots[clean] and refusedRoots[resolved] and the comparisons inside the
os.UserHomeDir() block (homeClean == clean, homeClean == resolved, homeResolved
== clean, homeResolved == resolved) to compare using the same normalized form
(or EqualFold) of clean/resolved/home paths, ensuring functions like
resolvePath(), isRefusedRootSubtree(), and isAgentSessionStoreRoot() continue to
receive the normalized values.
---
Outside diff comments:
In `@cmd/ancestor.go`:
- Around line 35-49: The ancestor search in findAncestorIndex() must stop
walking up when it hits a lumen boundary file; update findAncestorIndex to, at
each candidate directory before checking DB presence, call
hasLumenBoundaryFile(candidate) and break the loop (return empty string or stop
search) if it returns true so descendants can't reuse a DB above a .lumenignore
boundary; keep the existing checks (pathCrossesSkipDir,
merkle.IsRootUnindexable, os.Stat on config.DBPathForProject) otherwise and
ensure the function returns appropriately when a boundary is encountered.
---
Nitpick comments:
In `@cmd/purge.go`:
- Around line 215-223: The pathIsUnder function should defensively normalize
inputs like sameOrUnderRoot does: call filepath.Clean on both path and root
before trimming and comparison to remove redundant elements; then proceed to
trim the trailing separators on root, build candidate (root +
string(filepath.Separator)), and perform the case-insensitive conversion only
when runtime.GOOS == "windows" as currently implemented so the behavior is
unchanged but more robust against unclean paths.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 47d74599-2ec4-4966-bc2e-23002db39b95
📒 Files selected for processing (18)
cmd/ancestor.gocmd/ancestor_test.gocmd/hook.gocmd/hook_test.gocmd/index.gocmd/purge.gocmd/purge_test.gocmd/resolve.gocmd/resolve_test.gocmd/stdio.gocmd/stdio_test.gointernal/git/worktree_test.gointernal/index/index_test.gointernal/merkle/ignore.gointernal/merkle/ignore_test.gointernal/merkle/merkle_test.gointernal/merkle/root_guard.goscripts/run
| if root, err := git.RepoRoot(cwd); err == nil { | ||
| cwd = root | ||
| } else if ancestor := findAncestorIndex(cwd, modelName); ancestor != "" { | ||
| cwd = ancestor | ||
| if unindexable, _ := merkle.IsRootUnindexable(root); !unindexable { | ||
| cwd = root | ||
| allowBackgroundIndex = true | ||
| } else if !hasLumenBoundaryFile(cwd) { | ||
| if ancestor := findAncestorIndex(cwd, modelName); ancestor != "" { | ||
| cwd = ancestor | ||
| allowBackgroundIndex = true | ||
| } | ||
| } | ||
| } else if !hasLumenBoundaryFile(cwd) { | ||
| if ancestor := findAncestorIndex(cwd, modelName); ancestor != "" { | ||
| cwd = ancestor | ||
| allowBackgroundIndex = true | ||
| } | ||
| } else { | ||
| allowBackgroundIndex = true | ||
| } | ||
| if hasLumenBoundaryFile(cwd) { | ||
| allowBackgroundIndex = true | ||
| } |
There was a problem hiding this comment.
.lumenignore boundaries are still bypassed on the git-root path in cmd/hook.go and cmd/resolve.go. generateSessionContextInternalWithDirective in cmd/hook.go and resolveIndexRoot in cmd/resolve.go only consult hasLumenBoundaryFile(...) on fallback branches. When the current directory is a bounded subdirectory inside a normal git repo, both functions still collapse to the repo root, so SessionStart can prewarm the wrong index and search can open the wrong DB. Please short-circuit on the boundary before any git-root adoption in both files, and add matching regression cases in cmd/hook_test.go and cmd/resolve_test.go for “git repo exists + boundary in subdir”.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@cmd/hook.go` around lines 142 - 162, The current logic in
generateSessionContextInternalWithDirective (cmd/hook.go) and resolveIndexRoot
(cmd/resolve.go) checks git.RepoRoot and may adopt the repo root before honoring
.lumenignore boundaries, causing background indexing and DB opens at the wrong
root; update both functions to call hasLumenBoundaryFile(cwd) first and
short-circuit (keep cwd and not adopt git.RepoRoot) when a boundary exists, only
falling back to git.RepoRoot/findAncestorIndex if no boundary is present,
preserving allowBackgroundIndex semantics; after changing the control flow
around hasLumenBoundaryFile, add regression tests in cmd/hook_test.go and
cmd/resolve_test.go for the case "git repo exists + boundary in subdir" to
assert cwd remains the subdir and SessionStart/prewarm and resolveIndexRoot use
the bounded path.
| if refusedRoots[clean] || refusedRoots[resolved] { | ||
| return true, "hardcoded system root" | ||
| } | ||
| if isRefusedRootSubtree(clean) || isRefusedRootSubtree(resolved) { | ||
| return true, "hardcoded system root" | ||
| } | ||
| if home, err := os.UserHomeDir(); err == nil { | ||
| homeClean := filepath.Clean(home) | ||
| homeResolved := resolvePath(home) | ||
| if homeClean == clean || homeClean == resolved || homeResolved == clean || homeResolved == resolved { | ||
| return true, "user home directory" | ||
| } | ||
| } | ||
| if isAgentSessionStoreRoot(clean) || isAgentSessionStoreRoot(resolved) { |
There was a problem hiding this comment.
Make the Windows exact-root checks case-insensitive.
sameOrUnderRoot() already normalizes case, but the exact-root map lookup and the home-directory comparisons still use raw string equality. On Windows that lets case variants like c:\users or a differently cased home path slip past this guard, and the downstream callers in cmd/index.go, cmd/hook.go, and cmd/resolve.go all treat IsRootUnindexable() as authoritative.
🛠️ Suggested fix
+func samePath(a, b string) bool {
+ a = filepath.Clean(a)
+ b = filepath.Clean(b)
+ if runtime.GOOS == "windows" {
+ return strings.EqualFold(a, b)
+ }
+ return a == b
+}
+
+func isRefusedExactRoot(path string) bool {
+ for root := range refusedRoots {
+ if samePath(path, root) {
+ return true
+ }
+ }
+ return false
+}
+
func IsRootUnindexable(dir string) (bool, string) {
clean := filepath.Clean(dir)
resolved := resolvePath(dir)
if runtime.GOOS == "windows" && (isWindowsDriveRoot(clean) || isWindowsDriveRoot(resolved)) {
return true, "windows drive root"
}
- if refusedRoots[clean] || refusedRoots[resolved] {
+ if isRefusedExactRoot(clean) || isRefusedExactRoot(resolved) {
return true, "hardcoded system root"
}
if isRefusedRootSubtree(clean) || isRefusedRootSubtree(resolved) {
return true, "hardcoded system root"
}
if home, err := os.UserHomeDir(); err == nil {
homeClean := filepath.Clean(home)
homeResolved := resolvePath(home)
- if homeClean == clean || homeClean == resolved || homeResolved == clean || homeResolved == resolved {
+ if samePath(homeClean, clean) || samePath(homeClean, resolved) ||
+ samePath(homeResolved, clean) || samePath(homeResolved, resolved) {
return true, "user home directory"
}
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@internal/merkle/root_guard.go` around lines 107 - 120, Refuse roots checks
are case-sensitive on Windows: update the exact-root map lookup and
home-directory comparisons to use normalized, case-insensitive comparisons
(e.g., lowercased or strings.EqualFold) rather than raw string equality so they
match the same normalization performed by sameOrUnderRoot(); specifically change
the checks that reference refusedRoots[clean] and refusedRoots[resolved] and the
comparisons inside the os.UserHomeDir() block (homeClean == clean, homeClean ==
resolved, homeResolved == clean, homeResolved == resolved) to compare using the
same normalized form (or EqualFold) of clean/resolved/home paths, ensuring
functions like resolvePath(), isRefusedRootSubtree(), and
isAgentSessionStoreRoot() continue to receive the normalized values.
Summary
Verification
Summary by CodeRabbit
New Features
.lumenignoreboundary files to control indexing scope within repositories.Bug Fixes