Skip to content

Harden Windows launcher, unsafe root guards, and purge matching#175

Closed
philipgraffshapiro wants to merge 5 commits into
ory:mainfrom
philipgraffshapiro:main
Closed

Harden Windows launcher, unsafe root guards, and purge matching#175
philipgraffshapiro wants to merge 5 commits into
ory:mainfrom
philipgraffshapiro:main

Conversation

@philipgraffshapiro

@philipgraffshapiro philipgraffshapiro commented Jun 12, 2026

Copy link
Copy Markdown

Summary

  • refuse unsafe index roots before indexing or hook prewarming
  • keep agent session stores from being indexed as project roots
  • make the Windows launcher resolution work for Windows-style shells
  • make path-specific purge match Windows project_path metadata case-insensitively

Verification

  • go test ./cmd -count=1
  • go test ./internal/merkle -count=1 -v
  • go test ./... -count=1
  • rebuilt Windows binary locally and verified version 0.0.41
  • verified unsafe roots refuse before indexing

Summary by CodeRabbit

  • New Features

    • Added support for .lumenignore boundary files to control indexing scope within repositories.
    • Improved Windows platform support with case-insensitive path matching for index operations.
  • Bug Fixes

    • Prevented indexing of system directories, home directories, and agent session stores.
    • Fixed background indexing behavior for directories with explicit boundaries.
    • Corrected Windows binary detection and download in the launcher script.

@CLAassistant

Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

This PR implements root-guarding for Lumen's index operations: system roots, home directories, and agent session stores are now blocked from indexing. A new .lumenignore boundary-file mechanism prevents ancestor-index reuse within explicit boundaries. The implementation integrates unindexable checks into all index-root resolution paths (hook, resolve, index, purge) and adds cross-platform path handling for Windows.

Changes

Root Guard and Lumen Boundary File Enforcement

Layer / File(s) Summary
Root guard implementation and path normalization
internal/merkle/root_guard.go, internal/merkle/ignore.go, internal/merkle/ignore_test.go
New IsRootUnindexable checks hardcoded system roots, Windows drive/subtrees, home directory, and agent session stores using symlink resolution and .lumenignore catch-all detection. Path normalization via filepath.Clean and filepath.ToSlash ensures consistent ignore matching across platforms. Windows drive-root and protected-subtree tests verify platform-specific behavior.
Lumen boundary file detection and ancestor integration
cmd/ancestor.go, cmd/ancestor_test.go
hasLumenBoundaryFile helper detects .lumenignore files to gate ancestor-index reuse. Tests refactored to use temp directories for cross-platform portability.
Hook session context with background indexing guards
cmd/hook.go, cmd/hook_test.go
generateSessionContextInternalWithDirective computes allowBackgroundIndex based on root unindexability, boundary files, and ancestor availability; early returns for unindexable roots. New tests verify spawn/no-spawn at correct boundaries, stale-index triggering, and donor messaging.
Resolve index-root selection with unindexable checks
cmd/resolve.go, cmd/resolve_test.go
resolveIndexRoot checks unindexable status on successful git roots; conditionally falls back to ancestor index only when no boundary file present. Test verifies boundary prevents ancestor adoption.
Index and stdio effective-root selection with unindexable enforcement
cmd/index.go, cmd/stdio.go, cmd/stdio_test.go
Upward walks skip unindexable candidates; git-root fallback only returns indexable roots; getOrCreate rejects unindexable with error. Tests use temp directories and add resource cleanup.
Purge with platform-aware path matching and unindexable filtering
cmd/purge.go, cmd/purge_test.go
Purge skips unindexable git roots; uses exact-match lookup then platform-aware ancestor selection via case-insensitive Windows path comparison and trim-separator normalization. Windows test verifies case-variant path matching.
Test infrastructure portability and platform guards
internal/index/index_test.go, internal/merkle/merkle_test.go, internal/git/worktree_test.go
Permission-denial tests skip on Windows; git worktree tests set GIT_CEILING_DIRECTORIES explicitly.
Windows binary distribution updates
scripts/run
Detects Windows-like environments and adds .exe suffix to binary candidates, download paths, and GitHub release asset names.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • ory/lumen#160: Both PRs implement the same unindexable-root / .lumenignore-catch-all idea: the main PR wires merkle.IsRootUnindexable into index/root selection flow and refactors the IsRootUnindexable implementation (migrating it from internal/merkle/ignore.go toward root_guard.go), so the changes are directly connected.

Suggested reviewers

  • aeneasr
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 19.61% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically summarizes the main changes across the PR: Windows launcher hardening (scripts/run), root safety guards (internal/merkle/root_guard.go), and purge matching improvements (cmd/purge.go).
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cmd/ancestor.go (1)

35-49: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Stop the ancestor walk at .lumenignore boundaries.

findAncestorIndex() never breaks when it reaches an ancestor containing .lumenignore, so boundary/subdir can still reuse a DB above that boundary. The current callers only check hasLumenBoundaryFile() on the initial cwd / searchPath, so the boundary contract is bypassed for descendants.

🛠️ Suggested fix
 func findAncestorIndex(path, model string) string {
 	candidate := filepath.Dir(path)
 	for {
 		unindexable, _ := merkle.IsRootUnindexable(candidate)
 		if !pathCrossesSkipDir(candidate, path) && !unindexable {
 			if _, err := os.Stat(config.DBPathForProject(candidate, model)); err == nil {
 				return candidate
 			}
 		}
+		if hasLumenBoundaryFile(candidate) {
+			break
+		}
 		parent := filepath.Dir(candidate)
 		if parent == candidate {
 			break
 		}
 		candidate = parent
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/ancestor.go` around lines 35 - 49, The ancestor search in
findAncestorIndex() must stop walking up when it hits a lumen boundary file;
update findAncestorIndex to, at each candidate directory before checking DB
presence, call hasLumenBoundaryFile(candidate) and break the loop (return empty
string or stop search) if it returns true so descendants can't reuse a DB above
a .lumenignore boundary; keep the existing checks (pathCrossesSkipDir,
merkle.IsRootUnindexable, os.Stat on config.DBPathForProject) otherwise and
ensure the function returns appropriately when a boundary is encountered.
🧹 Nitpick comments (1)
cmd/purge.go (1)

215-223: ⚡ Quick win

Consider adding filepath.Clean for defensive path normalization.

The pathIsUnder function doesn't clean the paths before comparison, while the similar sameOrUnderRoot function in internal/merkle/root_guard.go (lines 161-171) does:

func sameOrUnderRoot(path, root string) bool {
    path = strings.ToLower(filepath.Clean(path))
    root = strings.ToLower(filepath.Clean(root))
    // ...
}

Although the paths should already be normalized in practice (from filepath.Abs and filepath.EvalSymlinks upstream), explicitly cleaning both path and root before comparison would be more defensive against edge cases (e.g., stored paths with redundant elements like .\ or //) and more consistent with the existing pattern in root_guard.go.

♻️ Suggested defensive improvement
 func pathIsUnder(path, root string) bool {
+	path = filepath.Clean(path)
+	root = filepath.Clean(root)
 	root = strings.TrimRight(root, `\/`)
 	candidate := root + string(filepath.Separator)
 	if runtime.GOOS == "windows" {
 		path = strings.ToLower(path)
 		candidate = strings.ToLower(candidate)
 	}
 	return strings.HasPrefix(path, candidate)
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/purge.go` around lines 215 - 223, The pathIsUnder function should
defensively normalize inputs like sameOrUnderRoot does: call filepath.Clean on
both path and root before trimming and comparison to remove redundant elements;
then proceed to trim the trailing separators on root, build candidate (root +
string(filepath.Separator)), and perform the case-insensitive conversion only
when runtime.GOOS == "windows" as currently implemented so the behavior is
unchanged but more robust against unclean paths.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cmd/hook.go`:
- Around line 142-162: The current logic in
generateSessionContextInternalWithDirective (cmd/hook.go) and resolveIndexRoot
(cmd/resolve.go) checks git.RepoRoot and may adopt the repo root before honoring
.lumenignore boundaries, causing background indexing and DB opens at the wrong
root; update both functions to call hasLumenBoundaryFile(cwd) first and
short-circuit (keep cwd and not adopt git.RepoRoot) when a boundary exists, only
falling back to git.RepoRoot/findAncestorIndex if no boundary is present,
preserving allowBackgroundIndex semantics; after changing the control flow
around hasLumenBoundaryFile, add regression tests in cmd/hook_test.go and
cmd/resolve_test.go for the case "git repo exists + boundary in subdir" to
assert cwd remains the subdir and SessionStart/prewarm and resolveIndexRoot use
the bounded path.

In `@internal/merkle/root_guard.go`:
- Around line 107-120: Refuse roots checks are case-sensitive on Windows: update
the exact-root map lookup and home-directory comparisons to use normalized,
case-insensitive comparisons (e.g., lowercased or strings.EqualFold) rather than
raw string equality so they match the same normalization performed by
sameOrUnderRoot(); specifically change the checks that reference
refusedRoots[clean] and refusedRoots[resolved] and the comparisons inside the
os.UserHomeDir() block (homeClean == clean, homeClean == resolved, homeResolved
== clean, homeResolved == resolved) to compare using the same normalized form
(or EqualFold) of clean/resolved/home paths, ensuring functions like
resolvePath(), isRefusedRootSubtree(), and isAgentSessionStoreRoot() continue to
receive the normalized values.

---

Outside diff comments:
In `@cmd/ancestor.go`:
- Around line 35-49: The ancestor search in findAncestorIndex() must stop
walking up when it hits a lumen boundary file; update findAncestorIndex to, at
each candidate directory before checking DB presence, call
hasLumenBoundaryFile(candidate) and break the loop (return empty string or stop
search) if it returns true so descendants can't reuse a DB above a .lumenignore
boundary; keep the existing checks (pathCrossesSkipDir,
merkle.IsRootUnindexable, os.Stat on config.DBPathForProject) otherwise and
ensure the function returns appropriately when a boundary is encountered.

---

Nitpick comments:
In `@cmd/purge.go`:
- Around line 215-223: The pathIsUnder function should defensively normalize
inputs like sameOrUnderRoot does: call filepath.Clean on both path and root
before trimming and comparison to remove redundant elements; then proceed to
trim the trailing separators on root, build candidate (root +
string(filepath.Separator)), and perform the case-insensitive conversion only
when runtime.GOOS == "windows" as currently implemented so the behavior is
unchanged but more robust against unclean paths.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 47d74599-2ec4-4966-bc2e-23002db39b95

📥 Commits

Reviewing files that changed from the base of the PR and between d0dee0e and a8d1b97.

📒 Files selected for processing (18)
  • cmd/ancestor.go
  • cmd/ancestor_test.go
  • cmd/hook.go
  • cmd/hook_test.go
  • cmd/index.go
  • cmd/purge.go
  • cmd/purge_test.go
  • cmd/resolve.go
  • cmd/resolve_test.go
  • cmd/stdio.go
  • cmd/stdio_test.go
  • internal/git/worktree_test.go
  • internal/index/index_test.go
  • internal/merkle/ignore.go
  • internal/merkle/ignore_test.go
  • internal/merkle/merkle_test.go
  • internal/merkle/root_guard.go
  • scripts/run

Comment thread cmd/hook.go
Comment on lines 142 to +162
if root, err := git.RepoRoot(cwd); err == nil {
cwd = root
} else if ancestor := findAncestorIndex(cwd, modelName); ancestor != "" {
cwd = ancestor
if unindexable, _ := merkle.IsRootUnindexable(root); !unindexable {
cwd = root
allowBackgroundIndex = true
} else if !hasLumenBoundaryFile(cwd) {
if ancestor := findAncestorIndex(cwd, modelName); ancestor != "" {
cwd = ancestor
allowBackgroundIndex = true
}
}
} else if !hasLumenBoundaryFile(cwd) {
if ancestor := findAncestorIndex(cwd, modelName); ancestor != "" {
cwd = ancestor
allowBackgroundIndex = true
}
} else {
allowBackgroundIndex = true
}
if hasLumenBoundaryFile(cwd) {
allowBackgroundIndex = true
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

.lumenignore boundaries are still bypassed on the git-root path in cmd/hook.go and cmd/resolve.go. generateSessionContextInternalWithDirective in cmd/hook.go and resolveIndexRoot in cmd/resolve.go only consult hasLumenBoundaryFile(...) on fallback branches. When the current directory is a bounded subdirectory inside a normal git repo, both functions still collapse to the repo root, so SessionStart can prewarm the wrong index and search can open the wrong DB. Please short-circuit on the boundary before any git-root adoption in both files, and add matching regression cases in cmd/hook_test.go and cmd/resolve_test.go for “git repo exists + boundary in subdir”.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/hook.go` around lines 142 - 162, The current logic in
generateSessionContextInternalWithDirective (cmd/hook.go) and resolveIndexRoot
(cmd/resolve.go) checks git.RepoRoot and may adopt the repo root before honoring
.lumenignore boundaries, causing background indexing and DB opens at the wrong
root; update both functions to call hasLumenBoundaryFile(cwd) first and
short-circuit (keep cwd and not adopt git.RepoRoot) when a boundary exists, only
falling back to git.RepoRoot/findAncestorIndex if no boundary is present,
preserving allowBackgroundIndex semantics; after changing the control flow
around hasLumenBoundaryFile, add regression tests in cmd/hook_test.go and
cmd/resolve_test.go for the case "git repo exists + boundary in subdir" to
assert cwd remains the subdir and SessionStart/prewarm and resolveIndexRoot use
the bounded path.

Comment on lines +107 to +120
if refusedRoots[clean] || refusedRoots[resolved] {
return true, "hardcoded system root"
}
if isRefusedRootSubtree(clean) || isRefusedRootSubtree(resolved) {
return true, "hardcoded system root"
}
if home, err := os.UserHomeDir(); err == nil {
homeClean := filepath.Clean(home)
homeResolved := resolvePath(home)
if homeClean == clean || homeClean == resolved || homeResolved == clean || homeResolved == resolved {
return true, "user home directory"
}
}
if isAgentSessionStoreRoot(clean) || isAgentSessionStoreRoot(resolved) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Make the Windows exact-root checks case-insensitive.

sameOrUnderRoot() already normalizes case, but the exact-root map lookup and the home-directory comparisons still use raw string equality. On Windows that lets case variants like c:\users or a differently cased home path slip past this guard, and the downstream callers in cmd/index.go, cmd/hook.go, and cmd/resolve.go all treat IsRootUnindexable() as authoritative.

🛠️ Suggested fix
+func samePath(a, b string) bool {
+	a = filepath.Clean(a)
+	b = filepath.Clean(b)
+	if runtime.GOOS == "windows" {
+		return strings.EqualFold(a, b)
+	}
+	return a == b
+}
+
+func isRefusedExactRoot(path string) bool {
+	for root := range refusedRoots {
+		if samePath(path, root) {
+			return true
+		}
+	}
+	return false
+}
+
 func IsRootUnindexable(dir string) (bool, string) {
 	clean := filepath.Clean(dir)
 	resolved := resolvePath(dir)
 	if runtime.GOOS == "windows" && (isWindowsDriveRoot(clean) || isWindowsDriveRoot(resolved)) {
 		return true, "windows drive root"
 	}
-	if refusedRoots[clean] || refusedRoots[resolved] {
+	if isRefusedExactRoot(clean) || isRefusedExactRoot(resolved) {
 		return true, "hardcoded system root"
 	}
 	if isRefusedRootSubtree(clean) || isRefusedRootSubtree(resolved) {
 		return true, "hardcoded system root"
 	}
 	if home, err := os.UserHomeDir(); err == nil {
 		homeClean := filepath.Clean(home)
 		homeResolved := resolvePath(home)
-		if homeClean == clean || homeClean == resolved || homeResolved == clean || homeResolved == resolved {
+		if samePath(homeClean, clean) || samePath(homeClean, resolved) ||
+			samePath(homeResolved, clean) || samePath(homeResolved, resolved) {
 			return true, "user home directory"
 		}
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/merkle/root_guard.go` around lines 107 - 120, Refuse roots checks
are case-sensitive on Windows: update the exact-root map lookup and
home-directory comparisons to use normalized, case-insensitive comparisons
(e.g., lowercased or strings.EqualFold) rather than raw string equality so they
match the same normalization performed by sameOrUnderRoot(); specifically change
the checks that reference refusedRoots[clean] and refusedRoots[resolved] and the
comparisons inside the os.UserHomeDir() block (homeClean == clean, homeClean ==
resolved, homeResolved == clean, homeResolved == resolved) to compare using the
same normalized form (or EqualFold) of clean/resolved/home paths, ensuring
functions like resolvePath(), isRefusedRootSubtree(), and
isAgentSessionStoreRoot() continue to receive the normalized values.

@philipgraffshapiro philipgraffshapiro closed this by deleting the head repository Jun 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants