Skip to content

feat: worker isolation stability#4

Merged
php-workx merged 22 commits into
mainfrom
feat/worker-isolation-stability
May 8, 2026
Merged

feat: worker isolation stability#4
php-workx merged 22 commits into
mainfrom
feat/worker-isolation-stability

Conversation

@php-workx
Copy link
Copy Markdown
Owner

@php-workx php-workx commented May 7, 2026

Summary by CodeRabbit

  • New Features
    • Per-ticket isolated worktree support for workers/reviewers, integration worktrees, and safer merge-back flows; explicit worktree routing for ticket runs.
  • Documentation
    • New contributor/agent guides, reviewer prompts, and expanded design docs describing worker isolation, worktree/cache roots, and workflow guidance.
  • Improvements
    • More durable resume/verification/integration behavior, richer changed-file artifacts (raw + effective), and isolated subprocess environments for runs and verifications.
  • Tests
    • Broad new and updated tests covering worktree lifecycle, diffs, verification, resume, and merge scenarios.
  • Misc
    • Updated ignore list to exclude Go build/test cache and adjusted local task wiring.

@php-workx php-workx self-assigned this May 7, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 7, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 36b9d149-56ae-4c68-8c6f-81ce07e8291e

📥 Commits

Reviewing files that changed from the base of the PR and between 471ffd0 and 063767b.

📒 Files selected for processing (2)
  • internal/engine/worktree.go
  • internal/engine/worktree_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • internal/engine/worktree.go
  • internal/engine/worktree_test.go

Walkthrough

Adds per-ticket Git worktree isolation and orchestration: worktree lifecycle APIs, isolated subprocess environments, verification wiring to run commands from ticket worktrees, durable wave integration/transaction state for resume, extensive tests, and documentation updates. Single-ticket CLI runs remain backward compatible.

Changes

Worker Isolation via Git Worktrees

Layer / File(s) Summary
Schemas & Contracts
internal/adapters/runtime/types.go, internal/state/types.go, internal/engine/...
Adds WorktreePath/WorktreeRoot fields to runtime request types and extends artifact types with RawChangedFiles and EffectiveChangedFiles.
Design & Documentation
docs/plans/worker-isolation.md, docs/review-prompts/*, AGENTS.md, CLAUDE.md, README.md
Complete semantic spec and checklist for worker isolation, reviewer prompts for crash/repo-safety review, AGENTS/CLAUDE guidance, and README worktree-cache resolution docs; .verk/config.yaml content removed.
Git Adapter Worktree Support
internal/adapters/repo/git/repo.go, internal/adapters/repo/git/repo_test.go
DiffAgainst emits untracked-file diffs; adds CreateWorktree/RemoveWorktree/ErrWorktreeExists, canonicalization and registry helpers, centralized git env handling, and tests for worktree lifecycle and diffs.
Process Environment Isolation
internal/adapters/runtime/cacheenv.go, internal/adapters/runtime/cacheenv_test.go
BuildIsolatedProcessEnv creates per-state cache dirs under .verk/process-cache, de-duplicates and deterministically sorts env entries, and returns isolated envs; tests validate expected vars and directories.
Runtime Adapter & Prompts
internal/adapters/runtime/claude/adapter.go, internal/adapters/runtime/codex/adapter.go, internal/adapters/runtime/prompt.go, tests
Adapters build isolated envs using WorktreePath, pass workDir to command runners, consider stdout+stderr for status/classification, and update prompts to allow safe local commits in assigned worktree while forbidding commits/edits during review.
Verification Runner
internal/adapters/verify/command/runner.go, tests
RunCommands/RunQualityCommands accept workDir (default repoRoot), run commands with cmd.Dir=workDir, construct env via BuildIsolatedProcessEnv, constrain per-quality command paths, and populate CommandResult.Cwd.
Engine Worktree & Integration
internal/engine/worktree.go, internal/engine/worktree_test.go
Adds validateWorktreeBelongsToRepo, WorktreeManager and integration behaviors, merge-to-main application logic, conflict detection, reconciliation/pruning, and broad tests covering merge semantics and stale cache cleanup.
Epic & Wave Orchestration
internal/engine/epic_run.go, internal/engine/wave_verify.go, internal/engine/wave_scheduler.go
RunEpic/resume reconcile worktree root, persist durable wave artifacts, compute baseline/raw changed files, inject per-ticket worktree paths, manage pending integration transactions, and gate integration/verification with resume support.
Ticket Run & Verification Wiring
internal/engine/ticket_run.go, tests
RunTicket accepts/resolves/validates WorktreePath, threads it through worker/reviewer/verification, records raw+filtered changed files, and runs verification with explicit workDir.
Resume & Pending Integration
internal/engine/resume.go, internal/engine/wave_verify.go, tests
Resume reconstructs integration base from cursor, rehydrates pending integration transactions, replays or completes pending verification/integration with correct worktree context, and updates cursor markers atomically.
Ticketstore Epic Gating
internal/adapters/ticketstore/tkmd/store.go, tests
Adds hasEpicType and gates deps-based child edges on frontmatter type == epic; tests ensure non-epic deps do not create children.
Tests & Helpers
many internal/*_test.go files, internal/e2e/helpers_test.go
Extensive unit/integration tests added/updated across repo, runtime adapters, verification runner, engine epic/resume/worktree flows; e2e helpers standardize git environment via testGitEnv().
Config & Misc
.gitignore, .verk/config.yaml, justfile
Added .gocache/ to .gitignore, removed .verk/config.yaml content, and adjusted justfile to introduce dev-check and change dev target.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 17

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/engine/epic_run.go (1)

586-606: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't publish wave success before diff-artifact persistence finishes.

acceptedWave is saved and EventWaveCompleted is emitted before diffArtifactErr can flip waveFailed. If diff persistence fails, RunEpic returns blocked while wave.json and progress consumers still see the wave as accepted/successful. Move the diff-artifact loop ahead of the save/event, or re-mark and re-save the wave after setting waveFailed.

Based on learnings: Treat ticket files and runtime state as product surfaces, not casual scratch files.

Also applies to: 618-646

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/engine/epic_run.go` around lines 586 - 606, The code currently saves
acceptedWave (via state.SaveJSONAtomic) and emits EventWaveCompleted with
SendProgress before handling diff-artifact persistence errors (diffArtifactErr),
allowing a success event to be published even if diff persistence later fails;
to fix, either move the diff-artifact persistence loop so it runs before saving
acceptedWave and calling SendProgress (so waveFailed/diffArtifactErr can
influence acceptedWave.Status) or, if you must keep the current ordering, detect
diffArtifactErr afterwards, set waveFailed = true and update acceptedWave.Status
accordingly and call state.SaveJSONAtomic( ... , acceptedWave) again to persist
the corrected status before emitting EventWaveCompleted (or re-emit a corrected
progress event); update RunEpic and any surrounding logic to reference
waveFailed/diffArtifactErr prior to publishing success and ensure SendProgress
uses the final persisted acceptedWave.Status.
🧹 Nitpick comments (1)
justfile (1)

23-31: ⚡ Quick win

Reduce gate duplication between dev-check and pre-push.

Right now the check list is duplicated manually, which can drift. Consider factoring shared prerequisites into one intermediate target and reusing it.

♻️ Refactor sketch
+# Shared fast gate without test execution
+pre-commit-core: format-check vet lint-check build-check mod-tidy-check actionlint betterleaks
+
 # Pre-commit: fast local checks + fresh non-race tests
-pre-commit: format-check vet lint-check build-check mod-tidy-check actionlint betterleaks test-fast
+pre-commit: pre-commit-core test-fast
 
 # Pre-push: pre-commit + race tests + vulnerability scan + semgrep
 pre-push: pre-commit test-race vuln semgrep
 
 # Dev gate: same broad checks as pre-push, but skip test-fast because test-race
 # already runs the full package set.
-dev-check: format-check vet lint-check build-check mod-tidy-check actionlint betterleaks test-race vuln semgrep
+dev-check: pre-commit-core test-race vuln semgrep
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@justfile` around lines 23 - 31, The dev-check target duplicates the same
prerequisite list as pre-push, risking drift; create a shared intermediate
target (e.g., common-checks or pre-push-common) that declares the repeated
prerequisites (format-check vet lint-check build-check mod-tidy-check actionlint
betterleaks test-race vuln semgrep) and then change both pre-push and dev-check
to depend on that single shared target (with dev-check adding only its unique
extra if any), so both targets reuse the same canonical prerequisite list.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/adapters/repo/git/repo_test.go`:
- Around line 321-328: The test helper testGitEnv() currently appends
"GIT_OPTIONAL_LOCKS=0" without removing any inherited "GIT_OPTIONAL_LOCKS"
entries, which can leave duplicate keys; update the env-filtering switch in
testGitEnv() (the switch handling keys like "GIT_DIR", "GIT_WORK_TREE", etc.) to
also drop any existing "GIT_OPTIONAL_LOCKS" entries (or filter entries that
start with "GIT_OPTIONAL_LOCKS=") before appending the canonical
"GIT_OPTIONAL_LOCKS=0" so the new value cannot be shadowed by an inherited one.

In `@internal/adapters/repo/git/repo.go`:
- Around line 269-285: The function writeNewFileDiff currently emits a hunk
header even for zero-byte files which produces an invalid patch; change it so
that when content is empty it prints only the diff/file headers and returns
before emitting the "@@ -0,0 +1,%d @@" hunk line. Specifically, in
writeNewFileDiff move the calculation/printing of the "@@ -0,0 +1,%d @@" line
(and any use of lineCount) into the non-empty-content branch (i.e., after the if
len(content) == 0 return) or alternatively return immediately after writing the
file headers, ensuring writeNewFileDiff (the function name) does not emit a hunk
header for empty content.
- Around line 372-384: The current fallback deletes targetPath even when Git
never registered it as a worktree; before calling os.RemoveAll(targetPath) in
the block around gitBytes(worktreeRoot, "worktree", "remove", ...), run
gitBytes(worktreeRoot, "worktree", "list", "--porcelain") and verify targetPath
appears in that listing (or matches the listed worktree path entries) and only
then proceed to remove; if the path is absent, return a clear error instead of
deleting; keep existing use of isWorktreeMissingFromGit/isPathMissing to
short-circuit other cases but add this explicit verification step using git
worktree list --porcelain to avoid removing unrelated directories.

In `@internal/adapters/runtime/cacheenv.go`:
- Around line 81-93: resolveProcessStateRoot currently falls back to
os.MkdirAll(verkPath) when filepath.EvalSymlinks(verkPath) fails, which silently
creates a private .verk under a linked worktree; change this to fail fast
instead. In resolveProcessStateRoot, after computing verkPath, if
filepath.EvalSymlinks(verkPath) returns an error do not call os.MkdirAll —
return a formatted error (including verkPath and the EvalSymlinks error) so
callers know the .verk link is missing or broken; this preserves the shared
cache-root semantics and avoids creating a disposable .verk under the worktree.

In `@internal/adapters/runtime/claude/adapter.go`:
- Around line 747-748: The runtimeCommandEnv function currently passes the
parent environment wholesale to runtime.BuildIsolatedProcessEnv; update it to
first strip inherited GIT-related variables (at minimum GIT_DIR, GIT_WORK_TREE,
GIT_INDEX_FILE and any other GIT_* keys) from os.Environ() before calling
BuildIsolatedProcessEnv to ensure isolation. Add a helper like stripEnvKeys(env
[]string, keys ...string) that removes entries whose key matches any blocked key
(use strings.Cut or strings.SplitN to parse "key=value"), then call
runtime.BuildIsolatedProcessEnv(stripEnvKeys(os.Environ(), "GIT_DIR",
"GIT_WORK_TREE", "GIT_INDEX_FILE", "GIT_OBJECT_DIRECTORY", "GIT_COMMON_DIR",
"GIT_OPTIONS", "GIT_AUTHOR_NAME", "GIT_AUTHOR_EMAIL"), worktreePath) from
runtimeCommandEnv.

In `@internal/adapters/verify/command/runner.go`:
- Around line 238-243: The current prefix check on cmdWorkDir is purely lexical
and can be bypassed by symlinks; before enforcing the worktree boundary for
qc.Path, resolve symlinks on the computed path and on absWorkDir (e.g., use
filepath.EvalSymlinks or equivalent) and then compare their cleaned absolute
forms (or use filepath.Rel on the resolved paths) to ensure the final resolved
cmdWorkDir does not escape absWorkDir; update the logic around cmdWorkDir,
absWorkDir, and qc.Path so the security check runs against the evaluated
filesystem paths rather than the raw joined string.

In `@internal/e2e/helpers_test.go`:
- Around line 55-72: testGitEnv appends a hardcoded "GIT_OPTIONAL_LOCKS=0" but
doesn't filter out any existing GIT_OPTIONAL_LOCKS env var, causing duplicates;
update the switch in testGitEnv to include "GIT_OPTIONAL_LOCKS" (or the exact
key "GIT_OPTIONAL_LOCKS") among the excluded keys so any inherited
GIT_OPTIONAL_LOCKS is skipped and only the appended "GIT_OPTIONAL_LOCKS=0"
remains in the returned slice.

In `@internal/engine/epic_run.go`:
- Around line 1178-1186: RunEpic currently rejects nested sub-epics because
req.WorktreeRoot/req.WorktreePath are prefilled; change the guard so recursive
child waves do not get blocked: detect whether the current run is a
nested/recursive epic (e.g., by checking parentTicketID or an IsSubEpic flag on
the request/context) and, for nested runs, skip or clear
req.WorktreeRoot/req.WorktreePath before this check so isolation is suppressed
for recursive waves until a proper sub-epic integration model exists; update the
logic around the outcome.err assignment (and the conditional that uses
req.WorktreeRoot/req.WorktreePath) to only return an error for true top-level
epic isolation cases, leaving nested/recursive epics to continue using the
shared-workspace code path.
- Around line 382-388: The code is persisting the moving integration ref
(waveBaseRef) into durable fields (wave.WaveBaseCommit and
last_wave_base_commit) so older waves can point to a ref that later advances;
instead resolve the ref once to a concrete commit SHA when creating a wave and
store that SHA in wave.WaveBaseCommit (and persist last_wave_base_commit as the
SHA), while keeping the original ref name in a separate non-durable field (e.g.,
wave.WaveBaseRefName or reuse waveBaseRef only for live integration operations).
Locate the wave initialization/creation code that sets wave.WaveBaseCommit =
waveBaseRef and change it to resolve/ref‑lookup to obtain the commit SHA, assign
that SHA to WaveBaseCommit/last_wave_base_commit, and leave the ref name only
for runtime use.

In `@internal/engine/resume.go`:
- Around line 648-663: The loop currently skips calling
waveManager.Diff/persistWorktreeDiff for outcomes with phase ==
state.TicketPhaseClosed, which causes lost artifacts if the wave later fails;
remove that conditional and always call waveManager.Diff(wave.TicketIDs[i]) and
persistWorktreeDiff(artifacts.RepoRoot, req.RunID, wave.TicketIDs[i], diff)
whenever waveManager != nil so closed-ticket worktrees are persisted too,
keeping the existing error handling that calls cleanupWaveManager() and returns
on error; references: outcomes, waveManager, wave.TicketIDs,
persistWorktreeDiff, cleanupWaveManager, state.TicketPhaseClosed.

In `@internal/engine/ticket_run.go`:
- Around line 113-122: The current canonicalization of WorktreePath in RunTicket
does not verify the worktree actually belongs to RepoRoot; add a validation step
(e.g., implement validateWorktreePath) that resolves both absRepoRoot and the
resolved worktreePath to their git common-dir (or linked-worktree set) and fails
the request if they differ. Invoke validateWorktreePath from RunTicket after
computing worktreePath (and before proceeding) and return an error when the git
common-dir or linked-worktree identity does not match absRepoRoot so external
checkouts cannot be used.
- Around line 532-539: The code currently records rawChangedFiles and then
silently drops engine-owned paths by setting st.implementation.ChangedFiles to
the filtered effectiveChangedFiles; instead, detect engine-owned mutations by
computing the difference between rawChangedFiles and effectiveChangedFiles
(i.e., files removed by filterEngineOwnedFiles) and fail fast: if any
engine-owned paths are present, return a descriptive error (including the
offending paths) rather than proceeding; perform this check after calling
collectRawChangedFiles and filterEngineOwnedFiles and before setting
st.implementation.ChangedFiles so functions like collectRawChangedFiles,
filterEngineOwnedFiles and the fields st.implementation.RawChangedFiles /
EffectiveChangedFiles / ChangedFiles are used to locate and implement the
change.

In `@internal/engine/wave_scheduler.go`:
- Around line 166-169: The wave.Acceptance entries for changed files can be nil
or omitted because uniqueSorted returns nil for empty input; update the code
that writes wave.Acceptance["changed_files_raw"] and ["changed_files_effective"]
so both keys are always present and mapped to explicit empty string slices when
there are no files: call uniqueSorted(req.RawChangedFiles) and
uniqueSorted(req.ChangedFiles), and if the result is nil replace it with an
empty []string{} before assigning to wave.Acceptance; ensure you always assign
both keys (use req.RawChangedFiles and req.ChangedFiles) so the persisted shape
is consistently [] when empty.

In `@internal/engine/wave_verify.go`:
- Around line 1013-1026: The current branch inside the hasTx path hard-fails if
the stored integration worktree is missing; instead rehydrate the integration
worktree from durable refs (the persisted base_commit and accepted_refs from
pending_wave/pending_wave_integration) before giving up so we can finish the
apply/commit transaction without re-verifying. Concretely: in the hasTx branch
(where gitRevParse, completeAlreadyAppliedPendingWaveIntegration,
integrationManagerForPendingTransaction and
completePendingWaveIntegrationTransaction are used) add the same rehydration
logic used in the pre-verification path: obtain an integration manager
(integrationManagerForPendingTransaction), and if the integration worktree/cache
is absent attempt to rebuild it from pendingWave.BaseCommit (tx.BaseCommit /
base_commit) and pendingWave.AcceptedRefs (accepted_refs) into the cache-root
semantics described in README.md; after successful rehydration, proceed to call
completePendingWaveIntegrationTransaction(req, cursor, runPath, run,
&pendingWave, wavePath, tx, integration) so the transaction completes without
re-running verification.
- Around line 1042-1064: resumePendingWaveVerification creates or reopens a
WaveIntegrationManager via integrationManagerForPendingTransaction but never
cleans it up on success or error; after integration, immediately defer its
teardown (e.g., defer
integration.Cleanup()/integration.Close()/integration.CleanupWorktree() —
whichever cleanup method exists on WaveIntegrationManager) so the integration
worktree and cache entries are removed on all exits (including before calling
runWaveVerificationLoop, clearPendingWaveVerificationOnTerminalFailure, and
completePendingWaveIntegrationTransaction); place the defer right after
integrationManagerForPendingTransaction returns and preserve the cache-root
semantics described in README.md.

In `@internal/engine/worktree.go`:
- Around line 84-106: The ticketID is used raw in filesystem paths
(worktreePath) and git refs in WorktreeManager.CreateWorktree which allows path
traversal and invalid ref chars; validate and normalize it immediately after
trimming (e.g., reject empty or any '/' or '..' segments and illegal ref
characters) or replace it with a safe encoded form (e.g., percent-encode or
hex/sha256 of the original) and then use that sanitized variable for all uses
(worktreePath and the generated ref string); update any other uses (including
the refs construction around lines 341-343) to reference the sanitized name to
ensure safe, deterministic on-disk and ref names.
- Around line 1356-1375: resolveWithinRoot and ensureWriteParent must forbid any
symlinked path components because the current lexical check lets symlinked
parents escape the root; change resolveWithinRoot to walk from absRoot through
each path component of rel, performing os.Lstat on each joined prefix and
returning an error if any component is a symlink (and also error if
filepath.EvalSymlinks(absTarget) resolves outside absRoot as a safety check),
and update ensureWriteParent to use os.Lstat instead of os.Stat and reject
symlink parents/destinations before allowing any write/remove/chmod operations
(apply the same symlink-rejection logic to the code referenced around lines
1501-1532).

---

Outside diff comments:
In `@internal/engine/epic_run.go`:
- Around line 586-606: The code currently saves acceptedWave (via
state.SaveJSONAtomic) and emits EventWaveCompleted with SendProgress before
handling diff-artifact persistence errors (diffArtifactErr), allowing a success
event to be published even if diff persistence later fails; to fix, either move
the diff-artifact persistence loop so it runs before saving acceptedWave and
calling SendProgress (so waveFailed/diffArtifactErr can influence
acceptedWave.Status) or, if you must keep the current ordering, detect
diffArtifactErr afterwards, set waveFailed = true and update acceptedWave.Status
accordingly and call state.SaveJSONAtomic( ... , acceptedWave) again to persist
the corrected status before emitting EventWaveCompleted (or re-emit a corrected
progress event); update RunEpic and any surrounding logic to reference
waveFailed/diffArtifactErr prior to publishing success and ensure SendProgress
uses the final persisted acceptedWave.Status.

---

Nitpick comments:
In `@justfile`:
- Around line 23-31: The dev-check target duplicates the same prerequisite list
as pre-push, risking drift; create a shared intermediate target (e.g.,
common-checks or pre-push-common) that declares the repeated prerequisites
(format-check vet lint-check build-check mod-tidy-check actionlint betterleaks
test-race vuln semgrep) and then change both pre-push and dev-check to depend on
that single shared target (with dev-check adding only its unique extra if any),
so both targets reuse the same canonical prerequisite list.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 31e6d930-43e1-4ed1-b8b3-087864ff465f

📥 Commits

Reviewing files that changed from the base of the PR and between 4e9d0f9 and 4d630fe.

📒 Files selected for processing (44)
  • .gitignore
  • .verk/config.yaml
  • AGENTS.md
  • CLAUDE.md
  • README.md
  • docs/plans/INDEX.md
  • docs/plans/worker-isolation.md
  • docs/review-prompts/2026-04-25-worker-isolation-distributed-workflow-crash-recovery-reviewer.md
  • docs/review-prompts/2026-04-25-worker-isolation-git-plumbing-repository-safety-reviewer.md
  • internal/adapters/repo/git/repo.go
  • internal/adapters/repo/git/repo_test.go
  • internal/adapters/runtime/cacheenv.go
  • internal/adapters/runtime/cacheenv_test.go
  • internal/adapters/runtime/claude/adapter.go
  • internal/adapters/runtime/claude/adapter_test.go
  • internal/adapters/runtime/claude/adapter_unix_test.go
  • internal/adapters/runtime/codex/adapter.go
  • internal/adapters/runtime/codex/adapter_test.go
  • internal/adapters/runtime/prompt.go
  • internal/adapters/runtime/prompt_test.go
  • internal/adapters/runtime/types.go
  • internal/adapters/ticketstore/tkmd/store.go
  • internal/adapters/ticketstore/tkmd/store_test.go
  • internal/adapters/verify/command/runner.go
  • internal/adapters/verify/command/runner_test.go
  • internal/cli/run.go
  • internal/e2e/epic_multi_wave_test.go
  • internal/e2e/helpers_test.go
  • internal/engine/epic_gate.go
  • internal/engine/epic_run.go
  • internal/engine/epic_run_test.go
  • internal/engine/resume.go
  • internal/engine/resume_test.go
  • internal/engine/ticket_run.go
  • internal/engine/ticket_run_test.go
  • internal/engine/ticket_validation.go
  • internal/engine/wave_commands_test.go
  • internal/engine/wave_scheduler.go
  • internal/engine/wave_verify.go
  • internal/engine/wave_verify_test.go
  • internal/engine/worktree.go
  • internal/engine/worktree_test.go
  • internal/state/types.go
  • justfile
💤 Files with no reviewable changes (1)
  • .verk/config.yaml

Comment thread internal/adapters/repo/git/repo_test.go
Comment thread internal/adapters/repo/git/repo.go
Comment thread internal/adapters/repo/git/repo.go
Comment thread internal/adapters/runtime/cacheenv.go
Comment thread internal/adapters/runtime/claude/adapter.go Outdated
Comment thread internal/engine/wave_scheduler.go Outdated
Comment thread internal/engine/wave_verify.go
Comment thread internal/engine/wave_verify.go
Comment thread internal/engine/worktree.go
Comment thread internal/engine/worktree.go
Address CodeRabbit review findings on PR #4:
- Fix symlink escape in resolveWithinRoot and ensureWriteParent
- Add ticket ID validation rejecting /, .., \ characters
- Add validateWorktreeBelongsToRepo using git.MainWorktreeRoot()
- Change engine-owned path handling from block to log-only
- Strip GIT_* env vars before BuildIsolatedProcessEnv
- Add GIT_OPTIONAL_LOCKS to test exclusion lists
- Add hasWorktreePath safety check before os.RemoveAll fallback
- Store resolved SHA instead of moving ref for wave base commit
- Move diff persistence before SaveJSONAtomic/SendProgress
- Remove closed-ticket diff skip in epic_run and resume
- Ensure changed_files_raw/effective are never nil
- Fix writeNewFileDiff hunk header for empty content
- Fail fast on linked worktree missing .verk symlink
- Enable allowRehydrate for passed-wave hasTx in resume
- Add defer integration.Cleanup() in resume path
- Use filepath.EvalSymlinks in verify path check
- Extract runSingleQualityCommand to reduce cognitive complexity

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (4)
internal/engine/worktree.go (3)

88-94: ⚠️ Potential issue | 🟠 Major

Sanitize ticket IDs for Git refs, not just filesystem paths.

This guard still allows ref-breaking values like : , @{, trailing .lock, or control characters. Those IDs will survive worktree creation and then fail later when used in refs/verk/runs/.../tickets/<ticketID>, or collide unpredictably. Please validate against Git ref-name rules (or encode once) before using the ID in either namespace.

Also applies to: 344-345

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/engine/worktree.go` around lines 88 - 94, The current ticketID
sanitation only checks filesystem path chars but must enforce Git ref-name rules
before using ticketID in refs/verk/runs/.../tickets/<ticketID>; replace the
ad-hoc checks with a proper validation (or canonical encoding) by adding and
calling a helper like validateGitRefName(ticketID) (or encodeTicketID(ticketID))
and use that in the code paths that set ticketID (the variable named ticketID
around the shown block and the other occurrences around lines 344-345). The
validator should reject control characters, ASCII 0x7f, colon ':', sequences
like '@{', trailing “.lock”, a slash/backslash/back-to-root segments, leading or
trailing dots/spaces, and other characters disallowed by Git refname rules (or
alternatively return an encoded safe string such as a url-safe base64 or
percent-encoding) and propagate an error if invalid so refs are never
constructed with unsafe ticket IDs.

1497-1508: ⚠️ Potential issue | 🔴 Critical

Reject symlink destinations before merge-to-main writes.

The parent walk is hardened now, but the destination path itself is still trusted. If mainRoot/file already exists as a symlink to /outside/file, os.Stat(mainAbs), os.OpenFile(dstAbs, ...), and os.Chmod(mainAbs, ...) will follow it and mutate outside the repo. Preflight needs to fail when the destination already exists as a symlink.

Also applies to: 1516-1561

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/engine/worktree.go` around lines 1497 - 1508, Reject existing
symlink destinations before performing merge-to-main writes: when resolving the
destination (using resolveWithinRoot with mainRoot and change.destRel) use
os.Lstat on mainAbs and check for mode os.ModeSymlink (or
FileMode&os.ModeSymlink != 0); if it is a symlink return an error (e.g.,
"destination is a symlink") instead of proceeding to os.OpenFile/Chmod. Apply
the same Lstat+symlink check in the other merge-to-main path referenced (the
block around lines 1516-1561) so both the existence check and any preflight
ensureWriteParent/Chmod/OpenFile steps abort if the destination is a symlink;
keep using ensureWriteParent and resolveWithinRoot for locating paths but do not
follow symlinks when validating the target.

1842-1849: ⚠️ Potential issue | 🟠 Major

Don't treat a missing .git entry as proof the path belongs to this repo.

A nested directory inside the repo and an arbitrary external directory both lack their own .git. Returning success on os.IsNotExist lets callers point WorktreePath at something like /tmp/outside and still pass validation. Resolve repo identity from worktreePath itself instead of skipping the check.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/engine/worktree.go` around lines 1842 - 1849, The current code
returns success when filepath.Join(worktreePath, ".git") does not exist, which
incorrectly treats any external directory as valid; instead of returning nil on
os.IsNotExist(err), resolve the repository identity from worktreePath itself and
validate it against the expected repo root (do not rely solely on
resolveWithinRoot). Replace the early return with a call to a helper (or
implement inline) that derives the repo root/identity from worktreePath (for
example by walking up to find a git worktree marker or running equivalent logic
used elsewhere to determine repo identity) and compare that identity to the
repository we expect; if they match continue, otherwise return an error.
Reference symbols: worktreePath, resolveWithinRoot, the .git Lstat/gitInfo check
and the surrounding validation flow.
internal/engine/ticket_run.go (1)

539-555: ⚠️ Potential issue | 🟠 Major

Block engine-owned edits instead of just filtering them out.

This still allows a worker to modify .verk, .tickets, or .git, then continue once those paths are removed from ChangedFiles. Because those are durable engine-owned surfaces, this should fail closed and block the ticket rather than logging and proceeding.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/engine/ticket_run.go` around lines 539 - 555, The code currently
filters out engine-owned paths by comparing rawChangedFiles and
effectiveChangedFiles (via filterEngineOwnedFiles) and merely logs the filtered
names; instead, when any engine-owned paths are detected (i.e.,
len(rawChangedFiles) != len(effectiveChangedFiles)), fail the ticket run
immediately: record the offending paths and transition the run into a
blocked/failed state (use the same ticket ID context from st.req.Ticket.ID), set
st.implementation.RawChangedFiles and .EffectiveChangedFiles appropriately, and
return/raise an error so the worker cannot proceed; update any status fields on
st.implementation (e.g., status/state or error/message) and ensure the caller
observes the failure rather than continuing after removing those paths from
ChangedFiles.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@AGENTS.md`:
- Around line 15-98: Remove the embedded transient memory block identified by
the `<claude-mem-context>` tag and the "# Memory Context" section in AGENTS.md
(the large "Legend: ..." / session list block); replace it with a short durable
guideline line that instructs contributors not to inline session/ticket memory
and points them to the proper transient storage or separate memory docs.
Specifically, delete the entire `<claude-mem-context>...</claude-mem-context>`
chunk and add one or two sentences under the AGENTS.md header (e.g., "Do not
include session memory or ticket transcripts here; use [designated memory store]
or mem-search/get_observations for transient context") so the file remains
stable and durable.

In `@internal/engine/worktree.go`:
- Around line 1372-1381: The code fails when the parent directory doesn't exist
because filepath.EvalSymlinks(parent) requires the full path to exist; change
the logic to walk up from parent until you find an existing ancestor, call
filepath.EvalSymlinks on that existing ancestor, then reconstruct resolvedTarget
by joining the resolved ancestor with the relative path from that ancestor to
absTarget (use filepath.Rel to compute the suffix). Reference symbols:
absTarget, parent, filepath.EvalSymlinks, resolvedParent, resolvedTarget, and
ensureWriteParent to allow the later creation of missing directories. Ensure you
handle the case where the loop reaches absRoot (or root) as a fallback before
calling EvalSymlinks.

---

Duplicate comments:
In `@internal/engine/ticket_run.go`:
- Around line 539-555: The code currently filters out engine-owned paths by
comparing rawChangedFiles and effectiveChangedFiles (via filterEngineOwnedFiles)
and merely logs the filtered names; instead, when any engine-owned paths are
detected (i.e., len(rawChangedFiles) != len(effectiveChangedFiles)), fail the
ticket run immediately: record the offending paths and transition the run into a
blocked/failed state (use the same ticket ID context from st.req.Ticket.ID), set
st.implementation.RawChangedFiles and .EffectiveChangedFiles appropriately, and
return/raise an error so the worker cannot proceed; update any status fields on
st.implementation (e.g., status/state or error/message) and ensure the caller
observes the failure rather than continuing after removing those paths from
ChangedFiles.

In `@internal/engine/worktree.go`:
- Around line 88-94: The current ticketID sanitation only checks filesystem path
chars but must enforce Git ref-name rules before using ticketID in
refs/verk/runs/.../tickets/<ticketID>; replace the ad-hoc checks with a proper
validation (or canonical encoding) by adding and calling a helper like
validateGitRefName(ticketID) (or encodeTicketID(ticketID)) and use that in the
code paths that set ticketID (the variable named ticketID around the shown block
and the other occurrences around lines 344-345). The validator should reject
control characters, ASCII 0x7f, colon ':', sequences like '@{', trailing
“.lock”, a slash/backslash/back-to-root segments, leading or trailing
dots/spaces, and other characters disallowed by Git refname rules (or
alternatively return an encoded safe string such as a url-safe base64 or
percent-encoding) and propagate an error if invalid so refs are never
constructed with unsafe ticket IDs.
- Around line 1497-1508: Reject existing symlink destinations before performing
merge-to-main writes: when resolving the destination (using resolveWithinRoot
with mainRoot and change.destRel) use os.Lstat on mainAbs and check for mode
os.ModeSymlink (or FileMode&os.ModeSymlink != 0); if it is a symlink return an
error (e.g., "destination is a symlink") instead of proceeding to
os.OpenFile/Chmod. Apply the same Lstat+symlink check in the other merge-to-main
path referenced (the block around lines 1516-1561) so both the existence check
and any preflight ensureWriteParent/Chmod/OpenFile steps abort if the
destination is a symlink; keep using ensureWriteParent and resolveWithinRoot for
locating paths but do not follow symlinks when validating the target.
- Around line 1842-1849: The current code returns success when
filepath.Join(worktreePath, ".git") does not exist, which incorrectly treats any
external directory as valid; instead of returning nil on os.IsNotExist(err),
resolve the repository identity from worktreePath itself and validate it against
the expected repo root (do not rely solely on resolveWithinRoot). Replace the
early return with a call to a helper (or implement inline) that derives the repo
root/identity from worktreePath (for example by walking up to find a git
worktree marker or running equivalent logic used elsewhere to determine repo
identity) and compare that identity to the repository we expect; if they match
continue, otherwise return an error. Reference symbols: worktreePath,
resolveWithinRoot, the .git Lstat/gitInfo check and the surrounding validation
flow.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 68391e16-7b66-4d13-a3bf-c8ef8c711333

📥 Commits

Reviewing files that changed from the base of the PR and between 4d630fe and 471ffd0.

📒 Files selected for processing (16)
  • .gitignore
  • AGENTS.md
  • internal/adapters/repo/git/repo.go
  • internal/adapters/repo/git/repo_test.go
  • internal/adapters/runtime/cacheenv.go
  • internal/adapters/runtime/claude/adapter.go
  • internal/adapters/runtime/codex/adapter.go
  • internal/adapters/verify/command/runner.go
  • internal/adapters/verify/command/runner_test.go
  • internal/e2e/helpers_test.go
  • internal/engine/epic_run.go
  • internal/engine/resume.go
  • internal/engine/ticket_run.go
  • internal/engine/wave_scheduler.go
  • internal/engine/wave_verify.go
  • internal/engine/worktree.go
✅ Files skipped from review due to trivial changes (1)
  • .gitignore
🚧 Files skipped from review as they are similar to previous changes (1)
  • internal/adapters/verify/command/runner_test.go

Comment thread AGENTS.md Outdated
Comment thread internal/engine/worktree.go Outdated
Test User and others added 2 commits May 8, 2026 09:08
…xity

resolveWithinRoot now walks up from the target parent until it finds an
existing ancestor, resolving symlinks only on the existing portion and
reconstructing the full path. This allows merge-to-main of files in
brand-new directories (e.g. pkg/new/file.go) where the parent doesn't
exist yet.

Also extracts runSingleQualityCommand from RunQualityCommands to bring
cognitive complexity under the gocognit limit, and removes the now-
unnecessary nolint:cyclop directive.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@php-workx php-workx merged commit 8c176cd into main May 8, 2026
11 checks passed
@php-workx php-workx deleted the feat/worker-isolation-stability branch May 8, 2026 09:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant