Skip to content

feat: add shared trust ledger for cross-worktree validation tracking#134

Merged
pacaplan merged 15 commits into
mainfrom
dev
Apr 27, 2026
Merged

feat: add shared trust ledger for cross-worktree validation tracking#134
pacaplan merged 15 commits into
mainfrom
dev

Conversation

@pacaplan
Copy link
Copy Markdown
Collaborator

@pacaplan pacaplan commented Apr 26, 2026

Summary

Adds a shared trust ledger so validated/skipped commits are automatically recognized across git worktrees — no manual intervention or validator-merge skill needed.

What changed

  • New trust ledger module (src/utils/trust-ledger.ts): append-only JSONL file at $(git rev-parse --git-common-dir)/agent-validator/trusted-snapshots.jsonl, shared across all worktrees via git's common directory. Records are written after passing runs, checks, and skips (both clean and dirty trees).
  • Startup reconciliation (src/core/reconciliation.ts): runs after lock acquisition, before logger init. Short-circuits as trusted (exit 0) when HEAD is already trusted. For merge commits with two trusted parents, uses git merge-tree --write-tree to detect conflict resolution and validates only the merge-resolution delta.
  • trusted exit status: new success status (exit 0), no gates run, no logs created. Handled in ConsoleReporter, report output, and exit code mapping.
  • Shared git helpers (src/utils/git.ts): extracted runGit/gitStdout to eliminate duplication between trust-ledger and reconciliation modules.
  • Removed validator-merge skill: the shared ledger makes it vestigial. Deleted skills/validator-merge/, removed docs references.

Key design decisions

  • Tree-based trust from v1: dirty-tree validation writes commit: null with tree SHA, recognized by tree match after commit
  • config_hash/scope_hash stored for audit but not gated on in v1
  • Trust lookup: commit match first, tree match only on clean worktree
  • Unified merge path: diff synthetic merge tree vs actual tree; empty = auto-trust, non-empty = validate delta only
  • Pruning at >1000 ledger lines via reachability check

OpenSpec artifacts

Full proposal, specs, design, and tasks at openspec/changes/better-git-tracking/.

Test plan

  • Run bun test — 612 tests pass including new trust-ledger and reconciliation tests
  • Validate in worktree A, merge into worktree B, run validator in B — should exit trusted
  • Validate on dirty tree, commit, run validator — should recognize via tree match
  • Merge two trusted branches with conflicts — should validate only conflict-resolution files
  • agent-validator skip writes trusted ledger record
  • Ledger write failures are caught and logged, never crash the run

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

  • New Features

    • Validation results now automatically propagate across linked git worktrees using a shared trust ledger
    • New skip command enables bypassing validation gates while advancing baseline state
  • Removed Features

    • validator-merge skill removed (functionality replaced by automatic trust propagation)
  • Documentation

    • Added trusted snapshots documentation explaining cross-worktree result sharing
    • Updated validation guides with new command behaviors
  • Other Improvements

    • Enhanced CLI adapter configuration and ordering
    • Improved release process handling for branch-based workflows

pacaplan and others added 10 commits April 25, 2026 19:09
Unhandled exceptions during ledger pruning (corrupt file, disk error)
would crash the entire validator run. Consistent with all other ledger
operations, errors are now caught and logged to stderr.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Extract runGit/gitStdout/GitResult to src/utils/git.ts (was duplicated
  verbatim in trust-ledger.ts and reconciliation.ts)
- Replace private gitObjectExists in trust-ledger.ts with import from
  execution-state.ts where it already exists
- Tighten TrustRecord.status from bare string to ValidatorStatus | 'skipped'

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 26, 2026

Warning

Rate limit exceeded

@pacaplan has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 28 minutes and 33 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7994e5af-277d-4c0b-a53a-e0bea3d53874

📥 Commits

Reviewing files that changed from the base of the PR and between 908eef8 and a8bd97e.

📒 Files selected for processing (21)
  • .claude/commands/release.md
  • openspec/changes/better-git-tracking/design.md
  • openspec/changes/better-git-tracking/specs/validation-trust-ledger/spec.md
  • openspec/changes/better-git-tracking/tasks/reconciliation-and-cleanup.md
  • openspec/changes/better-git-tracking/tasks/trust-ledger-core.md
  • src/cli-adapters/github-copilot.ts
  • src/commands/detect.ts
  • src/commands/gate-command-support.ts
  • src/commands/gate-command.ts
  • src/commands/skip.ts
  • src/core/change-detector.ts
  • src/core/reconciliation.ts
  • src/core/run-executor-helpers.ts
  • src/core/run-executor.ts
  • src/utils/git.ts
  • src/utils/trust-ledger.ts
  • test/cli-adapters/copilot-plugin.test.ts
  • test/core/change-detector.test.ts
  • test/core/reconciliation.test.ts
  • test/utils/trust-ledger-full-tree.test.ts
  • test/utils/trust-ledger.test.ts
📝 Walkthrough

Walkthrough

This PR implements a shared trust ledger for cross-worktree validation state propagation, introducing startup reconciliation to short-circuit validation when HEAD is already trusted, adding a new "trusted" validator status, integrating trust recording into run/check/review/detect commands, removing the validator-merge skill, and documenting the feature via extensive specifications and guides.

Changes

Cohort / File(s) Summary
Documentation & Guides
README.md, docs/skills-guide.md, docs/trusted-snapshots.md, docs/user-guide.md
Updated documentation to explain trusted snapshots feature, removed /validator-merge skill, and clarified detect/skip command behavior with trust propagation.
OpenSpec Specifications
openspec/changes/better-git-tracking/.openspec.yaml, openspec/changes/better-git-tracking/proposal.md, openspec/changes/better-git-tracking/design.md, openspec/changes/better-git-tracking/tasks.md, openspec/changes/better-git-tracking/tasks/*, openspec/changes/better-git-tracking/specs/*
Comprehensive specifications for trust ledger architecture, reconciliation behavior, skip/run lifecycle, merge handling, and validator-merge removal.
Release & Configuration
.claude/commands/release.md, .validator/config.yml
Updated release process to branch based on current branch; updated CLI adapter preference ordering and model pinning.
Core Trust System
src/utils/trust-ledger.ts, src/core/reconciliation.ts
New modules implementing append-only JSONL trust ledger with commit/tree-based trust evaluation, pruning, snapshot trees, and startup reconciliation with merge-parent analysis.
Git Utilities
src/utils/git.ts
New helpers for executing git commands with stdout/stderr capture and error reporting.
Command Support & Integration
src/commands/gate-command-support.ts, src/commands/gate-command.ts, src/commands/detect.ts, src/commands/skip.ts
New shared gate-command startup module for lock/log lifecycle; refactored commands to integrate reconciliation, trust ledger recording, and dynamic change options override.
Run Execution
src/core/run-executor.ts, src/core/run-executor-helpers.ts
Added startup reconciliation and trust recording for run outcomes; extended run context with startup change options and trust source metadata.
Change Detection
src/core/change-detector.ts
Rewritten getFixBaseChangedFiles to directly compare fixBase snapshots against current tree using git diff plus stash untracked hash reconciliation.
CLI Adapters
src/cli-adapters/github-copilot.ts
Refactored to centralize exec timeout, improve health-check control flow distinguishing missing vs unhealthy states.
Status & Output
src/types/validator-status.ts, src/output/console.ts, src/output/report.ts, src/core/trusted-message.ts
Added "trusted" validator status as success-equivalent; updated console/report output formatting; added trusted snapshot documentation URL.
Utilities
src/utils/log-parser.ts
Extended log classification to detect review logs by job ID prefix in addition to content marker.
Skills Removal
skills/validator-merge/SKILL.md, skills/validator-merge/merge-state.sh
Deleted validator-merge skill and merge-state.sh script; functionality replaced by automatic trust ledger reconciliation.
Test Coverage
test/commands/detect-trust.test.ts, test/core/change-detector.test.ts, test/core/reconciliation.test.ts, test/core/run-executor.test.ts, test/cli-adapters/copilot-plugin.test.ts, test/output/console.test.ts, test/output/report.test.ts, test/utils/log-parser.test.ts, test/utils/trust-ledger.test.ts, test/utils/trust-ledger-full-tree.test.ts, test/skills/merge-state-script.test.ts
Comprehensive test additions covering trust reconciliation, dirty-tree snapshots, detect-trust integration, copilot health checks, output formatting, and log parsing; removed validator-merge script tests.

Sequence Diagram(s)

sequenceDiagram
    participant CLI as CLI Command
    participant Executor as Run Executor
    participant Lock as Lock Manager
    participant Ledger as Trust Ledger
    participant Reconcile as Reconciliation
    participant GitRepo as Git Repo
    participant Gates as Gate Runner
    
    CLI->>Executor: executeRun(options)
    Executor->>Lock: acquireLock()
    Lock-->>Executor: lock acquired
    Executor->>Ledger: pruneIfNeeded(1000)
    Executor->>Reconcile: reconcileStartup(config)
    Reconcile->>GitRepo: check HEAD commit/tree
    Reconcile->>Ledger: isTrusted(HEAD.commit, HEAD.tree)
    Ledger-->>Reconcile: trusted status
    alt HEAD is trusted
        Reconcile->>Ledger: appendRecord(source: ledger-reconciled)
        Ledger-->>Reconcile: record written
        Reconcile-->>Executor: { kind: 'trusted' }
        Executor->>GitRepo: write execution state
        Executor->>Executor: return 'trusted' status
        Executor->>Lock: releaseLock()
    else HEAD is not trusted
        Reconcile-->>Executor: { kind: 'continue', changeOptions, trustSourceOnPass }
        Executor->>Executor: initRunContext(changeOptions)
        Executor->>Gates: executeGates()
        Gates-->>Executor: gate results
        Executor->>GitRepo: write execution state
        Executor->>Ledger: appendCurrentTrustRecord(trustSourceOnPass)
        Ledger-->>Executor: record written
        Executor->>Lock: releaseLock()
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

  • PR #98: Modifies the same stash-based fixBase change-detection logic with helpers for detecting/filtering untracked files captured in stash ^3 parent.
  • PR #100: Changes detect command terminal behavior to exit with code 2 for "no changes" scenarios instead of early text-only signaling.
  • PR #65: Modifies run-executor no-changes/rerun handling and terminal-status detection, overlapping with startup reconciliation/trusted short-circuiting.

Poem

🐰 A ledger of trust spreads through all worktrees wide,
No more merges, no more state to hide!
When HEAD is blessed by the trusted snapshot way,
The validator skips its gates and bounds with glee—
Proof that good work needs not be re-verified today. ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 12.20% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'feat: add shared trust ledger for cross-worktree validation tracking' clearly and concisely summarizes the main change: introducing a shared trust ledger system for cross-worktree validation. The title is specific, relates directly to the core feature additions (trust ledger module, reconciliation logic, and related integration), and would help teammates understand the primary purpose of the changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch dev

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@pacaplan pacaplan requested a review from Copilot April 26, 2026 02:24
@pacaplan
Copy link
Copy Markdown
Collaborator Author

@greptileai review

@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Apr 26, 2026

Greptile Summary

Adds a shared trust ledger (trusted-snapshots.jsonl in the git common dir) so validated commits are automatically recognized across worktrees, removing the need for the now-deleted validator-merge skill. Reconciliation runs after lock acquisition and before logger init, short-circuiting with trusted (exit 0) when HEAD or its merge parents are already trusted.

  • P1 — lockAcquired = true premature in gate-command.ts: the flag is set before acquireAndReconcileGateStartup calls acquireLock; if lock acquisition throws, the error handler incorrectly attempts to release a lock that was never held and writes execution state unnecessarily.
  • P1 — Silent read errors in readRecords(): both branches of the outer catch return [] with no log, so a permission error or I/O failure on the ledger file silently disables all trust propagation with no diagnostic output to the user.

Confidence Score: 4/5

Two P1 defects should be fixed before merge; both are low-risk to correct and don't affect the happy path.

The overall design is sound and the happy path (trust propagation across worktrees) works correctly. Two P1 issues exist: a lock-state flag set prematurely in the gate command path (wrong cleanup on acquisition failure) and silent swallowing of non-ENOENT read errors in the ledger (breaks trust propagation with no user-visible signal). Both are straightforward one-line fixes.

src/commands/gate-command.ts (lockAcquired flag), src/utils/trust-ledger.ts (readRecords error handling)

Important Files Changed

Filename Overview
src/utils/trust-ledger.ts New trust ledger module — append/read/prune JSONL ledger; non-ENOENT read errors are silently swallowed (returns [] with no log), breaking trust propagation without diagnosis
src/core/reconciliation.ts New startup reconciliation module — handles HEAD trust lookup, single/both-parents trusted merge paths, and merge-tree diff delta scoping; logic is clean
src/commands/gate-command.ts Updated gate command executor — lockAcquired set to true before acquireLock is actually called, causing erroneous state cleanup on lock-acquisition failure
src/commands/skip.ts Skip command now writes a trusted ledger record; pruneIfNeeded is not called, unlike run and gate-command paths, allowing unbounded ledger growth under skip-heavy usage
src/utils/git.ts Extracted shared runGit/gitStdout helpers and resolveBaseBranch; straightforward refactor with no issues
src/core/run-executor.ts Updated run executor — correctly calls pruneIfNeeded and reconcileStartup after lock, before logger init; lock release properly guarded in finally block
src/commands/gate-command-support.ts Gate command support — acquireAndReconcileGateStartup wraps lock acquisition + reconciliation cleanly; trusted short-circuit releases lock and exits correctly
src/types/validator-status.ts Added trusted to ValidatorStatus union and isSuccessStatus; correct exit-code 0 mapping

Comments Outside Diff (1)

  1. src/commands/skip.ts, line 35-51 (link)

    P2 pruneIfNeeded not called before ledger write

    Both executeRun and executeGateCommand call pruneIfNeeded(1000) after acquiring the lock, before touching the ledger. skip acquires the lock but never prunes. On a project that uses agent-validate skip heavily (e.g., skipping many commits in bulk), the ledger can grow past the 1000-line threshold and never be pruned, causing the O(n) lookup in isTrusted to degrade indefinitely. Adding await pruneIfNeeded(1000) after acquireLock would keep behavior consistent.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: src/commands/skip.ts
    Line: 35-51
    
    Comment:
    **`pruneIfNeeded` not called before ledger write**
    
    Both `executeRun` and `executeGateCommand` call `pruneIfNeeded(1000)` after acquiring the lock, before touching the ledger. `skip` acquires the lock but never prunes. On a project that uses `agent-validate skip` heavily (e.g., skipping many commits in bulk), the ledger can grow past the 1000-line threshold and never be pruned, causing the O(n) lookup in `isTrusted` to degrade indefinitely. Adding `await pruneIfNeeded(1000)` after `acquireLock` would keep behavior consistent.
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/commands/gate-command.ts
Line: 394-400

Comment:
**`lockAcquired` set before lock is actually acquired**

`lockAcquired = true` is set on line 394 before `acquireAndReconcileGateStartup` has a chance to call `acquireLock`. If `acquireLock` throws (e.g., another process holds the lock or the lock directory is inaccessible), the catch block calls `handleGateError(..., lockAcquired=true, ...)`, which in turn tries to `writeExecutionState` and `releaseLock` for a lock that was never held. The flag should only be set after the lock is successfully acquired.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/utils/trust-ledger.ts
Line: 100-103

Comment:
**Non-ENOENT read errors silently swallowed**

Both branches of the outer `catch` block return `[]`, making the `code === 'ENOENT'` guard dead code. A real filesystem error — permission denied, I/O error, corrupted directory — is indistinguishable from "ledger doesn't exist yet." The result is that `isTrusted()` will always return `{ trusted: false }` with no console message, silently breaking cross-worktree trust propagation and forcing a full re-validation with no diagnostic output.

The ENOENT branch is correct; the non-ENOENT branch should log before returning:
```ts
} catch (error) {
  if ((error as { code?: string }).code === 'ENOENT') return [];
  console.error(`[trust-ledger] failed to read records: ${(error as Error).message}`);
  return [];
}
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/commands/skip.ts
Line: 35-51

Comment:
**`pruneIfNeeded` not called before ledger write**

Both `executeRun` and `executeGateCommand` call `pruneIfNeeded(1000)` after acquiring the lock, before touching the ledger. `skip` acquires the lock but never prunes. On a project that uses `agent-validate skip` heavily (e.g., skipping many commits in bulk), the ledger can grow past the 1000-line threshold and never be pruned, causing the O(n) lookup in `isTrusted` to degrade indefinitely. Adding `await pruneIfNeeded(1000)` after `acquireLock` would keep behavior consistent.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "[simplify] refactor: extract shared git ..." | Re-trigger Greptile

Comment thread src/commands/gate-command.ts Outdated
Comment on lines +394 to +400
lockAcquired = true;
const reconciliation = await acquireAndReconcileGateStartup({
commandName,
config,
logDir,
options,
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 lockAcquired set before lock is actually acquired

lockAcquired = true is set on line 394 before acquireAndReconcileGateStartup has a chance to call acquireLock. If acquireLock throws (e.g., another process holds the lock or the lock directory is inaccessible), the catch block calls handleGateError(..., lockAcquired=true, ...), which in turn tries to writeExecutionState and releaseLock for a lock that was never held. The flag should only be set after the lock is successfully acquired.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/commands/gate-command.ts
Line: 394-400

Comment:
**`lockAcquired` set before lock is actually acquired**

`lockAcquired = true` is set on line 394 before `acquireAndReconcileGateStartup` has a chance to call `acquireLock`. If `acquireLock` throws (e.g., another process holds the lock or the lock directory is inaccessible), the catch block calls `handleGateError(..., lockAcquired=true, ...)`, which in turn tries to `writeExecutionState` and `releaseLock` for a lock that was never held. The flag should only be set after the lock is successfully acquired.

How can I resolve this? If you propose a fix, please make it concise.

Comment thread src/utils/trust-ledger.ts
Comment on lines +100 to +103
} catch (error) {
if ((error as { code?: string }).code === 'ENOENT') return [];
return [];
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Non-ENOENT read errors silently swallowed

Both branches of the outer catch block return [], making the code === 'ENOENT' guard dead code. A real filesystem error — permission denied, I/O error, corrupted directory — is indistinguishable from "ledger doesn't exist yet." The result is that isTrusted() will always return { trusted: false } with no console message, silently breaking cross-worktree trust propagation and forcing a full re-validation with no diagnostic output.

The ENOENT branch is correct; the non-ENOENT branch should log before returning:

} catch (error) {
  if ((error as { code?: string }).code === 'ENOENT') return [];
  console.error(`[trust-ledger] failed to read records: ${(error as Error).message}`);
  return [];
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/utils/trust-ledger.ts
Line: 100-103

Comment:
**Non-ENOENT read errors silently swallowed**

Both branches of the outer `catch` block return `[]`, making the `code === 'ENOENT'` guard dead code. A real filesystem error — permission denied, I/O error, corrupted directory — is indistinguishable from "ledger doesn't exist yet." The result is that `isTrusted()` will always return `{ trusted: false }` with no console message, silently breaking cross-worktree trust propagation and forcing a full re-validation with no diagnostic output.

The ENOENT branch is correct; the non-ENOENT branch should log before returning:
```ts
} catch (error) {
  if ((error as { code?: string }).code === 'ENOENT') return [];
  console.error(`[trust-ledger] failed to read records: ${(error as Error).message}`);
  return [];
}
```

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a shared, cross-worktree “trust ledger” that records trusted snapshots in the repo’s git common directory and introduces startup reconciliation to short-circuit validation (new trusted status) when HEAD (or eligible merge scenarios) is already trusted. This also removes the now-unneeded validator-merge skill and updates reporting / console output to represent the new status.

Changes:

  • Introduce src/utils/trust-ledger.ts (append-only JSONL ledger + pruning) and wire ledger writes into run/check/review/skip flows.
  • Add startup reconciliation (src/core/reconciliation.ts) to short-circuit trusted HEADs and scope merge validation via synthetic merge trees.
  • Add trusted validator status and update console/report output, plus remove skills/validator-merge/ and related docs/tests.

Reviewed changes

Copilot reviewed 36 out of 36 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/utils/trust-ledger.ts New shared JSONL trust ledger (append/read/lookup/prune) + record construction.
src/core/reconciliation.ts New startup reconciliation: trust short-circuit + merge-parent trust logic + merge-tree scoping.
src/utils/git.ts Adds shared runGit/gitStdout helpers used by ledger/reconciliation.
src/core/run-executor.ts Runs prune + reconciliation after lock acquisition and returns early on trusted short-circuit; writes ledger on early no-applicable-gates.
src/core/run-executor-helpers.ts Adds trusted messaging and appends trust records after execution state updates.
src/commands/gate-command-support.ts New shared helpers for gate commands: lock + reconcile startup, early-exit handling, error handling.
src/commands/gate-command.ts Refactors gate execution to use support module and appends ledger records after execution state write.
src/commands/skip.ts Writes a trusted ledger record on skip after writing .execution_state.
src/types/validator-status.ts Adds trusted to status union and treats it as success-equivalent.
src/output/report.ts Adds Status: Trusted mapping for report output.
src/output/console.ts Shows trusted summaries in green when status is overridden to “Trusted”.
src/utils/log-parser.ts Improves review-vs-check log detection for @ filenames / review logs without explicit markers.
src/cli-adapters/github-copilot.ts Improves Copilot health checks by distinguishing missing vs unhealthy CLI failures.
test/utils/trust-ledger.test.ts New unit tests for ledger path, read/append behavior, trust lookup rules, pruning, config hash behavior.
test/core/reconciliation.test.ts New integration-style tests validating reconciliation behaviors (trusted HEAD, tree-match materialization, merge scenarios).
test/utils/log-parser.test.ts Adds coverage for @ filenames and review adapter execution errors.
test/core/run-executor.test.ts Confirms trusted is considered a success status.
test/output/report.test.ts Asserts report status line for trusted short-circuit.
test/output/console.test.ts Asserts console summary formatting for trusted override.
test/cli-adapters/copilot-plugin.test.ts Adds test for “unhealthy” Copilot CLI start failures with stderr propagation.
test/skills/merge-state-script.test.ts Removes tests for deleted validator-merge script.
skills/validator-merge/merge-state.sh Deleted (skill removed).
skills/validator-merge/SKILL.md Deleted (skill removed).
docs/skills-guide.md Removes /validator-merge documentation.
.validator/config.yml Updates adapter preference ordering and codex model entry.
.claude/commands/release.md Updates release instructions (commit/push flow).
openspec/changes/better-git-tracking/** Adds proposal/design/spec/task artifacts documenting ledger + reconciliation + skill removal.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/commands/gate-command.ts Outdated
Comment on lines +394 to +400
lockAcquired = true;
const reconciliation = await acquireAndReconcileGateStartup({
commandName,
config,
logDir,
options,
});
Copy link

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lockAcquired is set to true before the lock is actually acquired (before acquireAndReconcileGateStartup()). If acquireLock() throws, the catch path will treat the lock as held and may attempt writeExecutionState/releaseLock, which can cause misleading errors or unintended unlock behavior. Set lockAcquired = true only after acquireAndReconcileGateStartup() (or after acquireLock) succeeds, or have acquireAndReconcileGateStartup return an explicit lockAcquired flag.

Suggested change
lockAcquired = true;
const reconciliation = await acquireAndReconcileGateStartup({
commandName,
config,
logDir,
options,
});
const reconciliation = await acquireAndReconcileGateStartup({
commandName,
config,
logDir,
options,
});
lockAcquired = true;

Copilot uses AI. Check for mistakes.
Comment thread src/utils/trust-ledger.ts
Comment on lines +164 to +188
function scopeFor(
config: LoadedConfig,
command: ScopeDescriptor['command'],
options: { gate?: string; enableReviews?: Set<string> } = {},
): ScopeDescriptor {
const cli_overrides: Record<string, unknown> = {};
if (options.gate) cli_overrides.gate = options.gate;
const enableReviews = options.enableReviews;
if (enableReviews && enableReviews.size > 0) {
cli_overrides.review = Array.from(enableReviews).sort();
}

const gateNames = [
...Object.keys(config.checks ?? {}).map((name) => `check:${name}`),
...Object.keys(config.reviews ?? {}).map((name) => `review:${name}`),
].sort();

return {
command,
gates: options.gate ? [options.gate] : gateNames,
entry_points: (config.project.entry_points ?? [])
.map((entry) => entry.path)
.sort(),
cli_overrides,
};
Copy link

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scopeFor() builds gates from both config.checks and config.reviews (and prefixes them with check:/review:), regardless of the command. This makes scope metadata inconsistent with actual execution (e.g., check doesn’t run reviews, and --gate <name> matches Job.name without the check: prefix). Consider generating the gate list based on command (checks-only vs reviews-only vs both) and using the same gate identifiers the CLI/job system uses (typically the raw gate name).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 45 out of 45 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/utils/trust-ledger.ts
Comment on lines +167 to +170
function indexInfoFor(entries: TreeEntry[]): string {
return entries
.map((entry) => `${entry.mode} ${entry.object}\t${entry.path}\0`)
.join('');
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

git update-index --index-info expects each NUL-delimited entry to include enough fields to reconstruct the index (typically the object type (e.g. blob) and/or the stage). The current indexInfoFor writes only <mode> <object>\t<path>\0, which is unlikely to be accepted by update-index and would break dirty-tree snapshot tree computation at runtime. Consider preserving the type field from ls-tree (store it on TreeEntry) and emitting the full ls-tree-compatible <mode> <type> <object>\t<path>\0 format (or switch to an update-index mode that matches the fields you have).

Copilot uses AI. Check for mistakes.
Comment thread src/core/reconciliation.ts Outdated
Comment on lines +102 to +108
async function mergeTree(
parent1: string,
parent2: string,
): Promise<string | null> {
const result = await runGit(['merge-tree', '--write-tree', parent1, parent2]);
const match = result.stdout.match(/\b[0-9a-f]{40,64}\b/);
return match?.[0] ?? null;
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mergeTree() extracts the first 40–64 hex substring from merge-tree stdout. merge-tree output can include multiple object IDs, so this can return a non-tree OID and later make git diff fail or scope reconciliation incorrectly. Parse the synthetic tree OID deterministically (e.g., from the first line / a known prefix like "merged tree:") and only return it when it’s clearly the tree ID.

Suggested change
async function mergeTree(
parent1: string,
parent2: string,
): Promise<string | null> {
const result = await runGit(['merge-tree', '--write-tree', parent1, parent2]);
const match = result.stdout.match(/\b[0-9a-f]{40,64}\b/);
return match?.[0] ?? null;
function parseMergeTreeOid(stdout: string): string | null {
const lines = stdout
.split('\n')
.map((line) => line.trim())
.filter(Boolean);
const firstLine = lines[0];
if (!firstLine) {
return null;
}
const directMatch = firstLine.match(/^[0-9a-f]{40,64}$/);
if (directMatch) {
return directMatch[0];
}
const labeledMatch = firstLine.match(/^merged tree:\s*([0-9a-f]{40,64})$/i);
return labeledMatch?.[1] ?? null;
}
async function mergeTree(
parent1: string,
parent2: string,
): Promise<string | null> {
const result = await runGit(['merge-tree', '--write-tree', parent1, parent2]);
return parseMergeTreeOid(result.stdout);

Copilot uses AI. Check for mistakes.
Comment on lines +98 to +102
gate: ctx.options.gate,
enableReviews: ctx.options.enableReviews,
},
trusted: ctx.trustSourceOnPass === 'ledger-reconciled' ? true : undefined,
});
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trusted: ctx.trustSourceOnPass === 'ledger-reconciled' ? true : undefined can incorrectly mark snapshots as trusted even when the user narrowed the run (--gate / --review), which should not propagate trust. Consider only forcing trusted: true for ledger-reconciled writes when the invocation is still default scope; otherwise omit trusted so appendCurrentTrustRecord marks it untrusted.

Copilot uses AI. Check for mistakes.
Comment on lines +324 to +328
gate: options.gate,
enableReviews: options.enableReviews,
},
trusted: trustSourceOnPass === 'ledger-reconciled' ? true : undefined,
});
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trusted: trustSourceOnPass === 'ledger-reconciled' ? true : undefined forces propagating trust even when the user narrowed the command via --gate / --review. That can mark partial validations as trusted. Consider only forcing trusted when there is no narrowing, otherwise omit trusted so appendCurrentTrustRecord records trusted: false.

Copilot uses AI. Check for mistakes.
Comment on lines +82 to +86
enableReviews: ledger.options.enableReviews,
},
trusted: ledger.source === 'ledger-reconciled' ? true : undefined,
});
}
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This early-exit ledger write forces trusted: true whenever source === 'ledger-reconciled', which can incorrectly create propagating trust records for narrowed invocations (--gate / --review). Consider only forcing trusted for default-scope runs, otherwise leave trusted unset so appendCurrentTrustRecord marks it untrusted.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/cli-adapters/github-copilot.ts (1)

144-171: ⚠️ Potential issue | 🟡 Minor

which copilot is not available on Windows; use copilot --help for the presence check instead.

The presence detection at line 150 relies on the which command, which exists on POSIX shells (Linux, macOS) but not on Windows. On Windows, which does not exist in cmd.exe or PowerShell by default (the Windows equivalent is where.exe or Get-Command). This means checkHealth() will always fail the first check and report { available: false, status: 'missing' } even when Copilot is installed and functional. In contrast, isAvailable() (line 137) correctly calls copilot --help directly, avoiding this cross-platform issue.

Consolidate the logic to use copilot --help for both presence and health checks. Distinguish missing vs. unhealthy by inspecting the error: if the error code is ENOENT or the message matches a "command not found" pattern, treat it as missing; otherwise, treat it as unhealthy.

Suggested fix
   async checkHealth(): Promise<{
     available: boolean;
     status: 'healthy' | 'missing' | 'unhealthy';
     message?: string;
   }> {
     try {
-      await this.execCopilot('which copilot');
-    } catch {
-      return {
-        available: false,
-        status: 'missing',
-        message: 'Command not found',
-      };
-    }
-
-    try {
       await this.execCopilot('copilot --help');
     } catch (error) {
-      const err = error as { stderr?: string; message?: string };
+      const err = error as { stderr?: string; message?: string; code?: string };
+      // ENOENT / spawn failure → binary not present on PATH
+      if (err.code === 'ENOENT' || /not found|command not found/i.test(err.message ?? '')) {
+        return {
+          available: false,
+          status: 'missing',
+          message: 'Command not found',
+        };
+      }
       return {
         available: true,
         status: 'unhealthy',
         message: (err.stderr || err.message || 'Unhealthy').trim(),
       };
     }

     return { available: true, status: 'healthy', message: 'Ready' };
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cli-adapters/github-copilot.ts` around lines 144 - 171, The checkHealth()
method currently uses execCopilot('which copilot') which fails on Windows;
change it to use execCopilot('copilot --help') for the presence check (same
approach as isAvailable()), and then differentiate missing vs unhealthy by
inspecting the caught error from execCopilot: if error has code 'ENOENT' or
message/stderr matches a "command not found" / "not recognized" / "no such file"
pattern treat it as { available: false, status: 'missing', message: ... },
otherwise treat it as { available: true, status: 'unhealthy', message: ... };
update references to execCopilot and the checkHealth() return values accordingly
so cross-platform detection works.
🧹 Nitpick comments (18)
src/core/change-detector.ts (2)

210-231: Tighten branches per Biome noNegationElse and reduce duplicate hash lookups.

Both reconciliation loops invert the natural condition (!== followed by else) which Biome flags, and call Map.get twice for the same key. A single positive comparison removes both warnings.

🧹 Proposed cleanup
     for (const file of currentUntracked) {
       if (!snapshotUntracked.has(file)) {
         files.add(file);
         continue;
       }
-
-      if (snapshotUntracked.get(file) !== currentHashes.get(file)) {
-        files.add(file);
-      } else {
-        files.delete(file);
-      }
+      const snapshotHash = snapshotUntracked.get(file);
+      const currentHash = currentHashes.get(file);
+      if (snapshotHash === currentHash) {
+        files.delete(file);
+      } else {
+        files.add(file);
+      }
     }

     for (const file of snapshotUntracked.keys()) {
       if (currentUntracked.has(file)) continue;
-
-      if (snapshotUntracked.get(file) !== currentHashes.get(file)) {
-        files.add(file);
-      } else {
-        files.delete(file);
-      }
+      const snapshotHash = snapshotUntracked.get(file);
+      const currentHash = currentHashes.get(file);
+      if (snapshotHash === currentHash) {
+        files.delete(file);
+      } else {
+        files.add(file);
+      }
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/core/change-detector.ts` around lines 210 - 231, The two loops in
change-detector using currentUntracked, snapshotUntracked, currentHashes, and
files invert the natural condition and call
snapshotUntracked.get/currentHashes.get twice; refactor each loop to use a
single positive comparison by caching the hash lookups into local variables
(e.g., let snapHash = snapshotUntracked.get(file); let currHash =
currentHashes.get(file)) and then use if (snapHash === currHash) {
files.delete(file); continue; } else { files.add(file); } (similarly for the
second loop) to eliminate duplicate Map.get calls and remove the negated-else
pattern flagged by Biome.

245-267: getStashUntrackedFileHashes silently returns empty for non-stash fixBase refs.

When fixBase is a regular commit (e.g., parent1/parent2 from reconciliation.ts), ${fixBase}^3 does not exist and git ls-tree rejects, which the catch swallows. That's intentional per the design, but it also masks legitimate failures (network, corruption, permissions). Consider differentiating "no third parent" (expected) from "ls-tree failed for other reasons" by inspecting the error/stderr, or at least logging at debug level so dirty-tree reconciliation issues are diagnosable.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/core/change-detector.ts` around lines 245 - 267, The
getStashUntrackedFileHashes function currently swallows all errors from
execFileAsync('git', ['ls-tree','-r','-z', `${fixBase}^3`]), which masks
legitimate failures; update this function to inspect the thrown error (from
execFileAsync) and treat the "no third parent" case (errors indicating the
object `${fixBase}^3` does not exist, e.g., messages like "Not a valid object
name" or exit code specific to git) as the expected case that returns an empty
Map, but for any other error log the error at debug (or warn) level via the
existing logger and rethrow or return an empty Map with a debug message per
project convention; use the error properties (stderr, message, code) from
execFileAsync to distinguish the cases and reference getStashUntrackedFileHashes
and the `${fixBase}^3` git invocation when making the change.
test/core/change-detector.test.ts (1)

105-170: Integration tests execute real git commands, deviating from the project's testing convention.

These tests spawn a real git binary inside a temp repo (init, commit, stash --include-untracked, rev-parse, stash pop) rather than mocking shell calls with spyOn. That makes them slower, dependent on the host's git version, and skipped in environments without git on PATH.

That said, the behavior under test (stash ^3 untracked tree + git hash-object reconciliation) is genuinely hard to fake faithfully, so a small integration suite is justified. Consider:

  • gating these tests behind a describe.skipIf(!hasGit()) so missing git doesn't fail CI, and
  • replacing process.chdir(repoDir) with passing cwd through the API (or runInTempCwd(repoDir, async () => ...) helper) to avoid leaking global state if any other test in the file reads process.cwd().

Based on learnings: "Mock shell commands using spyOn instead of executing real shell commands in tests."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/core/change-detector.test.ts` around lines 105 - 170, The tests in the
"ChangeDetector fixBase dirty snapshots" suite run real git commands and mutate
process.cwd(), which makes CI flaky when git is missing and leaks global state;
update the suite to guard execution with a runtime check (e.g., wrap the
describe with a skip-if when hasGit() is false) and remove global chdir by using
a per-test working-directory strategy: stop calling process.chdir(repoDir) in
beforeEach/afterEach and instead run test logic under the temp repo by passing
cwd to the git helper or using a runInTempCwd(repoDir, async () => { ... })
helper so createDirtySnapshot(), the ChangeDetector instantiation, and all git
calls operate with an explicit cwd; keep the createDirtySnapshot and
ChangeDetector usage but ensure tests are skipped when git is unavailable and no
global process.cwd() is mutated.
src/utils/git.ts (1)

11-93: Significant duplication between runGit and runGitWithInput.

The two functions differ only in the stdio descriptor for stdin and the post-spawn stdin.write/end calls. Same applies to gitStdout/gitStdoutWithInput. Extracting a small core (e.g., parameterized by an optional input?: string) would remove ~40 lines of duplicated decoder/handler wiring and keep behavior in one place.

♻️ Refactor sketch
-export function runGit(
-  args: string[],
-  options: { env?: NodeJS.ProcessEnv } = {},
-): Promise<GitResult> { ... }
-
-export function runGitWithInput(
-  args: string[],
-  input: string,
-  options: { env?: NodeJS.ProcessEnv } = {},
-): Promise<GitResult> { ... }
+export function runGit(
+  args: string[],
+  options: { env?: NodeJS.ProcessEnv; input?: string } = {},
+): Promise<GitResult> {
+  return new Promise((resolve, reject) => {
+    const child = spawn('git', args, {
+      stdio: [options.input !== undefined ? 'pipe' : 'ignore', 'pipe', 'pipe'],
+      env: options.env ? { ...process.env, ...options.env } : undefined,
+    });
+    // ...shared decoder/event wiring...
+    if (options.input !== undefined) {
+      child.stdin.on('error', (err: NodeJS.ErrnoException) => {
+        if (err.code !== 'EPIPE') reject(err);
+      });
+      child.stdin.write(options.input);
+      child.stdin.end();
+    }
+  });
+}
+
+export const runGitWithInput = (args: string[], input: string, options = {}) =>
+  runGit(args, { ...options, input });

As per coding guidelines: "Write clean, DRY (Don't Repeat Yourself) code".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/git.ts` around lines 11 - 93, The runGit/runGitWithInput and
gitStdout/gitStdoutWithInput duplicates the child spawn and decoder wiring;
create a single core function (e.g., runGitCore or extend runGit with an
optional input?: string parameter) that spawns git, sets stdio to
['pipe','pipe','pipe'] when input is provided or ['ignore','pipe','pipe']
otherwise, reuses the StringDecoder/stdout/stderr handling, merges env like
currently done, writes and ends child.stdin only if input is provided, and
resolves the same GitResult ({stdout, stderr, code}) trimmed; then refactor
runGit and runGitWithInput to call that core and refactor
gitStdout/gitStdoutWithInput to call the unified function and preserve the same
error message construction (including stderr detail) and behavior.
src/output/console.ts (1)

164-167: Override-color mapping is binary.

Any non-'Trusted' override falls through to red, including future success-like overrides. If you anticipate adding more override labels (e.g., 'Passed with warnings', 'Error'), consider mapping by label rather than the current === 'Trusted' short-circuit. Not a concern for the current PR scope.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/output/console.ts` around lines 164 - 167, The current statusOverride
handling (variable statusOverride returning { overallStatus, statusColor })
treats any non-'Trusted' value as red; change it to use a lookup mapping (e.g.,
a Record<string, ChalkColor> or switch) keyed by override labels so you can map
'Trusted' → green, 'Passed with warnings' → yellow, 'Error' → red, etc., and
fall back to a sensible default color; update the code that constructs and
returns { overallStatus: statusOverride, statusColor: color } (referencing
statusOverride, color, overallStatus, statusColor) to use the mapping lookup
with a default instead of the current === 'Trusted' ternary.
src/commands/skip.ts (1)

41-41: Extract the prune threshold into a named constant.

The 1000 threshold is duplicated wherever pruneIfNeeded is invoked. Consider exposing a named constant (e.g., TRUST_LEDGER_PRUNE_THRESHOLD) from src/utils/trust-ledger.ts (or making the parameter optional with a default) so the value stays consistent across all call sites.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/commands/skip.ts` at line 41, Extract the hard-coded 1000 into a shared
named constant and use it at call sites: add and export
TRUST_LEDGER_PRUNE_THRESHOLD in src/utils/trust-ledger.ts (or make pruneIfNeeded
accept an optional parameter with default = TRUST_LEDGER_PRUNE_THRESHOLD), then
replace literal calls like await pruneIfNeeded(1000) in skip.ts with await
pruneIfNeeded(TRUST_LEDGER_PRUNE_THRESHOLD) (or call without arg if you added
the default). Update imports to reference the new exported constant where needed
and keep the pruneIfNeeded signature consistent with the new default/constant.
src/core/run-executor-helpers.ts (1)

87-103: Clean extraction; consider exporting and reusing in run-executor.ts.

appendRunTrustRecord neatly centralizes the trust-record write with the correct trusted flag for ledger-reconciled sources. The same pattern is duplicated inline in src/core/run-executor.ts (lines 155-168) for the no_applicable_gates early-exit path, but it omits the explicit trusted flag. Exporting this helper and reusing it there (see related comment on run-executor.ts) keeps both code paths consistent.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/core/run-executor-helpers.ts` around lines 87 - 103, The helper
appendRunTrustRecord centralizes writing the run trust record but is currently
only local to run-executor-helpers.ts while run-executor.ts duplicates similar
logic without setting the trusted flag; export appendRunTrustRecord from
src/core/run-executor-helpers.ts and replace the duplicated inline block in
run-executor.ts (the no_applicable_gates early-exit path) to call
appendRunTrustRecord(ctx, status) so both paths use the same logic and preserve
the trusted: ctx.trustSourceOnPass === 'ledger-reconciled' behavior.
src/commands/gate-command.ts (3)

316-329: Ledger status fidelity: passed_with_warnings is collapsed to 'passed'.

outcome.allPassed ? 'passed' : 'failed' loses the passed_with_warnings distinction that run-executor-helpers.ts:determineStatus preserves (when allPassed && anySkipped). Both are trust-eligible, so the record is still written, but downstream audits will see 'passed' for what was actually a warnings-passed run. The trust-ledger reader for cross-worktree decisions may want to know there were skipped violations.

Consider mirroring determineStatus here, or exposing it and reusing it:

♻️ Suggested fix
-      status: outcome.allPassed ? 'passed' : 'failed',
+      status: outcome.allPassed
+        ? outcome.stats.skipped > 0
+          ? 'passed_with_warnings'
+          : 'passed'
+        : 'failed',
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/commands/gate-command.ts` around lines 316 - 329, The current call to
appendCurrentTrustRecord collapses passed_with_warnings into 'passed' by using
outcome.allPassed ? 'passed' : 'failed'; update this to preserve the warnings
state by deriving status the same way run-executor-helpers.ts does—either call
or import determineStatus(outcome) and pass its result as the status, or
replicate its logic (return 'passed_with_warnings' when outcome.allPassed &&
outcome.anySkipped, 'passed' when allPassed, else 'failed'); ensure the
trusted/trustSourceOnPass logic remains unchanged and continue calling
appendCurrentTrustRecord with the new status value.

211-216: Same startupChangeOptions.fixBase override pattern as run-executor-helpers.ts:220-225.

Two near-identical implementations of the reconciliation-driven fixBase override now live in src/core/run-executor-helpers.ts and here. Consider extracting a shared helper (e.g., applyStartupFixBase(changeOptions, startupChangeOptions, options)) into src/utils/git.ts or a new src/core/change-options.ts to keep semantics in lockstep.

As per coding guideline "Write clean, DRY (Don't Repeat Yourself) code".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/commands/gate-command.ts` around lines 211 - 216, Extract the duplicated
fixBase-override logic into a single helper (e.g., applyStartupFixBase) and
replace the two inline implementations with calls to that helper: create
applyStartupFixBase(changeOptions, startupChangeOptions, options) in a shared
module (suggest src/core/change-options.ts or src/utils/git.ts) that returns a
new changeOptions with fixBase set only when startupChangeOptions?.fixBase is
truthy and options.commit is falsy; update callers in gate-command.ts and
run-executor-helpers.ts to import and use applyStartupFixBase, and run tests to
ensure behavior is unchanged.

264-332: executeAndFinalize parameter list is getting unwieldy.

Twelve positional parameters (with three new optional ones at the end) make call sites easy to misalign and hard to read. Consider grouping into an options object to keep the contract self-documenting:

async function executeAndFinalize(args: {
  config: LoadedConfig;
  logger: Logger;
  debugLogger?: DebugLogger;
  isRerun: boolean;
  failuresMap?: Map<string, Map<string, PreviousViolation[]>>;
  changeOptions?: ChangeOptions;
  effectiveBaseBranch: string;
  passedSlotsMap?: Map<string, Map<number, PassedSlot>>;
  changes: string[];
  jobs: ...;
  contextContent?: string;
  commandName?: GateCommandName;
  options?: GateCommandOptions;
  trustSourceOnPass?: TrustRecordSource;
}): Promise<boolean>
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/commands/gate-command.ts` around lines 264 - 332, The executeAndFinalize
function has an unwieldy 12+ parameter list; refactor its signature to accept a
single args object (e.g., args: { config, logger, debugLogger?, isRerun,
failuresMap?, changeOptions?, effectiveBaseBranch, passedSlotsMap?, changes,
jobs, contextContent?, commandName?, options?, trustSourceOnPass? }) and update
the function body to destructure those properties, keeping all existing types
and behavior (including calls to debugLogger.logRunStart/logRunEnd, Runner
instantiation, writeExecutionState, appendCurrentTrustRecord, and return of
outcome.allPassed); update all callers to pass a single object and adjust any
tests, and consider adding a thin overload or migration helper if you need
backward-compatible call sites during the change.
src/commands/detect.ts (1)

76-85: trustedChangeOptions spread risks undefined-overwriting opts.fixBase.

If reconciliation.changeOptions is returned with fixBase explicitly set to undefined, the spread on Line 84 will overwrite a freshly-resolved opts.fixBase from the fresh-run path (Line 72-73). In practice reconcileDetect likely doesn’t set fixBase: undefined, but a small guard avoids a fragile contract:

♻️ Suggested guard
-  if (isRerun) return opts;
-  return { ...opts, ...trustedChangeOptions };
+  if (isRerun) return opts;
+  return {
+    ...opts,
+    ...(trustedChangeOptions.fixBase !== undefined
+      ? { fixBase: trustedChangeOptions.fixBase }
+      : {}),
+  };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/commands/detect.ts` around lines 76 - 85, The current merge "{ ...opts,
...trustedChangeOptions }" can overwrite a valid opts.fixBase with an explicit
undefined from trustedChangeOptions; update the merge so that fixBase from
trustedChangeOptions is only applied if it is not undefined (e.g. build the
result by spreading opts, then conditionally assign trustedChangeOptions.fixBase
only when !== undefined, or omit fixBase from trustedChangeOptions before
spreading). Target the merging logic that combines opts and trustedChangeOptions
in detect (the code block returning "{ ...opts, ...trustedChangeOptions }") so
other properties still merge but fixBase is preserved when
trustedChangeOptions.fixBase is undefined.
test/utils/trust-ledger-full-tree.test.ts (1)

64-67: Make cwd restore robust against early test failures.

If the test throws after process.chdir(repoDir) but before assertions, afterEach runs, but if any future change adds setup steps after chdir that themselves can fail, an unrestored cwd will leak across files. Consider wrapping the body in a try/finally inside the test, or restoring cwd before fs.rm:

♻️ Suggested tweak
 	afterEach(async () => {
-		process.chdir(originalCwd);
-		await fs.rm(repoDir, { recursive: true, force: true });
+		try {
+			process.chdir(originalCwd);
+		} finally {
+			await fs.rm(repoDir, { recursive: true, force: true });
+		}
 	});
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/utils/trust-ledger-full-tree.test.ts` around lines 64 - 67, The
afterEach cleanup should always restore the working directory even if fs.rm
throws or earlier test setup fails; update the afterEach block so
process.chdir(originalCwd) runs before awaiting fs.rm(repoDir, { recursive:
true, force: true }) (or ensure tests that call process.chdir(repoDir) use
try/finally to restore cwd inside the test), referencing the existing afterEach
async () => { process.chdir(originalCwd); await fs.rm(repoDir, { recursive:
true, force: true }); } to guarantee cwd is restored before attempting removal.
src/utils/trust-ledger.ts (2)

401-404: Hardcoded 1000 threshold appears in two places.

pruneIfNeeded(1000) is called from gate-command-support.ts:174 (and presumably from run-executor). Consider exporting a DEFAULT_PRUNE_THRESHOLD = 1000 from this module so callers can’t drift, and the spec value lives in code as a single source of truth.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/trust-ledger.ts` around lines 401 - 404, The hardcoded prune
threshold (1000) is duplicated across callers; introduce and export a single
constant DEFAULT_PRUNE_THRESHOLD = 1000 from this module and update callers that
call pruneIfNeeded(1000) to import and use DEFAULT_PRUNE_THRESHOLD instead;
change the export in this file (alongside the existing pruneIfNeeded function)
and replace literal 1000 usages in caller code (e.g., where pruneIfNeeded is
invoked) so the threshold is maintained in one place.

147-165: Use exit code detection instead of substring matching for git error cases.

maybeTree determines "no such tree → null" by checking English substrings (Needed a single revision, unknown revision, ambiguous argument, Not a valid object name). On systems with localized git (LC_ALL/LANG set) or future git versions where messages change, these checks fail silently and rethrow the error. This breaks computeSnapshotTreeSha for the common case of stashes without untracked files (where ^3 doesn't exist).

Use either:

  • Set LC_ALL=C when spawning git to enforce stable English messages
  • Detect missing refs via exit code (both gitStdout and runGit have access to exit codes)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/trust-ledger.ts` around lines 147 - 165, The current maybeTree
function relies on English error message substrings from gitStdout to detect "no
such tree" and returns null, which fails on localized messages; fix by detecting
missing refs via git exit codes or forcing stable English output: update
maybeTree to call git with LC_ALL=C in the spawn/env passed to gitStdout (or
runGit) OR change the error handling to inspect the subprocess exit code (e.g.,
error.exitCode or error.code returned by gitStdout/runGit) and treat the git
exit code for unknown/rev-parse failures (commonly 128) as the "no tree → null"
case; reference maybeTree and the gitStdout/runGit call so you modify the
invocation and the catch branch to use exit code/env instead of substring
matching.
openspec/changes/better-git-tracking/design.md (1)

9-11: Optional: add language tags to fenced code blocks.

markdownlint-cli2 flags lines 9, 77, and 98 (MD040). These are pseudo-flow blocks; adding text (or bash for the shell-style flow on line 9) would silence the lint without changing rendering.

Also applies to: 77-92, 98-118

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@openspec/changes/better-git-tracking/design.md` around lines 9 - 11, Markdown
fenced code blocks containing pseudo-flow snippets (e.g., the block with
"loadConfig → initContext → handleAutoClean → acquireLock →
runWithLock(logger.init → gates)" and the similar blocks in ranges 77-92 and
98-118) lack language tags and trigger MD040; add an explicit language tag to
each fenced block (use "bash" for the shell-style flow on the first snippet and
"text" for the purely illustrative arrows in the other blocks) so the lint
passes while preserving rendering; locate the exact fenced blocks in design.md
and append the appropriate language identifier after the opening ``` backticks.
src/commands/gate-command-support.ts (1)

92-125: checkEarlyExit actually terminates the process; signature lies about it.

Both branches end up calling handleNoWork, which is Promise<never> and process.exits. The function returns Promise<void>, so callers can’t infer the no-return semantics, and it’s easy to add a third branch later that does return without realizing the contract is "either return or exit." Consider either:

  • Returning Promise<void> only (let the caller decide to exit), or
  • Marking the early-exit branches with an explicit return type (Promise<void> plus // eslint:noreturn style isn't possible in TS — use a dedicated earlyExitIfNoWork(...): Promise<void> that always exits in those branches and an explicit return in the non-exit branch).

Minor today, but it has the same "looks like a guard" feel as assert functions and will trip readers.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/commands/gate-command-support.ts` around lines 92 - 125, The function
checkEarlyExit currently calls handleNoWork (which returns Promise<never> and
exits the process) but is declared as Promise<void>, so update the
implementation and API to be explicit: either (A) stop exiting inside
checkEarlyExit and let callers handle termination — remove calls to handleNoWork
here and instead return early (adjust callers of checkEarlyExit accordingly), or
(B) make the function name and signature reflect that it may not return (rename
to earlyExitIfNoWork or similar) and document/annotate that it will call
handleNoWork and not return; in that case keep the internal calls to
handleNoWork but change all call sites to not expect a return and update
types/comments to indicate non-returning behavior (reference checkEarlyExit and
handleNoWork to locate the behavior). Ensure no callers assume continuation
after calling the function in the chosen approach.
test/utils/trust-ledger.test.ts (2)

75-113: Avoid as any in test helper.

Casting through as any defeats type checking on the LoadedConfig shape — easy to silently drift from the real type. reconciliation.test.ts uses as unknown as LoadedConfig for the same purpose; either consolidate on that pattern or import the type and let TS catch shape mismatches.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/utils/trust-ledger.test.ts` around lines 75 - 113, The test helper
function config currently returns its object with a trailing "as any", which
bypasses TS shape checks; change this to a proper cast to the real LoadedConfig
type (either import the LoadedConfig type and assert the return as "as
LoadedConfig" or follow the pattern used in reconciliation.test.ts and use "as
unknown as LoadedConfig") so TypeScript verifies the shape of the returned
object; update the function signature or its return expression in function
config(...) accordingly and import the LoadedConfig type where needed.

26-26: Use an OS temp dir instead of an in-repo .test-trust-ledger path.

TEST_DIR is a fixed path inside the repo (via import.meta.dir). Two issues:

  1. Concurrent test runs (or a partially-cleaned previous failure) share the same directory, which can cause flakes since beforeEach does rm -rf on it.
  2. The directory is rooted in the working repo, so a stray failure leaves debris under source control.

Prefer a fresh os.tmpdir() + mkdtemp per test (consistent with reconciliation.test.ts).

Based on learnings: "Use temporary directories with absolute paths instead of relying on environment-specific paths."

♻️ Suggested change
-const TEST_DIR = path.join(import.meta.dir, "../../.test-trust-ledger");
+let TEST_DIR: string;
@@
 	beforeEach(async () => {
-		await fs.rm(TEST_DIR, { recursive: true, force: true });
-		await fs.mkdir(TEST_DIR, { recursive: true });
+		TEST_DIR = await fs.mkdtemp(path.join(os.tmpdir(), "validator-trust-ledger-"));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/utils/trust-ledger.test.ts` at line 26, TEST_DIR is currently a fixed
repo-relative path causing shared-state flakes; change creation to use an OS
temp directory created per test (e.g., use os.tmpdir() + fs.mkdtemp or
fs.mkdtempSync) instead of the import.meta.dir path. Replace the TEST_DIR
constant with a per-test temp dir created in the test suite setup (beforeEach),
assign that path to the same TEST_DIR variable or a local testDir used by tests,
and ensure the existing cleanup (rm -rf in beforeEach/afterEach) targets this
temp dir; mirror the approach used in reconciliation.test.ts so each test gets a
unique absolute temp directory.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.claude/commands/release.md:
- Around line 144-157: The non-main flow assumes an existing PR but gh pr view
can fail if the current branch has no PR; update the release instructions to
detect that case and create a PR when none exists by invoking gh pr create with
--base main and a title/body that includes NEW_VERSION (e.g., "chore: release
v${NEW_VERSION}"), then run gh pr view --json url --jq .url to print the PR URL;
reference CURRENT_BRANCH, NEW_VERSION, gh pr view and gh pr create in the doc so
maintainers know where to change the steps.

In `@src/commands/detect.ts`:
- Around line 87-99: Replace the duplicated resolution logic in the
resolveEffectiveBaseBranch function by removing that function and calling the
shared resolveBaseBranch helper from src/utils/git.ts instead; import
resolveBaseBranch at the top of the module and use it where
resolveEffectiveBaseBranch was used (pass the same DetectCliOptions and
LoadedConfig or the same arguments previously supplied), ensuring all references
to resolveEffectiveBaseBranch are updated to call resolveBaseBranch so the
GITHUB_BASE_REF / CI / config.project.base_branch logic is centralized.

In `@src/core/run-executor.ts`:
- Around line 155-168: The inline appendCurrentTrustRecord call in
run-executor.ts for the 'no_applicable_gates' early exit is missing the explicit
trusted flag and duplicates logic; replace this block to reuse the
appendRunTrustRecord helper (export it from run-executor-helpers.ts if not
already exported) so the same trusted logic is applied (i.e., pass trusted:
ctx.trustSourceOnPass === 'ledger-reconciled' ? true : undefined) and remove the
duplicated options construction, ensuring appendRunTrustRecord receives config,
logDir, command: 'run', status: prepared.earlyResult.status, source:
ctx.trustSourceOnPass ?? 'validated', and the options object (gate,
enableReviews).

In `@src/utils/git.ts`:
- Around line 51-80: The runGitWithInput function can crash the process when
writing to child.stdin emits an 'error' (e.g., EPIPE); add a handler on
child.stdin ('error') before calling child.stdin.write(input) that either
rejects the Promise with the stream error or swallows known transient errors
(EPIPE) and then remove that handler when child.stdin.end() or the child process
closes; specifically, in runGitWithInput attach child.stdin.on('error', handler)
that calls reject(err) (or ignores err.code === 'EPIPE'), and ensure you remove
the listener in the child.on('close') and/or after child.stdin.end() to avoid
leaks.

In `@src/utils/trust-ledger.ts`:
- Around line 401-422: In pruneIfNeeded, for TrustRecord entries where
record.commit is null you must only retain the record when the working_tree_ref
still exists in git (per design/spec); remove the additional OR check against
gitObjectExists(record.tree) so the condition becomes solely await
gitObjectExists(record.working_tree_ref) before pushing to survivors;
references: pruneIfNeeded, readRecords, reachableCommits, gitObjectExists,
TrustRecord.
- Around line 190-203: The listTreeEntries function currently calls
gitStdout(['ls-tree','-r', tree]) and parses newline-quoted paths, which breaks
on quoted/non-ASCII names; change the git invocation to use the -z flag (git
ls-tree -r -z <tree>) and parse NUL-terminated records instead of splitting on
'\n'. For each NUL record, split the record at the first tab to separate meta
and file path (the format remains "meta<TAB>path\0"), then split meta by spaces
to extract mode and object as before; keep the early-return on empty output.
Update any callers (e.g., indexInfoFor) assumptions to remain consistent with
NUL-terminated entries.

In `@test/core/reconciliation.test.ts`:
- Around line 246-256: The merge invocation currently swallows failures via
.catch(() => undefined) which can hide whether a real conflict occurred; change
the logic around the git(["merge", "--no-ff", "feature-b", "-m", "merge"]) call
to explicitly detect a conflict: either remove the silent catch and assert that
the promise rejects, or keep the catch but set a flag (e.g., mergeFailed = true)
and assert it was reached before proceeding to the subsequent git(["commit",
"-m", "merge"]). As an additional or alternative check, verify the presence of
the MERGE_HEAD file (path.join(repoDir, ".git", "MERGE_HEAD")) after the merge
attempt to ensure a merge conflict existed, and fail the test if neither the
rejection flag nor MERGE_HEAD is present; update the code around the git merge
call and the following commit/assertion logic accordingly.

In `@test/utils/trust-ledger-full-tree.test.ts`:
- Around line 46-67: The suite mutates process.cwd() in beforeEach/afterEach
causing race conditions when tests run in parallel; as a minimal fix wrap the
test suite in a serial describe by replacing the current describe(...) with
describe.serial(...) (or call describe.serial at the top-level block that
contains beforeEach/afterEach) so that the tests in
trust-ledger-full-tree.test.ts run sequentially and avoid concurrent
process.chdir() conflicts affecting functions like appendCurrentTrustRecord,
readRecords, hasWorkingTreeChanges, getCurrentCommit, gitStdout and spawnGit.

---

Outside diff comments:
In `@src/cli-adapters/github-copilot.ts`:
- Around line 144-171: The checkHealth() method currently uses
execCopilot('which copilot') which fails on Windows; change it to use
execCopilot('copilot --help') for the presence check (same approach as
isAvailable()), and then differentiate missing vs unhealthy by inspecting the
caught error from execCopilot: if error has code 'ENOENT' or message/stderr
matches a "command not found" / "not recognized" / "no such file" pattern treat
it as { available: false, status: 'missing', message: ... }, otherwise treat it
as { available: true, status: 'unhealthy', message: ... }; update references to
execCopilot and the checkHealth() return values accordingly so cross-platform
detection works.

---

Nitpick comments:
In `@openspec/changes/better-git-tracking/design.md`:
- Around line 9-11: Markdown fenced code blocks containing pseudo-flow snippets
(e.g., the block with "loadConfig → initContext → handleAutoClean → acquireLock
→ runWithLock(logger.init → gates)" and the similar blocks in ranges 77-92 and
98-118) lack language tags and trigger MD040; add an explicit language tag to
each fenced block (use "bash" for the shell-style flow on the first snippet and
"text" for the purely illustrative arrows in the other blocks) so the lint
passes while preserving rendering; locate the exact fenced blocks in design.md
and append the appropriate language identifier after the opening ``` backticks.

In `@src/commands/detect.ts`:
- Around line 76-85: The current merge "{ ...opts, ...trustedChangeOptions }"
can overwrite a valid opts.fixBase with an explicit undefined from
trustedChangeOptions; update the merge so that fixBase from trustedChangeOptions
is only applied if it is not undefined (e.g. build the result by spreading opts,
then conditionally assign trustedChangeOptions.fixBase only when !== undefined,
or omit fixBase from trustedChangeOptions before spreading). Target the merging
logic that combines opts and trustedChangeOptions in detect (the code block
returning "{ ...opts, ...trustedChangeOptions }") so other properties still
merge but fixBase is preserved when trustedChangeOptions.fixBase is undefined.

In `@src/commands/gate-command-support.ts`:
- Around line 92-125: The function checkEarlyExit currently calls handleNoWork
(which returns Promise<never> and exits the process) but is declared as
Promise<void>, so update the implementation and API to be explicit: either (A)
stop exiting inside checkEarlyExit and let callers handle termination — remove
calls to handleNoWork here and instead return early (adjust callers of
checkEarlyExit accordingly), or (B) make the function name and signature reflect
that it may not return (rename to earlyExitIfNoWork or similar) and
document/annotate that it will call handleNoWork and not return; in that case
keep the internal calls to handleNoWork but change all call sites to not expect
a return and update types/comments to indicate non-returning behavior (reference
checkEarlyExit and handleNoWork to locate the behavior). Ensure no callers
assume continuation after calling the function in the chosen approach.

In `@src/commands/gate-command.ts`:
- Around line 316-329: The current call to appendCurrentTrustRecord collapses
passed_with_warnings into 'passed' by using outcome.allPassed ? 'passed' :
'failed'; update this to preserve the warnings state by deriving status the same
way run-executor-helpers.ts does—either call or import determineStatus(outcome)
and pass its result as the status, or replicate its logic (return
'passed_with_warnings' when outcome.allPassed && outcome.anySkipped, 'passed'
when allPassed, else 'failed'); ensure the trusted/trustSourceOnPass logic
remains unchanged and continue calling appendCurrentTrustRecord with the new
status value.
- Around line 211-216: Extract the duplicated fixBase-override logic into a
single helper (e.g., applyStartupFixBase) and replace the two inline
implementations with calls to that helper: create
applyStartupFixBase(changeOptions, startupChangeOptions, options) in a shared
module (suggest src/core/change-options.ts or src/utils/git.ts) that returns a
new changeOptions with fixBase set only when startupChangeOptions?.fixBase is
truthy and options.commit is falsy; update callers in gate-command.ts and
run-executor-helpers.ts to import and use applyStartupFixBase, and run tests to
ensure behavior is unchanged.
- Around line 264-332: The executeAndFinalize function has an unwieldy 12+
parameter list; refactor its signature to accept a single args object (e.g.,
args: { config, logger, debugLogger?, isRerun, failuresMap?, changeOptions?,
effectiveBaseBranch, passedSlotsMap?, changes, jobs, contextContent?,
commandName?, options?, trustSourceOnPass? }) and update the function body to
destructure those properties, keeping all existing types and behavior (including
calls to debugLogger.logRunStart/logRunEnd, Runner instantiation,
writeExecutionState, appendCurrentTrustRecord, and return of outcome.allPassed);
update all callers to pass a single object and adjust any tests, and consider
adding a thin overload or migration helper if you need backward-compatible call
sites during the change.

In `@src/commands/skip.ts`:
- Line 41: Extract the hard-coded 1000 into a shared named constant and use it
at call sites: add and export TRUST_LEDGER_PRUNE_THRESHOLD in
src/utils/trust-ledger.ts (or make pruneIfNeeded accept an optional parameter
with default = TRUST_LEDGER_PRUNE_THRESHOLD), then replace literal calls like
await pruneIfNeeded(1000) in skip.ts with await
pruneIfNeeded(TRUST_LEDGER_PRUNE_THRESHOLD) (or call without arg if you added
the default). Update imports to reference the new exported constant where needed
and keep the pruneIfNeeded signature consistent with the new default/constant.

In `@src/core/change-detector.ts`:
- Around line 210-231: The two loops in change-detector using currentUntracked,
snapshotUntracked, currentHashes, and files invert the natural condition and
call snapshotUntracked.get/currentHashes.get twice; refactor each loop to use a
single positive comparison by caching the hash lookups into local variables
(e.g., let snapHash = snapshotUntracked.get(file); let currHash =
currentHashes.get(file)) and then use if (snapHash === currHash) {
files.delete(file); continue; } else { files.add(file); } (similarly for the
second loop) to eliminate duplicate Map.get calls and remove the negated-else
pattern flagged by Biome.
- Around line 245-267: The getStashUntrackedFileHashes function currently
swallows all errors from execFileAsync('git', ['ls-tree','-r','-z',
`${fixBase}^3`]), which masks legitimate failures; update this function to
inspect the thrown error (from execFileAsync) and treat the "no third parent"
case (errors indicating the object `${fixBase}^3` does not exist, e.g., messages
like "Not a valid object name" or exit code specific to git) as the expected
case that returns an empty Map, but for any other error log the error at debug
(or warn) level via the existing logger and rethrow or return an empty Map with
a debug message per project convention; use the error properties (stderr,
message, code) from execFileAsync to distinguish the cases and reference
getStashUntrackedFileHashes and the `${fixBase}^3` git invocation when making
the change.

In `@src/core/run-executor-helpers.ts`:
- Around line 87-103: The helper appendRunTrustRecord centralizes writing the
run trust record but is currently only local to run-executor-helpers.ts while
run-executor.ts duplicates similar logic without setting the trusted flag;
export appendRunTrustRecord from src/core/run-executor-helpers.ts and replace
the duplicated inline block in run-executor.ts (the no_applicable_gates
early-exit path) to call appendRunTrustRecord(ctx, status) so both paths use the
same logic and preserve the trusted: ctx.trustSourceOnPass ===
'ledger-reconciled' behavior.

In `@src/output/console.ts`:
- Around line 164-167: The current statusOverride handling (variable
statusOverride returning { overallStatus, statusColor }) treats any
non-'Trusted' value as red; change it to use a lookup mapping (e.g., a
Record<string, ChalkColor> or switch) keyed by override labels so you can map
'Trusted' → green, 'Passed with warnings' → yellow, 'Error' → red, etc., and
fall back to a sensible default color; update the code that constructs and
returns { overallStatus: statusOverride, statusColor: color } (referencing
statusOverride, color, overallStatus, statusColor) to use the mapping lookup
with a default instead of the current === 'Trusted' ternary.

In `@src/utils/git.ts`:
- Around line 11-93: The runGit/runGitWithInput and gitStdout/gitStdoutWithInput
duplicates the child spawn and decoder wiring; create a single core function
(e.g., runGitCore or extend runGit with an optional input?: string parameter)
that spawns git, sets stdio to ['pipe','pipe','pipe'] when input is provided or
['ignore','pipe','pipe'] otherwise, reuses the StringDecoder/stdout/stderr
handling, merges env like currently done, writes and ends child.stdin only if
input is provided, and resolves the same GitResult ({stdout, stderr, code})
trimmed; then refactor runGit and runGitWithInput to call that core and refactor
gitStdout/gitStdoutWithInput to call the unified function and preserve the same
error message construction (including stderr detail) and behavior.

In `@src/utils/trust-ledger.ts`:
- Around line 401-404: The hardcoded prune threshold (1000) is duplicated across
callers; introduce and export a single constant DEFAULT_PRUNE_THRESHOLD = 1000
from this module and update callers that call pruneIfNeeded(1000) to import and
use DEFAULT_PRUNE_THRESHOLD instead; change the export in this file (alongside
the existing pruneIfNeeded function) and replace literal 1000 usages in caller
code (e.g., where pruneIfNeeded is invoked) so the threshold is maintained in
one place.
- Around line 147-165: The current maybeTree function relies on English error
message substrings from gitStdout to detect "no such tree" and returns null,
which fails on localized messages; fix by detecting missing refs via git exit
codes or forcing stable English output: update maybeTree to call git with
LC_ALL=C in the spawn/env passed to gitStdout (or runGit) OR change the error
handling to inspect the subprocess exit code (e.g., error.exitCode or error.code
returned by gitStdout/runGit) and treat the git exit code for unknown/rev-parse
failures (commonly 128) as the "no tree → null" case; reference maybeTree and
the gitStdout/runGit call so you modify the invocation and the catch branch to
use exit code/env instead of substring matching.

In `@test/core/change-detector.test.ts`:
- Around line 105-170: The tests in the "ChangeDetector fixBase dirty snapshots"
suite run real git commands and mutate process.cwd(), which makes CI flaky when
git is missing and leaks global state; update the suite to guard execution with
a runtime check (e.g., wrap the describe with a skip-if when hasGit() is false)
and remove global chdir by using a per-test working-directory strategy: stop
calling process.chdir(repoDir) in beforeEach/afterEach and instead run test
logic under the temp repo by passing cwd to the git helper or using a
runInTempCwd(repoDir, async () => { ... }) helper so createDirtySnapshot(), the
ChangeDetector instantiation, and all git calls operate with an explicit cwd;
keep the createDirtySnapshot and ChangeDetector usage but ensure tests are
skipped when git is unavailable and no global process.cwd() is mutated.

In `@test/utils/trust-ledger-full-tree.test.ts`:
- Around line 64-67: The afterEach cleanup should always restore the working
directory even if fs.rm throws or earlier test setup fails; update the afterEach
block so process.chdir(originalCwd) runs before awaiting fs.rm(repoDir, {
recursive: true, force: true }) (or ensure tests that call
process.chdir(repoDir) use try/finally to restore cwd inside the test),
referencing the existing afterEach async () => { process.chdir(originalCwd);
await fs.rm(repoDir, { recursive: true, force: true }); } to guarantee cwd is
restored before attempting removal.

In `@test/utils/trust-ledger.test.ts`:
- Around line 75-113: The test helper function config currently returns its
object with a trailing "as any", which bypasses TS shape checks; change this to
a proper cast to the real LoadedConfig type (either import the LoadedConfig type
and assert the return as "as LoadedConfig" or follow the pattern used in
reconciliation.test.ts and use "as unknown as LoadedConfig") so TypeScript
verifies the shape of the returned object; update the function signature or its
return expression in function config(...) accordingly and import the
LoadedConfig type where needed.
- Line 26: TEST_DIR is currently a fixed repo-relative path causing shared-state
flakes; change creation to use an OS temp directory created per test (e.g., use
os.tmpdir() + fs.mkdtemp or fs.mkdtempSync) instead of the import.meta.dir path.
Replace the TEST_DIR constant with a per-test temp dir created in the test suite
setup (beforeEach), assign that path to the same TEST_DIR variable or a local
testDir used by tests, and ensure the existing cleanup (rm -rf in
beforeEach/afterEach) targets this temp dir; mirror the approach used in
reconciliation.test.ts so each test gets a unique absolute temp directory.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f42bcb59-4ab3-457a-ad4f-eb9ff1e21a2e

📥 Commits

Reviewing files that changed from the base of the PR and between d1eb0eb and 908eef8.

📒 Files selected for processing (45)
  • .claude/commands/release.md
  • .validator/config.yml
  • README.md
  • docs/skills-guide.md
  • docs/trusted-snapshots.md
  • docs/user-guide.md
  • openspec/changes/better-git-tracking/.openspec.yaml
  • openspec/changes/better-git-tracking/design.md
  • openspec/changes/better-git-tracking/proposal.md
  • openspec/changes/better-git-tracking/specs/run-lifecycle/spec.md
  • openspec/changes/better-git-tracking/specs/skip-command/spec.md
  • openspec/changes/better-git-tracking/specs/validation-trust-ledger/spec.md
  • openspec/changes/better-git-tracking/specs/validator-merge/spec.md
  • openspec/changes/better-git-tracking/tasks.md
  • openspec/changes/better-git-tracking/tasks/reconciliation-and-cleanup.md
  • openspec/changes/better-git-tracking/tasks/trust-ledger-core.md
  • skills/validator-merge/SKILL.md
  • skills/validator-merge/merge-state.sh
  • src/cli-adapters/github-copilot.ts
  • src/commands/detect.ts
  • src/commands/gate-command-support.ts
  • src/commands/gate-command.ts
  • src/commands/skip.ts
  • src/core/change-detector.ts
  • src/core/reconciliation.ts
  • src/core/run-executor-helpers.ts
  • src/core/run-executor.ts
  • src/core/trusted-message.ts
  • src/output/console.ts
  • src/output/report.ts
  • src/types/validator-status.ts
  • src/utils/git.ts
  • src/utils/log-parser.ts
  • src/utils/trust-ledger.ts
  • test/cli-adapters/copilot-plugin.test.ts
  • test/commands/detect-trust.test.ts
  • test/core/change-detector.test.ts
  • test/core/reconciliation.test.ts
  • test/core/run-executor.test.ts
  • test/output/console.test.ts
  • test/output/report.test.ts
  • test/skills/merge-state-script.test.ts
  • test/utils/log-parser.test.ts
  • test/utils/trust-ledger-full-tree.test.ts
  • test/utils/trust-ledger.test.ts
💤 Files with no reviewable changes (4)
  • skills/validator-merge/SKILL.md
  • docs/skills-guide.md
  • skills/validator-merge/merge-state.sh
  • test/skills/merge-state-script.test.ts

Comment thread .claude/commands/release.md Outdated
Comment thread src/commands/detect.ts Outdated
Comment thread src/core/change-detector.ts
Comment thread src/core/run-executor.ts
Comment thread src/utils/git.ts
Comment on lines +51 to +80
export function runGitWithInput(
args: string[],
input: string,
options: { env?: NodeJS.ProcessEnv } = {},
): Promise<GitResult> {
return new Promise((resolve, reject) => {
const child = spawn('git', args, {
stdio: ['pipe', 'pipe', 'pipe'],
env: options.env ? { ...process.env, ...options.env } : undefined,
});
const stdoutDecoder = new StringDecoder('utf8');
const stderrDecoder = new StringDecoder('utf8');
let stdout = '';
let stderr = '';
child.stdout.on('data', (chunk: Buffer | string) => {
stdout += typeof chunk === 'string' ? chunk : stdoutDecoder.write(chunk);
});
child.stderr.on('data', (chunk: Buffer | string) => {
stderr += typeof chunk === 'string' ? chunk : stderrDecoder.write(chunk);
});
child.on('close', (code) => {
stdout += stdoutDecoder.end();
stderr += stderrDecoder.end();
resolve({ stdout: stdout.trim(), stderr: stderr.trim(), code });
});
child.on('error', reject);
child.stdin.write(input);
child.stdin.end();
});
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Unhandled error on child.stdin can crash the process on EPIPE.

If the spawned git process exits before consuming all of input (e.g., bad args, early failure), writing to child.stdin will emit an 'error' event on the writable stream. With no listener attached, Node.js treats it as an uncaughtException. Attach a swallow-or-reject handler so transient stdin errors don't crash the host process.

🔧 Suggested fix
     child.on('error', reject);
+    child.stdin.on('error', (err: NodeJS.ErrnoException) => {
+      // Process exited before consuming stdin (e.g., EPIPE) — surface via 'close'.
+      if (err.code !== 'EPIPE') reject(err);
+    });
     child.stdin.write(input);
     child.stdin.end();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/git.ts` around lines 51 - 80, The runGitWithInput function can
crash the process when writing to child.stdin emits an 'error' (e.g., EPIPE);
add a handler on child.stdin ('error') before calling child.stdin.write(input)
that either rejects the Promise with the stream error or swallows known
transient errors (EPIPE) and then remove that handler when child.stdin.end() or
the child process closes; specifically, in runGitWithInput attach
child.stdin.on('error', handler) that calls reject(err) (or ignores err.code ===
'EPIPE'), and ensure you remove the listener in the child.on('close') and/or
after child.stdin.end() to avoid leaks.

Comment thread src/utils/trust-ledger.ts
Comment thread src/utils/trust-ledger.ts Outdated
Comment on lines +401 to +422
export async function pruneIfNeeded(threshold: number): Promise<void> {
try {
const ledgerPath = await getLedgerPath();
if ((await lineCount(ledgerPath)) <= threshold) return;

const [records, reachable] = await Promise.all([
readRecords(),
reachableCommits(),
]);
const survivors: TrustRecord[] = [];
for (const record of records) {
if (record.commit) {
if (reachable.has(record.commit)) survivors.push(record);
continue;
}
if (
(await gitObjectExists(record.working_tree_ref)) ||
(await gitObjectExists(record.tree))
) {
survivors.push(record);
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Pruning keeps dirty-tree records on tree-only existence; design.md gates it on working_tree_ref.

The design (and specs/.../spec.md line 282) say: for commit: null records, keep iff working_tree_ref still exists in git. The implementation ORs working_tree_ref existence with tree existence, so a record whose working_tree_ref was gc’d but whose tree happens to still be reachable via another record stays in the ledger — which contradicts the documented rule and could keep references to stash refs the user can no longer recover from.

If the broader behavior is intentional, please update the spec/design to match; otherwise drop the second condition:

♻️ Align prune with spec
-      if (
-        (await gitObjectExists(record.working_tree_ref)) ||
-        (await gitObjectExists(record.tree))
-      ) {
+      if (await gitObjectExists(record.working_tree_ref)) {
         survivors.push(record);
       }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/trust-ledger.ts` around lines 401 - 422, In pruneIfNeeded, for
TrustRecord entries where record.commit is null you must only retain the record
when the working_tree_ref still exists in git (per design/spec); remove the
additional OR check against gitObjectExists(record.tree) so the condition
becomes solely await gitObjectExists(record.working_tree_ref) before pushing to
survivors; references: pruneIfNeeded, readRecords, reachableCommits,
gitObjectExists, TrustRecord.

Comment thread test/core/reconciliation.test.ts Outdated
Comment thread test/utils/trust-ledger-full-tree.test.ts
@pacaplan pacaplan merged commit ec98f8f into main Apr 27, 2026
8 checks passed
@pacaplan pacaplan deleted the dev branch April 27, 2026 02:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants