Show demo PR annotations in README#8
Merged
Merged
Conversation
Conalh
added a commit
that referenced
this pull request
May 29, 2026
…ion-drift blind spots (#49) * fix: run CLI bin through symlinks and stop false Codex baseline drift Two trust-critical fixes flagged in external review. 1. npm installs the `scopetrail` bin as a symlink, so the entrypoint's `import.meta.url === process.argv[1]` main-module check was false when launched via `npx`/global install: main() never ran and the CLI exited 0 with no output -- a CI gate that silently passed every PR. Resolve both sides through realpathSync. Add a test that runs the built bin through a symlink (the existing CLI tests only ran `node dist/index.js` directly). 2. Codex sandbox/approval drift ranked a missing base value at -1, so a brand-new .codex/config.toml that merely set the narrowest posture (read-only sandbox, untrusted approval) was reported as a high-severity widening/weakening -- a false positive on the safest possible config. Anchor the missing baseline at Codex's safe default so only settings genuinely wider than that default surface; a brand-new danger-full-access sandbox or `never` approval still fires. Add unit tests for both Codex directions and a benign benchmark fixture (codex-baseline-narrowest); regenerate RESULTS.md (29 cases, precision/recall unchanged at 100%/0% FP) and sync the README corpus counts. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * feat: close remaining permission-drift blind spots and tighten docs Implements the rest of the external review (the trust-critical bin/Codex pair landed in the previous commit). Detector blind spots: - #4 Merge all recognized MCP server maps instead of first-map-wins, so an empty `mcpServers: {}` can no longer shadow a populated `servers: {}` in Cursor/VS Code configs. - #3 Diff sensitive MCP fields (env, headers, cwd) on existing servers, which serverCommand()-based comparison ignored. A server keeping the same command but gaining a secret env var or auth header now surfaces; secret-bearing key names escalate to high. Applies to .mcp.json and Codex [mcp_servers.NAME]. Removals stay silent (narrowing, not widening). New kinds: scope_trail.mcp_server_sensitive_field_changed and scope_trail.codex_mcp_sensitive_field_changed. - #5 Model Claude hook entries by (matcher, type, command) rather than command alone, so rebinding a guard from one tool to another (same command) is caught. NOTE: this reverses the prior "matcher change is noise" decision and its test — a PreToolUse guard moved from Bash to Read stops guarding Bash, which is a real enforcement-surface change. - #7 Expand the critical removed-deny pattern set beyond .env/secret/ credential/.pem to SSH keys, *.key, cloud credentials, registry tokens, and kube configs. Docs & supply-chain: - #2/#8 Qualify the README "local-only" claim: the scanner uploads nothing, but the Action's setup step runs `npm ci` from the registry. - #9 Pin agent-gov-core to exact 1.3.0 (lockfile kept in sync) so npm consumers do not silently resolve a newer parser/schema. - #10 Note that the benchmark figures bound regressions, not real-world field accuracy. Each change ships unit tests and a labeled benchmark fixture. Corpus grows to 35 cases (27 rogue, 8 benign) across 21 detector kinds; precision/recall hold at 100%/0% false-positive rate, RESULTS.md regenerated and README counts synced. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix: clean up base snapshot when head materialization fails runDiff materialized the base snapshot, then the head snapshot, and only assigned `cleanup` after both succeeded. If head materialization threw — an unresolvable head ref (a shallow checkout missing it), or a max-buffer error on an oversized config — the base snapshot's temp dir was already on disk and leaked, since cleanup was never wired up. Clean the base snapshot explicitly before the error propagates. Found during a fresh correctness pass over the codebase (not part of the external review). The existing unresolvable-ref test fails on the *base* ref, before any dir exists, so it never exercised this path. Adds a regression test that asserts no scopetrail-snapshot temp dir survives a head-side failure. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Verification