Betterleaks migration#93
Merged
Merged
Conversation
Replace the secret-scanning binary in both Node and Python with betterleaks (github.com/betterleaks/betterleaks v1.1.2), maintained by the gitleaks authors. JSON report shape is unchanged, so the result-mapping logic survives; the wire-up changes are CLI subcommand (`detect --no-git -s` → `dir <path>`), release URL pattern, and checksum filename. Internal renames: - GitleaksScanner → BetterleaksScanner (node + python) - BinaryManager.get/is/verify/find/download_gitleaks → *_betterleaks - GITLEAKS_VERSION → BETTERLEAKS_VERSION - update-gitleaks command → update-betterleaks (with deprecated alias) User-facing back-compat: - `--with-gitleaks` accepted as alias of `--with-betterleaks` - `--engine gitleaks` accepted as alias of `--engine betterleaks` - `agent update-gitleaks` accepted as hidden alias Dropped the parenthetical in the `rafter secrets` help description so the "Secrets only" phrase no longer wraps mid-line at typer's default width. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Update README, CLAUDE.md, SKILL.md, llms.txt, shared-docs/CLI_SPEC.md, recipes/*, and the bundled python skill/agent resources to refer to Betterleaks (the gitleaks successor) as the canonical name. Each occurrence of the legacy flag/value/subcommand explicitly notes the back-compat alias so existing docs/scripts/agents that still say `gitleaks` keep working. Historical docs under docs/ (audits, code reviews, proposals, research) are left as-is — they describe state at a point in time. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ning) Functional fixes (4 reviewers reported): - Node BetterleaksScanner now falls back to PATH-installed binary (Homebrew users were silently demoted to regex; Python already did this). - agent verify/status soft-degrade when only legacy gitleaks is present — emit "run rafter agent update-betterleaks" hint instead of hard-failing. - Update remaining user-facing surfaces: action.yml, .pre-commit-hooks.yaml, Dockerfile (was `--with-gitleaks 2>/dev/null || true`), python/README.md, .github/copilot-instructions.md. - Python scanner now logs warnings on JSON/timeout errors (was silent FN). Supply-chain hardening: - Pin SHA256 hashes for the bundled BETTERLEAKS_VERSION (1.1.2) in source. Default install no longer trusts the release-page checksums.txt to authenticate itself; --version <other> still falls back to the upstream file (TOFU at install time). - Reject symlink/hardlink/device tar entries on extract — without this, a malicious release could ship `betterleaks` as a symlink to e.g. ~/.ssh/authorized_keys, and the subsequent chmod +x would mode-flip the target. Same defense for zip extraction (Unix mode bits in external_attr). Post-extract lstat confirms the result is a regular file. - Refuse non-https redirects (Node) and non-https final URLs (Python). - Validate `--version` against /^[A-Za-z0-9._-]+$/ to neutralize URL injection attempts like `1.1.2/../evil`. - Add `--` separator before user-supplied paths in betterleaks invocations so a path beginning with `-` isn't parsed as a flag. Aligns Node scanFile timeout with scanDirectory (60s, matches Python). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Parity: - Python BetterleaksScanner.getSeverity now inspects Tags (key+secret => critical, api => high, generic => medium) — matches Node behavior; same rule could classify differently across runtimes before this. - Tighten Node BetterleaksResult interface: git-history-only fields (Commit/Author/Email/Date/Message/Fingerprint) are absent on `dir` mode output, so mark them optional rather than lying about the contract. - Python User-Agent now uses rafter-cli __version__ instead of accidentally reusing BETTERLEAKS_VERSION. Tests for the legacy alias surface (no coverage before this — would have silently regressed on the next refactor): - node: --engine gitleaks (scan), --with-gitleaks (init), legacy gitleaks detection in agent status with the upgrade hint. - python: --engine gitleaks (secrets), --with-gitleaks (agent init). - python: tag-based severity (critical/high/medium) parametrized. - python: new test_binary_manager.py covers --version validation (URL-injection guards), pinned-hash table completeness for the bundled BETTERLEAKS_VERSION, and non-https refusal in _download_file. - node: matching --version validation tests in binary-manager.test.ts. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The migration kept --with-gitleaks, --engine gitleaks, and the update-gitleaks subcommand as silent aliases. Per product direction, move off gitleaks entirely. Removed (now hard-error): - `rafter agent init --with-gitleaks` → "unknown option" - `rafter agent scan --engine gitleaks` → "Invalid engine" - `rafter agent baseline create --engine gitleaks` → same - `rafter agent update-gitleaks` (hidden alias) → "unknown command" - `gitleaks` value in mcp `scan_secrets` enum → schema rejection Kept (read-only legacy detection): - `rafter agent verify` and `rafter agent status` continue to probe for ~/.rafter/bin/gitleaks and gitleaks on PATH so users with leftover installs see "legacy gitleaks at X; run: rafter agent update-betterleaks" instead of a confusing "not found". This is the soft-degrade that prevents a verify hard-fail regression on upgrade. Docs scrubbed of "legacy alias accepted" copy across README, CLAUDE.md, llms.txt, shared-docs/CLI_SPEC.md, recipes/gemini-cli.md, and the bundled python skill cli-reference. Tests flipped from "alias is accepted" to "alias is rejected". Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Audit run by 2 codex agents and 2 claude agents on the betterleaks-migration branch (post-alias-removal). Fixes: Bundled Node skill resources (P0, both reviewers flagged): The npm-bundled docs at node/resources/** still said "Gitleaks" and documented the removed `update-gitleaks` subcommand and `--with-gitleaks` flag — a clean install would land docs that recommend commands the CLI now hard-rejects. Mirrored the rename across: - node/resources/skills/rafter/SKILL.md - node/resources/skills/rafter/docs/cli-reference.md - node/resources/agents/rafter.md - node/resources/rafter-security-skill.md - node/.claude/skills/rafter/docs/cli-reference.md (dev-side mirror) - fixtures/vulnerable-repo/README.md (`--engine gitleaks` example) Python silent-failure bug (P1, codex-functional flagged): `_run_scan()` ignored the betterleaks subprocess return code, so any non-zero exit other than 1 (panic, OOM, malformed args) returned [] and the scan looked clean. Mirrored Node's contract: 0=clean, 1=findings, anything else => RuntimeError with stderr tail. Catches OSError on the subprocess call too. Windows .exe extension parity (P2, codex-functional flagged): Several legacy-detection / status / init paths hardcoded the binary name without `.exe`, so a managed `~/.rafter/bin/betterleaks.exe` (or leftover `gitleaks.exe`) was missed on Windows. - python/rafter_cli/commands/agent.py: agent init, _check_betterleaks, status output all now select binary name by sys.platform. - node/src/commands/agent/status.ts: same — derive `.exe` once and reuse. Test cleanup (P2, codex-functional flagged): node/tests/mcp-server.test.ts handler mock kept the `engineRaw === "gitleaks" ? "betterleaks" : engineRaw` normalization that production no longer has — removed so the test reflects current behavior. Copy polish (both reviewers flagged): - recipes/pre-commit.md sentence "21+ patterns via Betterleaks" wrongly attributed all 21 patterns to betterleaks. Now: "21+ built-in credential patterns plus optional Betterleaks integration". - shared-docs/SHOW_HN_DRAFT.md, drafts/show-hn/post.md, drafts/show-hn/faq.md marketing drafts updated to mention Betterleaks (kept gitleaks references where they're historical comparisons or the original FAQ question). CHANGELOG entry: Added an [Unreleased] entry for the full migration: scanner change, breaking removal of legacy aliases, soft-degrade detection, supply-chain hardening, and the alias-removal-as-breaking note. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Security hardening (Claude security reviewer flagged 3 real issues; Codex security review confirmed the layered hardening is otherwise solid): - Node download: add MAX_REDIRECTS=10 (was unbounded recursion on 30x loops), MAX_BYTES=200MB body cap with Content-Length precheck + per-chunk enforcement (was a 50GB-mirror DoS), and REQUEST_TIMEOUT_MS=60s socket timeout (was slow-loris hang). All in node/src/utils/binary-manager.ts downloadFile. - Python download: matching MAX_BYTES=200MB body cap. Python already had urllib's internal redirect cap and timeout=60. Simplicity (Claude simplicity reviewer): - Extract findLegacyGitleaks() onto BinaryManager (Node) and find_legacy_gitleaks() onto BinaryManager (Python). Was duplicated 3x per runtime (verify + status + agent init); now one canonical implementation each side, used everywhere. Drops the duplicated Windows .exe extension handling and the homedir traversal. - Inline _update_betterleaks_impl() back into update_betterleaks() — the wrapper existed for the (now-removed) update-gitleaks alias and was only called once. Functional correctness (Codex functional reviewer ran the full test suite, blew context before completing the manual matrix): - Confirmed: 73/73 node + 164/164 python tests pass. - Confirmed live: aliases hard-error correctly (--with-gitleaks, --engine gitleaks, update-gitleaks), legacy detection emits the upgrade hint identically across both runtimes, real download still works end-to-end after the hardening additions. Items deferred (low value or trade-off): - Re-verify on-disk binary hash before each scan (TOFU drift) — local malware threat outside our model. - Race on parallel agent init writing to same archive path — local DoS only, no security boundary. - Switch Node verify_betterleaks_verbose from execAsync (shell quoted) to execFile (argv) — fragile but trusted inputs only. - Drop post-extract lstat from Python tarball extract — kept for parity with Node where node-tar's filter typing is loose. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Rome-1
added a commit
that referenced
this pull request
May 10, 2026
Minor bump (not patch) because the betterleaks migration removed user-facing CLI surface — `--with-gitleaks`, `--engine gitleaks`, and `rafter agent update-gitleaks` now hard-error. On a 0.x line, removing documented CLI flags is the textbook MINOR trigger; burying it in a patch would mislead anyone scripting against those flags. Bumps: - node/package.json: 0.7.9 → 0.8.0 - python/pyproject.toml: 0.7.9 → 0.8.0 - node/resources/rafter-security-skill.md: 0.7.9 → 0.8.0 (ClawHub publish) - python/rafter_cli/resources/rafter-security-skill.md: 0.7.9 → 0.8.0 - recipes/openclaw.md example frontmatter: 0.7.9 → 0.8.0 CHANGELOG cleanup: - Move betterleaks bullets from [0.7.9] (where the PR #93 octopus merge parked them under main's `### Changed` heading) into the new [0.8.0] section. The published v0.7.9 (dc81574) does not contain betterleaks code; npm/PyPI 0.7.9 still ships gitleaks. - New [0.8.0] groups all unreleased bullets (rf-hrtd dry-run, ClawHub auto-publish) and adds three Fixed entries: rf-cfjc (action.yml jq parser, fixed in 941879f), #96 (ClawHub owner handle), rf-z6sv (#97, pre-commit rev pin bump). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Rome-1
added a commit
that referenced
this pull request
May 10, 2026
Minor bump (not patch) because the betterleaks migration removed user-facing CLI surface — `--with-gitleaks`, `--engine gitleaks`, and `rafter agent update-gitleaks` now hard-error. On a 0.x line, removing documented CLI flags is the textbook MINOR trigger; burying it in a patch would mislead anyone scripting against those flags. Bumps: - node/package.json: 0.7.9 → 0.8.0 - python/pyproject.toml: 0.7.9 → 0.8.0 - node/resources/rafter-security-skill.md: 0.7.9 → 0.8.0 (ClawHub publish) - python/rafter_cli/resources/rafter-security-skill.md: 0.7.9 → 0.8.0 - recipes/openclaw.md example frontmatter: 0.7.9 → 0.8.0 CHANGELOG cleanup: - Move betterleaks bullets from [0.7.9] (where the PR #93 octopus merge parked them under main's `### Changed` heading) into the new [0.8.0] section. The published v0.7.9 (dc81574) does not contain betterleaks code; npm/PyPI 0.7.9 still ships gitleaks. - New [0.8.0] groups all unreleased bullets (rf-hrtd dry-run, ClawHub auto-publish) and adds three Fixed entries: rf-cfjc (action.yml jq parser, fixed in 941879f), #96 (ClawHub owner handle), rf-z6sv (#97, pre-commit rev pin bump). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
5 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.