Skip to content

Correctness fixes + offline catalog (add/search) + pre-commit support (0.3.0)#38

Merged
DataDave-Dev merged 8 commits into
mainfrom
fix/staged-content-and-robustness
Jul 1, 2026
Merged

Correctness fixes + offline catalog (add/search) + pre-commit support (0.3.0)#38
DataDave-Dev merged 8 commits into
mainfrom
fix/staged-content-and-robustness

Conversation

@DataDave-Dev

@DataDave-Dev DataDave-Dev commented Jul 1, 2026

Copy link
Copy Markdown
Owner

Summary

Hardens the enforcement path so becwright keeps its "guaranteed, not probabilistic" promise, and lowers adoption friction with an offline catalog and pre-commit/Husky support. Bumps to 0.3.0.

Correctness (P0)

  • Check staged content, not the working tree. The pre-commit path listed staged file names but opened the working-tree copy, so an unstaged edit could flag a violation that isn't being committed, or hide a staged one. Now the staged blobs are materialized into a temp mirror and checked there. --all still inspects the working tree on purpose.
  • Per-check timeout. A hung/runaway check (bundles can carry third-party code) no longer freezes the commit. 30s cap, override via BECWRIGHT_CHECK_TIMEOUT (0 disables).
  • Validated rules.yaml. A misspelled severity silently downgraded a rule to a warning; malformed YAML dumped a raw traceback. Both now raise a readable RulesError and exit 2.

Adoption (P1)

  • Offline catalog: becwright add <name> and becwright search [query]. The catalog now ships inside the package (src/becwright/becs/, via package-data), so installing a ready-made BEC needs no URL and works offline. Shares the trust-gated install flow with import.
  • pre-commit framework + Husky. New .pre-commit-hooks.yaml and README setup for both.

Polish (P2)

  • Memoize compiled glob patterns (check --all perf).
  • CI now also tests Python 3.14.

Test plan

  • pytest — 152 passing (new: staged-content both directions, timeout, rules validation, catalog/add/search).
  • Wheel inspected: catalog .bec.yaml files are included.
  • Dogfooding (becwright check --all) green; the pre-commit hook validated each commit using staged content.

Summary by CodeRabbit

  • New Features

    • Added built-in catalog browsing and installation commands for ready-to-use rules.
    • Offline rule installation is now supported from the packaged catalog.
    • CI now runs on Python 3.14.
  • Bug Fixes

    • Rule checks now use staged content for pre-commit-style runs.
    • Invalid rule definitions are reported more clearly, including malformed YAML and unsupported severities.
    • Long-running checks now time out cleanly instead of hanging.
  • Documentation

    • Updated usage guides and command references to match the new catalog workflow.

@coderabbitai

coderabbitai Bot commented Jul 1, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

@DataDave-Dev, you've reached your PR review limit, so we couldn't start this review.

Next review available in: 48 minutes

Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available.
You're only billed for reviews past your plan's rate limits ($0.25/file).

How can I continue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews.

How do review limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please refer docs for additional details.

Review details
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 46aa4f6b-00c6-4d41-9b2e-8a7c7ed838af

📥 Commits

Reviewing files that changed from the base of the PR and between edf81c9 and 09c5fa7.

📒 Files selected for processing (42)
  • .github/workflows/ci.yml
  • .pre-commit-hooks.yaml
  • CLAUDE.md
  • README.es.md
  • README.md
  • becs/README.md
  • documentation/portability.es.md
  • documentation/portability.md
  • documentation/usage.es.md
  • documentation/usage.md
  • integrations/claude-code/skills/becwright/SKILL.md
  • npm/becwright/package.json
  • npm/platforms/darwin-arm64/package.json
  • npm/platforms/darwin-x64/package.json
  • npm/platforms/linux-arm64/package.json
  • npm/platforms/linux-x64/package.json
  • npm/platforms/win32-x64/package.json
  • pyproject.toml
  • src/becwright/__init__.py
  • src/becwright/becs/README.md
  • src/becwright/becs/no-console-log-js.bec.yaml
  • src/becwright/becs/no-dangerous-eval.bec.yaml
  • src/becwright/becs/no-debug-go.bec.yaml
  • src/becwright/becs/no-debug-remnants.bec.yaml
  • src/becwright/becs/no-debug-rust.bec.yaml
  • src/becwright/becs/no-debugger-js.bec.yaml
  • src/becwright/becs/no-hardcoded-secrets.bec.yaml
  • src/becwright/becs/no-token-in-logs.bec.yaml
  • src/becwright/becs/no-wildcard-imports.bec.yaml
  • src/becwright/catalog.py
  • src/becwright/cli.py
  • src/becwright/engine.py
  • src/becwright/git.py
  • src/becwright/report.py
  • src/becwright/rules.py
  • tests/test_bundle.py
  • tests/test_catalog.py
  • tests/test_cli_and_git.py
  • tests/test_engine_integration.py
  • tests/test_list.py
  • tests/test_rules.py
  • tests/test_staged_content.py
📝 Walkthrough

Walkthrough

This PR adds an offline, package-bundled BEC catalog with new add/search CLI commands, hardens rule-file validation with a RulesError exception, introduces a check timeout and glob caching in the engine, adds staged-worktree evaluation for git checks, bumps package versions to 0.3.0, and updates documentation/CI accordingly.

Changes

Catalog, rules hardening, engine/staging, and packaging

Layer / File(s) Summary
Catalog module and packaging
src/becwright/catalog.py, src/becwright/becs/README.md, pyproject.toml, tests/test_bundle.py, tests/test_catalog.py
Adds CatalogError, catalog_names(), and read_bec() to enumerate/read bundled .bec.yaml files, ships catalog docs, packages catalog data in the wheel, and updates/adds tests.
CLI add/search commands and bundle install refactor
src/becwright/cli.py, tests/test_catalog.py, tests/test_list.py
Adds catalog import, refactors install logic into _install_bundle(), adds _cmd_add/_cmd_search and parser wiring, updates list/rules-header text, and tests these commands.
Rules validation hardening
src/becwright/rules.py, tests/test_rules.py, tests/test_cli_and_git.py
Adds RulesError and _VALID_SEVERITIES, validates rule shape/required fields/severity in _to_rule, wraps YAML parsing errors in load_rules, and adds failure-path tests including a CLI-level invalid severity test.
Check timeout and glob caching in engine
src/becwright/engine.py, tests/test_engine_integration.py
Adds BECWRIGHT_CHECK_TIMEOUT-driven timeout handling around subprocess calls, caches compiled glob regexes via lru_cache, and adds a timeout integration test.
Staged-content evaluation for checks
src/becwright/git.py, src/becwright/report.py, tests/test_staged_content.py
Adds _staged_blob/staged_worktree to materialize staged file content into a temp directory, updates gather() to evaluate staged content unless all_files is set, and adds staged-vs-working-tree tests.
Version bumps across packages
pyproject.toml, src/becwright/__init__.py, npm/becwright/package.json, npm/platforms/*/package.json
Bumps version from 0.2.2 to 0.3.0 across Python and npm package manifests.
Documentation and CI updates for catalog workflow
README.md, README.es.md, CLAUDE.md, documentation/usage*.md, documentation/portability*.md, integrations/claude-code/skills/becwright/SKILL.md, .pre-commit-hooks.yaml, .github/workflows/ci.yml
Updates docs to describe search/add catalog workflow, retargets catalog directory links, adds a pre-commit hook definition, and expands CI Python matrix to include 3.14.

Estimated code review effort: 3 (Moderate) | ~30 minutes

Sequence Diagram(s)

sequenceDiagram
  participant User
  participant CLI
  participant catalog
  participant bundle
  participant report
  participant git
  participant engine

  User->>CLI: becwright search term
  CLI->>catalog: catalog_names()
  CLI->>bundle: parse_bundle(read_bec(name))
  CLI-->>User: matching BECs

  User->>CLI: becwright add name
  CLI->>catalog: read_bec(name)
  CLI->>bundle: _install_bundle(root, text)
  CLI-->>User: install result

  User->>CLI: becwright check
  CLI->>report: gather(rules, files, root, all_files)
  report->>git: staged_worktree(root, files)
  git-->>report: temp staged dir
  report->>engine: evaluate(rules, files, staged_root)
  engine-->>report: results (pass/fail/timeout)
  report-->>CLI: summary
  CLI-->>User: exit code + output
Loading

Possibly related PRs

  • DataDave-Dev/becwright#6: Both PRs extend CLI subcommand wiring in src/becwright/cli.py (adding init versus adding add/search).
  • DataDave-Dev/becwright#12: Both PRs touch the list subcommand in src/becwright/cli.py, one implementing it and this one changing its output text.
  • DataDave-Dev/becwright#2: Both PRs update README install/usage documentation reflecting distribution changes.
🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description has useful detail, but it does not follow the required What, Why, and Checklist template headings. Rewrite the description using the template's What, Why, and Checklist sections, and include the required test and docs notes.
Docstring Coverage ⚠️ Warning Docstring coverage is 5.26% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main changes: correctness fixes, offline catalog/search, pre-commit support, and the 0.3.0 release.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/staged-content-and-robustness

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

The pre-commit path listed staged file names but the checks opened the
working-tree copy, so the result could diverge from what the commit records:
an unstaged edit could flag a violation that is not being committed, or hide
a staged one behind a clean working tree.

Materialize the staged blobs (git show :0:<file>) into a temp mirror, copy
the working-tree .bec/ tooling alongside so custom checks resolve, and run
the checks there. `--all` still inspects the working tree on purpose.
A check runs on every commit and may carry imported third-party code (from a
bundle). Without a timeout a hung or runaway check freezes the commit forever.
Add a 30s cap, overridable via BECWRIGHT_CHECK_TIMEOUT (0 disables it for slow
whole-repo runs), and treat a timeout as a failed check with a clear message.
A misspelled severity ("blockign", "bloqueante") silently downgraded a rule
to a warning, quietly disabling enforcement; a malformed rules.yaml surfaced a
raw PyYAML traceback in the middle of the user's commit.

Raise a readable RulesError for invalid YAML, a non-list `rules:`, a missing
required field, or an invalid severity. The CLI prints it and exits 2.
Importing a ready-made BEC required knowing and typing a raw GitHub URL, and
the catalog was not shipped with the package, so it could not work offline.

Bundle the catalog inside the wheel (src/becwright/becs/, packaged via
package-data) and read it through a new catalog module. Add two commands:
`becwright search [query]` lists the catalog with each BEC's intent, and
`becwright add <name>` installs one with the same trust-gated flow as `import`
(shared _install_bundle helper). No network, no URL.

Docs, the init header, `list`, and the repo layout are updated to point at the
new commands and location.
Teams that already manage git hooks could not plug becwright in without letting
it own the pre-commit hook. Ship a .pre-commit-hooks.yaml so becwright can be
added as a pre-commit repo (it determines the staged files itself, so
pass_filenames is false and always_run is true), and document both the
pre-commit and Husky setups in the README.
matches() rebuilt and recompiled each path glob on every (file, pattern) pair,
which adds up on `check --all` over a large repo. Memoize the compiled regex.
New since 0.2.2: staged-content checking, per-check timeout, validated
rules.yaml, the offline catalog with `add`/`search`, and pre-commit/Husky
support. Bump the version across pyproject, the npm launcher and platform
packages, and pin the README pre-commit examples to v0.3.0 (the first tag that
ships .pre-commit-hooks.yaml).
@DataDave-Dev DataDave-Dev force-pushed the fix/staged-content-and-robustness branch from edf81c9 to 09c5fa7 Compare July 1, 2026 04:23
@DataDave-Dev DataDave-Dev merged commit 732b97d into main Jul 1, 2026
3 of 4 checks passed
@DataDave-Dev DataDave-Dev deleted the fix/staged-content-and-robustness branch July 1, 2026 04:25

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (5)
tests/test_staged_content.py (1)

1-30: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Duplicate test helpers across files.

_git, _init_repo, and _rules_yaml re-implement helpers that already exist in tests/test_cli_and_git.py. Consider extracting these into a shared conftest.py fixture/helper module to avoid drift between the two copies.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_staged_content.py` around lines 1 - 30, The staged-content tests
duplicate the same git/setup helpers already defined in
tests/test_cli_and_git.py, so please remove the copy in this test module and
reuse shared helpers instead. Move or import _git, _init_repo, and _rules_yaml
from a common test support location such as conftest.py or a shared helper
module, then update _setup to call those shared symbols so both test files stay
in sync.
src/becwright/git.py (1)

63-84: 🚀 Performance & Scalability | 🔵 Trivial | 🏗️ Heavy lift

Consider batching staged-file materialization instead of one git show per file.

staged_worktree spawns one git show :0:<path> subprocess per staged file. For large commits this means N process spawns just to build the temp worktree. git checkout-index --prefix=<dir>/ -z --stdin can materialize all staged files into a directory with a single git invocation, reading NUL-separated paths from stdin.

♻️ Illustrative sketch (needs adaptation for `.bec/` copy + error handling)
def staged_worktree(root: Path, files: list[str]) -> Iterator[Path]:
    tmp = Path(tempfile.mkdtemp(prefix="becwright-"))
    try:
        subprocess.run(
            ["git", "checkout-index", f"--prefix={tmp}/", "-z", "--stdin"],
            cwd=root, input="\0".join(files).encode(), check=False,
            timeout=_check_timeout(),
        )
        bec = root / ".bec"
        if bec.is_dir():
            shutil.copytree(bec, tmp / ".bec", dirs_exist_ok=True)
        yield tmp
    finally:
        shutil.rmtree(tmp, ignore_errors=True)

Please confirm git checkout-index --stdin behaves as expected for the repo's supported git version range before adopting.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/becwright/git.py` around lines 63 - 84, The staged_worktree helper is
currently materializing each file with a separate _staged_blob/git show call,
which is inefficient for large file sets. Update staged_worktree to batch
staged-file checkout using a single git checkout-index invocation with --prefix
and --stdin, while preserving the existing .bec copy behavior and temp-dir
cleanup in the try/finally block. Verify the supported Git version range for
checkout-index --stdin and keep the function’s path handling compatible with the
current root/files workflow.
src/becwright/engine.py (1)

85-96: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Redundant re-parsing of _check_timeout().

Called once for timeout= and again for the timeout message, re-reading/re-parsing the env var twice per rule for no benefit.

♻️ Suggested fix
+        timeout = _check_timeout()
         try:
             proc = subprocess.run(
                 rule.check, shell=True, cwd=root,
                 input="\n".join(relevant), capture_output=True, text=True,
-                timeout=_check_timeout(),
+                timeout=timeout,
             )
         except subprocess.TimeoutExpired:
             results.append(RuleResult(
                 rule=rule, passed=False,
-                output=f"check timed out after {_check_timeout():g}s (its command hung)",
+                output=f"check timed out after {timeout:g}s (its command hung)",
             ))
             continue
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/becwright/engine.py` around lines 85 - 96, The timeout handling in the
rule execution path re-calls _check_timeout() twice, causing unnecessary
re-reading/parsing of the env var. In the subprocess.run block inside the
rule-checking logic, compute the timeout once in the surrounding scope and reuse
that same value for both the timeout= argument and the TimeoutExpired message;
keep the change localized to the rule execution flow and the RuleResult append
path.
src/becwright/catalog.py (1)

26-33: 🔒 Security & Privacy | 🔵 Trivial | 💤 Low value

Consider sanitizing name against path traversal.

read_bec joins user-supplied name directly into the resource path without checking for /, .., or absolute path components. Since name originates from the local becwright add <name> / becwright search CLI, the practical exploitability is minimal (the user only affects their own filesystem), but a stray ../ could resolve outside the becs/ directory.

🛡️ Optional hardening
 def read_bec(name: str) -> str:
+    if "/" in name or "\\" in name or name in ("..", "."):
+        raise CatalogError(f"Invalid BEC name: {name!r}")
     entry = _catalog_dir().joinpath(name + _SUFFIX)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/becwright/catalog.py` around lines 26 - 33, `read_bec` currently joins
the user-controlled `name` directly into the catalog path, so harden it against
path traversal before resolving the file. Update `read_bec` to validate or
normalize `name` so it cannot contain `/`, `..`, or absolute path components,
and ensure the final resolved path stays under `_catalog_dir()`. Keep the
existing `CatalogError` flow in `read_bec` for invalid or missing entries.
src/becwright/rules.py (1)

54-63: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Chain the original YAML error for a proper traceback.

Ruff (B904) flags line 60: re-raising inside an except block without from e/from None discards the original traceback context.

🔧 Proposed fix
     try:
         data = yaml.safe_load(rules_path.read_text(encoding="utf-8")) or {}
     except yaml.YAMLError as e:
-        raise RulesError(f"{rules_path}: invalid YAML ({e}).")
+        raise RulesError(f"{rules_path}: invalid YAML ({e}).") from e
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/becwright/rules.py` around lines 54 - 63, The YAML parse failure handling
in load_rules currently raises RulesError without preserving the original
traceback. Update the except yaml.YAMLError block to chain the caught exception
when raising RulesError so the original YAML error context is retained; use the
existing load_rules function and RulesError symbol to locate the change.

Source: Linters/SAST tools

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/becwright/git.py`:
- Around line 53-60: The staged file read in _staged_blob currently runs git
show without any timeout, so a single slow or stuck subprocess can still hang
the commit path. Update the subprocess.run call in _staged_blob to use the same
BECWRIGHT_CHECK_TIMEOUT protection as the rest of the check flow, and handle the
timeout case by returning None or otherwise failing safely so the commit does
not freeze.

In `@src/becwright/rules.py`:
- Around line 32-51: Require the rule parser in the Rule creation flow to reject
missing or empty paths instead of defaulting to an empty tuple. In the parsing
logic that builds the Rule object, validate the raw mapping for a non-empty
paths field alongside id and check, and raise RulesError when paths is absent or
empty so evaluate() never receives a rule that matches nothing. Use the existing
parse/validation function that returns Rule to make this change.

---

Nitpick comments:
In `@src/becwright/catalog.py`:
- Around line 26-33: `read_bec` currently joins the user-controlled `name`
directly into the catalog path, so harden it against path traversal before
resolving the file. Update `read_bec` to validate or normalize `name` so it
cannot contain `/`, `..`, or absolute path components, and ensure the final
resolved path stays under `_catalog_dir()`. Keep the existing `CatalogError`
flow in `read_bec` for invalid or missing entries.

In `@src/becwright/engine.py`:
- Around line 85-96: The timeout handling in the rule execution path re-calls
_check_timeout() twice, causing unnecessary re-reading/parsing of the env var.
In the subprocess.run block inside the rule-checking logic, compute the timeout
once in the surrounding scope and reuse that same value for both the timeout=
argument and the TimeoutExpired message; keep the change localized to the rule
execution flow and the RuleResult append path.

In `@src/becwright/git.py`:
- Around line 63-84: The staged_worktree helper is currently materializing each
file with a separate _staged_blob/git show call, which is inefficient for large
file sets. Update staged_worktree to batch staged-file checkout using a single
git checkout-index invocation with --prefix and --stdin, while preserving the
existing .bec copy behavior and temp-dir cleanup in the try/finally block.
Verify the supported Git version range for checkout-index --stdin and keep the
function’s path handling compatible with the current root/files workflow.

In `@src/becwright/rules.py`:
- Around line 54-63: The YAML parse failure handling in load_rules currently
raises RulesError without preserving the original traceback. Update the except
yaml.YAMLError block to chain the caught exception when raising RulesError so
the original YAML error context is retained; use the existing load_rules
function and RulesError symbol to locate the change.

In `@tests/test_staged_content.py`:
- Around line 1-30: The staged-content tests duplicate the same git/setup
helpers already defined in tests/test_cli_and_git.py, so please remove the copy
in this test module and reuse shared helpers instead. Move or import _git,
_init_repo, and _rules_yaml from a common test support location such as
conftest.py or a shared helper module, then update _setup to call those shared
symbols so both test files stay in sync.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 67e00cea-b541-42c6-8a98-55eadf27ccb8

📥 Commits

Reviewing files that changed from the base of the PR and between 0b13502 and edf81c9.

📒 Files selected for processing (42)
  • .github/workflows/ci.yml
  • .pre-commit-hooks.yaml
  • CLAUDE.md
  • README.es.md
  • README.md
  • becs/README.md
  • documentation/portability.es.md
  • documentation/portability.md
  • documentation/usage.es.md
  • documentation/usage.md
  • integrations/claude-code/skills/becwright/SKILL.md
  • npm/becwright/package.json
  • npm/platforms/darwin-arm64/package.json
  • npm/platforms/darwin-x64/package.json
  • npm/platforms/linux-arm64/package.json
  • npm/platforms/linux-x64/package.json
  • npm/platforms/win32-x64/package.json
  • pyproject.toml
  • src/becwright/__init__.py
  • src/becwright/becs/README.md
  • src/becwright/becs/no-console-log-js.bec.yaml
  • src/becwright/becs/no-dangerous-eval.bec.yaml
  • src/becwright/becs/no-debug-go.bec.yaml
  • src/becwright/becs/no-debug-remnants.bec.yaml
  • src/becwright/becs/no-debug-rust.bec.yaml
  • src/becwright/becs/no-debugger-js.bec.yaml
  • src/becwright/becs/no-hardcoded-secrets.bec.yaml
  • src/becwright/becs/no-token-in-logs.bec.yaml
  • src/becwright/becs/no-wildcard-imports.bec.yaml
  • src/becwright/catalog.py
  • src/becwright/cli.py
  • src/becwright/engine.py
  • src/becwright/git.py
  • src/becwright/report.py
  • src/becwright/rules.py
  • tests/test_bundle.py
  • tests/test_catalog.py
  • tests/test_cli_and_git.py
  • tests/test_engine_integration.py
  • tests/test_list.py
  • tests/test_rules.py
  • tests/test_staged_content.py
💤 Files with no reviewable changes (1)
  • becs/README.md

Comment thread src/becwright/git.py
Comment on lines +53 to +60
def _staged_blob(root: Path, path: str) -> bytes | None:
# `:0:<path>` is the staged (index) version of the file, which is exactly
# what the commit will record — not the working-tree copy that may differ.
res = subprocess.run(
["git", "show", f":0:{path}"],
cwd=root, capture_output=True,
)
return res.stdout if res.returncode == 0 else None

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

No timeout on the per-file git show subprocess call.

This PR adds BECWRIGHT_CHECK_TIMEOUT specifically to stop a hung check from freezing a commit, but _staged_blob's git show call (invoked once per staged file) has no timeout at all. A stuck git process (lock contention, slow clean/smudge filters, network-mounted repo) here would hang the same commit path this feature is meant to protect.

🔒 Suggested fix
 def _staged_blob(root: Path, path: str) -> bytes | None:
     # `:0:<path>` is the staged (index) version of the file, which is exactly
     # what the commit will record — not the working-tree copy that may differ.
     res = subprocess.run(
         ["git", "show", f":0:{path}"],
-        cwd=root, capture_output=True,
+        cwd=root, capture_output=True, timeout=_check_timeout(),
     )
     return res.stdout if res.returncode == 0 else None
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def _staged_blob(root: Path, path: str) -> bytes | None:
# `:0:<path>` is the staged (index) version of the file, which is exactly
# what the commit will record — not the working-tree copy that may differ.
res = subprocess.run(
["git", "show", f":0:{path}"],
cwd=root, capture_output=True,
)
return res.stdout if res.returncode == 0 else None
def _staged_blob(root: Path, path: str) -> bytes | None:
# `:0:<path>` is the staged (index) version of the file, which is exactly
# what the commit will record — not the working-tree copy that may differ.
res = subprocess.run(
["git", "show", f":0:{path}"],
cwd=root, capture_output=True, timeout=_check_timeout(),
)
return res.stdout if res.returncode == 0 else None
🧰 Tools
🪛 ast-grep (0.44.0)

[error] 55-58: Command coming from incoming request
Context: subprocess.run(
["git", "show", f":0:{path}"],
cwd=root, capture_output=True,
)
Note: [CWE-78] Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection').

(subprocess-from-request)

🪛 Ruff (0.15.20)

[error] 56-56: subprocess call: check for execution of untrusted input

(S603)


[error] 57-57: Starting a process with a partial executable path

(S607)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/becwright/git.py` around lines 53 - 60, The staged file read in
_staged_blob currently runs git show without any timeout, so a single slow or
stuck subprocess can still hang the commit path. Update the subprocess.run call
in _staged_blob to use the same BECWRIGHT_CHECK_TIMEOUT protection as the rest
of the check flow, and handle the timeout case by returning None or otherwise
failing safely so the commit does not freeze.

Comment thread src/becwright/rules.py
Comment on lines +32 to 51
if not isinstance(raw, dict):
raise RulesError(f"each rule must be a mapping, got {type(raw).__name__}.")
for field in ("id", "check"):
if not raw.get(field):
raise RulesError(f"a rule is missing the required field '{field}'.")
severity = raw.get("severity", "blocking")
if severity not in _VALID_SEVERITIES:
raise RulesError(
f"rule '{raw['id']}': invalid severity {severity!r} "
f"(use one of: {', '.join(_VALID_SEVERITIES)})."
)
return Rule(
id=raw["id"],
paths=tuple(raw.get("paths", [])),
check=raw["check"],
intent=(raw.get("intent") or "").strip(),
why_it_matters=(raw.get("why_it_matters") or "").strip(),
rejected_alternatives=tuple(raw.get("rejected_alternatives", [])),
severity=raw.get("severity", "blocking"),
severity=severity,
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Confirm how empty/missing `paths` is handled during matching in engine.py
rg -n "paths" src/becwright/engine.py -C3

Repository: DataDave-Dev/becwright

Length of output: 435


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect the matching logic and the Rule model to determine what empty paths do.
ast-grep outline src/becwright/engine.py --view expanded || true
printf '\n--- engine.py relevant lines ---\n'
rg -n "def matches|matches\(|paths" src/becwright/engine.py -C4

printf '\n--- rules.py model lines ---\n'
rg -n "class Rule|paths:" src/becwright/rules.py -C4

Repository: DataDave-Dev/becwright

Length of output: 1895


Require paths during rule parsing. matches() returns False for an empty tuple, so a rule without paths is accepted as paths=() and then skipped by evaluate() before check ever runs. That turns a bad config into a no-op instead of a RulesError.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/becwright/rules.py` around lines 32 - 51, Require the rule parser in the
Rule creation flow to reject missing or empty paths instead of defaulting to an
empty tuple. In the parsing logic that builds the Rule object, validate the raw
mapping for a non-empty paths field alongside id and check, and raise RulesError
when paths is absent or empty so evaluate() never receives a rule that matches
nothing. Use the existing parse/validation function that returns Rule to make
this change.

Source: Learnings

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant