Skip to content

chore(agents): add Windows autoreview harness support#592

Merged
vincentkoc merged 1 commit into
openclaw:masterfrom
paulcam206:paulcam/autoreview-windows-support
May 30, 2026
Merged

chore(agents): add Windows autoreview harness support#592
vincentkoc merged 1 commit into
openclaw:masterfrom
paulcam206:paulcam/autoreview-windows-support

Conversation

@paulcam206
Copy link
Copy Markdown
Contributor

Summary

  • Replace the autoreview helper's custom executable lookup with shutil.which() and a clear missing-engine diagnostic.
  • Extract the Bash smoke harness logic into a shared Python implementation, then keep Bash and PowerShell wrappers over it.
  • Preserve existing --parallel-tests shell behavior by default while adding explicit --parallel-tests-shell powershell|pwsh support.

Validation

  • .\build.ps1
  • dotnet test .\tests\OpenClaw.Shared.Tests\OpenClaw.Shared.Tests.csproj --no-restore — 2027 passed, 29 skipped
  • dotnet test .\tests\OpenClaw.Tray.Tests\OpenClaw.Tray.Tests.csproj --no-restore — 868 passed
  • Stubbed harness smoke with fake codex.cmd through Python and PowerShell wrappers
  • AUTOREVIEW_ALLOW_UNSANDBOXED_TOOLS=1 python .agents\skills\autoreview\scripts\autoreview --mode local --engine copilot — clean

Extract the autoreview smoke harness into Python so Bash and PowerShell wrappers can share the same fixture logic. Use PATHEXT-aware command resolution for reviewer CLIs and make parallel test shell selection explicit while preserving existing defaults.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@clawsweeper
Copy link
Copy Markdown

clawsweeper Bot commented May 29, 2026

Codex review: needs maintainer review before merge. Reviewed May 29, 2026, 7:07 PM ET / 23:07 UTC.

Summary
The PR updates the repo-local autoreview helper and smoke harness for Windows-friendly command resolution, a shared Python harness with Bash/PowerShell wrappers, and explicit parallel-test shell selection.

Reproducibility: not applicable. this is a tooling enhancement PR, not a bug report with a failing current-main reproduction path. The relevant checks are source review and the PR body's reported helper/harness validation.

Review metrics: 2 noteworthy metrics.

  • Patch scope: 5 files affected; 308 added, 183 deleted. The change is confined to the repo-local autoreview skill, helper, and harness wrappers.
  • Reported repo tests: 2,895 passed, 29 skipped. The PR body reports the AGENTS.md-required shared and tray test suites with concrete pass counts.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🐚 platinum hermit
Patch quality: 🐚 platinum hermit
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • none.

Next step before merge

  • No ClawSweeper repair lane is needed because the patch has no actionable correctness or security finding.

Security
Cleared: The diff adds no dependency, secret, permission, or network surface; subprocess use remains limited to explicit reviewer/test command invocation and local fixture setup.

Review details

Best possible solution:

Land the scoped tooling update after normal merge checks accept the reported validation, keeping the release-owned changelog untouched.

Do we have a high-confidence way to reproduce the issue?

Not applicable; this is a tooling enhancement PR, not a bug report with a failing current-main reproduction path. The relevant checks are source review and the PR body's reported helper/harness validation.

Is this the best way to solve the issue?

Yes; extracting the smoke harness into shared Python while keeping Bash and PowerShell wrappers is a narrow maintainable path for Windows support, and the new shell selector preserves the existing default behavior.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 8af4b70360a9.

Label changes

Label changes:

  • add P3: This is low-risk internal agent tooling and documentation work rather than a product runtime bug or urgent regression.
  • add proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes after-change terminal-style validation for the repo build/tests, Windows wrapper smoke with fake codex.cmd, and a local autoreview run, so no additional contributor proof is needed.
  • add rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🐚 platinum hermit and patch quality is 🐚 platinum hermit.
  • add status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (live_output): The PR body includes after-change terminal-style validation for the repo build/tests, Windows wrapper smoke with fake codex.cmd, and a local autoreview run, so no additional contributor proof is needed.

Label justifications:

  • P3: This is low-risk internal agent tooling and documentation work rather than a product runtime bug or urgent regression.
  • rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🐚 platinum hermit and patch quality is 🐚 platinum hermit.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (live_output): The PR body includes after-change terminal-style validation for the repo build/tests, Windows wrapper smoke with fake codex.cmd, and a local autoreview run, so no additional contributor proof is needed.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes after-change terminal-style validation for the repo build/tests, Windows wrapper smoke with fake codex.cmd, and a local autoreview run, so no additional contributor proof is needed.
Evidence reviewed

What I checked:

  • Repository policy read: AGENTS.md was read fully; its validation guidance is relevant to evaluating the PR body's reported build and test results, while this review remains read-only. (AGENTS.md:1, 8af4b70360a9)
  • Current master lacks this support: Current master has only the existing autoreview skill files and no parallel-tests-shell option, PowerShell harness wrapper, shared Python harness, or resolve_command helper. (8af4b70360a9)
  • PR command resolution change: The PR commit uses shutil.which-based resolution for reviewer engine binaries and gh base detection, matching the Windows PATHEXT goal without adding dependencies. (.agents/skills/autoreview/scripts/autoreview:295, d204f9fb9590)
  • PR harness implementation: The Bash harness becomes a thin Python launcher, the PowerShell wrapper forwards validated parameters, and the shared Python harness preserves the malicious/benign fixture behavior. (.agents/skills/autoreview/scripts/test-review-harness.py:148, d204f9fb9590)
  • Documentation update: The skill docs document Windows invocation, PowerShell harness usage, and the new parallel-tests-shell option. (.agents/skills/autoreview/SKILL.md:106, d204f9fb9590)
  • Whitespace check: The PR diff has no git diff --check whitespace errors across the touched autoreview files. (d204f9fb9590)

Likely related people:

  • Vincent Koc: git log and blame show the current .agents/skills/autoreview helper, docs, and Bash smoke harness were introduced together in f6a2d5d. (role: introduced current autoreview helper; confidence: high; commits: f6a2d5d18f5c; files: .agents/skills/autoreview/SKILL.md, .agents/skills/autoreview/scripts/autoreview, .agents/skills/autoreview/scripts/test-review-harness)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added proof: sufficient Contributor real behavior proof is sufficient. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. P3 Low-risk cleanup, docs, polish, ergonomics, or speculative feature. labels May 29, 2026
Copy link
Copy Markdown
Member

@vincentkoc vincentkoc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maintainer pass: scoped autoreview tooling update. Checked PR diff, no whitespace errors, helper help exposes --parallel-tests-shell, and the Bash smoke wrapper help runs from the PR head. ClawSweeper reported no actionable findings.

@vincentkoc vincentkoc merged commit b163d74 into openclaw:master May 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

P3 Low-risk cleanup, docs, polish, ergonomics, or speculative feature. proof: sufficient Contributor real behavior proof is sufficient. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants