feat(code-audit): add PR auditor as second code-audit slice by W00DSRULES · Pull Request #18 · DavidHavoc/openworkers

W00DSRULES · 2026-05-15T14:59:10Z

openworkers audit pr <github-url> extracts atomic claims from a PR description and verdicts each against the actual unified diff. Same pipeline shape as the README auditor (planner → deterministic researcher → checker + trust gate → critic), proving that the SourceAdapter abstraction generalises to a second backend.

New modules:

core/sources/github.py — GitHubAdapter over a unified diff via a PrSpec value object; live fetch (fetch_pr_from_github) is a sibling helper, decoupled from the adapter so tests stay network-free. parse_pr_url + load_pr_fixture as supporting helpers.
core/orchestrator/pr_flow.py — PrAuditOrchestrator.
core/orchestrator/audit_prompts.py — shared audit-prompt loader (extracted from readme_flow.py so each new auditor registers its templates in one place).
prompts/code_audit/pr_planner.md + pr_checker.md — PR-specific claim types (add | remove | fix | refactor | test | behavior | doc | other) and diff-aware verdict rules.

Schema:

core/schemas_audit.py exposes AuditClaim / AuditClaimList as aliases of ReadmeClaim / ReadmeClaimList so PR code reads cleanly without churning the README slice. AuditReport.target captures the audited artefact id (PR URL, etc.).

Agents:

providers/code_audit_agents.py adds PrPlannerAgent + PrCheckerAgent alongside the README versions. The same _enforce_trust_gate runs after the LLM responds, so any claim with no diff evidence is forced to unsupported regardless of LLM output. AuditCriticAgent is reused as-is.

CLI:

openworkers audit pr <url> (or --fixture <dir> for offline runs) with optional --token falling back to GITHUB_TOKEN / GH_TOKEN.

Tests:

tests/fixtures/sample_pr/ — canned PR JSON + unified diff with one verified, one drifted, one contradicted, one fabricated (no-evidence) claim.
tests/code_audit/test_pr_flow.py — URL parser, fixture loader, adapter grep, end-to-end audit with verdict distribution, and an explicit trust-gate-override assertion for the hallucinated WIDGETLIB_DEBUG claim.

Docs: README.md, ROADMAP.md, CHANGELOG.md, AGENTS.md updated with the new slice, shared pipeline section, and PR-audit-flow walkthrough.

Verification: 159/159 tests pass (153 existing + 6 new PR tests), mypy strict clean on new modules, ruff clean on new files, black formatted. CLI smoke-runs in text and JSON modes against the fixture.

Summary

Changes

Testing

pytest tests/ -v passes
ruff check . passes
black --check . passes
mypy core/ providers/ --strict --ignore-missing-imports passes

Checklist

I have read CONTRIBUTING.md
Code follows project style (black, ruff, mypy strict for core/providers)
Tests added or updated for the changes
Documentation updated if needed

`openworkers audit pr <github-url>` extracts atomic claims from a PR description and verdicts each against the actual unified diff. Same pipeline shape as the README auditor (planner → deterministic researcher → checker + trust gate → critic), proving that the `SourceAdapter` abstraction generalises to a second backend. New modules: - core/sources/github.py — GitHubAdapter over a unified diff via a PrSpec value object; live fetch (fetch_pr_from_github) is a sibling helper, decoupled from the adapter so tests stay network-free. parse_pr_url + load_pr_fixture as supporting helpers. - core/orchestrator/pr_flow.py — PrAuditOrchestrator. - core/orchestrator/audit_prompts.py — shared audit-prompt loader (extracted from readme_flow.py so each new auditor registers its templates in one place). - prompts/code_audit/pr_planner.md + pr_checker.md — PR-specific claim types (add | remove | fix | refactor | test | behavior | doc | other) and diff-aware verdict rules. Schema: - core/schemas_audit.py exposes AuditClaim / AuditClaimList as aliases of ReadmeClaim / ReadmeClaimList so PR code reads cleanly without churning the README slice. AuditReport.target captures the audited artefact id (PR URL, etc.). Agents: - providers/code_audit_agents.py adds PrPlannerAgent + PrCheckerAgent alongside the README versions. The same _enforce_trust_gate runs after the LLM responds, so any claim with no diff evidence is forced to `unsupported` regardless of LLM output. AuditCriticAgent is reused as-is. CLI: - `openworkers audit pr <url>` (or `--fixture <dir>` for offline runs) with optional `--token` falling back to GITHUB_TOKEN / GH_TOKEN. Tests: - tests/fixtures/sample_pr/ — canned PR JSON + unified diff with one verified, one drifted, one contradicted, one fabricated (no-evidence) claim. - tests/code_audit/test_pr_flow.py — URL parser, fixture loader, adapter grep, end-to-end audit with verdict distribution, and an explicit trust-gate-override assertion for the hallucinated WIDGETLIB_DEBUG claim. Docs: README.md, ROADMAP.md, CHANGELOG.md, AGENTS.md updated with the new slice, shared pipeline section, and PR-audit-flow walkthrough. Verification: 159/159 tests pass (153 existing + 6 new PR tests), mypy strict clean on new modules, ruff clean on new files, black formatted. CLI smoke-runs in text and JSON modes against the fixture. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

DavidHavoc merged commit c8e21ce into main May 15, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(code-audit): add PR auditor as second code-audit slice#18

feat(code-audit): add PR auditor as second code-audit slice#18
DavidHavoc merged 1 commit into
mainfrom
feat/pr-auditor

W00DSRULES commented May 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

W00DSRULES commented May 15, 2026

Summary

Changes

Testing

Checklist

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants