Skip to content

feat: Policy RFC Engine (proposal-only)#39

Open
Freeak88 wants to merge 1 commit intoopenclaw:mainfrom
Freeak88:feat/policy-rfc-engine
Open

feat: Policy RFC Engine (proposal-only)#39
Freeak88 wants to merge 1 commit intoopenclaw:mainfrom
Freeak88:feat/policy-rfc-engine

Conversation

@Freeak88
Copy link
Copy Markdown

@Freeak88 Freeak88 commented May 5, 2026

Summary

Adds a conservative Policy RFC Engine that scans existing ClawSweeper durable records and generates reviewable RFC-style policy proposals from repeated operational patterns.

This is proposal-only infrastructure. It does not mutate GitHub, dispatch repairs, close issues, alter apply behavior, or change scheduler/automerge paths.

What changed

  • Added src/policy-rfc/ with collector, scorer, synthesizer, types, and CLI runner.
  • Added pnpm run policy-rfc.
  • Added focused tests for pattern extraction, scoring, rejection of one-offs, deterministic RFC generation, JSON output, and malformed-record tolerance.
  • Added docs/policy-rfc-engine.md.

How to try

pnpm run policy-rfc -- --target-repo openclaw/openclaw

Generates RFC proposals from local durable records without mutating GitHub.

Validation

Passed:

  • pnpm install
  • pnpm run build
  • pnpm run build:all
  • node --test test/policy-rfc.test.ts
  • targeted oxlint
  • targeted oxfmt

Full pnpm test reaches the build step successfully and then fails in this local Windows / Node 22.19.0 environment. The package requires Node >=24. Visible failures include path separator / CRLF assertions, Codex spawn EPERM, pnpm ENOENT in validation subprocesses, and one repair hydration assertion. The new Policy RFC tests pass.

RFC and JSON outputs are deterministic for the same input records to ensure stable diffs and testability.

Safety

The first version is intentionally local-record only and proposal-only. It does not hook into scheduler lanes or GitHub mutation paths.

Non-Goals

  • No automatic policy execution
  • No scheduler integration
  • No GitHub mutation
  • No additional load on review shards

@steipete
Copy link
Copy Markdown
Contributor

steipete commented May 5, 2026

Thanks for the substantial proposal. I am not treating this as landable in the current queue. It adds a new product surface and about a thousand lines of new policy-generation code, while the latest visible checks only include the notify workflow and the PR body says full local testing was not run under this repo's required Node >=24 environment.

Before this could move forward, it needs an explicit maintainer/product decision that ClawSweeper should own a Policy RFC Engine at all, plus a current rebase and full pnpm run check on Node 24+. Until then, it should stay out of the main maintenance/perf batch.

Copy link
Copy Markdown

@ds4psb-ai ds4psb-ai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Independent code review — Policy RFC Engine

Posted on behalf of @ds4psb-ai (read-only contributor). Maintainer judgment owns merge. The product-surface question @steipete already raised is upstream of everything below; these are quality findings if you do choose to land it.

Safety audit summary

  • zero GitHub mutation pathsgh pr diff 39 | grep -E '(ghJson|gh api|ghMutate|gh pr|gh issue|gh label|github\.com|fetch\(|http:|https://api|axios|node-fetch)' returns 0 matches across all 5 source files. The engine is genuinely IO-local.
  • output scoped to results/policy-rfc/ — only mkdirSync(outputDir, ...) plus two writeFileSync calls in src/policy-rfc/index.ts:466-473, both under join(options.outputRoot, profile.slug).
  • [~] imports — PR description implies ../repository-profiles.js is the only cross-module import. Actually three are used: repository-profiles.js, clawsweeper-args.js, stable-json.js. All three are pure local helpers with no GitHub side effects, so the safety claim still holds, but the PR body is technically incomplete. Worth a one-line fix to the description.
  • no scheduler/apply/automerge integration — no edits to clawsweeper.ts, repair/, sweep.yml, or any apply path. New code is reachable only via the new pnpm run policy-rfc script.
  • package.json change is minimal — single + \"policy-rfc\": \"node dist/policy-rfc/index.js\", line; no other script touched.

Net: the proposal-only safety boundary is upheld in code. Good.

Determinism check

The PR claims: "RFC and JSON outputs are deterministic for the same input records to ensure stable diffs and testability." This is only partially true, and the test suite hides the gap.

Two non-deterministic sources reach the CLI's actual output bytes:

  1. src/policy-rfc/synthesizer.ts:14const createdAt = options.createdAt ?? new Date().toISOString();
  2. src/policy-rfc/scorer.ts:44now: options.now ?? new Date()

The CLI in src/policy-rfc/index.ts:486-491 only forwards createdAt when --created-at is passed (and never forwards a now to the scorer at all). Default pnpm run policy-rfc invocations therefore produce:

  • A fresh created_at in every JSON proposal, and a different "Generated by ClawSweeper Policy RFC Engine at ..." footer in every markdown.
  • A confidenceScore that drifts as recencyScore decays — same input records, same records, different confidence_score field in JSON, and different Confidence score: line in markdown.

The tests pin both now and createdAt, so they pass. There is no test that asserts "two consecutive CLI runs produce identical RFC bytes," which is exactly what the PR claims to guarantee.

Suggested fix: have runPolicyRfc resolve a single createdAt once and pass it down as now to the scorer (or freeze now to the latest observedAt in the input — that would also remove time-based drift for replayability). Then add a test that runs the engine twice against a fixture and asserts byte-equal .md and .json.

Top findings

P1 / should fix before merge

  • src/policy-rfc/collector.ts:206-208itemFromPath mishandles closed/ records. The regex \/items\/([^/.]+)\./ only matches the items/ subtree. AGENTS.md explicitly documents records/<repo-slug>/closed/<number>.md as the archive layout and the engine is designed to mine durable evidence — i.e. exactly the closed records. For closed records, item falls back to relativePath, which (a) inflates distinctItems (every closed record looks like its own item) and silently helps a pattern clear the minDistinctItems gate for the wrong reason, and (b) makes evidence lines harder to read. Suggested: extend to \/(?:items|closed)\/([^/.]+)\./ or strip .md/.json from any leaf segment.

  • Determinism claim vs. CLI defaults (see above). I'd treat this as P1 because the PR body explicitly markets determinism as a property.

P2 / nice to have

  • src/policy-rfc/scorer.ts:130-145proposedAction switch has no default arm. Today the switch is exhaustive, so TS infers string. The day someone adds a seventh PolicyPatternType, this silently returns undefined and propagates into proposedAction: string in the JSON. A default: throw new Error(...) or a satisfies never exhaustiveness check would surface the regression at compile/run time.

  • src/policy-rfc/collector.ts:210-223safeRead / safeReadDir / safeIsDirectory swallow every error class. Treating EACCES and EMFILE the same as ENOENT means a permissions-broken records dir produces an empty proposal set with zero diagnostic. At minimum, log error.code to console.warn for non-ENOENT failures so an operator knows the scan was incomplete.

  • src/policy-rfc/collector.ts:277-285jsonStringValues regex. "${key}"\s*:\s*"([^"]+)" truncates at the first internal \" and silently misses values that span lines. For current ClawSweeper records it's probably fine, but if anyone ever writes JSON evidence with escaped quotes, the extracted value will be wrong, not absent. Consider parsing with JSON.parse when the file ends in .json, regex only as a markdown fallback.

  • src/policy-rfc/collector.ts:302-304hasSuccessfulOutcome is a permissive substring match. It will mark a record containing the phrase "validation pass failed" as successfulOutcome: true. The success term in the confidence formula then over-rewards noisy records. Consider anchoring (e.g. require the marker to live in a structured result: / outcome: line, or in JSON "status": "...").

  • src/policy-rfc/collector.ts:329-336frontMatterStringArray JSON failure path returns [] instead of falling back to comma-split. A bracketed but malformed labels line drops all labels for that record rather than degrading. A try { ... } catch { /* fall through */ } instead of return [] would be safer.

Recommendation

Proposal-only safety boundary is upheld and the architecture is clean. The maintainer call about whether ClawSweeper should own this surface is the dominant gate (per @steipete's earlier comment); from a pure code-quality lens this PR is in good shape.

If you do choose to merge: I'd ask for the determinism fix and the closed/ regex fix as P1 blockers, since both are claims-vs-code mismatches that the test suite happens not to catch. The P2 items are quality-of-life and can land later.

Not approving (read-only contributor); flagging as a comment for maintainer judgment.

ds4psb-ai pushed a commit to ds4psb-ai/clawsweeper that referenced this pull request May 8, 2026
The Policy RFC engine is proposal-only, but its generated artifacts must be reproducible enough for maintainers to review and diff safely. Derive default timestamps from observed evidence instead of wall-clock time, and preserve archived closed-record item numbers alongside active items.

Constraint: PR openclaw#39 was reviewed with two P1 launch blockers around nondeterminism and closed-record parsing.

Rejected: Require every caller to pass timestamps | exported helpers are still public test surfaces and should be safe by default.

Confidence: high

Scope-risk: narrow

Directive: Do not reintroduce wall-clock defaults in policy-rfc generation without deterministic tests.

Tested: pnpm run build:all; node --test test/policy-rfc.test.ts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants