Require proof before superseded PR closeout by jesse-merhi · Pull Request #196 · openclaw/clawsweeper

jesse-merhi · 2026-05-25T06:10:02Z

Summary

Require hydrated PR A / PR B proof before normal apply can close PR A as superseded.
Add compact shared supersession proof helpers, prompt templates, and schema-backed model output checks.
Gate repair replacement closeout on source PR security/context checks and fall back to link-only when proof is missing or blocked.

Validation

fnm exec --using v24.14.1 pnpm run format
fnm exec --using v24.14.1 pnpm run build:all
fnm exec --using v24.14.1 node --test --test-name-pattern 'PRs superseded by linked pull requests|linked supersession proof is incomplete|same-file supersession is incomplete|security-labeled PRs open|comments contain security markers|supersession proof prompt' test/clawsweeper.test.ts
fnm exec --using v24.14.1 node --test test/repair/execute-fix-artifact-source.test.ts test/repair/execute-fix-github.test.ts
fnm exec --using v24.14.1 pnpm run check
git diff --check --cached

clawsweeper · 2026-05-25T06:11:16Z

Codex review: needs maintainer review before merge. Reviewed May 25, 2026, 2:14 AM ET / 06:14 UTC.

Summary
The PR adds model-backed supersession proof prompts, schema, shared helpers, and apply/repair gates before ClawSweeper closes source PRs as superseded.

Reproducibility: yes. by source inspection. Current main can promote linked supersession closeout from durable report links without hydrating and comparing PR A and PR B; I did not run a live apply repro.

Review metrics: 2 noteworthy metrics.

Changed surface: 10 files, +1612/-35. The PR touches both normal apply close promotion and repair replacement closeout, so maintainers should review the integrated automation behavior.
Proof contract added: 2 prompts, 1 schema, 1 shared helper module. These files define the model contract that decides whether a source PR can be closed as superseded.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🌊 off-meta tidepool
Patch quality: 🐚 platinum hermit
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

Resolve the dirty merge state and rerun the declared Node 24 validation commands.

Risk before merge

GitHub reports this PR as dirty, so maintainers need to resolve conflicts and review the final merge result before landing.
The PR changes live apply/repair closeout automation and adds a model-backed proof subprocess; maintainers should confirm the fail-closed/link-only fallback is the intended operator behavior.

Maintainer options:

Refresh Before Merge (recommended)
Resolve the dirty merge state, rerun the declared Node 24 checks, and review the final apply/repair closeout behavior on the merged result.
Accept Link-Only Fallback
Maintainers can accept that missing or blocked proof will keep source PRs open with a replacement link instead of allowing superseded closeout.

Next step before merge
Maintainer review should resolve the dirty branch and decide whether the new model-backed proof gate is acceptable for live closeout automation.

Security
Cleared: No concrete security or supply-chain concern found; the diff adds no dependencies or workflow permission changes and routes security-sensitive source PRs away from auto-close.

Review details

Best possible solution:

Resolve the dirty merge state, rerun the declared Node 24 checks, and land the proof-gated closeout path if maintainers accept the added model proof step.

Do we have a high-confidence way to reproduce the issue?

Yes by source inspection. Current main can promote linked supersession closeout from durable report links without hydrating and comparing PR A and PR B; I did not run a live apply repro.

Is this the best way to solve the issue?

Yes, the proposed fail-closed proof gate is a conservative way to solve the closeout safety issue, but the dirty merge state must be resolved before the final implementation can be trusted.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 77e35f078a76.

Label changes

Label changes:

add P2: This is meaningful automation safety hardening with a bounded apply/repair-lane blast radius.
add merge-risk: 🚨 automation: Merging changes live closeout automation and adds a model proof subprocess that unit tests cannot fully validate against production GitHub state.
add rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🌊 off-meta tidepool and patch quality is 🐚 platinum hermit.
add status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Not applicable: Real behavior proof is not a contributor gate here because the PR author association is MEMBER.

Label justifications:

P2: This is meaningful automation safety hardening with a bounded apply/repair-lane blast radius.
merge-risk: 🚨 automation: Merging changes live closeout automation and adds a model proof subprocess that unit tests cannot fully validate against production GitHub state.
rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🌊 off-meta tidepool and patch quality is 🐚 platinum hermit.
status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Not applicable: Real behavior proof is not a contributor gate here because the PR author association is MEMBER.

Evidence reviewed

What I checked:

repository policy read: Read the full target AGENTS.md; its conservative apply-lane and maintainer-authored item guidance applies to this review. (AGENTS.md:1, 77e35f078a76)
current closeout behavior: Current main detects linked supersession candidates from durable review report links and promotes closeout without hydrated PR A / PR B coverage proof. (src/clawsweeper.ts:10100, 77e35f078a76)
repair closeout behavior: Current repair replacement closeout comments and closes source PRs after basic state checks, before any source/replacement proof comparison. (src/repair/execute-fix-artifact.ts:1766, 77e35f078a76)
PR diff scope: The PR adds supersession proof prompts/schema/helpers and wires them into apply and repair closeout paths, with focused tests around incomplete proof and security-sensitive source PRs. (src/supersession-proof.ts:1, a29dcdfe5663)
mergeability: GitHub API reports mergeable=false and mergeable_state=dirty for Require proof before superseded PR closeout #196. (a29dcdfe5663)
history provenance: The linked supersession close lane traces to commit 97885cf; repair replacement closeout traces to commit 4ea930a. (src/clawsweeper.ts:10040, 97885cf4e291)

Likely related people:

Peter Steinberger: Git blame and git log show the current stale PR close-promotion lane and repair replacement closeout helpers were introduced in recent commits by Peter Steinberger. (role: introduced behavior and recent area contributor; confidence: high; commits: 97885cf4e291, 4ea930af2074; files: src/clawsweeper.ts, src/repair/execute-fix-artifact.ts, src/repair/execute-fix-github.ts)

What the crustacean ranks mean

🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works

ClawSweeper keeps one durable marker-backed review comment per issue or PR.
Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
Maintainers can also comment @clawsweeper review to request a fresh review only.
Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

Copilot

Pull request overview

This PR strengthens the “superseded PR” auto-close flow by requiring explicit, schema-validated “supersession proof” (plus security/context gating) before ClawSweeper can close a source PR as superseded by a linked replacement PR.

Changes:

Adds shared supersession proof utilities + schema parsing/normalization to ensure “superseded” decisions have concrete coverage evidence.
Updates apply/repair flows to hydrate PR context (labels/comments/files/body) and gate closeout on security signals and model proof; falls back to link-only when blocked or incomplete.
Expands tests and mocks to cover proof-required behavior, security gating, and prompt/content constraints.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`src/supersession-proof.ts`	New shared helpers for compact PR views + schema-backed parsing/normalization of supersession proof output.
`src/clawsweeper.ts`	Adds supersession-proof runtime/prompt/model execution and blocks “superseded close” promotions unless proof allows it.
`src/repair/execute-fix-github.ts`	Hydrates PR view with labels/comments/files/body and introduces a security gate helper for source PRs.
`src/repair/execute-fix-artifact.ts`	Adds replacement closeout proof runner and enforces security + proof checks before `gh pr close`; otherwise link-only.
`schema/clawsweeper-supersession-proof.schema.json`	New JSON schema for model output validation.
`prompts/supersession-proof.md`	New prompt template for general supersession proof in apply flow.
`prompts/repair/replacement-closeout-proof.md`	New prompt template for repair replacement closeout proof.
`test/clawsweeper.test.ts`	Adds mocks + tests covering proof-required close behavior and security/incomplete-proof keep-open behavior.
`test/repair/execute-fix-github.test.ts`	Adds unit tests for the source PR security gating helper.
`test/repair/execute-fix-artifact-source.test.ts`	Adds source-level regression tests ensuring security/proof checks precede closeout calls and prompt changes are in place.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

 export function fetchSourcePullRequestView({
  repo,
  number,
  targetDir,
 }: {
  repo: string;
  number: JsonValue;
  targetDir: string;
 }): LooseRecord {
  return JSON.parse(
    run(
      "gh",
-      ["pr", "view", String(number), "--repo", repo, "--json", "author,state,mergedAt,title,url"],
+      [
+        "pr",
+        "view",
+        String(number),
+        "--repo",
+        repo,
+        "--json",
+        "author,state,mergedAt,title,url,body,labels,comments,files,headRefOid,updatedAt",
+      ],


+  } catch {
+    return { close: false, reason: "replacement closeout proof output was invalid JSON" };


clawsweeper · 2026-05-25T06:18:48Z

ClawSweeper PR egg

✨ Hatched: 🥚 common Tiny Diff Drake

Hatch command

Comment @clawsweeper hatch when this PR is hatchable.

Hatchability rules:

Merged PRs are hatchable.
Open PRs are hatchable when they are status: 👀 ready for maintainer look, status: 🚀 automerge armed, or labeled clawsweeper:automerge.
Closed unmerged PRs are hatchable only when one of those hatchable labels is still present in the durable record.

Rarity: 🥚 common.
Trait: sparkles near resolved comments.
Image traits: location artifact grotto; accessory CI status badge; palette pearl, teal, and neon green; mood proud; pose leaning over a miniature review desk; shell soft speckled shell; lighting warm desk-lamp glow; background delicate sparkle particles.
Share on X: post this hatch
Copy: My PR egg hatched a 🥚 common Tiny Diff Drake in ClawSweeper.

What is this egg doing here?

Eggs appear after the PR passes real-behavior proof. It is here for vibes, not verdicts: it does not change labels, ratings, merge decisions, or automation.
The shell reacts to review momentum: open follow-up work warms it up, re-review makes it wobble, and a clean final review lets it hatch.
Hatchability usually comes from sufficient real-behavior proof, no blocking P0/P1/P2 findings, no security attention needed, and clean correctness. A merged PR is already final, so merge makes the egg hatchable independently.
The hatch is seeded from this repository and PR number, so the same PR keeps the same creature; the reviewed head SHA can only change safe visual details.
Rarity is just collectible sparkle: 🥚 common, 🌱 uncommon, 💎 rare, ✨ glimmer, and 🌈 legendary.

nxmxbbd · 2026-05-25T18:56:00Z

Adding evidence that the supersession-proof gate in this PR would catch a real
case beyond the closed-unmerged replacement scenario.

openclaw/openclaw#86590 was auto-closed on 2026-05-25 as
duplicate_or_superseded by the merged #85063, but current main still lacks
the follow-up surfaces #86590 was adding:

SessionMcpRuntime.getOmittedServers() and the OmittedMcpServerReason enum
connect-timeout and list-tools-failed omitted-server reasons on
materialized tools
the active-lease rebind guard message
bundle-mcp runtime busy; cannot rebind workspace/config while active
compact's managed-runtime reuse via getOrCreateSessionMcpRuntime(...) in
src/agents/pi-embedded-runner/compact.ts
the compact runtime-context omit for empty sandboxSessionKey

The ClawSweeper review body on #86590 explicitly said "Keep open for maintainer
review: ... sufficient after-fix proof and no blocking code-review findings"
and emitted ,
but the close-applied marker still fired about eight minutes later. Same
family as openclaw/openclaw#86006 (verdict-vs-apply contract gap) and
#197 (supersede-by-closed-unmerged), except here PR B
(#85063) is merged and PR A (#86590) has unique uncovered work.

If the supersession-proof gate in this PR is fed PR A's file paths and PR B's
merged squash (07f500aa562d — bounded tools/list discovery only),
uniqueSourceWork should be non-empty and the decision should land as
keep_open, so this PR is likely to prevent the #86590-style close in
practice.

One question: is the marker-contract layer (apply lane gated independently on
clawsweeper-verdict:needs-human) also in scope here, or planned separately?
The model proof gate looks robust; a static marker check would be a cheap
second layer against future review-vs-apply drift.

fix: require proof before superseded PR closeout

a29dcdf

Copilot AI review requested due to automatic review settings May 25, 2026 06:10

Copilot started reviewing on behalf of jesse-merhi May 25, 2026 06:10 View session

Copilot AI reviewed May 25, 2026

View reviewed changes

clawsweeper Bot added rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. labels May 25, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Require proof before superseded PR closeout#196

Require proof before superseded PR closeout#196
jesse-merhi wants to merge 1 commit into
mainfrom
jesse/pr-supersession-proof

jesse-merhi commented May 25, 2026

Uh oh!

clawsweeper Bot commented May 25, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

clawsweeper Bot commented May 25, 2026

Uh oh!

nxmxbbd commented May 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		} catch {
		return { close: false, reason: "replacement closeout proof output was invalid JSON" };

Conversation

jesse-merhi commented May 25, 2026

Summary

Validation

Uh oh!

clawsweeper Bot commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

clawsweeper Bot commented May 25, 2026

Hatch command

Uh oh!

nxmxbbd commented May 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

clawsweeper Bot commented May 25, 2026 •

edited

Loading