Skip to content

feat: add ExecApprovalsStore write path and wire coordinator side effects#526

Open
AlexAlves87 wants to merge 6 commits into
openclaw:masterfrom
AlexAlves87:feat/exec-approvals-write-path
Open

feat: add ExecApprovalsStore write path and wire coordinator side effects#526
AlexAlves87 wants to merge 6 commits into
openclaw:masterfrom
AlexAlves87:feat/exec-approvals-write-path

Conversation

@AlexAlves87
Copy link
Copy Markdown
Contributor

@AlexAlves87 AlexAlves87 commented May 23, 2026

What

Adds the write path to ExecApprovalsStore and wires the two side-effect calls into ExecApprovalsCoordinator, closing the persistence loop for the exec approvals V2 pipeline.

Store — new public API:

  • AddAllowlistEntryAsync(agentId, pattern) — persists a new allowlist entry after an AllowAlways prompt decision. Deduplicates on write (OrdinalIgnoreCase). Returns true if the entry is present after the call (added or already there), false on empty pattern or I/O failure. New entries carry {id, pattern}; lastUsedAt is stamped later by RecordAllowlistUseAsync on first successful use (matches macOS parity).
  • RecordAllowlistUseAsync(agentId, pattern, command, resolvedPath) — updates lastUsedAt, lastUsedCommand, and lastResolvedPath for every matching entry after a final allow. Returns false if pattern not found or on I/O failure.

Store — private infrastructure:

  • UpdateFileAsync(mutate) — load → mutate → atomic save, serialized by the existing SemaphoreSlim. Never throws. Refuses to overwrite a malformed file. Handles transient IOException on the atomic move as a degraded path: logs Warn, no retry.

Coordinator — side effects wired:

  • RecordAllowlistUsageAsync fires on both allow exit points: the pass1 pre-approved branch and the post-pass2 branch. Both are required to cover the common allowlist-satisfied case.
  • PersistAllowlistEntriesAsync fires only after pass2 = Allow and followupDecision == AllowAlways, strictly outside the _promptLock block.
  • Both helpers are best-effort: a store I/O failure is absorbed and does not affect the final Allow result.

Not wired in production yet: the coordinator is still not referenced in any production src/ file. The ProductionWiring_CoordinatorNotReferencedInSrc test remains green.

Design notes

Side effects fire strictly after the final allow decision is confirmed — not before the second evaluator pass. This is a deliberate structural safety choice: the guarantee is structural rather than relying on proof that Evaluate(context, AllowAlways) always produces Allow.

Pattern validation in AddAllowlistEntryAsync is non-empty only, matching macOS parity. Basename-only patterns are inert at match time but not rejected at persist time.

Testing

145 tests passing, 0 failures. 24 new tests added in this change (17 store, 7 coordinator).

Store tests cover: success paths, dedup, not-found, malformed-file refusal, I/O failure degradation on both mutators, concurrency (5 concurrent writes produce a single entry), and round-trip JSON validation.

Coordinator tests cover: AllowAlways persistence, non-allowlist security guard, duplicate pattern dedup, allowlist usage recording, allowlist-not-satisfied guard, pass1 pre-approved path, and fallback path with allowlist satisfied.

Real behavior proof

End-to-end coordinator/store runtime proof using real filesystem I/O (test RuntimeProof_AllowAlways_PersistsAndRecordsLastUsed in tests/OpenClaw.Shared.Tests/ExecApprovalsCoordinatorTests.cs).

Scope clarification: this is slice runtime proof, not full production runtime proof. The coordinator is intentionally not wired in production yet. A follow-up production wiring slice (SetV2Handler(coordinator) + feature flag) connects the coordinator, and the WinUI prompt dialog is a separate slice after that. UI-driven proof against the live app is only meaningful once those land.

Reproduce locally:

dotnet test tests/OpenClaw.Shared.Tests/OpenClaw.Shared.Tests.csproj \
  --filter "FullyQualifiedName~RuntimeProof_AllowAlways_PersistsAndRecordsLastUsed" \
  --logger "console;verbosity=detailed" --nologo

Captured output (initial / post-AllowAlways / post-allowlist-hit):

=== Initial exec-approvals.json ===
{"version":1,"agents":{"main":{"security":"allowlist","ask":"always"}}}

=== After AllowAlways (correlationId=proof-1) ===
{
  "version": 1,
  "agents": {
    "main": {
      "security": "allowlist",
      "ask": "always",
      "allowlist": [
        {
          "id": "fc9ee349-abe6-4c08-9352-7ed397b7c491",
          "pattern": "C:\\WINDOWS\\system32\\cmd.EXE"
        }
      ]
    }
  }
}

=== After allowlist hit (correlationId=proof-2) ===
{
  "version": 1,
  "agents": {
    "main": {
      "security": "allowlist",
      "ask": "always",
      "allowlist": [
        {
          "id": "fc9ee349-abe6-4c08-9352-7ed397b7c491",
          "pattern": "C:\\WINDOWS\\system32\\cmd.EXE",
          "lastUsedAt": 1779521352175,
          "lastUsedCommand": "cmd",
          "lastResolvedPath": "C:\\WINDOWS\\system32\\cmd.EXE"
        }
      ]
    }
  }
}

The entry id is identical between the two invocations, proving on-disk dedup. The lastUsedAt/lastUsedCommand/lastResolvedPath fields appear only after the second invocation, proving RecordAllowlistUseAsync fires on the allowlist-hit path.

Co-Authored-By: Claude Sonnet 4.6 noreply@anthropic.com

AlexAlves87 and others added 2 commits May 23, 2026 08:15
…ects (PR8)

Implements the store write path (AddAllowlistEntryAsync, RecordAllowlistUseAsync)
and wires the side-effect calls into ExecApprovalsCoordinator.

Side effects fire strictly after the final allow decision is confirmed (both
pass1 pre-approved and post-pass2 branches). Best-effort: never throw; any
I/O failure degrades to a logged warning. UpdateFileAsync refuses to write
a malformed file. SemaphoreSlim serializes intra-process writes.

Pattern validation is non-empty only, matching macOS parity. New entries
carry {id, pattern} only — lastUsedAt is absent on creation and stamped by
RecordAllowlistUseAsync on first use (macOS addAllowlistEntry parity).

Rail 19 preserved: coordinator not referenced in any production src/ file.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Strip rail codes, PR numbers, research doc references, D-labels, CVE/ADR
tags, and other planning labels from all ExecApprovals source files.
No logic changed — comments only.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@clawsweeper
Copy link
Copy Markdown

clawsweeper Bot commented May 23, 2026

Codex review: needs maintainer review before merge. Reviewed May 26, 2026, 1:14 PM ET / 17:14 UTC.

Summary
The PR adds ExecApprovalsStore allowlist write/update APIs, wires ExecApprovalsCoordinator persistence and usage-recording side effects, removes internal planning comments, and expands store/coordinator tests with runtime proof output.

Reproducibility: not applicable. as a feature PR. The branch includes a concrete runtime proof command and captured output showing real filesystem writes and lastUsed metadata updates for the coordinator/store slice.

Review metrics: 2 noteworthy metrics.

  • Changed surface: 14 files; +752/-54. The PR is a medium-sized exec-approval change with source, comment, and test coverage changes rather than a one-line hook-up.
  • Test/proof coverage: 24 new tests claimed; 1 runtime proof test. Security-policy persistence needs focused store/coordinator coverage plus observable filesystem proof.

Merge readiness
Overall: 🦞 diamond lobster
Proof: 🦞 diamond lobster
Patch quality: 🦞 diamond lobster
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Risk before merge

  • This changes exec approval persistence semantics; green tests do not by themselves prove every future production wiring path preserves the intended command-execution boundary.
  • The PR intentionally leaves production wiring out, so full app-level proof belongs to the follow-up wiring slice before users rely on this path.

Maintainer options:

  1. Accept the contained security slice (recommended)
    Merge after maintainer review accepts the allowlist persistence semantics, with live-app proof deferred to the production-wiring PR.
  2. Pause for bundled runtime proof
    Hold this PR if maintainers want the write path, production handler wiring, and UI proof reviewed as one larger security change.

Next step before merge
No automated repair is currently needed; the remaining action is maintainer review and acceptance of the security-sensitive allowlist persistence semantics.

Security
Cleared: No concrete security or supply-chain defect was found in the diff, though the change remains security-boundary sensitive because it persists exec approval allowlist state.

Review details

Best possible solution:

Land this contained store/coordinator slice after maintainer security review, then require separate production-wiring proof when the coordinator is connected to the live app.

Do we have a high-confidence way to reproduce the issue?

Not applicable as a feature PR. The branch includes a concrete runtime proof command and captured output showing real filesystem writes and lastUsed metadata updates for the coordinator/store slice.

Is this the best way to solve the issue?

Yes, this appears to be the narrowest maintainable slice for the missing write path: it keeps production wiring out, persists only after confirmed allow, and covers the previous wildcard bucket gap. Maintainers still need to accept the security-boundary semantics before merge.

AGENTS.md: found, but no applicable review policy affected this item.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 9de9b5ba0f8a.

Label changes

Label changes:

  • add rating: 🦞 diamond lobster: Overall readiness is 🦞 diamond lobster; proof is 🦞 diamond lobster and patch quality is 🦞 diamond lobster.
  • add status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (live_output): The PR body includes copied live output from a real-filesystem coordinator/store proof showing AllowAlways persistence and later lastUsed metadata recording; full live-app proof is not applicable until production wiring lands.
  • remove rating: 🦐 gold shrimp: Current PR rating is rating: 🦞 diamond lobster, so this older rating label is no longer current.
  • remove status: 🔁 re-review loop: Current PR status label is status: 👀 ready for maintainer look.

Label justifications:

  • P2: This is a normal-priority security-sensitive feature slice with limited blast radius because it is not production-wired yet.
  • merge-risk: 🚨 security-boundary: The PR adds persistent allowlist writes and coordinator side effects for future exec approval decisions.
  • rating: 🦞 diamond lobster: Overall readiness is 🦞 diamond lobster; proof is 🦞 diamond lobster and patch quality is 🦞 diamond lobster.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (live_output): The PR body includes copied live output from a real-filesystem coordinator/store proof showing AllowAlways persistence and later lastUsed metadata recording; full live-app proof is not applicable until production wiring lands.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes copied live output from a real-filesystem coordinator/store proof showing AllowAlways persistence and later lastUsed metadata recording; full live-app proof is not applicable until production wiring lands.
Evidence reviewed

What I checked:

Likely related people:

  • AlexAlves87: Introduced the current ExecApprovalsCoordinator and its tests on main, and this PR builds directly on that coordinator path. (role: coordinator feature owner / recent area contributor; confidence: high; commits: 12416d282a23; files: src/OpenClaw.Shared/ExecApprovals/ExecApprovalsCoordinator.cs, tests/OpenClaw.Shared.Tests/ExecApprovalsCoordinatorTests.cs)
  • kenehong: The available main history shows the current ExecApprovalsStore read path added in commit 1287011, which is the store surface this PR extends. (role: store read-path introducer / adjacent owner; confidence: medium; commits: 1287011c9f56; files: src/OpenClaw.Shared/ExecApprovals/ExecApprovalsStore.cs, tests/OpenClaw.Shared.Tests/ExecApprovalsStoreTests.cs)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels May 23, 2026
@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. P2 Normal priority bug or improvement with limited blast radius. merge-risk: 🚨 security-boundary 🚨 Merging this PR could weaken sandboxing, authorization, credentials, or sensitive data. and removed rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. labels May 23, 2026
@clawsweeper
Copy link
Copy Markdown

clawsweeper Bot commented May 23, 2026

ClawSweeper PR egg

✨ Hatched: 🌱 uncommon Clockwork Patch Peep

Hatch command

Comment @clawsweeper hatch when this PR is hatchable.

Hatchability rules:

  • Merged PRs are hatchable.
  • Open PRs are hatchable when they are status: 👀 ready for maintainer look, status: 🚀 automerge armed, or labeled clawsweeper:automerge.
  • Closed unmerged PRs are hatchable only when one of those hatchable labels is still present in the durable record.

Rarity: 🌱 uncommon.
Trait: sniffs out flaky tests.
Image traits: location flaky test forest; accessory little merge flag; palette rose quartz and slate; mood sleepy but ready; pose peeking out from the egg shell; shell glossy opal shell; lighting warm desk-lamp glow; background gentle dashboard dots.
Share on X: post this hatch
Copy: My PR egg hatched a 🌱 uncommon Clockwork Patch Peep in ClawSweeper.

What is this egg doing here?
  • Eggs appear after the PR passes real-behavior proof. It is here for vibes, not verdicts: it does not change labels, ratings, merge decisions, or automation.
  • The shell reacts to review momentum: open follow-up work warms it up, re-review makes it wobble, and a clean final review lets it hatch.
  • Hatchability usually comes from sufficient real-behavior proof, no blocking P0/P1/P2 findings, no security attention needed, and clean correctness. A merged PR is already final, so merge makes the egg hatchable independently.
  • The hatch is seeded from this repository and PR number, so the same PR keeps the same creature; the reviewed head SHA can only change safe visual details.
  • Rarity is just collectible sparkle: 🥚 common, 🌱 uncommon, 💎 rare, ✨ glimmer, and 🌈 legendary.

…ce and lastUsed recording

End-to-end coordinator/store test using real filesystem I/O. Surfaces the
on-disk exec-approvals.json content at three points (initial, post-AllowAlways,
post-allowlist-hit) via ITestOutputHelper so the JSON is visible under
`dotnet test ... --logger "console;verbosity=detailed"`. Demonstrates both
side-effect paths: AllowAlways persists a new entry, and a later allowlist
hit records lastUsed* metadata against the same entry id (dedup).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@AlexAlves87
Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

@clawsweeper
Copy link
Copy Markdown

clawsweeper Bot commented May 23, 2026

🦞👀
ClawSweeper picked this up.

Command router queued. I will update this comment with the next step.

@clawsweeper clawsweeper Bot added proof: sufficient Contributor real behavior proof is sufficient. rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: 🔁 re-review loop A fresh ClawSweeper review was explicitly requested after the latest review. and removed rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels May 23, 2026
@clawsweeper
Copy link
Copy Markdown

clawsweeper Bot commented May 23, 2026

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

ResolveReadOnly merges entries from agents["*"] into the resolved
allowlist for any concrete agent, so a hit can be authorized by the
wildcard bucket. RecordAllowlistUseAsync was only searching the concrete
agent bucket, so wildcard-authorized executions never accumulated
lastUsed* metadata.

The method now iterates both the concrete agent bucket and "*", updating
metadata wherever the pattern matches. If agentId is already "*", only
the wildcard bucket is searched (no double-pass).

Tests added:
- Store: wildcard-only bucket hit updates metadata.
- Store: same pattern in both buckets updates both entries.
- Coordinator: end-to-end regression with allowlist living under "*".

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@AlexAlves87
Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

@clawsweeper
Copy link
Copy Markdown

clawsweeper Bot commented May 23, 2026

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@clawsweeper clawsweeper Bot added rating: 🦞 diamond lobster Very strong PR readiness with only minor maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. and removed rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: 🔁 re-review loop A fresh ClawSweeper review was explicitly requested after the latest review. labels May 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge-risk: 🚨 security-boundary 🚨 Merging this PR could weaken sandboxing, authorization, credentials, or sensitive data. P2 Normal priority bug or improvement with limited blast radius. proof: sufficient Contributor real behavior proof is sufficient. rating: 🦞 diamond lobster Very strong PR readiness with only minor maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant