RFC 0010 (draft): Analyst-skill seams on a pipeline-shaped session substrate by david-steeves · Pull Request #11 · openclaw/rfcs

david-steeves · 2026-06-08T22:02:52Z

Summary

Proposes shaping OpenClaw's session/transcript substrate as a three-stage data pipeline (raw → processed → curated) so that stage boundaries become explicit seams where designated analyst skills can read, classify, transform, or hold payloads — and reserves the substrate, the analyst registry, and Policy artifacts to the Gateway control plane so agent-plane code cannot rewrite its own supervisor.

The pipeline is a contract about shape, not backend: local SQLite, an embedded log, an object store, or a managed remote queue can each back any stage. Analyst skills run in the existing skill runtime and emit verdicts (pass, transform, block, escalate) shaped as redacted evidence in the form RFC 0003 already defines.

Why now

Three in-flight threads converge on the same structural gap — a place where policy can be evaluated against flowing session data, with the supervisor structurally separated from the supervised:

docs: propose plugin sdk session storage migration #7 (Plugin SDK session/transcript storage migration) is decoupling agents from file-shaped APIs; once the SDK exists, the shape behind it becomes the question.
RFC 0007: Pluggable scheduler seam in gateway #5 (Pluggable scheduler seam in gateway) demonstrates the Gateway-plugin seam pattern this RFC reuses for the pipeline substrate.
docs: update policy conformance exec approvals evidence #6 (updating RFC 0003 Policy conformance) standardizes the redacted-evidence shape this RFC reuses for data-conformance verdicts.

Today's substrate has no named moments where policy can run on flowing data, and no structural separation between the code that produces data and the code that evaluates it. Filing this now, before #7 hardens, lets the pipeline shape inform the SDK rather than retrofit onto it.

Scope

Architectural shape only. Manifest fields, jsonc schemas, registration APIs, and CLI shapes are intentionally deferred to a follow-up contract RFC.
Storage-neutral. Choosing a backend (SQLite, object store, etc.) is out of scope; the contract is about boundaries, not stores.
No bundled analyst catalog. Operators decide which analyst skills to install.
No new isolation primitive. Analyst skills run in the existing skill runtime.
Declarative redaction only. Forensic attestation (pre-image hashes, signer chains) is named as a follow-up; this RFC is honest that the evidence model is observational.

Status

status: draft. Opening as a draft PR to start the maintainer-discussion thread per the recently merged RFC lifecycle docs. I'll start a maintainer-discussion thread on Discord under my identity.

Reviewer-relevant deltas already applied

V1 → v2 closed must-fix findings from an internal review panel: (a) softened PII/PHI claims to "declare redaction intent" with forensic attestation explicitly deferred; (b) transform retains original payload in a write-restricted store for Policy audit; (c) escalate is async with pending-human-review marker and Policy-configured timeout; (d) stage plugins constrained to passive I/O — payload mutation is exclusively the analyst-skill surface; (e) empty-evidence stamp defined inline; (f) citations use PR numbers rather than contested RFC IDs.

Unresolved questions

Eight, listed in the RFC. The ones I'd most like maintainer steer on early:

Should an empty seam be a Policy conformance failure by default for sensitive stage transitions, or always emit a stamp?
Where should the contract RFC live — combined with this one, or sequenced after?
Supply-chain integrity for analyst skills — should evidence records themselves be cryptographically chained?

First-time contributor; happy to iterate on shape, scope, or terminology.

…bstrate

- Lead Summary with structural-integrity framing - Call out local SQLite / embedded log / object store as substrate options - Cite enabling work by PR number (openclaw#5, openclaw#7) rather than contested RFC IDs - Soften PII/PHI claims to "declare redaction intent"; defer forensic attestation to a follow-up RFC - transform verdict retains original payload in write-restricted store - escalate is async with pending-human-review marker + Policy-set timeout - New goal: stage plugins are passive I/O; no payload mutation - Define empty-evidence stamp in Proposal - Drop "in different costumes" metaphor; tighten throat-clearing - Anchor "seam" terminology with a definition - Expand Unresolved Questions: supply chain, ToCToU, multi-analyst conflict, contract-RFC location

clawsweeper · 2026-06-08T22:03:56Z

Codex review: needs real behavior proof before merge. Reviewed June 8, 2026, 6:08 PM ET / 22:08 UTC.

Summary
Adds a new draft RFC proposing a raw -> processed -> curated session/transcript pipeline with Gateway-owned analyst-skill seams, policy evidence, and control-plane boundaries.

Reproducibility: not applicable. this is an RFC proposal, not a bug report. I checked current main for matching analyst-skill pipeline text and found no existing implementation or RFC that would make it obsolete.

Review metrics: 2 noteworthy metrics.

Diff surface: 1 added Markdown file, 334 added lines. The PR is RFC-only, so review should focus on lifecycle and architecture fit rather than runtime tests.
Open design questions: 9 unresolved questions. The RFC intentionally leaves contract, supply-chain, and policy semantics for maintainer discussion before acceptance.

Merge readiness
Overall: 🦐 gold shrimp
Proof: 🌊 off-meta tidepool
Patch quality: 🦐 gold shrimp
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

Set rfc_pr to the pull request URL.
Compact the Summary to the repository's one-paragraph shape.
Resolve or explicitly defer the major contract/security questions in maintainer discussion before acceptance.

Risk before merge

[P1] Merging while status: draft would conflict with the README lifecycle and publish unresolved Gateway/session/policy architecture as repository guidance.
[P1] The RFC defers concrete manifest/API/CLI shapes and leaves supply-chain, time-of-check/time-of-use, evidence, and conflict-resolution questions unresolved.

Maintainer options:

Keep draft until lifecycle conditions are met (recommended)
Resolve or explicitly accept the contract questions, complete the maintainer-discussion step, set rfc_pr to the PR URL, and move to status: accepted only when maintainers are ready to merge.
Accept broad direction now
Maintainers could accept the unresolved details as follow-up RFC work, but should first mark the RFC accepted and understand that the text becomes repository guidance.
Pause until substrate work settles
If the session-storage migration should define the substrate first, leave this draft paused or close it in favor of a narrower follow-up after that contract lands.

Next step before merge

[P1] Needs maintainer review because the draft asks for architecture, security, and RFC-lifecycle decisions that automation should not settle.

Security
Cleared: The diff only adds a Markdown RFC and does not change executable code, dependencies, workflows, permissions, secrets, or package metadata.

Review findings

[P3] Set rfc_pr to the pull request URL — rfcs/0010-agentic-pipeline-substrate.md:9
[P3] Keep the summary to one paragraph — rfcs/0010-agentic-pipeline-substrate.md:27-35

Review details

Best possible solution:

Keep this as a draft RFC until maintainer discussion settles the architecture, then update metadata/status and link an implementation issue before merge.

Do we have a high-confidence way to reproduce the issue?

Not applicable; this is an RFC proposal, not a bug report. I checked current main for matching analyst-skill pipeline text and found no existing implementation or RFC that would make it obsolete.

Is this the best way to solve the issue?

Unclear as a final solution. The draft is a plausible architecture direction, but it deliberately defers the API/config contract and calls out unresolved supply-chain and policy semantics that need maintainer direction.

Full review comments:

[P3] Set rfc_pr to the pull request URL — rfcs/0010-agentic-pipeline-substrate.md:9
The README says new RFC metadata should set rfc_pr to the RFC pull request, but this new RFC leaves it as TBD. Before this can be accepted or merged, set it to https://github.com/openclaw/rfcs/pull/11 so the published RFC links back to its review thread.
Confidence: 0.9
[P3] Keep the summary to one paragraph — rfcs/0010-agentic-pipeline-substrate.md:27-35
The repository authoring notes ask RFC summaries to stay to one paragraph, but this Summary runs three paragraphs. Please compact the backend/runtime detail into one summary paragraph or move it into Motivation/Proposal so the RFC follows the local shape.
Confidence: 0.82

Overall correctness: patch is correct
Overall confidence: 0.84

AGENTS.md: not found in the target repository.

Codex review notes: model gpt-5.5, reasoning high; reviewed against e938e93198f4.

Label changes

Label changes:

add P3: This is a low-risk draft RFC/product-direction proposal with process nits and no current runtime breakage.
add merge-risk: 🚨 other: Merging a draft RFC with unresolved contract and security questions would publish unsettled architecture outside the normal RFC lifecycle.
add rating: 🦐 gold shrimp: Overall readiness is 🦐 gold shrimp; proof is 🌊 off-meta tidepool and patch quality is 🦐 gold shrimp.
add status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Not applicable: The PR is an RFC-only documentation proposal with no runtime behavior to demonstrate.

Label justifications:

P3: This is a low-risk draft RFC/product-direction proposal with process nits and no current runtime breakage.
merge-risk: 🚨 other: Merging a draft RFC with unresolved contract and security questions would publish unsettled architecture outside the normal RFC lifecycle.
rating: 🦐 gold shrimp: Overall readiness is 🦐 gold shrimp; proof is 🌊 off-meta tidepool and patch quality is 🦐 gold shrimp.
status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Not applicable: The PR is an RFC-only documentation proposal with no runtime behavior to demonstrate.

Evidence reviewed

What I checked:

No matching current-main RFC: A search of current main found no existing RFC or README content for the analyst-skill pipeline, raw/processed/curated stage boundaries, or session pipeline proposal, so the central change is not already implemented. (e938e93198f4)
RFC lifecycle guidance: The current README says draft RFCs should not merge while still draft, each new RFC needs a maintainer-discussion thread, and rfc_pr should point to the RFC pull request. (README.md:77, e938e93198f4)
Proposed RFC remains draft: The proposed file is status: draft and leaves rfc_pr: TBD, so it is not in an accepted/merge-ready lifecycle state yet. (rfcs/0010-agentic-pipeline-substrate.md:7, 7383da6952df)
Contract details intentionally deferred: The RFC defers manifest fields, schemas, registration APIs, and CLI shapes to follow-up work, and lists unresolved contract/security questions that need maintainer direction. (rfcs/0010-agentic-pipeline-substrate.md:118, 7383da6952df)
Relevant history: Recent RFC process and adjacent policy/RFC work was concentrated in README/template updates and policy/feed/approval RFCs by the likely routing candidates.

Likely related people:

kevinlin-openai: Authored the current RFC lifecycle and template guidance that governs draft status, rfc_pr, and merge readiness. (role: recent RFC process owner; confidence: high; commits: e366ea9825a4, f4fdf38f4717; files: README.md, rfcs/0000-template.md)
Gio Della-Libera: Authored or updated policy conformance and feeds RFC material that this draft builds on for evidence and catalog/policy direction. (role: adjacent policy/RFC contributor; confidence: medium; commits: e46c2a113cc9, 92f480be5cd9; files: rfcs/needs_refactoring/0003-policy-conformance.md, rfcs/0006-feeds.md)
Omar Shahine: Authored the approval prompt markdown RFC that this draft references for escalation/approval surface context. (role: adjacent RFC author; confidence: low; commits: f346050b2878; files: rfcs/0005-approval-prompt-markdown.md)

What the crustacean ranks mean

🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works

ClawSweeper keeps one durable marker-backed review comment per issue or PR.
Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
Maintainers can also comment @clawsweeper review to request a fresh review only.
Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

david-steeves · 2026-06-08T22:06:17Z

Working on perf numbers to compare impact of just the shape without reviewers doing really analysis.

david-steeves · 2026-06-09T04:26:40Z

Note on lifecycle step: the RFC lifecycle docs point to #maintainer-discussions on Discord for the discussion thread. I'm a first-time contributor and the channel is currently gated for my role — New Post returns "An error occurred." I've asked in #clawtributors for the role grant, but rather than block the conversation on access, I'm dropping the thread content here so maintainers can react to the substance. Once I have posting permission I'll mirror this into a Discord thread and cross-link both ways.

👋 First-time contributor opening RFC 0010 (draft) — Analyst-skill seams on a pipeline-shaped session substrate (this PR).

TL;DR of the RFC. Shape OpenClaw's session/transcript substrate as a 3-stage data pipeline (raw → processed → curated). Stage boundaries become explicit seams where designated analyst skills can read, classify, transform, or hold payloads — and the substrate, the analyst registry, and Policy artifacts are reserved to the Gateway control plane so agent-plane code cannot rewrite its own supervisor. Storage-neutral (SQLite, log, object store, managed queue — all fine). Evidence shape reuses RFC 0003.

Why I think now is the moment. Three in-flight threads converge on the same gap — no named place where policy can run on flowing session data, and no structural separation between producer and evaluator:

docs: propose plugin sdk session storage migration #7 (Plugin SDK transcript storage migration) is decoupling agents from file-shaped APIs; once that SDK lands, the shape behind it becomes the question.
RFC 0007: Pluggable scheduler seam in gateway #5 (pluggable scheduler seam) is the Gateway-plugin pattern I reuse.
docs: update policy conformance exec approvals evidence #6 (RFC 0003 policy conformance) standardizes the redacted-evidence shape.

Bench numbers

I built a bench lab to put numbers on the cost:

M5 Pro native, 10 concurrent synthetic agents, 3 payload size classes, 6 substrate variants, 3 runs each, median reported.

Headline numbers (p50):

Payload	file-share baseline	sqlite-flat	3-stage pipeline (faithful promote)	Pipeline vs baseline
512 B	1.21 ms	0.94 ms	5.25 ms	~4×
5 MB	5.2 ms	75 ms	184 ms	~35×
100 MB	15 ms*	900 ms	2,689 ms	~180×

* file-share's 100 MB number is fake-cheap — RSS ballooned to 6.6 GB holding the read-back cache. Honest substrate comparison has to declare read-back semantics.

What the numbers tell me — and the design implication I want maintainer steer on

The pipeline shape is structurally cheap at chat-message sizes, but the multiplier vs file-share grows with payload size — 35× at 5 MB, ~180× at 100 MB. That turns into a hot complaint thread the day someone runs a long-context coding session emitting multi-MB tool outputs.

So my read: the seam-and-evidence pipeline is not a free upgrade you flip on for every OpenClaw instance. It's a real cost worth paying only when something justifies it — regulated deployments, parental-controls, hosted multi-tenant, finance/legal claws with mandatory analyst review. For a hobbyist on a single calculator-claw, paying 5× latency to stamp empty evidence into 5 SQLite databases is the wrong tradeoff.

This flips what I'd originally scoped as a post-MVP nice-to-have into something the MVP probably has to anticipate from day one: governance is configurable, not default-on, and the configuration unit is per-claw (or per-instance) — not per-runtime.

A Gateway-level "governed mode" setting, configurable per-claw:

The claw still calls its existing storage API (today: append JSONL to its workspace).
When governed mode is on for that claw, the Gateway substitutes a pipe-end for the file. The claw is unaware.
ACL enforcement lives in the pipe infrastructure, outside the claw's blast radius.

Three properties: (a) existing claws need zero code changes; (b) different claws can be governed at different strictness levels without forking the runtime; (c) a compromised claw cannot bypass the pipeline — the only "store" handle it has is one end of the pipe.

Analog: service-mesh sidecar (Istio/Linkerd). If MVP ships pipeline as "on for everything," the bench numbers predict an adoption-blocking complaint thread. If MVP ships it as "available, opt-in per claw," the same numbers become a feature — of course it's slower, that's what governance is paying for.

Newbie questions I'd love steer on before I push this further

Has anyone already tried or thought about this? I've read through the open RFCs and the threads I cite, but I'm new and don't know what's been kicked around in Discord, in older drafts, or in private conversations. If someone has already explored a pipeline-shaped substrate (even if they decided against it), I'd much rather build on that thinking than restate it. Pointers welcome.
Is this worth doing at all? The numbers are encouraging at small payloads and honest about cost at large ones, but "feasible" ≠ "wanted." If the maintainer position is "we like the file-share shape and have no appetite for adding a control-plane substrate," I'd rather hear that early. Genuinely no ego — I'm solving a real problem (per-claw governance for a parental-monitoring use case I'm building), and "you're solving the wrong problem" is a valid answer.
Is there an established place to get research/review teams iterating on RFCs at this depth? I ran v1 through an internal review panel I cobbled together (security + architecture reviewers) which produced the v1→v2 fixes, then built the bench lab to backstop the cost claims. If OpenClaw (or the broader agentic-runtime community) already has a working group for this, I'd love to plug in. If not but other RFC authors have hit the same gap, I'm interested in helping shape something.
Is anyone already working on something adjacent? I'd be surprised if no one's thinking about supervisor/supervised separation, policy-on-flowing-data, or per-claw governance. If there's overlap with someone's WIP, I'd rather coordinate than collide.
(One design call I can't decide alone) Should an empty seam (no analyst registered) be a Policy conformance failure by default for sensitive transitions, or always emit the no-op stamp?

Happy to iterate on shape, scope, or terminology.

david-steeves added 2 commits June 8, 2026 14:47

RFC 0010 (draft): analyst-skill seams on a pipeline-shaped session su…

a6339d8

…bstrate

clawsweeper Bot added rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. labels Jun 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC 0010 (draft): Analyst-skill seams on a pipeline-shaped session substrate#11

RFC 0010 (draft): Analyst-skill seams on a pipeline-shaped session substrate#11
david-steeves wants to merge 2 commits into
openclaw:mainfrom
david-steeves:rfc/agentic-pipeline-substrate

david-steeves commented Jun 8, 2026

Uh oh!

clawsweeper Bot commented Jun 8, 2026 •

edited

Loading

Uh oh!

david-steeves commented Jun 8, 2026

Uh oh!

david-steeves commented Jun 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

david-steeves commented Jun 8, 2026

Summary

Why now

Scope

Status

Reviewer-relevant deltas already applied

Unresolved questions

Uh oh!

clawsweeper Bot commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

david-steeves commented Jun 8, 2026

Uh oh!

david-steeves commented Jun 9, 2026

Bench numbers

What the numbers tell me — and the design implication I want maintainer steer on

Newbie questions I'd love steer on before I push this further

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

clawsweeper Bot commented Jun 8, 2026 •

edited

Loading