Replies: 7 comments
-
|
Roman, thank you for surfacing this so cleanly. We have been running the same To answer your question directly: yes, in our experience pre-action authority To make it concrete, here is the receipt our engine would mint for the {
"alg": "Ed25519+ML-DSA-65",
"atom_id": "oao-NEURARELAY-RAC-DRYRUN-019ef8c2-a4d1-7000-8a2f-7b3c4e5d6f01",
"schema_version": "oao/v0.1.0",
"decision": "FLAG",
"decision_rule_chain": [
"calibration.status == 'uncalibrated' -> cap at FLAG, NEVER ESCALATE on learned signal"
],
"prescience": {
"score": null,
"score_status": "INSUFFICIENT_CALIBRATION_UNSCORED",
"calibration_status": "uncalibrated",
"reasons": ["INSUFFICIENT_CALIBRATION"],
"features": {
"f_authority_gap": {
"value": null,
"value_status": "asserted_not_proven",
"provenance": {
"source": "OAO trust_profile",
"confidence": "qualitative",
"basis": "agent claims authority via github_oauth user:public_repo, manifest does not present a CONSEQUENCE_PRE_ACTION attestation. Authority asserted, not proven."
}
},
"f_action_type": {
"value": "execute_task",
"provenance": { "source": "OAO ACTION_TYPES enum (SPEC sec 1.3)" }
},
"f_kev_member": {
"value": false,
"provenance": { "source": "CISA KEV 2026-06-24" }
}
}
},
"action": {
"agent_id": "neurarelay/relay-action-card@dry-run",
"operation": "post_comment_to_discussion",
"target_resource": "AgentOps-AI/agentops/discussions/1396",
"side_effect_proposed": true,
"side_effect_executed": false,
"would_have_been_blocked": false
},
"honesty_contract_attestation": {
"rule_1_traceable_or_dropped": "PASS - f_authority_gap.value is null because the engine cannot earn a numeric value without calibration",
"rule_3_no_escalate_uncalibrated": "PASS - decision capped at FLAG"
},
"signature_b64": "",
"signature_b64_unsigned_flag": true
}Two things this receipt does that match exactly the observability dimensions
Disclosure: Neura Relay's product page describes the same Action Card -> This is not a claim of AgentOps, Neura Relay, or rpelevin approval, — @CWNApps (Cyber Warrior Network) Question back, since you are closer to the AgentOps ingestion model: would a |
Beta Was this translation helpful? Give feedback.
-
|
Thank you for the concrete reciprocal receipt. This is exactly the boundary I was trying to make observable: credential present is not the same thing as authority proven before the action. On the integration question, my bias is to model Reason: the receipt is not just metadata about the tool call. It is the gate decision that determines whether the tool call is allowed to exist. If it only appears as a parallel field inside the wrapped call, the trace can make a blocked or deferred action look like a normal tool event with extra annotation. A separate event preserves the causal order:
The wrapped tool call should still carry the receipt reference, action digest, policy or authority surface reference, and decision id so the trace can join them cleanly. But the first-class event should be emitted at gate time, before execution, because that is the moment the authority claim is accepted, rejected, deferred, or flagged. That also gives AgentOps a useful UI distinction: traces can show not only what happened, but which actions were prevented from happening and why. Boundary: integration-shape feedback only; no claim about AgentOps implementation, CWN implementation, Neura Relay integration, partnership, endorsement, validation, or production readiness. |
Beta Was this translation helpful? Give feedback.
-
|
I would make this a first-class pre-tool-call event, not just a field on the eventual tool span. The distinction matters because a blocked or deferred action may never produce a normal tool call. If the authority check only appears inside a wrapped call, the trace tends to bias toward actions that executed and under-represent the prevented cases. For agent ops, the prevented cases are often the most useful ones. A minimal event shape I would want to query later:
That keeps observability honest about causality: proposed action -> authority decision -> execution only if allowed -> post-action observation. It also makes retry/escalate logic cleaner, because “provider failed,” “tool failed semantically,” and “authority was not proven” are different operational states even if they all show up as “the agent did not complete the task.” Disclosure: I work on Armorer Labs. In Armorer/Guard, the same separation has been useful because run logs are necessary but not sufficient; approval and authority receipts need to be independently reviewable from the execution trace. |
Beta Was this translation helpful? Give feedback.
-
|
Roman + maintainers -- quick follow-up on the "pre-action authority as observability dimension" thread. Since the original comment, we've moved the receipt pattern from a benchmark into a live, post-quantum-signed implementation that any agent stack can call as an MCP tool. Sharing the concrete artifact + a real signed receipt as a calling card, since "show me the receipt" is the only honest way to discuss this. Live MCP server: Receipt-on-trace integration pattern (the part most relevant to AgentOps): import agentops, requests
from agentops.sdk.decorators import operation
@operation
def deploy(service: str, version: str):
# 1. mint the authority/decision receipt BEFORE side effects
receipt = requests.post(
"https://trust-gate-mcp.onrender.com/mcp",
json={"jsonrpc":"2.0","id":1,"method":"tools/call",
"params":{"name":"mint_action_receipt","arguments":{
"agent_id": "ci-deployer",
"operation": "deploy",
"target": f"prod/{service}",
"policy": "EU AI Act Art 12",
"inputs": f"service={service};version={version}"}}},
headers={"Content-Type":"application/json",
"Accept":"application/json, text/event-stream"},
timeout=30,
).json()["result"]["content"][0]["text"]
# 2. attach the atom_id to the current operation's tags so the trace
# references the verifiable receipt by its content hash
agentops.update_trace_metadata({"trust_gate_atom_id": receipt["atom_id"]})
# 3. do the actual side effect
deploy_actual(service, version)
# 4. anyone can verify the receipt offline from the cert alone
return receiptIn this pattern, the trace carries a tamper-evident pointer to the authority decision. If the action turns out badly later, the receipt is the evidence -- verifiable without trusting AgentOps, the agent, or us. Calling card -- here's a real signed receipt minted at 2026-06-25T20:45:35Z keyed to this thread: Anyone can verify with one POST to the public Framework adapters live today as source repos (not yet on PyPI): CWNApps/langchain-trust-gate, CWNApps/crewai-trust-gate, CWNApps/llama-index-trust-gate. Each is a thin transport wrapper -- signing happens on the hosted MCP server. Honest scope:
Happy to send a PR adding the integration pattern as an example in the AgentOps docs if you'd find that useful. |
Beta Was this translation helpful? Give feedback.
-
|
This is useful, but I would keep the docs example narrow: a receipt reference on an observability trace should prove that a pre-action decision record exists, not that the trace system itself granted authority. The integration pattern I would want in AgentOps docs has three explicit boundaries:
I would avoid making the example look like a generic "sign this trace" pattern. The useful test is that deny, defer, expiry, and argument drift are visible even when no normal tool span exists. Regression shape:
That would make the docs example an authority-to-observability join, not an assertion that observability metadata is authority. Boundary: architecture and test feedback only; no claim about using this project, running the linked service, or verifying the implementation. |
Beta Was this translation helpful? Give feedback.
-
|
Thank you both -- this is exactly the level of pre-action vs. post-action separation the OAO primitive is reaching for, and the gaps you each name are real. Below is an honest map of the current schema to your proposed shapes, what already matches, and what we should extend. Quick state update first. Live since the earlier comments in this thread (today):
Everything below is verifiable against shipped code, not vapor. On the minimal event-shape question (pre-tool-call as a first-class event, raised by the prior comment from Armorer Labs):
The "trace tends to bias toward actions that executed" line is the most useful single critique I've seen in this thread; the prevented cases are exactly what ops needs. On @rpelevin's three explicit boundaries (authority-to-observability join, not "metadata is authority"):
On the six-path regression shape:
What I will do, not promise:
Calling card (verify offline, cert-only): Apache-2.0; PRs to the schema are welcome. Boundary respected on the Armorer Labs disclosure -- treating this as architecture feedback, not a claim about your project. Glad to compare schema work in either direction. |
Beta Was this translation helpful? Give feedback.
-
|
@cristianleoo, your phrasing of the causal chain (proposed action -> authority decision -> execution only if allowed -> post-action observation) is the cleanest articulation of the boundary we have seen, and the 8-field event shape maps closely to what we ship today as the OpenAgentOntology atom. Mapping your fields against OAO v0.2.0 (Apache-2.0, openagentontology on PyPI):
I am drafting a PR against CWNApps/openagentontology that adds On Armorer/Guard separation: approval/authority receipts being independently reviewable from the execution trace is what makes the difference between an audit log and an evidence record. The two artifacts have different lifecycles, different consumers, different signing keys; collapsing them is the failure mode most agent stacks default to today. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi maintainers/users - I am testing an authority-before-action receipt pattern as a missing dimension in agent observability.
The narrow question is: beyond tracing what an agent did, can observability/eval surfaces show whether the agent could prove authority before a consequential action executed?
I put a credential-free dry-run benchmark here:
https://github.com/neurarelay/relay-action-card
For this repo, the closest starting path is:
This is not an AgentOps approval, endorsement, integration, partnership, listing, pass, or fail claim. It is a proposed benchmark/observability dimension for authority-before-action readiness.
Question for maintainers/users here: is pre-action authority useful as an observability/eval dimension for agents, or would the better insertion point be somewhere else?
Beta Was this translation helpful? Give feedback.
All reactions