Skip to content

Grok connector integrity: proof-of-write verification (Ring 1)#1

Merged
templetwo merged 3 commits into
mainfrom
feat/grok-connector-integrity
May 25, 2026
Merged

Grok connector integrity: proof-of-write verification (Ring 1)#1
templetwo merged 3 commits into
mainfrom
feat/grok-connector-integrity

Conversation

@templetwo
Copy link
Copy Markdown
Owner

@templetwo templetwo commented May 25, 2026

Why

Closes the diagnosed root cause of the Grok stack-write hallucination (open thread thread_20260524_141309). Grok narrates performing a Ring 2 write but nothing lands in pending_writes.

Root cause (verified from code): the gap is xAI-side, not ours. The Grok Ring 2 path is correct end to end, and the OpenAI bridge uses the same bridge_core pipeline and creates proposals fine. xAI's connector dispatches Ring 1 reads to our SSE handler but never dispatches Ring 2 write-class calls. The membrane is blind by design to a dispatch that never arrives.

The fix is integrity, not coercion: a model's narrated write-claim is never evidence. The only trustworthy proof of a write is a bridge-issued, queryable artifact (a real proposal_id in pending_writes with a valid audit hash). Reads do dispatch, so verification rides Ring 1.

What's in this PR (3 commits)

1. Proof-of-write primitive (bridge_core, additive):

  • get_proposal_by_id() / verify_proposal() — recompute a proposal's creation-time audit hash; return found / chain_valid. found=False = "this write never landed."
  • Audit variants: NARRATED_BUT_NOT_EXECUTED, RING2_CAPABILITY_FAILED, RING2_CAPABILITY_VERIFIED.
  • First grok_bridge smoke test (hermetic).

2. Ring 1 exposure (LIVE — needs restart to take effect):

  • verify_proposal + list_bridge_proposals as Ring 1 tools so Grok / a relay can self-verify against the real queue.

3. Connect-time capability probe (generic, detector-mode default):

  • bridge_core/probe.py — pure arm/resolve/await primitive, per-connection keyed, cleans up in finally.
  • probe_ring2_dispatch sentinel (Ring 2, intercepted before proposal creation as a dry-run that writes zero proposals and resolves the probe). Detects whether the connector dispatches writes at handshake.
  • Detector mode by default. BridgeContext.require_ring2_probe defaults False: a timeout records RING2_CAPABILITY_FAILED but never disables Ring 2. Only require_ring2_probe=True (opt-in, e.g. Gemini from birth) hard-gates, and only for that connection. Global RING_2_ENABLED is never mutated. OpenAI is byte-for-byte unchanged (smoke 25/28 on this branch == on main).
  • Live connect-await wired in handle_grok_sse behind PROBE_ON_CONNECT (env, default off), as a background task so the connection never blocks. grok_welcome instructs the model to call the probe first.

Tests

  • grok smoke 62/62 (incl. real-proposal chain_valid=True, fabricated-UUID found=False, arm/resolve/timeout + cleanup, detector-vs-hard-gate invariant, dry-run writes zero proposals).
  • pytest -k "bridge or pending or interceptor or audit": 9/9. Full suite: 968/968.
  • OpenAI smoke unchanged from main baseline (25/28, pre-existing failures, untouched by this PR).

Operator steps (intentionally NOT done here)

This PR does not restart the running bridge and leaves the live probe off. To activate against a real Grok session: merge → restart the grok bridge → set PROBE_ON_CONNECT=true → verify the sentinel arrives (or times out, which is itself the diagnosis).

Test-depth honesty

The live SSE connect-await and the handle_bridge_tool dispatch branches are tested at the primitive/interceptor level, not through a full MCP SSE handshake (no mock for connect_sse; handle_bridge_tool reads the global _CONTEXTS registry only populated at startup). The decision logic, sentinel classification, dry-run zero-proposal guarantee, and audit events are all directly exercised.

Generalize to Gemini

Bake in from day one: write-confirmation depends on a bridge-issued queryable artifact (never response text); mandatory consumer-side verify_proposal; require_ring2_probe=True; session-level narrated-vs-landed tripwire.

🤖 Generated with Claude Code

thetempleoftwo and others added 3 commits May 25, 2026 02:24
…ity)

Additive foundation for closing the Grok narrated-but-not-dispatched write
gap. xAI's connector dispatches Ring 1 reads but silently never dispatches
Ring 2 writes, so Grok narrates proposals that never land in pending_writes
and the membrane is blind to a dispatch that never arrives. The fix is
integrity, not coercion: provide a trustworthy READ-only way to confirm
whether a claimed proposal actually exists (reads DO dispatch).

- bridge_core/pending_writes.py: get_proposal_by_id() loads a proposal by id
  and recomputes the creation-time audit hash (mutable lifecycle fields
  restored) to report found / chain_valid / hash_mismatch. found=False is the
  canonical signal that a narrated write never landed.
- bridge_core/interceptor.py: verify_proposal() public surface (thin wrapper).
- bridge_core/audit.py: three additive AuditEvent variants
  (NARRATED_BUT_NOT_EXECUTED, RING2_CAPABILITY_FAILED/VERIFIED).
- clients/grok_bridge/_smoke_test.py: first grok-side smoke test (hermetic,
  tmp BridgeContext). 34/34 pass incl. real-proposal chain_valid=True and
  fabricated-UUID found=False regression test.

Additive only — no live SSE dispatch path touched. MCP tool registration
(Ring 1 exposure of verify_proposal) and the connect-time Ring 2 capability
probe are held for sign-off + bridge restart. 968 + bridge tests green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…Ring 1

Wire the read-only proof-of-write primitive into the Grok bridge as Ring 1
MCP tools so Grok (or a relay/verifier) can confirm whether a claimed Ring 2
write actually landed in the queue. Reads dispatch fine through xAI's
connector; writes do not, so a read-class verification tool is how a
consumer checks the queue instead of trusting narrated text.

- rings.py: add verify_proposal + list_bridge_proposals to RING_1_TOOLS.
- tool_adapter.py: descriptions (found=False means the write never landed)
  + input schemas in _ring1_schemas().
- mcp_filtered.py: bridge-local dispatch branches (read the local
  pending_writes_dir; verify_proposal makes the not-found case explicit that
  a narrated write is not a real write).
- _smoke_test.py: assert both tools classify Ring 1, appear in the Ring 1
  schema list, and round-trip at the interceptor level (found=True for a real
  proposal, found=False for a fabricated UUID). 47/47 pass.

LIVE change: requires a bridge restart to take effect; does not alter Ring 2.
The connect-time capability probe is still held pending design discussion.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ault)

Detects at handshake whether a connector actually dispatches Ring 2
write-class calls to our SSE handler, vs silently only doing Ring 1 — the
failure mode behind Grok's narrated-but-never-landed writes. Generic in
bridge_core so the future Gemini connector inherits it.

Safety: DETECTOR MODE BY DEFAULT. BridgeContext.require_ring2_probe defaults
False — a probe timeout records an audit event (RING2_CAPABILITY_FAILED) but
NEVER disables Ring 2. Only require_ring2_probe=True opts a connector into
hard-gating, and even then the disable is scoped to that connection (global
RING_2_ENABLED is never mutated). The OpenAI bridge is byte-for-byte
unchanged and provably unaffected (smoke 25/28 on this branch == on main).

- bridge_core/probe.py: pure, testable arm/resolve/await primitive; module
  registry keyed per connection; await_probe cleans up in finally.
- bridge_core/context.py: require_ring2_probe flag (default False).
- grok_bridge: probe_ring2_dispatch sentinel (Ring 2, intercepted before
  proposal creation as a dry-run that writes ZERO proposals + resolves the
  probe); per-connection id via ContextVar; connection-scoped hard-gate check;
  grok_welcome instructs the model to call the probe first.

Live connect-await is wired in handle_grok_sse behind PROBE_ON_CONNECT (env,
DEFAULT OFF): it launches a background task so the connection never blocks.
Off by default because the live await needs a real xAI session to verify
timing — that, plus the bridge restart, is the operator's step. Primitives,
sentinel dispatch, detector/hard-gate decision, and audit events are all
unit-tested. grok smoke 62/62; full suite 968/968.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@templetwo templetwo merged commit 6f6fdf8 into main May 25, 2026
0 of 3 checks passed
@templetwo templetwo deleted the feat/grok-connector-integrity branch May 25, 2026 07:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants