OT-RFC-38 LU-6 C4 — two-laptop testnet validation runbook#625
OT-RFC-38 LU-6 C4 — two-laptop testnet validation runbook#625branarakic wants to merge 1 commit into
Conversation
The local devnet harnesses (`scripts/devnet-test-rfc38-*.sh`) all use a libp2p-private mesh — loopback dialing, no DHT, no NAT traversal. They do not cover the failure modes that only surface when peers traverse real internet hops, which is the LAST remaining gate before declaring LU-6 mainnet-ready. This runbook is the C4 companion: an end-to-end checklist for operating two laptops on different NATs + a core operator's existing testnet node, exercising the full LU-6 lifecycle on real network conditions: - Curated CG creation + on-chain registration on Base Sepolia - Discovery-beacon propagation across the public mesh - Cross-NAT/DHT gossip delivery of opaque SWM ciphertext to a core - Host-catchup wire protocol over real RTT - Signature-based host-catchup authorization (B1, #618) - Member catchup resume across NAT/connection churn (B-series) - VM publish from edge nodes with deferred on-chain registration - Cross-laptop attestation cross-verification - Stress + unclean-restart scenarios over the public mesh Explicitly out of scope: - LU-11 chunked ciphertext commitment (still in design, #617) - RFC-39 random sampling (depends on LU-11) - Multi-million-message scale (use the synthetic harnesses) User-led (requires two real laptops on separate networks). Co-authored-by: Cursor <cursoragent@cursor.com>
|
|
||
| Capture the three agent addresses + libp2p peer IDs: | ||
| ```bash | ||
| dkg show |
There was a problem hiding this comment.
🔴 Bug: dkg show is not a top-level CLI command in this repo, and later references to dkg show-cg / dkg shared-memory host-mode stats also don't exist. An operator following the runbook will fail before step 1. Replace these with existing surfaces such as dkg status, dkg wallet, GET /api/agent/identity, dkg context-graph info, and GET /api/shared-memory/host-mode/stats.
|
|
||
| Verify on laptop A: | ||
| ```bash | ||
| curl -sH "Authorization: Bearer $(cat ~/.dkg/auth.token)" \ |
There was a problem hiding this comment.
🔴 Bug: ~/.dkg/auth.token is a commented file by default, so $(cat ~/.dkg/auth.token) injects both the comment and the token into the Authorization header. These curl examples can 401 even on a healthy node. Use dkg auth show or strip comments/blank lines before interpolating the token.
| Verify on laptop A: | ||
| ```bash | ||
| curl -sH "Authorization: Bearer $(cat ~/.dkg/auth.token)" \ | ||
| http://localhost:9200/api/context-graph/list | jq '.[] | select(.access=="curated")' |
There was a problem hiding this comment.
🔴 Bug: /api/context-graph/list returns an object with a contextGraphs array, not a bare array, and the items expose accessPolicy, not access. This jq filter will never match, so the verification step gives a false negative. Query .contextGraphs[] and filter on id/accessPolicy instead.
| ```bash | ||
| curl -sH "Authorization: Bearer $(cat ~/.dkg/auth.token)" \ | ||
| -H 'Content-Type: application/json' \ | ||
| -d '{ "contextGraphId": "<cg-id>" }' \ |
There was a problem hiding this comment.
🔴 Bug: omitting peerId means /api/shared-memory/catchup fans out to whatever peers happen to be connected. In this topology that may hit laptop A directly or no peer at all, so it does not reliably validate the intended 'via the core' host-catchup path. Pass the core's peerId, or use /api/shared-memory/host-catchup if the goal is to exercise hosted ciphertext replay specifically.
| List the local triples: | ||
| ```bash | ||
| curl -sH "Authorization: Bearer $(cat ~/.dkg/auth.token)" \ | ||
| "http://localhost:9200/api/shared-memory/list?contextGraphId=$(printf %s '<cg-id>' | jq -sRr @uri)" | jq '.triples | length' |
There was a problem hiding this comment.
🔴 Bug: there is no GET /api/shared-memory/list route in the daemon, so this verification command will 404. To prove the member received data, either inspect totalInsertedTriples from the catchup response or issue /api/query against the CG's _shared_memory graph.
…ANGELOG fix PR #625's runbook documented HTTP and CLI surfaces that don't exist on the rc.10 daemon — operators following it from step 1 would 404 / 401 their way through every section. PR #638's LU-8 CHANGELOG entry promised a server-side reconstruction path for `verify-batch` that the actual route refuses. Both are doc bugs only, no code change. docs/RFC38_LU6_TWO_LAPTOP_TESTNET_RUNBOOK.md: - Replaces `dkg show` (never a real top-level command) with the actual surface: `dkg status` (peerId/multiaddrs/role), `dkg auth show` (token, stripped of comments), and `GET /api/agent/identity` for the agent EOA. Same for `dkg show-cg` (compute the wire id as `keccak256(<cgId>)` since there's no CLI shortcut) and `dkg shared-memory host-mode stats` (use `GET /api/shared-memory/ host-mode/stats` directly). - Every `$(cat ~/.dkg/auth.token)` interpolation now uses `dkg auth show` instead. The token file has a commented-header preamble by default, so the literal `cat` would inject `#\n<token>` into the `Authorization` header and 401 even on a healthy node. `dkg auth show` strips comments + blank lines, matching what `packages/node-ui/vite.config.ts` does for the same reason. - `/api/context-graph/list` filter corrected — the response is `{ contextGraphs: [...] }` (envelope, not bare array) and items expose `accessPolicy` (numeric: 0=public, 1=curated), not `access`. The old `jq '.[] | select(.access=="curated")'` filter would have silently matched nothing. - Member catchup now pins `peerId` to the core's libp2p identity. Without `peerId`, `/api/shared-memory/catchup` fans out to every connected peer — which in a two-laptop+core topology can hit laptop A directly or no peer at all, and won't reliably validate the "via the core" host-catchup path the runbook claims to verify. Showed how to grab the peer id from `/api/status` first. - Replaced the `/api/shared-memory/list?contextGraphId=...` curl (404, no such route) with a SPARQL `SELECT (COUNT(*) AS ?n)` via `/api/query` against the `_shared_memory` graph suffix — the same shape every devnet script uses for the equivalent assertion. CHANGELOG.md: - LU-8 entry no longer claims "when `quads` is omitted the route reconstructs from the local SWM or post-publish CG data graph". That hasn't been true since the safety hardening in the published route (`packages/cli/src/daemon/routes/memory.ts:1042-1049`): `quads` is **required**, HTTP 400 on omission, because the daemon can't safely identify a single published batch's leaves inside a CG-wide store. Devnet scenario `scripts/devnet-test-rfc38-lu8.sh` SCENARIO 1 pins this contract with a HTTP-400 assertion. The CHANGELOG now matches the route's actual behaviour + rationale. Co-authored-by: Cursor <cursoragent@cursor.com>
Summary
The local devnet harnesses (`scripts/devnet-test-rfc38-*.sh`) all use a libp2p-private mesh — loopback dialing, no DHT, no NAT traversal. They do not cover the failure modes that only surface when peers traverse real internet hops, which is the last remaining gate before declaring LU-6 mainnet-ready.
This runbook is the C4 companion: an end-to-end checklist for operating two laptops on different NATs + a core operator's existing testnet node, exercising the full LU-6 lifecycle on real network conditions.
Stacked on #610. Pure documentation, no production-code changes.
What the runbook covers
Explicitly out of scope
Test plan
Made with Cursor