Skip to content

OT-RFC-38 LU-7/8/9/10 — catchup + verify + attestation + devnet harness#609

Merged
branarakic merged 4 commits into
feat/ot-rfc-38-lu5from
feat/ot-rfc-38-lu7-10
May 25, 2026
Merged

OT-RFC-38 LU-7/8/9/10 — catchup + verify + attestation + devnet harness#609
branarakic merged 4 commits into
feat/ot-rfc-38-lu5from
feat/ot-rfc-38-lu7-10

Conversation

@branarakic
Copy link
Copy Markdown
Contributor

Summary

Closes the curated-CG verification & late-joiner surface that LU-5 (#608) opened, plus end-to-end devnet validation for the whole Phase A slice. See docs/specs/SPEC_CG_HOSTING_MEMBERSHIP.md §7.1.1 for the implementation-status table and the documented LU-6 gap.

Stacked on: #608 (LU-5 edge publish), which is itself stacked on #595 (SPEC_CG_MEMORY_MODEL LU-1..LU-4). Reviewing in order keeps each diff small.

Source surfaces

  • LU-7 POST /api/shared-memory/catchup — caller-initiated SWMCatchupRequest. Single-peer mode or parallel fan-out across all connected peers. Public CGs accept anonymous catchup; curated CGs run authorizePrivateSyncRequest against the requester's signed envelope.
  • LU-8 POST /api/shared-memory/{verify-batch,report-batch-rejection} + packages/agent/src/swm/verify-batch.ts — member post-decrypt root recompute using V10's computeFlatKCRootV10 / computeFlatKCMerkleLeafCountV10. Mismatch → structured BatchRejection record gossiped via agent.share() so other members can refetch from a different host.
  • LU-9 POST /api/attestation/{mint,verify} + packages/agent/src/swm/member-attestation.ts — member signs an envelope binding (chainId, kavAddress, contextGraphId, batchId, merkleRoot, plaintextLeafHash, attesterAddress, attestedAt) with keccak256(abi.encodePacked(...)) + EIP-191 secp256k1, matching V10 chain-side signature layout so outsiders can hand-verify. Verify route runs signature recovery, signer-matches-attester, optional candidateLeaf rehash, and an optional async membershipResolver chain hook.
  • enumerate-cg-hosts (packages/agent/src/swm/enumerate-cg-hosts.ts) — distinct from enumerate-cg-members; returns dialable peer set for LU-7 catchup. Phase A returns all connected peers minus self; Phase B will refine to the sharding-table-eligible subset once shard count > 1.
  • packages/cli/src/daemon/routes/assertion.ts — small read surface additions the new attestation flow leans on.

Devnet harness (scripts/devnet-test-rfc38-*.sh)

11 standalone end-to-end scenarios, all driven through the daemon HTTP API (no custom libraries). devnet-test-rfc38-all.sh runs the full suite end-to-end and prints a consolidated pass/fail summary. Covered:

id what it exercises
lu5-pub LU-5 public CG regression (edge publishes plaintext to VM, no encryption)
lu5-cur LU-5 curated CG edge publish (chain-key AEAD wrap + no-attribution VM publish)
lu7 LU-7 SWMCatchupRequest (public anonymous + curated member-auth + outsider denial)
lu8 LU-8 verify-batch + report-batch-rejection (member post-decrypt root recompute + gossip)
lu9 LU-9 member-attestation mint+verify (roundtrip + 3 negative-path scenarios)
lu10 LU-10 public-CG regression sweep (publish + anonymous catchup + verify-batch + attestation, all on a public CG)
e2e end-to-end lifecycle (LU-5 → LU-7 → LU-8 → LU-9 composed in one user-visible scenario)
xcg cross-CG isolation (member of CG-A cannot read CG-B; outsider catchup denied; curator can still decrypt its own CGs)
mm multi-member CG (3 distinct member wallets; each verify-batches the same root; outsider cross-verifies all 3 attestations)
scale scale probe (50 triples / 25 KAs in one curated batch; full verify + attestation roundtrip)
lj late-joiner (member-from-curator + member-from-member-with-curator-offline; documented LU-6 cores-only gap as passing fail-soft assertion)

scripts/devnet.sh restart-node N op surface (restart a single node without wiping state). The late-joiner scenario uses it to take the curator offline mid-test and bring it back.

Test plan

Unit:

  • packages/agent/test/verify-batch.test.ts — pure recompute helper unit tests
  • packages/agent/test/member-attestation.test.ts — mint+verify roundtrip + tamper detection + membership resolver paths
  • packages/agent/test/enumerate-cg-hosts.test.ts — dialable-peer enumeration

Devnet — all 11 scenarios PASS against a fresh 6-node devnet (4 cores + 2 edges, all wallets unique + funded, no on-chain identity for the edges):

[ok] lu5-pub  PASS
[ok] lu5-cur  PASS
[ok] lu7      PASS
[ok] lu8      PASS
[ok] lu9      PASS
[ok] lu10     PASS
[ok] e2e      PASS
[ok] xcg      PASS
[ok] mm       PASS
[ok] scale    PASS
[ok] lj       PASS

All 11 scenarios PASSED.

Run instructions

./scripts/devnet.sh start 6          # 4 cores + 2 edges, fresh wallets
./scripts/devnet-test-rfc38-all.sh   # ~10 min, 11 scenarios end-to-end

For UI manual testing — point Vite at any edge node:

DEVNET_UI_NODE=5 ./scripts/devnet.sh ui start
# http://localhost:5173/ui/ now proxies /api/* to node 5 (edge curator)

Deferred (Phase A sub-task, tracked for follow-up)

LU-6 substrate hosting on cores — cores do not yet subscribe to the curated-CG SWM gossip topic via the sharding-table assignment (RFC §5.1 + §5.1.1 pre-registration staging). Today's catchup model works when the curator OR any other current member is online; if every member is offline, a late joiner's catchup against cores returns 0 triples cleanly (no crash). devnet-test-rfc38-late-joiner.sh SCENARIO C asserts this fail-soft shape. Full LU-6 lands the encrypted SWM substrate (the SwmSenderKey two-layer Sender Keys construction already in packages/core/src/crypto/swm-sender-key.ts but not yet wired to the workspace-gossip topic) plus the TTL + byte-cap staging policies in §5.1.1. Path forward documented in docs/specs/SPEC_CG_HOSTING_MEMBERSHIP.md §7.1.1.

Made with Cursor

const swmGraphUri = contextGraphSharedMemoryUri(contextGraphId, subGraphName);
const dataGraphUri = `did:dkg:context-graph:${contextGraphId}`;
try {
const swmResult = await (agent as any).store.query(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: when quads are omitted this reconstructs the candidate payload from the entire SWM/data graph, but expectedMerkleRoot is for a single KC/batch. As soon as a context graph contains more than one published batch, verify-batch will deterministically report root-mismatch for valid batches because unrelated triples are mixed in. Scope the local read to the requested batch/KC (or require callers to pass the exact batch quads) instead of querying every triple in the graph.

try {
const cgList = await (agent as any).listContextGraphs?.();
const match = (cgList ?? []).find((cg: any) => cg.id === contextGraphId);
onChainCgId = match?.onChainId ?? '0';
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: this silently signs attestations with contextGraphId = "0" whenever the local subscription metadata cannot resolve an on-chain id. That produces tokens bound to the wrong domain even though the KC already exists on-chain. Resolve the CG id from chain truth (getKCContextGraphId(BigInt(batchId)) / getContextGraphOnChainId) and fail if it cannot be determined instead of minting an attestation with a placeholder id.

{ subject, predicate: `${NS}rejectedByPeer`, object: `"${record.rejectedBy.peerId ?? ''}"`, graph: '' },
{ subject, predicate: `${NS}rejectionReportedAt`, object: `"${record.reportedAt}"`, graph: '' },
...(record.batchId !== undefined
? [{ subject, predicate: `${NS}rejectedBatchId`, object: `"${record.batchId}"`, graph: '' }]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: batchId comes from the HTTP body and is interpolated into an RDF literal without escaping. A value containing ", newlines, or RDF syntax will either break the SWM write or let callers smuggle malformed triples through this endpoint. Escape the literal with the existing RDF helper (or reject unsafe input) before passing it to agent.share().

Comment thread packages/agent/src/swm/verify-batch.ts Outdated
input.verifyResult.actualRoot,
input.verifyResult.reason ?? 'unknown',
input.rejectedBy.agentAddress,
reportedAt,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: reportedAt is part of the rejection digest, so retries of the same rejection from the same member always mint a new subject URI instead of deduping. That defeats the stated hash-dedupe identical rejection reports behavior and makes idempotent re-reporting impossible. Use a stable digest key derived from the batch/root/rejecter fields and keep reportedAt as metadata outside the digest.

Comment thread scripts/devnet-test-rfc38-all.sh Outdated
log " OT-RFC-38 INTEGRATION RUN SUMMARY"
log "================================================================"
note "OT-RFC-38 INTEGRATION RUN SUMMARY"
note "Run started: $(date -u -r "$START_TS" +'%Y-%m-%dT%H:%M:%SZ')"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: date -r <epoch-seconds> is BSD/macOS syntax; on GNU/Linux -r expects a file path, so the summary step fails on the project's primary dev/CI environment. Use a portable epoch conversion (date -u -d "@${START_TS}" ... on GNU, or a small POSIX-compatible helper) before relying on this runner in Linux devnets.

branarakic pushed a commit that referenced this pull request May 24, 2026
… review

Five correctness / safety / CI-portability fixes flagged in Codex's
latest review of #609. All are real bugs (injection, wrong-domain
attestation, deterministic false-positive, dedupe defeat, broken
non-macOS CI), not BC-only filler.

1. memory.ts /api/shared-memory/report-batch-rejection — RDF literal injection
   `batchId` (and every other HTTP-body-sourced value) was
   interpolated directly into an N-Quads literal body:
     `${NS}rejectedBatchId`, object: `"${record.batchId}"`
   A value containing `"`, newlines, or RDF syntax would either
   break the SWM insert outright OR let the caller smuggle
   attacker-controlled triples through this endpoint (the literal
   closes early, then the rest is parsed as fresh N-Quads). Now
   pipes every interpolated literal body through
   `escapeDkgRdfLiteral` from `@origintrail-official/dkg-core` —
   defense in depth so even fields like `expectedMerkleRoot` that
   are structurally constrained to 0x-hex still get escaped
   (input-validation regressions can't reopen the hole).

2. memory.ts /api/attestation/mint — fail-closed on unresolvable CG id
   Previously silently fell back to `onChainCgId = '0'` when local
   CG metadata couldn't resolve the on-chain id. That minted
   attestation tokens bound to ContextGraphId=0 (the sentinel for
   "no on-chain CG") even though a real KC for the batch already
   existed on-chain — outsiders verifying the token saw it pass
   cryptographic checks but reject as wrong-domain, with no
   diagnostic linking back to the actual CG.

   New three-layer resolver, all fail-closed:
     (a) Caller-supplied `onChainContextGraphId` (explicit override).
     (b) Chain-truth via `chain.getKCContextGraphId(batchId)` —
         authoritative because the KC ↔ CG binding is on-chain.
     (c) Local CG listing (last-resort; may be stale post-event-replay).
   If none resolve, returns 400 with a clear next-step ("pass
   `onChainContextGraphId` explicitly"). The endpoint NEVER mints
   against id=0 anymore.

3. memory.ts /api/shared-memory/verify-batch — refuse whole-graph
   reconstruction when `batchId` is supplied
   When `quads` were omitted, the endpoint loaded EVERY triple in
   the context graph's `_shared_memory` and data graphs. For a CG
   with more than one published batch (i.e. nearly every real CG
   after the second publish), this deterministically false-positives
   `root-mismatch` because we hash a superset of leaves against a
   single-batch `expectedMerkleRoot`. The triple store carries no
   per-batch label on SWM data, so we can't safely scope the read
   here without additional metadata. Now rejects with a 400 that
   tells the caller exactly what to do: supply the exact batch
   quads (typically already in hand from the LU-7 catch-up that
   produced the batch) or omit `batchId` if you really do want to
   verify the entire CG against a single root.

4. verify-batch.ts buildBatchRejectionRecord — `reportedAt` out of
   the digest (idempotent retry-dedupe)
   The jsdoc contract says "hash-dedupe identical rejection reports"
   but `reportedAt` was part of the digest input. Every retry of
   the same rejection (transient gossip drop, restart, etc.)
   therefore produced a NEW digest → NEW `did:dkg:batch-rejection:<digest>`
   subject URI → SWM substrate stored every retry as a distinct
   record, defeating dedupe on the consumer side. The digest now
   covers only identity-bearing fields (CG, batch, expected/actual
   root, reason, rejecter address); `reportedAt` stays in the
   record as metadata but doesn't gate identity. New focused test
   confirms two rejections with identical inputs but different
   `reportedAt` produce the same digest.

5. devnet-test-rfc38-all.sh — portable epoch→ISO date
   `date -r <epoch>` is BSD/macOS syntax; on GNU/Linux `-r` expects
   a FILE PATH, so the run-summary step previously errored on every
   GNU dev environment (incl. CI). Probe once and use the portable
   form: BSD path uses `-u -r`, GNU path uses `-u -d "@<epoch>"`.

Verification:
- `pnpm --filter @origintrail-official/dkg-agent exec vitest run
  test/verify-batch.test.ts` → 11/11 passing, including the new
  "digest is independent of reportedAt" retry-dedupe test.
- `pnpm exec tsc --noEmit` (cli package) → clean.
- `bash -n scripts/devnet-test-rfc38-all.sh` → syntax OK.

Branch was first rebased onto the updated `feat/ot-rfc-38-lu5` so
PR #608's auth-bypass / data-leak fixes flow up the RFC-38 stack
into this PR's diff.

Co-authored-by: Cursor <cursoragent@cursor.com>
@branarakic branarakic force-pushed the feat/ot-rfc-38-lu7-10 branch from 94c96bd to 1361ec9 Compare May 24, 2026 17:20
// 2. CG data graph (post-publish — selection moves quads from
// SWM into the named-graph as part of the seal step)
const swmGraphUri = contextGraphSharedMemoryUri(contextGraphId, subGraphName);
const dataGraphUri = `did:dkg:context-graph:${contextGraphId}`;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: the durable fallback always reads the root data graph here. If the batch was published from a registered sub-graph, those quads live under the sub-graph data URI, so this path will report empty-quads/root-mismatch after SWM has been drained even though the publish succeeded. Use contextGraphDataUri(contextGraphId, subGraphName) (or equivalent) for the post-publish lookup.

// a proper chain-side resolver lands in Phase B with the
// membership-at-epoch SPARQL query). Returning undefined here
// surfaces `membership: 'unknown'` to the caller.
const result = await verifyMemberAttestation({
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: this unconditionally supplies a membership resolver, so /api/attestation/verify returns membership: "unknown" even when the caller did not request chainCheckMembership. That changes the route contract and makes it impossible to distinguish “not checked” from “checked but unavailable”. Only pass a resolver when chainCheckMembership === true; otherwise let the result stay "skipped".

input: VerifyMemberAttestationInput,
): Promise<VerifyMemberAttestationResult> {
const { attestation, candidateLeaf, membershipResolver } = input;
const digest = computeAttestationDigest(attestation.payload);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: computeAttestationDigest(attestation.payload) runs before any structural validation or try/catch. A malformed payload (chainId, bad hex in roots/addresses, etc.) will throw out of verifyMemberAttestation, which turns the new HTTP verifier into a 500 instead of a clean failed verification. Please validate the payload fields on the verify path as well, or catch digest-construction errors and return ok: false.

merkleRoot: rootHex,
author,
});
} catch (err: any) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: this route maps every chain lookup failure to 500. For an unknown KC id, the existing /api/kc/:id/author route already translates the same adapter errors to 404, and callers rely on that distinction to tell “not published yet” from “server failure”. Mirror the same unknown kcId|nonexistent|out-of-bounds handling here instead of always returning 500.

branarakic pushed a commit that referenced this pull request May 24, 2026
… review

Five correctness / safety / CI-portability fixes flagged in Codex's
latest review of #609. All are real bugs (injection, wrong-domain
attestation, deterministic false-positive, dedupe defeat, broken
non-macOS CI), not BC-only filler.

1. memory.ts /api/shared-memory/report-batch-rejection — RDF literal injection
   `batchId` (and every other HTTP-body-sourced value) was
   interpolated directly into an N-Quads literal body:
     `${NS}rejectedBatchId`, object: `"${record.batchId}"`
   A value containing `"`, newlines, or RDF syntax would either
   break the SWM insert outright OR let the caller smuggle
   attacker-controlled triples through this endpoint (the literal
   closes early, then the rest is parsed as fresh N-Quads). Now
   pipes every interpolated literal body through
   `escapeDkgRdfLiteral` from `@origintrail-official/dkg-core` —
   defense in depth so even fields like `expectedMerkleRoot` that
   are structurally constrained to 0x-hex still get escaped
   (input-validation regressions can't reopen the hole).

2. memory.ts /api/attestation/mint — fail-closed on unresolvable CG id
   Previously silently fell back to `onChainCgId = '0'` when local
   CG metadata couldn't resolve the on-chain id. That minted
   attestation tokens bound to ContextGraphId=0 (the sentinel for
   "no on-chain CG") even though a real KC for the batch already
   existed on-chain — outsiders verifying the token saw it pass
   cryptographic checks but reject as wrong-domain, with no
   diagnostic linking back to the actual CG.

   New three-layer resolver, all fail-closed:
     (a) Caller-supplied `onChainContextGraphId` (explicit override).
     (b) Chain-truth via `chain.getKCContextGraphId(batchId)` —
         authoritative because the KC ↔ CG binding is on-chain.
     (c) Local CG listing (last-resort; may be stale post-event-replay).
   If none resolve, returns 400 with a clear next-step ("pass
   `onChainContextGraphId` explicitly"). The endpoint NEVER mints
   against id=0 anymore.

3. memory.ts /api/shared-memory/verify-batch — refuse whole-graph
   reconstruction when `batchId` is supplied
   When `quads` were omitted, the endpoint loaded EVERY triple in
   the context graph's `_shared_memory` and data graphs. For a CG
   with more than one published batch (i.e. nearly every real CG
   after the second publish), this deterministically false-positives
   `root-mismatch` because we hash a superset of leaves against a
   single-batch `expectedMerkleRoot`. The triple store carries no
   per-batch label on SWM data, so we can't safely scope the read
   here without additional metadata. Now rejects with a 400 that
   tells the caller exactly what to do: supply the exact batch
   quads (typically already in hand from the LU-7 catch-up that
   produced the batch) or omit `batchId` if you really do want to
   verify the entire CG against a single root.

4. verify-batch.ts buildBatchRejectionRecord — `reportedAt` out of
   the digest (idempotent retry-dedupe)
   The jsdoc contract says "hash-dedupe identical rejection reports"
   but `reportedAt` was part of the digest input. Every retry of
   the same rejection (transient gossip drop, restart, etc.)
   therefore produced a NEW digest → NEW `did:dkg:batch-rejection:<digest>`
   subject URI → SWM substrate stored every retry as a distinct
   record, defeating dedupe on the consumer side. The digest now
   covers only identity-bearing fields (CG, batch, expected/actual
   root, reason, rejecter address); `reportedAt` stays in the
   record as metadata but doesn't gate identity. New focused test
   confirms two rejections with identical inputs but different
   `reportedAt` produce the same digest.

5. devnet-test-rfc38-all.sh — portable epoch→ISO date
   `date -r <epoch>` is BSD/macOS syntax; on GNU/Linux `-r` expects
   a FILE PATH, so the run-summary step previously errored on every
   GNU dev environment (incl. CI). Probe once and use the portable
   form: BSD path uses `-u -r`, GNU path uses `-u -d "@<epoch>"`.

Verification:
- `pnpm --filter @origintrail-official/dkg-agent exec vitest run
  test/verify-batch.test.ts` → 11/11 passing, including the new
  "digest is independent of reportedAt" retry-dedupe test.
- `pnpm exec tsc --noEmit` (cli package) → clean.
- `bash -n scripts/devnet-test-rfc38-all.sh` → syntax OK.

Branch was first rebased onto the updated `feat/ot-rfc-38-lu5` so
PR #608's auth-bypass / data-leak fixes flow up the RFC-38 stack
into this PR's diff.

Co-authored-by: Cursor <cursoragent@cursor.com>
@branarakic branarakic force-pushed the feat/ot-rfc-38-lu7-10 branch from 1361ec9 to f75623d Compare May 24, 2026 17:52
@branarakic branarakic force-pushed the feat/ot-rfc-38-lu5 branch from acd3841 to 9e1f0c2 Compare May 24, 2026 23:20
Branimir Rakic and others added 2 commits May 25, 2026 01:20
…late-joiner devnet harness

Closes the curated-CG verification & late-joiner surface that LU-5 (edge
publish) opened, plus end-to-end devnet validation for the whole Phase A
slice. See `docs/specs/SPEC_CG_HOSTING_MEMBERSHIP.md` §7.1.1 for the
implementation-status table and the documented LU-6 gap.

Source surfaces

  - LU-7 `POST /api/shared-memory/catchup` — caller-initiated
    SWMCatchupRequest. Single-peer mode or parallel fan-out across all
    connected peers. Public CGs accept anonymous catchup; curated CGs run
    `authorizePrivateSyncRequest` against the requester's signed envelope.
  - LU-8 `POST /api/shared-memory/{verify-batch,report-batch-rejection}` +
    `packages/agent/src/swm/verify-batch.ts` — member post-decrypt root
    recompute using V10's `computeFlatKCRootV10`/`computeFlatKCMerkleLeafCountV10`.
    Mismatch → structured `BatchRejection` record gossiped via `agent.share()`
    so other members can refetch from a different host.
  - LU-9 `POST /api/attestation/{mint,verify}` +
    `packages/agent/src/swm/member-attestation.ts` — member signs an
    envelope binding (chainId, kavAddress, contextGraphId, batchId,
    merkleRoot, plaintextLeafHash, attesterAddress, attestedAt) with
    keccak256(abi.encodePacked(...)) + EIP-191 secp256k1, matching the V10
    chain-side signature layout so outsiders can hand-verify. Verify route
    runs signature recovery, signer-matches-attester, optional candidateLeaf
    rehash, and an optional async membershipResolver chain hook.
  - `packages/agent/src/swm/enumerate-cg-hosts.ts` — distinct from
    `enumerate-cg-members`; returns dialable peer set for LU-7 catchup.
    Phase A returns all connected peers minus self; Phase B will refine to
    the sharding-table-eligible subset once shard count > 1.
  - `packages/cli/src/daemon/routes/assertion.ts` — small read surface
    additions that the new attestation flow leans on.

Devnet harness (`scripts/devnet-test-rfc38-*.sh`)

  - 11 standalone end-to-end scenarios, all driven through the daemon
    HTTP API (no custom libraries). `devnet-test-rfc38-all.sh` runs the
    full suite end-to-end and prints a consolidated pass/fail summary.
  - Covered: LU-5 (curated + public), LU-7, LU-8, LU-9, LU-10 (public-CG
    regression sweep), `e2e` (LU-5→LU-7→LU-8→LU-9 composed in one
    user-visible lifecycle), `cross-cg` (isolation: member of CG-A cannot
    decrypt CG-B; outsider catchup denied), `multi-member` (3 distinct
    member wallets cross-verify the same batch + cross-verify each
    other's attestations), `scale` (50 triples / 25 KAs single batch),
    `late-joiner` (member-from-curator + member-from-member with curator
    offline; plus a documented LU-6 cores-only gap as a passing fail-soft
    assertion).
  - `scripts/devnet.sh restart-node N` op surface (restart a single node
    without wiping state). The late-joiner scenario uses it to take the
    curator offline mid-test and bring it back.

Documentation

  - `docs/specs/SPEC_CG_HOSTING_MEMBERSHIP.md` §7.1.1 — implementation
    status table for Phase A: LU-5/7/8/9/10 landed, LU-6 deferred. The
    "deferred LU-6" subsection explains what still works on the current
    branch (member-from-curator and member-from-member catchup) vs what
    requires the substrate-subscription work (cores-only catchup when
    every member is offline).
  - `CHANGELOG.md` — Unreleased entry, scoped to OT-RFC-38 Phase A, with
    one bullet per LU and a single "Deferred" callout.

Run instructions

    ./scripts/devnet.sh start 6        # 4 cores + 2 edges, fresh wallets
    ./scripts/devnet-test-rfc38-all.sh # ~10 min, 11 scenarios

Tested

  - All 11 devnet scenarios PASS against a fresh 6-node devnet (4 cores
    + 2 edges, all wallets unique + funded, no on-chain identity for the
    edges). Per-scenario logs land under `.devnet/integration-runs/<ts>/`.

Co-authored-by: Cursor <cursoragent@cursor.com>
… review

Five correctness / safety / CI-portability fixes flagged in Codex's
latest review of #609. All are real bugs (injection, wrong-domain
attestation, deterministic false-positive, dedupe defeat, broken
non-macOS CI), not BC-only filler.

1. memory.ts /api/shared-memory/report-batch-rejection — RDF literal injection
   `batchId` (and every other HTTP-body-sourced value) was
   interpolated directly into an N-Quads literal body:
     `${NS}rejectedBatchId`, object: `"${record.batchId}"`
   A value containing `"`, newlines, or RDF syntax would either
   break the SWM insert outright OR let the caller smuggle
   attacker-controlled triples through this endpoint (the literal
   closes early, then the rest is parsed as fresh N-Quads). Now
   pipes every interpolated literal body through
   `escapeDkgRdfLiteral` from `@origintrail-official/dkg-core` —
   defense in depth so even fields like `expectedMerkleRoot` that
   are structurally constrained to 0x-hex still get escaped
   (input-validation regressions can't reopen the hole).

2. memory.ts /api/attestation/mint — fail-closed on unresolvable CG id
   Previously silently fell back to `onChainCgId = '0'` when local
   CG metadata couldn't resolve the on-chain id. That minted
   attestation tokens bound to ContextGraphId=0 (the sentinel for
   "no on-chain CG") even though a real KC for the batch already
   existed on-chain — outsiders verifying the token saw it pass
   cryptographic checks but reject as wrong-domain, with no
   diagnostic linking back to the actual CG.

   New three-layer resolver, all fail-closed:
     (a) Caller-supplied `onChainContextGraphId` (explicit override).
     (b) Chain-truth via `chain.getKCContextGraphId(batchId)` —
         authoritative because the KC ↔ CG binding is on-chain.
     (c) Local CG listing (last-resort; may be stale post-event-replay).
   If none resolve, returns 400 with a clear next-step ("pass
   `onChainContextGraphId` explicitly"). The endpoint NEVER mints
   against id=0 anymore.

3. memory.ts /api/shared-memory/verify-batch — refuse whole-graph
   reconstruction when `batchId` is supplied
   When `quads` were omitted, the endpoint loaded EVERY triple in
   the context graph's `_shared_memory` and data graphs. For a CG
   with more than one published batch (i.e. nearly every real CG
   after the second publish), this deterministically false-positives
   `root-mismatch` because we hash a superset of leaves against a
   single-batch `expectedMerkleRoot`. The triple store carries no
   per-batch label on SWM data, so we can't safely scope the read
   here without additional metadata. Now rejects with a 400 that
   tells the caller exactly what to do: supply the exact batch
   quads (typically already in hand from the LU-7 catch-up that
   produced the batch) or omit `batchId` if you really do want to
   verify the entire CG against a single root.

4. verify-batch.ts buildBatchRejectionRecord — `reportedAt` out of
   the digest (idempotent retry-dedupe)
   The jsdoc contract says "hash-dedupe identical rejection reports"
   but `reportedAt` was part of the digest input. Every retry of
   the same rejection (transient gossip drop, restart, etc.)
   therefore produced a NEW digest → NEW `did:dkg:batch-rejection:<digest>`
   subject URI → SWM substrate stored every retry as a distinct
   record, defeating dedupe on the consumer side. The digest now
   covers only identity-bearing fields (CG, batch, expected/actual
   root, reason, rejecter address); `reportedAt` stays in the
   record as metadata but doesn't gate identity. New focused test
   confirms two rejections with identical inputs but different
   `reportedAt` produce the same digest.

5. devnet-test-rfc38-all.sh — portable epoch→ISO date
   `date -r <epoch>` is BSD/macOS syntax; on GNU/Linux `-r` expects
   a FILE PATH, so the run-summary step previously errored on every
   GNU dev environment (incl. CI). Probe once and use the portable
   form: BSD path uses `-u -r`, GNU path uses `-u -d "@<epoch>"`.

Verification:
- `pnpm --filter @origintrail-official/dkg-agent exec vitest run
  test/verify-batch.test.ts` → 11/11 passing, including the new
  "digest is independent of reportedAt" retry-dedupe test.
- `pnpm exec tsc --noEmit` (cli package) → clean.
- `bash -n scripts/devnet-test-rfc38-all.sh` → syntax OK.

Branch was first rebased onto the updated `feat/ot-rfc-38-lu5` so
PR #608's auth-bypass / data-leak fixes flow up the RFC-38 stack
into this PR's diff.

Co-authored-by: Cursor <cursoragent@cursor.com>
@branarakic branarakic force-pushed the feat/ot-rfc-38-lu7-10 branch from f75623d to 6e61370 Compare May 24, 2026 23:20
// 2. CG data graph (post-publish — selection moves quads from
// SWM into the named-graph as part of the seal step)
const swmGraphUri = contextGraphSharedMemoryUri(contextGraphId, subGraphName);
const dataGraphUri = `did:dkg:context-graph:${contextGraphId}`;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: when subGraphName is set and the batch has already been published, this fallback still queries the root CG graph. That misses finalized triples stored under the sub-graph-specific data graph, so verify-batch will report empty-quads / root-mismatch for published sub-graph batches even though the data exists locally. Please resolve the durable graph URI through the same sub-graph-aware helper used elsewhere instead of hardcoding the root graph.

input: VerifyMemberAttestationInput,
): Promise<VerifyMemberAttestationResult> {
const { attestation, candidateLeaf, membershipResolver } = input;
const digest = computeAttestationDigest(attestation.payload);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: verifyMemberAttestation digests caller-controlled payloads without validating them first. Non-numeric chainId / contextGraphId / attestedAt will throw here, while malformed hex in merkleRoot / plaintextLeafHash / attesterAddress gets coerced to zero bytes by the parsing helpers, so /api/attestation/verify can flip between 500s and false positives on bad input. Please validate the payload before calling computeAttestationDigest and return a structured verification failure for malformed attestations.

if (peerIdParam) {
candidatePeers = [peerIdParam];
} else {
candidatePeers = agent.node.libp2p
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: this route reimplements host enumeration even though the PR adds createCGHostEnumerator for that exact policy. Once LU-6 makes host selection sharding-aware, /api/shared-memory/catchup will keep probing every connected peer unless both copies are updated in lockstep. Please reuse the shared enumerator here so the hosting-peer policy stays centralized.

Addresses the three Codex bugs still open after the previous round.

1. memory.ts /api/shared-memory/verify-batch — sub-graph data URI
   (line 810)

   When `subGraphName` is supplied and the batch has already been
   published (SWM empty post-promote), the durable fallback used the
   root CG graph URI. Finalized triples live under the sub-graph-
   specific data graph, so the fallback missed them and returned
   `empty-quads` / `root-mismatch` for valid published sub-graph
   batches. Routed through `contextGraphDataUri(cg, subGraphName)`
   from `@origintrail-official/dkg-core` — the same helper the
   publisher uses on the write side.

2. member-attestation.ts — validate caller-controlled payloads in
   verifyMemberAttestation BEFORE digesting (line 243)

   `verifyMemberAttestation` digested untrusted payload fields without
   structural validation: non-numeric `chainId` / `contextGraphId` /
   `attestedAt` threw out of the parsing helpers (HTTP 500), while
   malformed hex got coerced to zero bytes (false positive on
   `signerMatchesAttester`). Added `validateAttestationPayload` —
   used by BOTH mint (throws) and verify (returns structured
   ok=false). Verify path also wraps `computeAttestationDigest` in a
   try/catch as defence in depth. Three new tests cover the failure
   modes (non-numeric chainId, malformed merkleRoot hex, non-integer
   attestedAt). All 18 member-attestation tests pass.

3. memory.ts /api/shared-memory/catchup — centralize host enumeration
   (line 636)

   Previously reimplemented `libp2p.getConnections() → strings` here
   even though the PR added `createCGHostEnumerator` for that exact
   policy. When LU-6 ships shard-aware host selection, both call
   sites would need lockstep updates. Now imports and uses the
   shared enumerator (per-CG enumeration unioned across all
   requested CGs), so phase B's filter lands centrally.

Build: cli + agent + publisher pass `tsc` clean.
Tests: member-attestation.test.ts → 18/18 PASS (3 new for the
validator path).

Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codex review skipped: filtered diff is 5486 lines (cap: 5,000). Please consider splitting this into smaller PRs for reviewability.

Two route-contract regressions surfaced by Codex round 2:

1. `/api/attestation/verify` unconditionally supplied a
   `membershipResolver` stub, so every response carried
   `membership: "unknown"` regardless of the caller's
   `chainCheckMembership` flag. That erased the distinction between
   "not asked" (caller didn't opt in) and "asked but unavailable"
   (Phase B chain-side resolver missing). Gate the resolver on the
   flag so omitting it returns no `membership` field (preserved
   contract); passing `true` still returns `unknown` until the
   Phase B resolver lands.

2. `/api/kc/:id` (merkle-root + author probe) mapped every
   chain-adapter exception to 500, including the same
   `unknown kcId` revert that the sibling `/api/kc/:id/author`
   route already translates to 404. Callers that branch on
   "not published yet" vs. "server failure" got the wrong signal.
   Mirror the same regex test the sibling route uses so both
   routes report 404 consistently for the unknown-id case.

Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codex review skipped: filtered diff is 5537 lines (cap: 5,000). Please consider splitting this into smaller PRs for reviewability.

@branarakic
Copy link
Copy Markdown
Contributor Author

Codex R2 follow-up — attestation+KC route hardening

Addressed both route-contract bugs in 44778ff9:

1. /api/attestation/verify always supplied membership resolver (R2 memory.ts:1130)

Old code unconditionally passed membershipResolver: async () => undefined, so every response carried membership: "unknown" regardless of whether the caller asked. Gated on parsed.chainCheckMembership === true; omitting the flag now returns no membership field (route contract preserved), passing true still returns unknown until the Phase B resolver lands.

2. /api/kc/:id unknown-kcId → 500 instead of 404 (R2 assertion.ts:3190)

Old code caught all chain-adapter exceptions and rethrew/returned 500, including the same unknown kcId|nonexistent|out-of-bounds revert that the sibling /api/kc/:id/author route already translates to 404. Callers that branch on "not published yet" vs. "server failure" got the wrong signal. Mirrored the same regex test the sibling uses so both routes are in lockstep.

Sub-graph fallback (R2 memory.ts:810), member-attestation digest order (R2 member-attestation.ts:243), and portable date -r (R1) were already fixed in the integration branch.

@branarakic branarakic merged commit 3f6cd2b into feat/ot-rfc-38-lu5 May 25, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant