Skip to content

OT-RFC-38 LU-6 C3 — pre-registration byte-cap stress harness#623

Closed
branarakic wants to merge 2 commits into
feat/ot-rfc-38-lu6-host-modefrom
feat/lu6-followup-c3-prereg-bytecap-stress
Closed

OT-RFC-38 LU-6 C3 — pre-registration byte-cap stress harness#623
branarakic wants to merge 2 commits into
feat/ot-rfc-38-lu6-host-modefrom
feat/lu6-followup-c3-prereg-bytecap-stress

Conversation

@branarakic
Copy link
Copy Markdown
Contributor

Summary

Validates that the per-CG byte cap on `SwmHostModeStore` and the sliding-window rate limit on `DiscoveryRateLimit` actually enforce when a curator floods cores with ciphertext for an unregistered (freemium-tier) CG — the abuse scenario spelled out in RFC §1.2.4.

Stacked on #610. Pure devnet harness; no production-code changes.

Test phases

  1. Curator (N5) creates a curated CG locally WITHOUT `register=true`.
  2. A core (N1) explicitly host-mode subscribes (short-circuits the beacon-driven auto-subscribe).
  3. Curator pushes a burst of fat triples (default: 16 × 80 KiB ≈ 1.3 MiB, exceeding the 1 MiB default cap).
  4. Asserts core's `perCg[CG_ID].bytes ≤ 2 MiB` ceiling (cap enforced, not just absorbed).
  5. Greps core `daemon.log` for "Host-mode rejected pre-reg envelope" lines (advisory; passes even when only the size clamp absorbs traffic — both controls are complementary).
  6. Confirms the core process is still alive (no crash regression).

Re-runnable: timestamp-suffixed CG id. Operators can tune the burst via `WRITES_COUNT` and `WRITE_PAYLOAD_BYTES` env vars.

Test plan

  • Bash `-n` syntax check passes
  • Devnet run — invoked manually post-merge as part of the LU-6 mainnet validation sweep

Made with Cursor

…arness

Validates that the per-CG byte cap on `SwmHostModeStore` and the
sliding-window rate limit on `DiscoveryRateLimit` actually enforce
when a curator floods cores with ciphertext for an unregistered
(freemium-tier) CG — the abuse scenario spelled out in RFC §1.2.4.

Test phases:
  1. Curator (N5) creates a curated CG locally WITHOUT register=true.
  2. A core (N1) explicitly host-mode subscribes (short-circuits
     the beacon-driven auto-subscribe).
  3. Curator pushes a burst of fat triples (default: 16 × 80 KiB
     ≈ 1.3 MiB total, exceeding the 1 MiB default cap).
  4. Asserts the core's perCg[CG_ID].bytes ≤ 2 MiB ceiling (cap
     enforced, not just absorbed without complaint).
  5. Greps core daemon.log for "Host-mode rejected pre-reg envelope"
     since the burst start (advisory; passes even when only the
     size clamp absorbs traffic).
  6. Confirms the core process is still alive (no crash regression).

Re-runnable: timestamp-suffixed CG id. Operators can tune the
burst via `WRITES_COUNT` and `WRITE_PAYLOAD_BYTES` env vars.

Co-authored-by: Cursor <cursoragent@cursor.com>
"allowedAgents": ["$CURATOR_AGENT"] }
EOF
)")
CREATED_ID=$(parse_json "$CREATE" '.id')
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: POST /api/context-graph/create returns { created, uri } for the non-register path, not id. CREATED_ID will stay empty here, so the script aborts before it ever runs the stress case. Read .created instead, or just reuse $CG_ID for the success check/logging.

}];
console.log(JSON.stringify({ contextGraphId: cgId, quads }));
')
RESP=$(api_call "$CURATOR_NODE" POST /api/shared-memory/write "$PAYLOAD" || true)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: || true hides every /api/shared-memory/write failure, so the loop can complete even if no envelope was ever emitted. Combined with the later warning-only checks, this can produce a false PASS on a completely broken write path. Fail fast on error responses, or at least count successful writes and require >0 before continuing.


# Default unregistered cap = 1 MiB = 1048576. We accept anything ≤ 2 MiB
# as "cap enforced" because the cap is a soft hint, not a hard wall.
CAP_CEILING=2097152
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: the implementation prunes unregistered host-mode storage down to perCgByteCap after each append, so the expected post-burst ceiling is the configured cap (1 MiB by default), not 2 MiB. Accepting anything up to 2 MiB will let a doubled-cap regression pass. Assert against the actual configured unregistered cap instead of this loose constant.

if [ -f "$CORE_LOG" ]; then
REJ_COUNT=$(tail -c +"$((LOG_OFFSET + 1))" "$CORE_LOG" 2>/dev/null | grep -c "Host-mode rejected pre-reg envelope" || true)
log "Rejection lines since burst: $REJ_COUNT"
if [ "$REJ_COUNT" = "0" ]; then
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: this downgrades the missing Host-mode rejected pre-reg envelope signal to a warning, but the script claims to validate DiscoveryRateLimit as well as the byte cap. If curator-to-beacon binding or the rate limiter is broken, the byte clamp alone can still make the script report PASS. Once the burst exceeds the per-minute budget, make missing rejection logs a hard failure, or add an explicit precondition that the beacon binding is present before starting the burst.

…, configured cap, honest scope

Addresses four Codex bugs flagged on PR #623:

1. Read response field that doesn't exist (prereg-bytecap-stress.sh:105)

   `/api/context-graph/create` (no `register: true`) returns
   `{ created, uri }`, not `{ id }`. Reading `.id` produced an
   empty string and the next [-n] check aborted the script before
   the burst ever ran — silently turning every devnet run into a
   no-op that reported success based on absent state.

   Fix: read `.created`.

2. `|| true` hid every write failure (prereg-bytecap-stress.sh:152)

   The burst loop swallowed every error from /api/shared-memory/write,
   so the script could complete a "burst" with zero envelopes ever
   emitted, hit phase 4 with the core empty, and warn-but-pass.

   Fix: drop `|| true`, count successful writes (triplesWritten=1),
   require ≥1 before proceeding to phase 4.

3. CAP_CEILING was 2 MiB instead of configured cap
   (prereg-bytecap-stress.sh:178)

   `enforceLimitsAfterAppend` guarantees `survivorBytes ≤ perCgByteCap`
   after every oversized append. The configured default for pre-reg
   CGs is 1 MiB. Accepting anything ≤ 2 MiB would let a doubled-cap
   regression pass undetected.

   Fix: assert against `EXPECTED_CAP_BYTES + CAP_OVERHEAD_BYTES`
   (1 MiB + 64 KiB framing slop by default). Both override-able via
   env for operators running non-default configs.

4. Over-claimed DiscoveryRateLimit coverage
   (prereg-bytecap-stress.sh:192)

   The script header claimed to validate BOTH the byte cap AND
   the rate limiter. The byte cap is the authoritative control;
   explicit rejection log lines are observability — and demanding
   them was a false-negative risk if the cap absorbed the burst
   before the rate limiter kicked in.

   Fix: header re-scoped to "byte cap enforced + observable
   rejections" (the rate limiter is complementary, not authoritative).
   The grep step is now logging-only, the byte cap assertion is hard.

Bonus: phase 4 hard-fails when the core stored 0 bytes (previously
warn-only), preventing silent passes when the gossip path didn't
engage at all.

Co-authored-by: Cursor <cursoragent@cursor.com>
SUCCESSFUL_WRITES=$((SUCCESSFUL_WRITES + 1))
fi
done
[ "$SUCCESSFUL_WRITES" -gt 0 ] || fail "no SWM writes succeeded — can't validate cap enforcement (precondition broken)"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: Requiring only >0 successful writes makes the cap assertion vacuous. With the default payload size, anything up to 12 successful writes stays under the 1 MiB limit, so phase 4 can pass even if pruning never ran. Gate this test on the successfully submitted bytes actually exceeding EXPECTED_CAP_BYTES (or track admitted bytes directly) before treating a pass as proof that the byte cap enforced.

act "6. Confirm the core process is still alive"
# ===========================================================================
CORE_PIDFILE=$(node_pidfile "$CORE_NODE")
if [ -f "$CORE_PIDFILE" ]; then
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: If daemon.pid is missing, this branch is skipped and the script still prints PASS plus Core process: still alive after burst. That turns a missing/stale pidfile into a false positive for the crash check. Fail when the pidfile is absent/empty, or fall back to an HTTP liveness probe such as /api/status before reporting success.

@branarakic
Copy link
Copy Markdown
Contributor Author

Superseded by PR #649 (release: rc.10 testnet-ready cut). All commits from this PR are now on main via #649. Unaddressed Codex review feedback (C3 prereg-bytecap stress harness reliability) is being tracked + fixed in a dedicated post-rc.10 followup PR.

@branarakic branarakic closed this May 25, 2026
branarakic pushed a commit that referenced this pull request May 25, 2026
…38 scripts

The four LU-6 devnet scripts (#621/#622/#623/#624) shipped with several
control-flow gaps that let regressions silently sneak through:
errors swallowed with `|| true`, fixed sleeps where bounded retries
were needed, and assertions that passed even when the scenario under
test never happened. Codex's review on the closed PRs flagged 24
specific items; the rc.10 integration merge fixed most of them. This
commit closes the remaining ones — the ones that materially affect
whether a PASS is meaningful.

devnet-test-rfc38-revocation.sh (#621):
 - Member pre-creates (M1, M2) no longer swallow EVERY error with
   `|| true`. The new `member_pre_create` helper captures the response
   and tolerates ONLY the idempotent "already exists" signal; any
   other failure (wrong auth, malformed body, daemon down) now fails
   the script immediately with the actual error visible, instead of
   surfacing later as an opaque catchup timeout.
 - Phase 5's single "sleep 3 + one-shot read" replaced with a 30s
   bounded-retry loop (`wait_for_count_or_steady`) — gossip /
   catchup latency between the post-revocation write and a member's
   final triple count is variable, and the one-shot snapshot was
   reporting M1's partial state as a regression.
 - Added the forward-only-rotation lower bound: `M2_FINAL >= M2_PRE`.
   The pre-existing `<= 3` check alone would pass if revocation also
   wiped M2's previously-decryptable triples — violating the contract
   in the script header that the kicked member RETAINS what they
   could already decrypt; they just stop learning anything new.

devnet-test-rfc38-curator-offline-midbatch.sh (#622):
 - Phase 1's `sleep 3` after member pre-create replaced with an
   explicit `wait_for_m1_onchain_id` poll. Phase 5's non-curator
   publish requires M1 to have observed the CG's `onChainId`,
   otherwise it bounces with "Context graph ... is not registered
   on-chain" — gossip lag for the `ContextGraphCreated` event can
   easily exceed 3s under devnet load.
 - EXIT trap now waits up to 60s for `/api/status` to respond
   before declaring the curator healthy again. `devnet.sh restart-node`
   returns after spawning the daemon, not after it's actually ready
   to serve, so CI runners that chain another scenario immediately
   were inheriting a half-started devnet.

devnet-test-rfc38-prereg-bytecap-stress.sh (#623):
 - Added a precondition that the burst's SUBMITTED_BYTES actually
   exceeds the configured cap. With the default 80 KiB payload size
   anything up to 12 writes stayed under the 1 MiB cap, so the
   downstream clamp assertion was vacuously satisfied if anyone
   misconfigured WRITES_COUNT/WRITE_PAYLOAD_BYTES — a TEST bug
   passing as a daemon PASS.
 - Phase 6's liveness check no longer silently skips when the
   pidfile is missing/empty. Falls back to a `/api/status` HTTP
   probe so containerised devnets (where pidfiles aren't written)
   still get a meaningful liveness check; hard-fails when both
   pidfile AND status are unreachable.

devnet-test-rfc38-unclean-restart.sh (#624):
 - Captures the killed core's `peerId` from `/api/status` BEFORE
   the SIGKILL. The post-restart catchup calls in phase 6 now pin
   to that peerId so we're explicitly exercising recovery from the
   restarted node — previously the catchup fanned out to any
   connected peer and could pass by pulling data from the curator
   (still online), validating nothing about the unclean-restart
   contract.
 - Phase 3 now waits for STRICT mid-batch state (`0 < M1_PARTIAL <
   WRITES_COUNT`) before the SIGKILL. The previous one-shot snapshot
   accepted any partial value including 0 or already-complete; both
   meant the kill below never actually exercised the
   `lastHostCatchupSeqno` resume path this test claims to cover.
 - Catchup responses are captured (not discarded to `/dev/null`)
   and asserted free of `error` / `swmError` / `durableError` via
   the new `assert_catchup_clean` helper. HTTP 500s, auth denials,
   host-catchup failures, etc. were previously invisible — the
   final triple-count check would still go green if data arrived
   via background gossip.

Bash-syntax-checked (`bash -n` on all four). No behaviour change
when the underlying scenarios are healthy; the changes only convert
silent false-positives into loud failures.

Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant