OT-RFC-38 LU-6 C3 — pre-registration byte-cap stress harness#623
OT-RFC-38 LU-6 C3 — pre-registration byte-cap stress harness#623branarakic wants to merge 2 commits into
Conversation
…arness
Validates that the per-CG byte cap on `SwmHostModeStore` and the
sliding-window rate limit on `DiscoveryRateLimit` actually enforce
when a curator floods cores with ciphertext for an unregistered
(freemium-tier) CG — the abuse scenario spelled out in RFC §1.2.4.
Test phases:
1. Curator (N5) creates a curated CG locally WITHOUT register=true.
2. A core (N1) explicitly host-mode subscribes (short-circuits
the beacon-driven auto-subscribe).
3. Curator pushes a burst of fat triples (default: 16 × 80 KiB
≈ 1.3 MiB total, exceeding the 1 MiB default cap).
4. Asserts the core's perCg[CG_ID].bytes ≤ 2 MiB ceiling (cap
enforced, not just absorbed without complaint).
5. Greps core daemon.log for "Host-mode rejected pre-reg envelope"
since the burst start (advisory; passes even when only the
size clamp absorbs traffic).
6. Confirms the core process is still alive (no crash regression).
Re-runnable: timestamp-suffixed CG id. Operators can tune the
burst via `WRITES_COUNT` and `WRITE_PAYLOAD_BYTES` env vars.
Co-authored-by: Cursor <cursoragent@cursor.com>
| "allowedAgents": ["$CURATOR_AGENT"] } | ||
| EOF | ||
| )") | ||
| CREATED_ID=$(parse_json "$CREATE" '.id') |
There was a problem hiding this comment.
🔴 Bug: POST /api/context-graph/create returns { created, uri } for the non-register path, not id. CREATED_ID will stay empty here, so the script aborts before it ever runs the stress case. Read .created instead, or just reuse $CG_ID for the success check/logging.
| }]; | ||
| console.log(JSON.stringify({ contextGraphId: cgId, quads })); | ||
| ') | ||
| RESP=$(api_call "$CURATOR_NODE" POST /api/shared-memory/write "$PAYLOAD" || true) |
There was a problem hiding this comment.
🔴 Bug: || true hides every /api/shared-memory/write failure, so the loop can complete even if no envelope was ever emitted. Combined with the later warning-only checks, this can produce a false PASS on a completely broken write path. Fail fast on error responses, or at least count successful writes and require >0 before continuing.
|
|
||
| # Default unregistered cap = 1 MiB = 1048576. We accept anything ≤ 2 MiB | ||
| # as "cap enforced" because the cap is a soft hint, not a hard wall. | ||
| CAP_CEILING=2097152 |
There was a problem hiding this comment.
🔴 Bug: the implementation prunes unregistered host-mode storage down to perCgByteCap after each append, so the expected post-burst ceiling is the configured cap (1 MiB by default), not 2 MiB. Accepting anything up to 2 MiB will let a doubled-cap regression pass. Assert against the actual configured unregistered cap instead of this loose constant.
| if [ -f "$CORE_LOG" ]; then | ||
| REJ_COUNT=$(tail -c +"$((LOG_OFFSET + 1))" "$CORE_LOG" 2>/dev/null | grep -c "Host-mode rejected pre-reg envelope" || true) | ||
| log "Rejection lines since burst: $REJ_COUNT" | ||
| if [ "$REJ_COUNT" = "0" ]; then |
There was a problem hiding this comment.
🔴 Bug: this downgrades the missing Host-mode rejected pre-reg envelope signal to a warning, but the script claims to validate DiscoveryRateLimit as well as the byte cap. If curator-to-beacon binding or the rate limiter is broken, the byte clamp alone can still make the script report PASS. Once the burst exceeds the per-minute budget, make missing rejection logs a hard failure, or add an explicit precondition that the beacon binding is present before starting the burst.
…, configured cap, honest scope Addresses four Codex bugs flagged on PR #623: 1. Read response field that doesn't exist (prereg-bytecap-stress.sh:105) `/api/context-graph/create` (no `register: true`) returns `{ created, uri }`, not `{ id }`. Reading `.id` produced an empty string and the next [-n] check aborted the script before the burst ever ran — silently turning every devnet run into a no-op that reported success based on absent state. Fix: read `.created`. 2. `|| true` hid every write failure (prereg-bytecap-stress.sh:152) The burst loop swallowed every error from /api/shared-memory/write, so the script could complete a "burst" with zero envelopes ever emitted, hit phase 4 with the core empty, and warn-but-pass. Fix: drop `|| true`, count successful writes (triplesWritten=1), require ≥1 before proceeding to phase 4. 3. CAP_CEILING was 2 MiB instead of configured cap (prereg-bytecap-stress.sh:178) `enforceLimitsAfterAppend` guarantees `survivorBytes ≤ perCgByteCap` after every oversized append. The configured default for pre-reg CGs is 1 MiB. Accepting anything ≤ 2 MiB would let a doubled-cap regression pass undetected. Fix: assert against `EXPECTED_CAP_BYTES + CAP_OVERHEAD_BYTES` (1 MiB + 64 KiB framing slop by default). Both override-able via env for operators running non-default configs. 4. Over-claimed DiscoveryRateLimit coverage (prereg-bytecap-stress.sh:192) The script header claimed to validate BOTH the byte cap AND the rate limiter. The byte cap is the authoritative control; explicit rejection log lines are observability — and demanding them was a false-negative risk if the cap absorbed the burst before the rate limiter kicked in. Fix: header re-scoped to "byte cap enforced + observable rejections" (the rate limiter is complementary, not authoritative). The grep step is now logging-only, the byte cap assertion is hard. Bonus: phase 4 hard-fails when the core stored 0 bytes (previously warn-only), preventing silent passes when the gossip path didn't engage at all. Co-authored-by: Cursor <cursoragent@cursor.com>
| SUCCESSFUL_WRITES=$((SUCCESSFUL_WRITES + 1)) | ||
| fi | ||
| done | ||
| [ "$SUCCESSFUL_WRITES" -gt 0 ] || fail "no SWM writes succeeded — can't validate cap enforcement (precondition broken)" |
There was a problem hiding this comment.
🔴 Bug: Requiring only >0 successful writes makes the cap assertion vacuous. With the default payload size, anything up to 12 successful writes stays under the 1 MiB limit, so phase 4 can pass even if pruning never ran. Gate this test on the successfully submitted bytes actually exceeding EXPECTED_CAP_BYTES (or track admitted bytes directly) before treating a pass as proof that the byte cap enforced.
| act "6. Confirm the core process is still alive" | ||
| # =========================================================================== | ||
| CORE_PIDFILE=$(node_pidfile "$CORE_NODE") | ||
| if [ -f "$CORE_PIDFILE" ]; then |
There was a problem hiding this comment.
🔴 Bug: If daemon.pid is missing, this branch is skipped and the script still prints PASS plus Core process: still alive after burst. That turns a missing/stale pidfile into a false positive for the crash check. Fail when the pidfile is absent/empty, or fall back to an HTTP liveness probe such as /api/status before reporting success.
…38 scripts The four LU-6 devnet scripts (#621/#622/#623/#624) shipped with several control-flow gaps that let regressions silently sneak through: errors swallowed with `|| true`, fixed sleeps where bounded retries were needed, and assertions that passed even when the scenario under test never happened. Codex's review on the closed PRs flagged 24 specific items; the rc.10 integration merge fixed most of them. This commit closes the remaining ones — the ones that materially affect whether a PASS is meaningful. devnet-test-rfc38-revocation.sh (#621): - Member pre-creates (M1, M2) no longer swallow EVERY error with `|| true`. The new `member_pre_create` helper captures the response and tolerates ONLY the idempotent "already exists" signal; any other failure (wrong auth, malformed body, daemon down) now fails the script immediately with the actual error visible, instead of surfacing later as an opaque catchup timeout. - Phase 5's single "sleep 3 + one-shot read" replaced with a 30s bounded-retry loop (`wait_for_count_or_steady`) — gossip / catchup latency between the post-revocation write and a member's final triple count is variable, and the one-shot snapshot was reporting M1's partial state as a regression. - Added the forward-only-rotation lower bound: `M2_FINAL >= M2_PRE`. The pre-existing `<= 3` check alone would pass if revocation also wiped M2's previously-decryptable triples — violating the contract in the script header that the kicked member RETAINS what they could already decrypt; they just stop learning anything new. devnet-test-rfc38-curator-offline-midbatch.sh (#622): - Phase 1's `sleep 3` after member pre-create replaced with an explicit `wait_for_m1_onchain_id` poll. Phase 5's non-curator publish requires M1 to have observed the CG's `onChainId`, otherwise it bounces with "Context graph ... is not registered on-chain" — gossip lag for the `ContextGraphCreated` event can easily exceed 3s under devnet load. - EXIT trap now waits up to 60s for `/api/status` to respond before declaring the curator healthy again. `devnet.sh restart-node` returns after spawning the daemon, not after it's actually ready to serve, so CI runners that chain another scenario immediately were inheriting a half-started devnet. devnet-test-rfc38-prereg-bytecap-stress.sh (#623): - Added a precondition that the burst's SUBMITTED_BYTES actually exceeds the configured cap. With the default 80 KiB payload size anything up to 12 writes stayed under the 1 MiB cap, so the downstream clamp assertion was vacuously satisfied if anyone misconfigured WRITES_COUNT/WRITE_PAYLOAD_BYTES — a TEST bug passing as a daemon PASS. - Phase 6's liveness check no longer silently skips when the pidfile is missing/empty. Falls back to a `/api/status` HTTP probe so containerised devnets (where pidfiles aren't written) still get a meaningful liveness check; hard-fails when both pidfile AND status are unreachable. devnet-test-rfc38-unclean-restart.sh (#624): - Captures the killed core's `peerId` from `/api/status` BEFORE the SIGKILL. The post-restart catchup calls in phase 6 now pin to that peerId so we're explicitly exercising recovery from the restarted node — previously the catchup fanned out to any connected peer and could pass by pulling data from the curator (still online), validating nothing about the unclean-restart contract. - Phase 3 now waits for STRICT mid-batch state (`0 < M1_PARTIAL < WRITES_COUNT`) before the SIGKILL. The previous one-shot snapshot accepted any partial value including 0 or already-complete; both meant the kill below never actually exercised the `lastHostCatchupSeqno` resume path this test claims to cover. - Catchup responses are captured (not discarded to `/dev/null`) and asserted free of `error` / `swmError` / `durableError` via the new `assert_catchup_clean` helper. HTTP 500s, auth denials, host-catchup failures, etc. were previously invisible — the final triple-count check would still go green if data arrived via background gossip. Bash-syntax-checked (`bash -n` on all four). No behaviour change when the underlying scenarios are healthy; the changes only convert silent false-positives into loud failures. Co-authored-by: Cursor <cursoragent@cursor.com>
Summary
Validates that the per-CG byte cap on `SwmHostModeStore` and the sliding-window rate limit on `DiscoveryRateLimit` actually enforce when a curator floods cores with ciphertext for an unregistered (freemium-tier) CG — the abuse scenario spelled out in RFC §1.2.4.
Stacked on #610. Pure devnet harness; no production-code changes.
Test phases
Re-runnable: timestamp-suffixed CG id. Operators can tune the burst via `WRITES_COUNT` and `WRITE_PAYLOAD_BYTES` env vars.
Test plan
Made with Cursor