Skip to content

fix(test/v0.1.25.28.1): soak AS4 must sum __admin__ sentinel delta#113

Merged
amavashev merged 1 commit into
mainfrom
fix/audit-soak-admin-sentinel-v0.1.25.28.1
Apr 18, 2026
Merged

fix(test/v0.1.25.28.1): soak AS4 must sum __admin__ sentinel delta#113
amavashev merged 1 commit into
mainfrom
fix/audit-soak-admin-sentinel-v0.1.25.28.1

Conversation

@amavashev
Copy link
Copy Markdown
Collaborator

Summary

Nightly audit-soak went red on 2026-04-18 (run 24599992000) with AS4 failing expected: 14000L, but was: 8923L. The shortfall of 5077 matched the 400-response count the soak driver issues (401=3920, 403=5003, 400=5077).

Root cause is a test-only coverage gap, not a production audit-integrity regression. v0.1.25.28's sentinel split added a second pre-auth tier (__admin__, for valid-admin-key requests failing downstream validation), but AS4's tier-equality check still summed only __unauth__ + tenant-soak. The production write path routes correctly — AS3 (counter-sum) and AS4's first invariant (globalDelta == written) both passed on the failing run.

Fix

  • Capture baselines for audit:logs:__unauth__ and audit:logs:__admin__ after setup (same pattern as existing _all / tenant-soak baselines).
  • AS4 now sums all three tier deltas: __unauth__ + __admin__ + tenant-soak == globalDelta.
  • Assertion description names all three tiers and prints every delta on failure, so future regressions triangulate the offending family immediately.

Scope: one test file + AUDIT.md entry. No server, spec, or data changes. Shipping as v0.1.25.28.1 — attributable to the sentinel-split that introduced the gap, not to v0.1.25.29 which is in flight.

Test plan

  • Static review — the 400-wave arithmetic works out: unauth(3920) + admin(5077) + tenant-soak(5003) = 14000.
  • Manual soak trigger on this branch (10 min × 500 rps with Docker) — nightly workflow on workflow_dispatch.
  • CI unit + integration green.
  • Next scheduled nightly (2026-04-19 08:09 UTC) green.

Relates to the 2026-04-18 soak failure following PR #111 (v0.1.25.28 sentinel split).

Nightly audit-soak workflow went red on 2026-04-18 (run 24599992000)
with AS4 tier-equality failing `expected: 14000L, but was: 8923L`.
Shortfall was exactly 5077 — matching the 400-response count the
soak driver issues (401=3920, 403=5003, 400=5077).

Root cause is a test-only coverage gap introduced alongside v0.1.25.28.
That release split the pre-auth tenant_id sentinel in
AuditFailureService into two tiers: __unauth__ (no / invalid key) and
__admin__ (valid admin key, downstream validation failed). The
production write path routes correctly — AS3 (counter-sum) and AS4's
first invariant (globalDelta == written) both passed on the failing
run. But AS4's per-tier equality check still summed only
__unauth__ + tenant-soak; the 5077 entries in __admin__ were not
included, so the sum fell short by exactly that amount.

Fix:
- Capture baselines for audit:logs:__unauth__ and audit:logs:__admin__
  after setup (same pattern as existing _all / tenant-soak baselines).
- AS4 now sums all three tier deltas: __unauth__ + __admin__ +
  tenant-soak == globalDelta.
- Assertion description names all three tiers and prints every delta
  on failure so future regressions triangulate the offending family.

No server, spec, or data changes — test-only.
AUDIT.md: adds v0.1.25.28.1 dated entry above v0.1.25.29.
@amavashev amavashev merged commit fce3712 into main Apr 18, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant