Skip to content

test(integration): rigorous-integration layer across 5 reliability tracks#119

Merged
mastermanas805 merged 1 commit into
masterfrom
feat/integration-tests-2026-05-20
May 20, 2026
Merged

test(integration): rigorous-integration layer across 5 reliability tracks#119
mastermanas805 merged 1 commit into
masterfrom
feat/integration-tests-2026-05-20

Conversation

@mastermanas805
Copy link
Copy Markdown
Member

Adds the next layer up from existing unit tests + chaos drills. Five tracks, all green locally:

Track 1 — Backup/restore integration (api/e2e/, integration_backup tag)

  • Wraps infra/scripts/restore-drill.sh; asserts RPO/RTO/cleanup
  • Pure-parse tests for NR alert + Prom rule thresholds (run anywhere)
  • New weekly CI workflow at .github/workflows/integration-backup.yml

Track 2 — Brevo webhook full pipeline (api/e2e/, e2e tag)

  • Registry-walk over every documented Brevo event with live PG round-trip
  • Idempotent re-delivery + delivered-then-bounce contract pinning

Track 4 — Deep /readyz cross-service (api/e2e/, e2e tag)

  • Envelope shape + secret-leak scan + cache TTL + P95 SLA
  • Criticality-matrix registry walk

Track 5 — Cross-track contract (no orphan kinds) (api/e2e/, no tag)

  • Walks AuditKind* constants from source via regex
  • Verifies every kind has a declared downstream consumer

Companion: INTEGRATION-TESTS-2026-05-20.md (repo root) — per-test failure-mode map.

Local verified:

$ go test -tags integration_backup -run 'TestBackupRestore_NRAlert|TestBackupRestore_PromRule' -v ./e2e/...
--- PASS: TestBackupRestore_NRAlert_AggregationWindow (0.00s)
--- PASS: TestBackupRestore_PromRule_ThresholdsPresent (0.00s)
PASS

$ go test -short -count=1 -run TestReliability ./e2e/...
ok      instant.dev/e2e 0.755s

🤖 Generated with Claude Code

…acks

Adds the next layer up from existing unit tests + chaos drills:

Track 1 — Backup/restore integration tests (api/e2e/, build tag
integration_backup):
  - Wraps infra/scripts/restore-drill.sh in Go test scaffolding
  - Asserts RTO < 5min (postgres) / 3min (mongo), RPO < 30h
  - Asserts cleanup of throwaway namespace
  - Pure-parse tests for NR alert aggregation_window + Prom rule
    36h/60h thresholds (run anywhere)
  - .github/workflows/integration-backup.yml: weekly cron + manual
    dispatch against KUBECONFIG_TEST_CLUSTER; defensive context-name
    gate to prevent prod runs

Track 2 — Brevo webhook full pipeline (api/e2e/, e2e build tag):
  - Registry-iterating round-trip over every documented Brevo event:
    delivered/soft_bounce/hard_bounce/blocked/complaint/deferred/
    unsubscribed/error
  - Idempotent re-delivery pins GREATEST(delivered_at) clause
  - Delivered-then-bounce pins makeClassUpdater contract (no time-travel)
  - Malformed payload 400 + unhandled event 200 end-to-end

Track 3 — Propagation runner integration tests (worker/internal/jobs/,
no build tag, runs under regular make gate):
  - Backoff exact-schedule via markRetry persistence for every position
  - Dead-letter at maxAttempts via markDeadLettered direct
  - F2 P1 guard: unknown_kind bounded retries
  - FOR UPDATE SKIP LOCKED concurrency (TEST_DATABASE_URL-gated)
  - Enum-vs-handler-map registry walk (TEST_DATABASE_URL-gated)

Track 4 — Deep /readyz cross-service (api/e2e/, e2e build tag):
  - Envelope-shape walk across api/worker/provisioner
  - Brevo unreachable → 200 degraded (NOT 503) contract
  - Cache TTL: 50-burst latency assertion
  - Secret-leak scan (20+ hex chars regex)
  - P95 response time < 500ms
  - Criticality-matrix registry walk

Track 5 — Cross-track contract test (api/e2e/, no build tag, runs
under regular gate when TEST_DATABASE_URL is set):
  - Walks AuditKind* constants from api/internal/models/audit_kinds.go
    via source-file regex
  - Forward: every constant has a consumer spec entry
  - Reverse: every spec entry refers to a real constant
  - Emails=true implies Forwards=true (F4 ledger drift class)
  - Propagation kinds match pending_propagations.kind enum
  - Forwarder_sent.classification populated for rows > 5min old

Companion: INTEGRATION-TESTS-2026-05-20.md (repo-root) lists every
new test with the failure mode it catches, per CLAUDE.md rule 17
coverage-block discipline.

Per CLAUDE.md rule 18 (registry-iterating regression tests, not
hand-typed lists), every Track has at least one registry walk
that fails LOUD on additions to the upstream registry without
matching downstream wiring.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@mastermanas805 mastermanas805 force-pushed the feat/integration-tests-2026-05-20 branch from bbe82d9 to f41ed2b Compare May 20, 2026 10:13
@mastermanas805 mastermanas805 merged commit 9e7c173 into master May 20, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant