Skip to content

docs: castore ES gap-analysis for internal-fork takeover decision#1

Merged
selmeci merged 32 commits into
mainfrom
claude/determined-kare-235d90
Apr 17, 2026
Merged

docs: castore ES gap-analysis for internal-fork takeover decision#1
selmeci merged 32 commits into
mainfrom
claude/determined-kare-235d90

Conversation

@selmeci
Copy link
Copy Markdown
Owner

@selmeci selmeci commented Apr 17, 2026

Summary

Research deliverable for the decision whether to adopt castore as the base of an internal event-sourcing framework for a greenfield financial / payments product (profile D1 + N1 GDPR + N4 zero-event-loss + N5 snapshots + N6 schema evolution).

This PR adds 3 documents — no code changes, no packages/** touched:

  • Spec (docs/superpowers/specs/2026-04-16-castore-es-gap-analysis-design.md, 427 lines) — how the analysis was structured: 8-section skeleton, 26-feature catalogue, competitor set, gap-entry template, MoSCoW + effort rubric, risk register format.
  • Plan (docs/superpowers/plans/2026-04-16-castore-es-gap-analysis.md, 689 lines) — 7 chunks operationalizing the spec: skeleton → castore audit → competitor harvest → gap catalogue → risk verification → roadmap → finalization. Absence-evidence protocol, gh auth fallback, go/no-go handoff ritual all included.
  • Deliverable (docs/superpowers/research/2026-04-16-castore-es-gap-analysis.md, 2224 lines) — the gap analysis & roadmap itself.

Scope is locked to 8 in-scope packages (core, event-storage-adapter-postgres, event-storage-adapter-in-memory, message-bus-adapter-event-bridge, message-bus-adapter-event-bridge-s3, event-type-zod, command-zod, lib-test-tools). The other 11 packages are named out-of-scope and will be removed in a separate "Fork & Trim" sub-project.

Key findings

Castore audit (§4) — 7 ✅ / 1 🔶 / 4 ⚠️ / 14 ❌ of 26 canonical ES features.

Competitor matrix (§3) — 4 competitors + DIY Postgres baseline.

  • F20 GDPR crypto-shredding absent in all 4 competitors — Emmett, EventStoreDB, Marten, Equinox. This confirms N1 is build-from-scratch regardless of framework choice.
  • Emmett licence RFC (potentially SSPL/AGPL) is the single highest non-technical risk against adopting Emmett.
  • Equinox scores best as pattern reference despite F#/.NET language lock-in (2/5 adoption, 5/5 pattern reference).

Risk register (§7) — 10 kept, 2 dropped/relocated.

  • R-07 (aws-sdk v2→v3 migration) dropped — verification showed every AWS adapter already on @aws-sdk/client-* v3; spec preview was stale.
  • R-06 (pg major lock-in) downgraded — adapter uses postgres (postgres.js v3), not pg (node-postgres); more stable than assumed.
  • R-09 (outbox relay SPOF) relocated — future design risk for G-01, not a current castore risk.

Gaps (§5) — 10 entries: 4 MUST (G-01 outbox, G-02 snapshots, G-03 idempotency, G-04 crypto-shredding) · 3 SHOULD · 3 COULD · 8 WON'T. DAG has 3 edges (G-07→G-08, G-07→G-10, G-05→G-04), acyclic.

Roadmap (§6) — Phase 1 = 68 person-days (midpoints). Calendar @ 4 effective days/week: ~17 weeks at 1 FTE, ~11 at 1.5 FTE, ~12 at 2 FTE (DAG-limited — most MUST gaps serialize).

Verdict (§6.0): Go — conditional. Castore is a sound foundation for a D1 financial product, but Phase 1 is a regulatory / correctness blocker — MUST be complete before any PII-bearing event reaches production. OQ-5 (crypto-shredding spike / POC) should be resolved before committing to the G-04 implementation plan.

Methodology notes

  • All 26 castore audit entries carry file:line evidence or "Absence confirmed on 2026-04-16 via: <grep patterns + locations>. Confidence: <high|medium|low>" per the plan's absence-evidence protocol.
  • Competitors were evaluated doc-only per spec OQ-4 default (no hands-on POC). Inspection date 2026-04-16 recorded in §8.2 for reproducibility.
  • Risk register entries R-01..R-04 were re-verified against current code in this branch; spec preview wording was discarded where invalidated.
  • Each chunk of execution passed a spec-compliance + quality review; 5 issues were surfaced by reviewers and fixed in targeted commits (3c4fbdc, 343f5a9).

Decision ritual (at the end of the deliverable)

The deliverable closes with a ## Decision section presenting three options for the reviewer:

  • (a) Go — proceed with castore fork; open first brainstorming cycle for the top-priority MUST gap per §5.
  • (b) No-go — pivot to an alternative (Emmett or DIY Postgres per §3); close the castore worktree.
  • (c) Defer — resolve specific open questions (OQ-1 FTE, OQ-2 deadline, OQ-5 crypto-shredding spike) before committing.

A template line **Decision (\<date>):** \<a|b|c> — \<rationale>. is left blank for the reviewer to fill.

Review plan

  • Read §6.0 Section opener in the deliverable first — it's the synthesis of everything else (2 minutes)
  • Skim §5 Gap detail catalogue — these are the 10 concrete pieces of work that Phase 1+2+3 would turn into per-gap brainstorming cycles
  • Inspect §7 Risk register — especially R-01 (EventStore concrete class) and R-04 (EventBridge 256KB limit affecting outbox design) as they most shape G-01 / G-02 implementation
  • Spot-check one or two audit entries in §4 against the cited file:line — validate the evidence protocol
  • Decide: fill the decision line at the end of the deliverable, or comment here requesting clarifications
  • If decision is (a) Go: next step is a separate brainstorming cycle for G-01 (outbox), which is the top of the DAG critical path

Anti-scope (explicit)

This PR does not:

  • Implement any gap (each MUST gap is a separate brainstorming → writing-plans → subagent-driven-development cycle).
  • Execute the Fork & Trim sub-project (removal of 11 out-of-scope packages).
  • Execute the Health Audit & Upgrade sub-project (dep audit, CI reset).
  • Touch packages/** — strictly read-only on the codebase.

🤖 Generated with Claude Code

selmeci and others added 30 commits April 16, 2026 21:37
Output of the brainstorming session for the internal-fork takeover of castore.
Defines the structure, methodology, competitor set, per-feature audit template,
gap-entry template, prioritization framework (MoSCoW) and risk register format
for the upcoming castore ES gap-analysis & roadmap document.

Scope locked to 8 in-scope packages driven by the product stack
(Postgres + EventBridge + Zod + test-tools) and the D1/N1/N4/N5/N6 profile
(finance domain, GDPR crypto-shredding, zero event loss, snapshots, schema evolution).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Non-blocking improvements from spec-document-reviewer pass:
- §2 tally example explicitly labeled illustrative
- §4 Impact surface field constrained to the 8 in-scope packages
- §5 DAG edge criterion defined (design/shipping dependency, not theme)
- §6 per-phase table Owner/ETA marked TBD with rationale (OQ-1/OQ-2)
- §6 risk register preview annotated as verify-during-audit hypotheses
- §7 success criterion #8 made measurable (glossary, file:line, back-links)
- §9 open questions split into blocking vs. non-blocking-with-defaults

Spec status: approved by reviewer, pending user review.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Plan operationalizes the design spec in 7 chunks:
1. Skeleton & methodology
2. Castore code audit (26 features across 5 categories)
3. Competitor harvest & matrix (Emmett, EventStoreDB, Marten, Equinox + DIY baseline)
4. Gap catalogue with dependency DAG
5. Risk register verification (R-01..R-12 validated against code)
6. Prioritized roadmap (phases, conditional calendar, checkpoint gate)
7. Finalization (appendices, success-criteria validation, handoff)

Each chunk delivers a reviewable slice; task steps are grep/read/write/commit-shaped
for a research deliverable (not TDD code-shaped). Anti-scope explicit: no
implementation plans for MUST gaps, no Fork & Trim, no Health Audit — those are
separate brainstorming cycles after this deliverable is accepted.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Eight issues surfaced by plan-document-reviewer:
- Chunk 2 preamble: introduce named "Absence evidence protocol"
  (code + docs + package-metadata grep with confidence calibration)
- Tasks 2.1/2.3/2.5: reference the protocol by name instead of improvising
- Task 2.5 F22 multi-tenancy: write §4 audit entry (WON'T belongs in §5, not §4)
- Task 2.6: add gh auth prerequisite with browser fallback for reproducibility
- Task 4.2 Step 0: reconcile gap IDs/priorities against §4 audit before writing
- Task 4.6 Mermaid snippet: mark explicitly as stylistic example, not template
- Task 5.4 R-09: drop from §7 and relocate into G-01 design considerations
- Task 6.4 Step 0: conditional path if OQ-1 (FTE) resolves before this task
- Task 7.4: mandate explicit go/no-go acceptance ritual with 3-option choice
  and a terminal decision line written into the deliverable

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
F1 ✅ append-only (UNIQUE constraint + EventAlreadyExistsError)
F2 ✅ OCC (version collision via pg 23505)
F3 ✅ multi-aggregate tx (pushEventGroup in BEGIN…COMMIT)
F4 ❌ idempotent writes absent (absence confirmed via grep)
F5 ❌ snapshots absent (absence confirmed via grep)
Category A tally: 3/5 ✅ · 2/5 ❌

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
F6 ❌ projection runner absent (no checkpoint/catch-up in scope)
F7 ❌ projection rebuild absent
F8 ❌ projection lag monitoring absent
F9 ⚠️ inline projections via onEventPushed (post-commit, not same-tx)
F10 ✅ async projections via ConnectedEventStore + EventBridge adapter
Category B tally: 1/5 ✅ · 1/5 ⚠️ · 3/5 ❌

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
F11 ⚠️ event versioning via type-string convention only (no schemaVersion field)
F12 ❌ upcaster pipeline absent (confirmed via grep)
F13 ❌ event type retirement absent (confirmed via grep)
F14 🔶 tolerant deserialization via Zod strip default (optional, not enforced)
Category C tally: 0/4 ✅ · 1/4 🔶 · 1/4 ⚠️ · 2/4 ❌

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
F15 ❌ transactional outbox absent (dual-write gap confirmed in connectedEventStore.ts:134-140)
F16 ⚠️ at-least-once delivery via EventBridge; natural key exists but no dedup helper
F17 ✅ message bus abstraction (NotificationMessageBus / StateCarryingMessageBus)
F18 ✅ message queue abstraction (queue types mirror bus types)
F19 ❌ DLQ/poison-pill absent at framework level (AWS-native only)
Category D tally: 2/5 ✅ · 1/5 ⚠️ · 2/5 ❌

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…F26)

F20 ❌ crypto-shredding absent (payload stored as plaintext JSONB)
F21 ❌ encryption at rest absent at framework level (infra-layer concern)
F22 ❌ multi-tenancy absent (WON'T — out of profile for single-tenant D1)
F23 ❌ causation/correlation absent (generic metadata field only, no enforcement)
F24 ⚠️ replay tooling partial (replay flag + force option; no orchestration CLI)
F25 ❌ observability absent (no OTel hooks in framework)
F26 ✅ testing utilities (mockEventStore, muteEventStore, in-memory adapter)
Category E tally: 1/7 ✅ · 1/7 ⚠️ · 5/7 ❌

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ibration + F9 justification + strengths back-ref)

- Recalibrate F6, F12, F19, F21, F25 Evidence confidence from high to medium (triple-negative-grep only; absence evidence protocol)
- Add ⚠️-vs-❌ distinction sentence to F9 Known limits (onEventPushed extension point exists; transactional coupling not achievable)
- Replace inaccurate eventDetail.ts:14 back-reference in strengths bullet with eventStore.ts:35 (REDUCER generic declaration)
- Update §8 Appendices stub to note §8.1 filled, §8.2/§8.3 pending per plan Task 7.1
- Update frontmatter status to reflect Chunks 1 and 2 complete

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Technical risks R-01..R-04 confirmed with file:line evidence.
Dependency risks: R-05 confirmed (upstream dormant since 2025-10-12),
R-06 rewritten (adapter uses postgres.js not pg),
R-07 INVALIDATED (all AWS adapters already on SDK v3).
Governance risks R-08 and R-10 confirmed via zero-match grep.
Organizational risks R-11 and R-12 kept as-is.
R-09 dropped (reclassified to §5 G-01). Final register: 10 rows kept,
2 dropped. Dropped section present with one-line reasons.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Fills §6.1 (MoSCoW counts), §6.2 (phase descriptions), §6.3 (per-phase
gap tables with TBD owner/ETA), §6.4 (conditional calendar — 3-row FTE
table with DAG parallelization analysis), and §6.5 (post-Phase-1
checkpoint decision gate with concrete review inputs and three outcome
options).

Phase 1 total effort: 68 person-days (midpoints: G-01 L=16, G-02 L=16,
G-03 M=6, G-04 XL=30). DAG edge G-05→G-04 creates the critical path
that caps 2-FTE Phase 1 at ~12 weeks despite parallelism.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- §8.2: 4-row table with repo URL, docs URL, last inspected 2026-04-16,
  and version tag for Emmett (0.42.0), EventStoreDB (v26.0.2),
  Marten (V8.30.1), Equinox (4.1.0) — obtained via gh API
- §8.3: 52-term alphabetized glossary covering every semantic acronym
  in the deliverable (ES, OCC, DLQ, PII, KMS, CDC, FTE, DDD, CQRS,
  NFR, DAG, KPI, TDE, SPOF + 38 additional terms)
- Updated status metadata line to "Draft — complete, pending user review"
- Updated §8 preamble to reflect all subsections filled

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
All 8 criteria pass:
1. 26 features confirmed (grep count = 26, Evidence: count = 26)
2. 4 competitor profiles complete with dealbreakers in §3.1–§3.4
3. All 18 ❌/⚠️ castore cells mapped to G-NN or WON'T in §5.0
4. All 10 gap entries have design sketch, effort, priority, deps, rollout
5. DAG in §5.7 is Mermaid-rendered with 3 edges, no cycles
6. Risk register has Likelihood/Impact/Mitigation for all 10 kept risks
7. §6.0 opens with explicit "Go — conditional" recommendation
8. §8.3 glossary has 52 terms; 72 file:line refs; forward refs resolve

Refined criterion 3 evidence to clarify F14's 🔶 status exemption.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
selmeci and others added 2 commits April 16, 2026 23:04
- Fix G-04/G-05 header inconsistency: G-04 now correctly shows
  "Depends on: G-05 API design"; G-05 now shows "Blocks: G-04"
  (was "none" in both — contradicted the §5.7 DAG edge G-05→G-04)
- Update EventStoreDB version from v24.10.13 to v26.0.2 to match
  gh API result and §8.2 competitor references table
- Update ESDB docs URL to canonical root (version-pinned URL was stale)
- Refine §8.4 criterion 3 evidence note on F14's 🔶 exemption

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds a final ## Decision section after §8 containing:
- Three-sentence summary surfacing §6.0 (go/no-go), §5 (gap catalogue
  entry points), and §7 (risk register) as the three critical reads
- Reminder that this is a decision input, not a green light
- Three-option ritual: (a) Go → open G-01 brainstorming cycle,
  (b) No-go → pivot to Emmett or DIY Postgres, (c) Defer → named
  blockers returned
- Blank decision template line for user to fill in

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@selmeci selmeci merged commit 73db53b into main Apr 17, 2026
@selmeci selmeci deleted the claude/determined-kare-235d90 branch April 17, 2026 13:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant