simulate-spec misses domain fidelity gaps — simulation verifies internal consistency, not external validity

## Problem

After 3 simulation iterations that all passed (zero unresolved pain points, decreasing pain point count each iteration), a domain expert review found 31 critical gaps including: trading calendar mismatches across asset classes, no corporate action handling, missing indicator dependency ordering, no pipeline scheduling, and missing data deduplication fields.

The simulation never caught these because it was asking the wrong question.

## Root Cause

The simulate-spec skill walks each bounded context in isolation, verifying that internal logic is consistent (no dead ends, no contradictions, all entities exercised). It does **not** verify that the spec accurately models real-world behavior.

Specific structural issues:

1. **Per-context walkthroughs miss cross-cutting constraints.** Trading calendars, corporate actions, and operational scheduling affect every bounded context. Walking contexts independently means these concerns fall through the cracks.

2. **No "day in the life" end-to-end scenarios.** The simulation walks individual rules but never traces a complete real-world scenario (e.g., "what happens on a weekend?" or "what happens after a stock split?").

3. **Convergence is a false signal.** Decreasing pain point count across iterations is interpreted as improvement, but it only measures internal consistency. A spec can be perfectly internally consistent while being fundamentally wrong about how the domain works.

4. **The adversarial review-simulation gate only checks internal consistency.** It verifies: unresolved pain points, untested entities, untested quality attributes, cross-context data shape consistency. It does not check whether the spec accurately reflects real-world domain behavior.

## Proposed Fixes

### 1. Add "Cross-Cutting Walkthrough" requirement to simulate-spec

After per-context walkthroughs, require at least 3 cross-cutting scenarios that trace data through multiple contexts:
- A normal operational day (full pipeline from ingestion to output)
- An edge-case operational day (weekend, holiday, empty database cold start)
- A domain anomaly day (data split/delisting, external service failure, flash crash)

### 2. Add "Domain Fidelity Check" to review-simulation gate

The adversarial review should explicitly verify that the spec accurately models real-world constraints, not just internal consistency. Add a checklist item:

> "I have verified that the spec accurately handles: operational scheduling, data source alignment, calendar/datetime edge cases, and domain-specific anomaly events."

### 3. Add "Day 1 Deployment" walkthrough

A dedicated walkthrough that traces the system from empty state to first successful output. This catches cold-start gaps (historical data backfill, benchmark initialization, indicator warmup periods).

### 4. Separate convergence metrics

Track two metrics independently:
- **Internal consistency** (pain point count — should decrease to zero)
- **External coverage** (number of real-world scenarios verified — should increase each iteration)

These are orthogonal. A spec can have zero pain points and zero real-world coverage.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

simulate-spec misses domain fidelity gaps — simulation verifies internal consistency, not external validity #172

Problem

Root Cause

Proposed Fixes

1. Add "Cross-Cutting Walkthrough" requirement to simulate-spec

2. Add "Domain Fidelity Check" to review-simulation gate

3. Add "Day 1 Deployment" walkthrough

4. Separate convergence metrics

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

simulate-spec misses domain fidelity gaps — simulation verifies internal consistency, not external validity #172

Description

Problem

Root Cause

Proposed Fixes

1. Add "Cross-Cutting Walkthrough" requirement to simulate-spec

2. Add "Domain Fidelity Check" to review-simulation gate

3. Add "Day 1 Deployment" walkthrough

4. Separate convergence metrics

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions