Crucible State Machine — Conformance Harness #7
joshua-temple
started this conversation in
State Machine
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
The conformance harness answers one question: does a machine implementation behave correctly? It rests on three pillars:
Fireproduces against a trusted reference implementation for the same input.Provide) behave identically. This pillar exists because the config/implementation split makes the JSON IR a co-equal authoring format; round-trip identity is what makes that promise enforceable.The first matters most when a machine is meant to be the canonical model of behavior that some other code path also implements — for example, when a state machine is introduced alongside an existing hand-written handler, and you want to prove the two agree before you let the machine take over. The harness is the drift detector.
The shape of the problem
You have two implementations of the same behavior:
m.Cast(state).Fire(ctx, event)returns effects directly (no IO).Conformance proves: for the same starting entity and the same event, both sides emit equivalent effects.
flowchart TD Entity[precondition entity] --> Snap[snapshot entity] Snap --> Ref[run reference implementation<br/>capture effects via intercepting sink] Ref --> Reset[reset entity to snapshot] Reset --> Fire[m.Cast state then Fire ctx event<br/>capture result.Effects] Fire --> Diff[DiffEffects with tolerances] Diff --> Verdict{equal?} Verdict -->|yes| Pass[MATCH] Verdict -->|no| Fail[MISMATCH: side-by-side field diff]The snapshot-and-reset on a single entity instance is deliberate: running both halves against the same entity avoids cosmetic timestamp/ID drift that two freshly-built fixtures would introduce, so tolerances stay as tight as possible.
Capturing the reference's effects
The reference implementation reaches the outside world through some seam — a publisher, a writer, an RPC client. Swap an intercepting sink into that seam for the duration of the test: it records every effect and performs no real IO, while the reference code proceeds as if the call succeeded. The machine half needs no sink —
Firereturns its effects directly.Equivalence & the diff
DiffEffectscompares two ordered slices of captured effects positionally: effectiagainst effecti. For each pair:missing/extramismatches.On mismatch the harness prints a side-by-side, field-level diff so the reviewer can see exactly which field diverged and on which side — and whether a field that passed did so only because it was within tolerance:
Tolerances
Some fields can differ legitimately. Tolerances are configurable, with sensible defaults:
AnyUUIDandIgnoreFieldsare escape hatches — every entry is a coverage hole, so use them sparingly.Round-trip identity (pillar 3)
The config/implementation split promises that a machine authored in Go and a machine loaded from JSON are the same machine. Round-trip identity is the test that holds that promise honest. It is a v1-core conformance check, run for every domain machine:
Two checks, both required: structural (the IR serializes to a byte-stable form — step 4) and behavioral (every golden scenario produces an identical
ScenarioResultagainst the code-built and the JSON-loaded machine — step 5). Because behavior is named-ref + params bound from the same registry, "identical" is exact, not approximate — no tolerances needed on this pillar. A divergence here means the IR is lossy or the registry binding drifted, and is a hard failure.Golden scenarios (pillar 2)
Beyond pairwise oracle comparison, the harness replays golden scenarios (the Scenario JSON format from the JSON, Mermaid & DOT discussion). A scenario fires a known event sequence and asserts the final state, the set of emitted effects, the trace length, and that no errors occurred. Golden scenarios are committed fixtures; a change to the machine that breaks one is a visible diff in CI. They also feed pillar 3 — the same scenarios run against both the code-built and JSON-loaded machine.
Runs as a normal test
Conformance tests run inside the standard test command (
go test ./...) — no separate pipeline, no new CI step. A conformance failure is an ordinary test failure that blocks merge. When a reference implementation eventually retires (the machine becomes the sole source of truth), its conformance test is deleted in the same change. The mapping is 1:1, so there is no orphan risk.Why this is the right place to invest
The harness is what lets a machine be introduced safely next to existing behavior: you get a continuously-verified proof that the declarative model and the imperative code agree, field for field, before you cut over. It is also the foundation the Phase 2 Visualizer builds on — the same scenario-runs-against-a-machine path powers a stakeholder scenario builder.
Crucible State Machine series: Overview & Roadmap · Kernel Core · HSM · Path Planning · JSON / Mermaid / DOT · Evolution Guide · Conformance · Phase 2
Beta Was this translation helpful? Give feedback.
All reactions