Test infrastructure for Go. testkit reads your interfaces and types, generates the test doubles, builders, fixtures, conformance suites, and benchmarks you would otherwise write by hand, and generates the tests that prove the generated code works. You write domain logic. testkit writes plumbing.
//go:generate testkit stub -o storetest/store_stub.gen.go Store
//go:generate testkit builder -o storetest/user_builder.gen.go User
//go:generate testkit suite -o storetest/store_spec.gen.go Store
//go:generate testkit bench -o storetest/store_bench.gen.go Store
//go:generate testkit sentinel -o errors.gen_test.go
//go:generate testkit enum -o status_enum.gen_test.go Statusgo generate ./... # produces *.gen.go and *.gen_test.go
go test ./... # 100% branch coverage of generated plumbingEach conformance generator targets a tier. Lower tiers run in seconds; higher tiers run for minutes-to-hours and stand in for production load.
| Tier | Generator | What it proves |
|---|---|---|
| 1 | suite |
Single-call contract per documented directive across 21 method shapes |
| 2-3 | model |
Property-based state-machine, differential, workload (planned) |
| 4 | bench |
Allocation, mean latency, and per-percentile latency budgets across 21 shapes |
| 5 | sim, chaos, differential-rollout, replay |
Subsystem simulation, continuous fault, shadow traffic, trace replay (planned) |
stub provides the runtime primitives the conformance tiers compose with — recording, fault injection, gating, virtual clocks, strict-mode dispatch. sim is the subsystem-level harness Tier 5 generators (chaos, replay) run on top of.
Each generator emits both the artifact and the tests that exercise it. testkit's roadmap covers 14 generators; six ship today and the remaining eight are documented as planned.
| Generator | Tier | What it produces | Injection point |
|---|---|---|---|
stub |
primitives | Per-method test doubles with strategy-pattern fault injection (counted, probabilistic, time-windowed, predicate, retry), virtual clock, recording, gates, concurrent-test primitives. Modes: BenchMode, DelegateTo. Auto-detects iter.Seq[T] / iter.Seq2[V, error] and emits stream helpers. |
Domain state for in-memory companions; per-method overrides via WithIfaceMethod constructor options. |
builder |
fixtures | Fluent With* per exported field, Append* for slices, WithEntry / WithEntries for maps, WithDataString for []byte, Mutate, Clone. Generic types, embedded structs, nested fields, sets-as-maps. |
<Type>Defaults() <Type> function; optional per-field literal defaults via //testkit:default. |
sentinel |
static | Prefix consistency, uniqueness, non-overlap (errors.Is asymmetry), unwrap-chain, format-string field round-trip, optional-method detection (Is, Unwrap). |
None. |
enum |
static | Exhaustiveness, all-values-distinct, stringer round-trip, out-of-range fallback, optional ParseX round-trip, optional MarshalText / UnmarshalText round-trip. |
None. |
codec |
wire | codectest.Spec[T] round-trip suite + benchmark + fuzz seeds + binary wire fixtures (testdata/wire/*.bin) regenerated when codec semantics change. Single source of truth across spec and wire. Modes: spec emission, -update-wire regeneration. |
Sample value overrides via <Type>Sample() convention. |
suite |
1 | Assert<Iface>Contract(t, factory, opts...) with shape-detected subtests across 21 method shapes plus one subtest per applied directive. Skip-with-diagnostic for missing options. |
Factory func() Iface closure plus typed <Iface>On<Method>(...) plug-in slots and <Iface>PrePopulate seeder. |
model |
2-3 | rapid property-based state-machine: differential SUT vs reference testing, auto-derived shape-specific laws (ReadAfterWrite, DeleteReturnsNotFound, PureDeterminism, PredicateConsistency, StreamReentrancy), concurrent stress with Porcupine linearizability checking, trace combinators (AfterEvery, EventuallyAfter, Never), goroutine leak detection, TestClock integration for time-aware interfaces, fuzz target generation. (planned) |
Factory func() T, optional RefFactory func() T for differential mode; per-method action helpers auto-emitted by shape; extension via ExtraActions, ExtraLaws, WithConcurrent. |
bench |
4 | Benchmark<Iface>Contract(b, factory, opts...) with shape-detected <Method>/hot-path and <Method>/concurrent-4 benchmarks across 21 shapes, plus opt-in allocs-within-N / latency-within-D / percentiles gates per //testkit:allocs / //testkit:latency / //testkit:percentiles. Auto-enables stub BenchMode. |
Factory func() Iface closure plus typed <Iface>BenchOn<Method>(...) plug-in slots and <Iface>BenchPrePopulate seeder. |
sim |
5 | Subsystem-shaped deterministic simulation harness (sim.NewDispatcherSim(t, seed, cfg)) wrapping the full production stack: stubs auto-wrapped with recording-stamped OnRecord hooks emitting into the engine trace; Clock / RandSource plumbed from engine seeds; completion-event sinks; capture-on-failure with minimal-reproducer seed extraction; Workload[T] and Invariant[T] registration verbs; cooperative-quiescence AssertAll. Per-subsystem composition (one Sim per top-level interface). Replaces hand-rolled per-package sim packages. |
Top-level interface; Workload[T] and Invariant[T] registrations; optional seed and dispatcher config. |
chaos |
5 | Continuous deterministic simulation harness driving randomized fault schedules, network partitions, clock skew, and process restarts across operation sequences. Seeded reproducible runs; on failure emits trace + minimal-reproducer seed. Integrates with sim via OnRecord hooks for trace correlation. |
Faults configuration, RunFault / PartitionSpec declarations, soak-budget hints. |
differential-rollout |
5 | Shadow-traffic harness running an interface across N implementations in parallel with response comparison and divergence reporting. Migration-grade testing; pluggable equivalence relations for non-deterministic fields (timestamps, IDs). | Implementation list, equivalence-class declarations, divergence threshold. |
replay |
5 | Trace-replay harness consuming captured production call traces (or sim-engine traces) and replaying them through impls to verify behavioral preservation across versions. | Trace source (file path or producer function), version-skew tolerance hints. |
smoke |
CLI surface | CLI command coverage: invokes each declared cobra.Command (or equivalent) with sampled flag combinations, asserts exit code and stdout/stderr shape per command. Auto-detects subcommand trees, flag types, required-flag validation. Captures golden output for stable commands; diffs on regression. |
None for declared commands; optional flag-value distributions. |
pkgdoc |
compliance | Compliance audit-doc skeleton (docs/compliance/package-audit/<pkg>.md) with REQ table, refactor-history banner, evidence section. Auto-fills mechanical parts; refreshes when source changes; validates REQ IDs against source directives. |
Domain analysis (design notes, refactor narrative, exceptions). |
Conformance generators are driven by //testkit: directives on interface methods. Directives are machine-readable, grep-able, and validated at codegen time — unknown directives error, conflicting combinations error, redundant pairs warn.
The matrix below covers every directive shipped in generator/directive/known.go, grouped by intent. Columns reflect the shipped consumers (stub, suite, bench, builder, sentinel); the planned generators (model, sim, chaos, differential-rollout, replay, smoke, codec, pkgdoc) document their planned directive surface in their own doc files. enum is directive-free.
| Directive | Args | stub | suite | bench |
|---|---|---|---|---|
errors |
<ErrName>... |
✓ Fault<Sentinel>() helpers |
✓ <Method>/returns <ErrX> |
— |
wrapped-via |
<ErrName> |
✓ wraps via target in helpers | ✓ <Method>/wrapped-via |
— |
| Directive | Args | stub | suite | bench |
|---|---|---|---|---|
idempotent |
— | — | ✓ <Method>/idempotent |
— |
pure |
— | — | ✓ <Method>/pure |
— |
cacheable |
— | — | ✓ <Method>/cacheable (implies pure) |
— |
monotonic |
— | — | ✓ <Method>/monotonic |
— |
| Directive | Args | stub | suite | bench |
|---|---|---|---|---|
concurrent |
— | — | ✓ <Method>/concurrent |
— |
concurrent-readers |
— | — | ✓ <Method>/concurrent-readers |
— |
nilsafe |
— | — | ✓ <Method>/nilsafe |
— |
atomic |
— | — | ✓ <Method>/atomic |
— |
| Directive | Args | stub | suite | bench |
|---|---|---|---|---|
ctx |
— | — | (documentation hint; auto-emitted ctx subtests cover the semantics) | — |
timeout |
<duration> |
— | ✓ <Method>/timeout |
— |
deprecated |
<Replacement> |
✓ tb.Logf in dispatch + // Deprecated: doc comment |
✓ <Method>/deprecated (skip with hint) |
— |
lease |
<ReleaseMethod> |
— | ✓ <Method>/lease |
— |
integration-only |
— | ✓ skip dispatch | ✓ skip method block | ✓ skip method helper |
| Directive | Args | stub | suite | bench |
|---|---|---|---|---|
allocs |
<N> |
— | — | ✓ <Method>/allocs-within-N (gate) |
latency |
<duration> |
— | — | ✓ <Method>/latency-within-D (gate) |
percentiles |
p<N>=<duration>... |
— | — | ✓ <Method>/percentiles (per-percentile gate; reports p50/p95/p99) |
| Directive | Args | stub | suite | bench |
|---|---|---|---|---|
retryable |
— | ✓ companion marker for retry-succeeds-on-attempt |
— | — |
retry-succeeds-on-attempt |
<N> |
✓ RetrySchedule(err) helper |
✓ <Method>/retry-succeeds-on-attempt |
— |
| Directive | Args | stub | suite | bench |
|---|---|---|---|---|
sideeffect |
<Method> |
— | ✓ <Method>/sideeffect |
— |
order-after |
<Method> |
✓ AssertAfter (strict mode) |
✓ <Method>/order-after |
— |
partition |
<Field> |
✓ FaultForPartition, FaultForOtherPartitions |
✓ <Method>/partition |
— |
| Directive | Args | stub | suite | bench |
|---|---|---|---|---|
validates |
<Field> |
— | ✓ <Method>/validates |
— |
bounded |
<min..max> |
— | ✓ <Method>/bounded |
— |
| Directive | Args | stub | suite | bench |
|---|---|---|---|---|
invariant |
<description>... |
— | (documentation hint) | — |
fuzz |
— | — | (planned) | — |
hooks |
<HookName>... |
— | ✓ <Method>/hooks |
— |
req |
<REQ-ID>... |
— | ✓ name suffix on emitted subtests | ✓ name suffix on emitted subtests |
| Directive | Args | stub | suite | bench |
|---|---|---|---|---|
eventually |
<timeout> |
— | ✓ <Method>/eventually |
— |
scope |
<ScopeName> |
— | ✓ <Method>/scope |
— |
pagination |
<CursorField> |
— | ✓ <Method>/pagination |
— |
| Directive | Args | Effect |
|---|---|---|
deleter |
— | Routes func(ctx?, K) error to Deleter shape (vs. Writer). |
mutator |
— | Marks func(ctx?, V) as Mutator (auto-detected from signature). |
not-mutator |
— | Opt-out of Mutator auto-detection (treat as Writer). |
keyfield |
<FieldName> |
Reference-synthesis hint for the planned model generator. |
| Directive | Args | stub | suite | bench |
|---|---|---|---|---|
sample |
<Func>... |
— | ✓ replaces zero-value args in smoke / plug-in / hot-path call sites | ✓ replaces synthesized literals in hot-path / gate calls |
Each invariant directive is a first-class consumer; the suite emits a paired-method subtest at the carrier method's t.Run block. The cross directive remains as the escape hatch for invariants not yet shaped into a per-invariant directive.
| Directive | Args | suite |
|---|---|---|
read-after-write |
<Reader> |
✓ <Method>/read-after-write |
delete-removes |
<Reader> |
✓ <Method>/delete-removes |
stream-reflects-mutations |
<Stream> |
✓ <Method>/stream-reflects-mutations |
lifecycle-after-close |
<Reader> |
✓ <Method>/lifecycle-after-close |
crdt-merge |
<Other> |
✓ <Method>/crdt-merge |
cross |
<name> <Methods>... |
✓ generic invariant escape hatch |
| Directive | Args | Generator | Effect |
|---|---|---|---|
sentinel-no-overlap-with |
<ImportPath>... |
sentinel |
Declare additional packages to verify sentinel non-overlap with. |
default |
<Value> |
builder |
Per-field literal default seeded into Build() when no Defaults factory exists. |
Composition is enforced: pure and sideeffect together is a codegen error; pure and monotonic together is a codegen error; retry-succeeds-on-attempt requires retryable; cacheable implies pure and inherits its conflicts; concurrent and concurrent-readers are mutually exclusive.
A focused utility set that earns its place. Every assertion uses go-cmp for structural diffs; not reflect.DeepEqual.
testkit.Equal(t, got, want, "Get must return the stored item")
testkit.ErrorIs(t, err, store.ErrNotFound, "Get on missing key must return ErrNotFound")
testkit.Assert(t, user).
IsNotNil("must exist").
HasLen(3, "must have 3 fields populated")
c := testkit.StartContract(b).AllocsMax(0).LatencyMax(5 * time.Microsecond)
for c.Loop() {
store.Get(b.Context(), key)
}
c.End()
rec := testkit.NewRecorder[PutCall]()
rec.OnRecord(func(c PutCall) { trace.Append(tick, c) })
rec.WaitForN(t, 3, 5*time.Second)
gate := rec.NewGate()Full reference: docs/testkit/primitives/.
Static CI checks. No test execution required; pure code and config analysis. testkit ships 18 validators across four categories.
- Structural — proto-sync, migration chain, depguard, wire freshness, error prefix, skip expiry.
- Test quality — assertion-free tests, test naming,
time.Sleepdetection, orphaned test doubles, parallel safety, contract-benchmark completeness. - Quality gates — benchmark contracts, benchmark regression vs baseline, per-layer coverage thresholds, per-layer mutation thresholds.
- Compliance — audit-doc completeness, REQ-to-test traceability.
Full reference: docs/testkit/validators/.
go install go.thesmos.sh/testkit/cmd/testkit@latest
go get go.thesmos.sh/testkit@latestAdd directives to the package that owns the types:
// store/generate.go
package store
//go:generate testkit stub -o storetest/store_stub.gen.go Store
//go:generate testkit builder -o storetest/user_builder.gen.go User
//go:generate testkit suite -o storetest/store_spec.gen.go Store
//go:generate testkit sentinel -o errors.gen_test.goGenerate, scaffold the companion file, fill in domain logic, and run tests:
go generate ./...
testkit scaffold stub storetest Store
$EDITOR storetest/store_stub.go
go test ./...Generated files default to your existing layout. Override with -o for any non-default placement.
| Generator | Default output |
|---|---|
stub |
<pkg>test/<subject>_stub.gen.go |
builder |
<pkg>test/<subject>_builder.gen.go |
suite |
<pkg>test/<subject>_spec.gen.go |
model |
<pkg>test/<subject>_model.gen.go |
bench |
<pkg>test/<subject>_bench.gen.go |
sim |
<pkg>test/<subject>_sim.gen.go |
chaos |
<pkg>test/<subject>_chaos.gen.go |
replay |
<pkg>test/<subject>_replay.gen.go |
sentinel |
errors.gen_test.go |
enum |
<subject>_enum.gen_test.go |
codec |
<subject>_codec.gen_test.go |
smoke |
cmd/<binary>/smoke.gen_test.go |
pkgdoc |
docs/compliance/package-audit/<pkg>.md |
Combine multiple types into one file by passing several arguments:
//go:generate testkit enum -o status_enum.gen_test.go OrderStatus PaymentStatus RefundStatuscheck: lint test check-structural check-test-quality check-quality check-compliance
check-structural:
go generate ./... && git diff --exit-code
testkit validate proto-sync migration depguard wire error-prefix skip-expiry
check-test-quality:
testkit validate assertion-free test-naming time-sleep \
orphaned-doubles parallel-safety contract-completeness
check-quality:
testkit validate benchmarks bench-regression coverage mutation
check-compliance:
testkit validate audit reqsPre-1.0. The runtime primitives (MethodStub[T], Recorder[T], FaultInjector, StartContract, shape-typed assertion and bench contexts, golden-file helpers) and the generator engine (generator/) are stable. Six generators ship today: stub, builder, sentinel, enum, suite, bench. The remaining eight (model, sim, chaos, differential-rollout, replay, smoke, codec, pkgdoc) are designed and documented but not yet implemented. Generator vocabulary and directive semantics may change in minor versions until the V1 cut. Consumers should pin and regenerate on upgrade.
V1 commits to:
- Stable directive vocabulary and composition rules.
- Stable generated-file layout and naming conventions.
- Backward-compatible runtime primitives (additive only).
- Documented deprecation cycle for any directive removal.
Core testkit package:
github.com/google/go-cmp— structural diffs for assertions.pgregory.net/rapid— property-based generators formodeland stub stream helpers.
Optional sub-packages with isolated dependencies:
| Package | Adds |
|---|---|
testkit/container |
testcontainers-go |
testkit/httptest |
stdlib only |
testkit/oteltest |
go.opentelemetry.io/otel/sdk |
testkit/clitest |
stdlib only |
- Primitives — assertions, recording, fault injection, benchmarking, golden files, polling.
- Generators — per-generator semantics, output, injection points.
- Validators — 18 CI checks.
- Configuration —
.testkit.ymlreference. - Layout — test package directory structure and file roles.
- Linter config — copy-pasteable
.golangci.ymlfor testkit consumers. - Adoption — incremental adoption guide.