Quality engineering methodologies for AI coding agents. Deterministic concurrency testing, combinatorial coverage, breaking change detection, client resilience patterns, fault injection, state machine testing, contract validation, and observability assertions.
These skills teach AI coding agents (Claude Code, Cursor, etc.) rigorous testing methodologies -- not just "write a test", but how to test concurrent code, what combinations to cover, and which invariants to assert.
| Skill | What It Does | Key Innovation |
|---|---|---|
| barrier-concurrency-testing | Deterministic race condition testing via barriers | Replaces flaky setTimeout-based timing tests with reproducible interleaving |
| breaking-change-detector | 6-category breaking change analysis | Tolerant reader pattern for safe schema evolution |
| fault-injection-testing | Circuit breaker, retry policy, queue preservation | Executable resilience primitives with state machine transitions |
| model-based-testing | State machine transition matrices and guard truth tables | N*N transition coverage, context mutation assertions |
| observability-testing | Structured log assertions and level policy enforcement | Mock logger with 5 assertion types, level classification heuristic |
| pairwise-test-coverage | Combinatorial testing with matrix generator | Zero-dep greedy algorithm covers all factor pairs in near-minimal test cases |
| websocket-client-resilience | Client-side WebSocket resilience patterns | Mobile-aware timeouts, circuit breakers, heartbeat hysteresis |
| zod-contract-testing | Schema boundary testing with compound state matrices | 2^N optional field coverage, refinement assertions, schema evolution |
Prove detection in under a minute. Clone the repo and run one command per skill:
git clone https://github.com/apankov1/quality-engineering.git
cd quality-engineering
# Run all tests across all skills
node --experimental-strip-types --test skills/*/*.spec.ts
# Or run individual skills:
# Barrier concurrency: 7 tests for deterministic race condition patterns
node --experimental-strip-types --test skills/barrier-concurrency-testing/test-fixtures.spec.ts
# Breaking change detector: 16 tests for field classification, schema validation, event type changes
node --experimental-strip-types --test skills/breaking-change-detector/breaking-change.spec.ts
# Fault injection: 29 tests for circuit breaker, retry policy, queue preservation
node --experimental-strip-types --test skills/fault-injection-testing/fault-injection.spec.ts
# Model-based testing: 31 tests for state machines, guard truth tables, context mutations
node --experimental-strip-types --test skills/model-based-testing/state-machine.spec.ts
# Observability testing: 27 tests for mock logger, log assertions, level classification
node --experimental-strip-types --test skills/observability-testing/structured-logger.spec.ts
# Pairwise coverage: 13 tests including safety rails and edge cases
node --experimental-strip-types --test skills/pairwise-test-coverage/pairwise.spec.ts
# WebSocket resilience: 27 tests for backoff, circuit breaker, heartbeat, command ack, gaps, timeouts
node --experimental-strip-types --test skills/websocket-client-resilience/resilience.spec.ts
# Zod contract testing: 24 tests for schema boundary validation, compound state matrices
node --experimental-strip-types --test skills/zod-contract-testing/schema-boundary.spec.tsEach skill ships importable utilities alongside its tests. Import what you need:
import { generatePairwiseMatrix } from './skills/pairwise-test-coverage/pairwise.ts';
import { classifyFieldChange } from './skills/breaking-change-detector/breaking-change.ts';
import { circuitBreakerTransition, CommandAckTracker } from './skills/websocket-client-resilience/resilience.ts';
import { createBarrier, createTrackedBarrier, releaseAllBarriers } from './skills/barrier-concurrency-testing/test-fixtures.ts';
import { CircuitBreaker, RetryPolicy, createFaultInjector } from './skills/fault-injection-testing/fault-injection.ts';
import { createStateMachine, testTransitionMatrix, assertGuardTruthTable } from './skills/model-based-testing/state-machine.ts';
import { createMockLogger, assertLogEntry, assertNoLogsAbove } from './skills/observability-testing/structured-logger.ts';
import { testValidInput, testInvalidInput, generateCompoundStateMatrix } from './skills/zod-contract-testing/schema-boundary.ts';Node.js 18-20 (LTS)? Replace node --experimental-strip-types with npx tsx:
npx tsx --test skills/*/*.spec.ts# Install a single skill
npx skills add apankov1/quality-engineering --skill barrier-concurrency-testing
# Install another
npx skills add apankov1/quality-engineering --skill pairwise-test-coverageDo not test race conditions with setTimeout and hope. This skill teaches agents to use barriers -- deterministic interleave points that make concurrency tests reproducible on every run.
- Barrier interface + tracked cleanup pattern
- 5 named invariant assertions for queue/sequence correctness
- Deferred promise alternative for simple cases
- Decision guide: when to use barriers vs deferred
- Violation rules:
inadequate_barrier_coverage,flaky_timing_test
Detects breaking changes across 6 categories that could disrupt active sessions or lose client compatibility. Uses the tolerant reader pattern for safe schema evolution.
breaking-change.ts-- Field change classifier and serialized schema validator- 6 detection categories: contracts, database schema, RPC/API, WebSocket protocol, serialized state, event sourcing
- Backward compatibility checklist for schema/contract changes
- Output format template: CRITICAL (disrupts sessions) / WARNING (migration required) / SAFE
- Violation rules:
contract_field_removal,schema_without_catch,strict_parse_in_deserialize,migration_drops_column,endpoint_removed,event_type_renamed
Simulates storage and network failures for resilience testing. Ships executable circuit breaker, retry policy with exponential backoff, and queue preservation assertions.
fault-injection.ts-- Circuit breaker state machine (closed/open/half-open), RetryPolicy with jitter, fault injector, queue assertions- Circuit breaker with configurable failure/success thresholds and reset timeout
- Exponential backoff with jitter and max delay cap
- Queue preservation and trimming assertions for message ordering
- Violation rules:
missing_circuit_breaker_test,missing_retry_backoff_test,missing_queue_preservation_test
Tests state machine transitions with XState-style patterns. Validates transition matrices, guard truth tables, context mutations, and terminal state handling.
state-machine.ts-- State machine factory, transition validator, matrix generator, guard truth table, context mutation assertions- N*N transition matrix generation for exhaustive state pair coverage
- Guard truth tables for boolean condition testing (2^N for N ≤ 4, pairwise for 5+)
- Context mutation assertions with deep structural equality and field removal detection
- Violation rules:
missing_transition_coverage,missing_guard_truth_table,missing_context_mutation_test,untested_terminal_state
Verifies structured log output and error context. Provides mock logger creation, log level policy enforcement, and assertion patterns for observability testing.
structured-logger.ts-- Mock logger, log assertions, level policy, level classifier- 5 assertion types:
assertLogEntry,assertNoLogsAbove,assertHasLogLevel,assertErrorLogged,classifyLogLevel - Log level policy with production sampling guidance
- Common misclassification table for alert fatigue prevention
- Violation rules:
missing_error_log_assertion,happy_path_logs_error,log_level_misclassification,missing_context_fields,console_spy_instead_of_mock
When your system has 4 factors with 3-4 values each, exhaustive testing means 100+ cases. Pairwise testing covers all pair interactions in ~12 cases.
Ships with real runnable code:
pairwise.ts-- Zero-dependency greedy covering algorithm (generates near-minimal test matrices)test-fixtures.ts-- Pairwise test case helpers (name generation, expected-value mapping)- Step-by-step workflow from factor identification to table-driven tests
- 6 testing technique examples in references (pairwise matrices, property-based, model-based, fault injection, contract validation, observability assertions)
6 resilience patterns for WebSocket clients, designed for real-world mobile network conditions where P99 latency is 5-8 seconds.
resilience.ts-- Backoff calculator, circuit breaker state machine, heartbeat hysteresis, gap detector, timeout classifier- Command acknowledgment, sequence gap detection, mobile-aware timeouts
- Before/after code examples for each pattern
- Violation rules with severity levels (must-fail, should-fail, nice-to-have)
Tests Zod schemas at system boundaries -- not just happy-path inputs. Compound state matrices cover all 2^N optional field combinations.
schema-boundary.ts-- Valid/invalid input testing, schema evolution, refinement coverage, compound state matrix generator- Compound state matrix for 2^N optional field combinations (exhaustive for ≤4, pairwise for 5+)
- Schema evolution testing for backward compatibility
- Refinement coverage (both passing and failing cases)
- Violation rules:
missing_invalid_input_test,missing_refinement_coverage,missing_compound_state_test,schema_not_at_boundary,type_assertion_instead_of_parse
Run the test suites with Node.js 22+ (no install needed):
git clone https://github.com/apankov1/quality-engineering.git
cd quality-engineering
# Run all tests across all skills
node --experimental-strip-types --test skills/*/*.spec.ts
# Run the pairwise CLI demo (3×3×3 matrix + 8×4 stress test)
node --experimental-strip-types skills/pairwise-test-coverage/pairwise.tsNode.js < 22? Use npx tsx instead:
npx tsx --test skills/*/*.spec.tsGitHub Actions snippet for CI:
- name: Quality engineering skill tests
run: node --experimental-strip-types --test skills/*/*.spec.tsStart with the change you're making. Each skill targets a different failure mode.
| What Changed | Skill | Why |
|---|---|---|
| Shared types, API signatures, DB schema | breaking-change-detector | Catch incompatibilities before merge |
| Code with concurrent access or shared state | barrier-concurrency-testing | Expose race windows deterministically |
| 3+ interacting parameters (config, modes, states) | pairwise-test-coverage | Cover pair interactions without exhaustive explosion |
| WebSocket client reconnection or health checks | websocket-client-resilience | Survive real mobile network conditions |
| Named states, lifecycles, workflow progressions | model-based-testing | N*N transition matrix catches invalid transitions |
| Error paths, retry logic, circuit breakers | fault-injection-testing | Verify resilience under simulated failures |
| Structured logging, error context, alert levels | observability-testing | Assert log output as first-class behavior |
| Zod schemas at system boundaries | zod-contract-testing | Test rejection of invalid data, not just acceptance |
| Skill | Defects Caught | Example |
|---|---|---|
| barrier-concurrency-testing | Race conditions, write ordering bugs, stale reads | Flush conflict: two writers interleave, last-write-wins silently drops data |
| breaking-change-detector | Backward-incompatible schema/API/protocol changes | Renamed field breaks active sessions still using old format |
| fault-injection-testing | Missing retry logic, uncapped backoff, queue data loss | Circuit stays closed despite repeated failures, queue items silently dropped |
| model-based-testing | Invalid transitions allowed, guard logic gaps | Draft→Published allowed (skipping review), context mutation side effects |
| observability-testing | Missing error logs, wrong log levels, alert fatigue | Error path logs at info level, happy path produces spurious warnings |
| pairwise-test-coverage | Interaction bugs in untested parameter combinations | Auth=expired + role=admin works, but auth=expired + role=guest crashes |
| websocket-client-resilience | Reconnection storms, false disconnects, lost messages | All clients retry at once after outage (thundering herd) |
| zod-contract-testing | Schema accepts invalid data, rejects valid data | Optional field combination triggers refinement bug, old data rejected |
Design → pairwise-test-coverage (define factor matrix for new feature)
→ model-based-testing (define state machine transitions)
Implementation → barrier-concurrency-testing (test concurrent paths as you write them)
→ observability-testing (verify logging as you add error paths)
→ fault-injection-testing (test resilience of new integrations)
Pre-merge → breaking-change-detector (audit contract/schema diffs)
→ zod-contract-testing (verify boundary schemas)
Client deploy → websocket-client-resilience (verify reconnection patterns)
When reviewing a pull request, walk the diff and select skills based on what changed:
- Scan the diff --
git diff --name-only base...HEAD - Match files to skills:
- Contract/schema/migration files → run breaking-change-detector + zod-contract-testing
- Concurrent or stateful code → run barrier-concurrency-testing
- Multi-factor config or mode logic → run pairwise-test-coverage
- WebSocket client code → run websocket-client-resilience
- State machines, lifecycles, workflow states → run model-based-testing
- Error handling, retry logic, circuit breakers → run fault-injection-testing
- Logging statements, error context → run observability-testing
- Check for overlap -- a single PR may trigger multiple skills (e.g., a schema migration that also adds concurrent flush logic)
- Verify each finding has a test -- every violation the skill flags should map to a test case in the PR (see Proving Defect Detection below)
A test that only passes is not evidence. To prove a test catches the bug:
- Write the test first -- before any fix, write a test that exercises the defect
- Confirm it fails -- run the test and verify it fails for the expected reason (not a syntax error or import failure)
- Apply the fix -- make the minimal change to correct the behavior
- Confirm it passes -- the same test now passes, proving the fix addresses the defect
- Document the proof -- add a comment in the test referencing the failure:
// Regression: before fix, barrier.wait() resolved immediately
// because release() was called in constructor. See commit abc1234.
it('blocks until explicitly released', async () => {
const barrier = createBarrier();
let resolved = false;
barrier.wait().then(() => { resolved = true; });
await new Promise(r => setTimeout(r, 10));
assert.equal(resolved, false); // Would have been true before fix
barrier.release();
await barrier.wait();
assert.equal(resolved, true);
});This is the bug_detection_not_validated rule from pairwise-test-coverage, applied as a cross-cutting practice.
Use this template when documenting skill results. Each finding maps to a severity, the skill that detected it, and the risk it traces to.
## Findings: PR #142
### MUST-FIX
| # | Skill | Violation | File | Risk |
|---|-------|-----------|------|------|
| 1 | breaking-change-detector | `contract_field_removal` | types.ts:42 | Active sessions fail on state load |
| 2 | barrier-concurrency-testing | `inadequate_barrier_coverage` | flush.spec.ts | Race window untested: concurrent writes |
### SHOULD-FIX
| # | Skill | Violation | File | Risk |
|---|-------|-----------|------|------|
| 3 | websocket-client-resilience | heartbeat hysteresis | client.ts:88 | False disconnects on slow networks |
### ADVISORY
| # | Skill | Violation | File | Risk |
|---|-------|-----------|------|------|
| 4 | pairwise-test-coverage | `missing_pairwise_coverage` | retry.spec.ts | 3 factors × 4 values, only happy path tested |
**Severity mapping**: must-fail → MUST-FIX, should-fail → SHOULD-FIX, nice-to-have → ADVISORYEach row traces from violation (what's wrong) → file (where) → risk (why it matters). Reviewers can triage by severity and verify each finding has a corresponding test using the fail-before/fix-after proof above.
These skills cover correctness and compatibility, not performance benchmarking or load testing. But several non-functional concerns are addressed directly:
| Concern | Covered By | How |
|---|---|---|
| Fault tolerance | fault-injection-testing | Circuit breaker state machine, retry with backoff, queue preservation |
| Observability | observability-testing | Mock logger assertions, log level policy, error context validation |
| State machine correctness | model-based-testing | N*N transition matrix, guard truth tables, context mutation assertions |
| Resilience under degraded networks | websocket-client-resilience | Circuit breakers, backoff with jitter, mobile-aware timeouts |
| Concurrency under contention | barrier-concurrency-testing | Deterministic interleaving for write ordering and stale-read detection |
| Contract safety | zod-contract-testing | Boundary validation, compound state matrices, schema evolution |
Not covered: load/stress testing, latency percentile benchmarks, throughput profiling, SLO threshold validation, large-scale chaos engineering. These require runtime infrastructure (load generators, APM tooling, distributed tracing) that is outside the scope of static analysis skills.
These skills grew out of solving real race conditions, breaking changes, and mobile network failures in a multiplayer platform on Cloudflare Workers. Generalized for any tech stack -- no framework dependencies.
All skills are framework-agnostic. The patterns work with:
- Test frameworks: Vitest, Jest, Node test runner, Go testing, Rust #[test]
- Languages: TypeScript/JavaScript (examples), but concepts apply to any language
- CI systems: GitHub Actions, GitLab CI, CircleCI, Jenkins