@noetaris/harness-testing

Testing utilities for @noetaris/harness agents. Provides isolated step execution, route testing, and an Observer test double — all designed for unit and integration tests of individual steps without spinning up the full agent loop.

Overview

runStep — executes a single named step in isolation with a synthetic ctx, capturing state updates, interrupts, and errors in a typed result
runRoute — invokes a step's route function directly with a synthetic state snapshot
MockObserver — records all harness lifecycle calls (onRunStart, onStepEnd, onEvent, …) for assertion

Installation

pnpm add -D @noetaris/harness-testing

Peer dependency:

pnpm add @noetaris/harness

Quick Start

import { runStep, MockObserver } from '@noetaris/harness-testing'

// test a step in isolation
const result = await runStep(harness, 'fetchData', {
  slots: { llm: mockLlm },
  state: { query: 'hello' },
})

expect(result.state).toEqual({ response: 'world' })
expect(result.interrupted).toBeNull()
expect(result.error).toBeNull()

// observe a full run
const obs = new MockObserver()
await agent.run({}, { observer: obs })
expect(obs.calls.onStepEnd).toHaveLength(3)
expect(obs.events('llm.response')).toHaveLength(1)

API Reference

`runStep`

runStep<S, Ctx>(
  harness: Harness<Ctx, S, any, any>,
  stepName: string,
  options: RunStepOptions<S, Ctx>,
): Promise<StepResult<S>>

Executes one step from the harness loop definition in isolation. Always resolves — interrupts and errors are captured in StepResult rather than thrown.

Throws StepNotFoundError if stepName doesn't exist in the loop, or NoRunFunctionError if the step has no run function (decision-only steps).

`RunStepOptions<S, Ctx>`

Field	Type	Default	Description
`slots`	`Ctx`	—	User-defined context slots passed as the step's `ctx` argument.
`state`	`Partial<S>`	—	Initial state snapshot passed as the step's first argument.
`interruptResponses`	`Record<string, unknown>`	—	Pre-loaded interrupt responses keyed by interrupt ID. When a key is present the interrupt is answered immediately (replay mode). When absent the interrupt is captured and execution stops (pause mode).
`sessionId`	`string`	`"test-session"`	Injected as `ctx.sessionId`.
`agentId`	`string`	`"test-agent"`	Injected as `ctx.agentId`.
`runId`	`string`	`crypto.randomUUID()`	Injected as `ctx.runId`.
`signal`	`AbortSignal`	never-aborts signal	Injected as `ctx.signal`.

`StepResult<S>`

Exactly one of state, interrupted, or error is non-null per call.

Field	Type	Description
`state`	`Partial<S> \| null`	State update returned by the step. Null if interrupted or errored.
`interrupted`	`{ interruptId: string; prompt: unknown } \| null`	Populated when the step called `ctx.interrupt()` without a matching `interruptResponses` entry.
`error`	`Error \| null`	Error thrown by the step (non-interrupt errors only).
`events`	`Array<{ name: string; payload: unknown }>`	All events emitted via `ctx.emit()` during the step, in call order.

Examples

Happy path:

const result = await runStep(harness, 'classify', {
  slots: { llm: mockLlm },
  state: { input: 'hello world' },
})
expect(result.state?.label).toBe('greeting')
expect(result.events).toEqual([{ name: 'classify.done', payload: { label: 'greeting' } }])

Interrupt — pause mode (no pre-loaded response):

const result = await runStep(harness, 'confirmAction', {
  slots: { llm: mockLlm },
  state: { action: 'delete' },
})
expect(result.interrupted).toEqual({
  interruptId: 'confirm-delete',
  prompt: { message: 'Are you sure?' },
})

Interrupt — replay mode (pre-loaded response):

const result = await runStep(harness, 'confirmAction', {
  slots: { llm: mockLlm },
  state: { action: 'delete' },
  interruptResponses: { 'confirm-delete': true },
})
expect(result.state?.confirmed).toBe(true)

Error capture:

const result = await runStep(harness, 'riskyStep', {
  slots: { llm: failingLlm },
  state: {},
})
expect(result.error?.message).toMatch('upstream failed')

`runRoute`

runRoute<S, Ctx>(
  harness: Harness<Ctx, S, any, any>,
  stepName: string,
  options: RunRouteOptions<S>,
): string

Invokes a step's route function synchronously with a synthetic state snapshot. Returns the next step name.

Throws StepNotFoundError if stepName doesn't exist, or NoRouteFunctionError if the step has no route function.

`RunRouteOptions<S>`

Field	Type	Description
`state`	`Partial<S>`	State snapshot passed to the `route` function.

Example

const next = runRoute(harness, 'decide', {
  state: { score: 0.9 },
})
expect(next).toBe('approve')

const next2 = runRoute(harness, 'decide', {
  state: { score: 0.1 },
})
expect(next2).toBe('reject')

`MockObserver`

A test double implementing the harness Observer interface. Records every lifecycle call for assertion.

const obs = new MockObserver()
await agent.run({}, { observer: obs })

obs.calls.onRunStart   // OnRunStartCall[]
obs.calls.onRunEnd     // OnRunEndCall[]
obs.calls.onStepStart  // OnStepStartCall[]
obs.calls.onStepEnd    // OnStepEndCall[]
obs.calls.onStepError  // OnStepErrorCall[]
obs.calls.onInterrupt  // OnInterruptCall[]
obs.calls.onEvent      // OnEventCall[]

All arrays are always present (never null) even if no calls were made.

`obs.events(name): unknown[]`

Filters onEvent records by name and returns the payload values in call order.

const payloads = obs.events('llm.response')
expect(payloads).toHaveLength(1)
expect(payloads[0]).toMatchObject({ model: 'claude-sonnet-4-6' })

`obs.reset(): void`

Clears all recorded calls. Useful when reusing the same observer across multiple agent.run() calls in one test.

await agent.run({}, { observer: obs })
obs.reset()
await agent.run({}, { observer: obs })
expect(obs.calls.onRunStart).toHaveLength(1) // only the second run

Error Classes

`StepNotFoundError`

Thrown by runStep and runRoute when the step name does not exist in the loop definition.

`NoRunFunctionError`

Thrown by runStep when the named step has no run function. This happens for decision-only steps that only define a route function — test their routing logic with runRoute instead.

`NoRouteFunctionError`

Thrown by runRoute when the named step has no route function.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
src		src
.gitignore		.gitignore
.npmignore		.npmignore
README.md		README.md
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
tsconfig.json		tsconfig.json
tsup.config.ts		tsup.config.ts
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

@noetaris/harness-testing

Overview

Installation

Quick Start

API Reference

`runStep`

`RunStepOptions<S, Ctx>`

`StepResult<S>`

Examples

`runRoute`

`RunRouteOptions<S>`

Example

`MockObserver`

`obs.events(name): unknown[]`

`obs.reset(): void`

Error Classes

`StepNotFoundError`

`NoRunFunctionError`

`NoRouteFunctionError`

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

@noetaris/harness-testing

Overview

Installation

Quick Start

API Reference

runStep

RunStepOptions<S, Ctx>

StepResult<S>

Examples

runRoute

RunRouteOptions<S>

Example

MockObserver

obs.events(name): unknown[]

obs.reset(): void

Error Classes

StepNotFoundError

NoRunFunctionError

NoRouteFunctionError

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`runStep`

`RunStepOptions<S, Ctx>`

`StepResult<S>`

`runRoute`

`RunRouteOptions<S>`

`MockObserver`

`obs.events(name): unknown[]`

`obs.reset(): void`

`StepNotFoundError`

`NoRunFunctionError`

`NoRouteFunctionError`

Packages