Harness as Code

Declarative AI agent governance in Go. Define tools, hooks, delegation rules, and entire sub-agents in version-controlled Markdown files — reviewable in PRs, testable in CI, reproducible across environments.

Like Infrastructure as Code, but for AI agent behavior. Every prompt ships with its governance. Every agent behavior is reproducible, reviewable, and testable.

The Problem

Most agent frameworks force a choice:

Approach	Tradeoff
Rigid plugin systems	Hard to customize, vendor lock-in
Raw LLM wrappers	No guardrails, no governance
Python mega-frameworks	200+ deps, hidden state, slow iteration

Harness as Code takes a different approach: your harness.md file IS the control plane for an AI agent — governance, tools, delegation limits, and system prompt in one reviewable artifact.

What Makes It Different

Markdown-first — harness.md (YAML frontmatter + body as system prompt) defines your agent declaratively. .harness/ directory adds tools, hooks, and sub-agents as individual .md files.
Governed by default — Hooks enforce safety through architecture, not instructions. You don't make agents trustworthy by writing better prompts — you make them trustworthy by engineering harnesses where wrong behavior is architecturally impossible.
Self-extending — The delegate meta-tool lets agents create tools and spawn sub-agents recursively at runtime.
Minimal — Single Go binary, ~5 dependencies, compiles in seconds. tools.Handler is just func(ctx, args) (string, error).
Portable — Works with GitHub Copilot, OpenAI, or any compatible chat completions API.
Testable — Built-in eval framework validates agent behavior against real models in CI.

Core Philosophy

Make the right thing to do the easy thing to do.

The harness enforces safety through architecture:

fs.replace fails if the match isn't unique (forces surgical edits)
Recursive delegation is depth-limited (configurable, hard cap at 5)
Iterations decrease per depth level (20 → 10 → 5 → 3)
Retry guards auto-block tools after 2 consecutive errors
Path operations are jailed to the working directory at the Go level
Hooks run at every lifecycle point — blocking is a first-class action

The DevOps Parallel

DevOps Gave Humans	Harness as Code Gives Agents
Infrastructure as Code	Agent governance as code
CI/CD pipelines	Agent loops with termination and retry
Deployment gates	Autonomy levels and approval gates
Git hooks	Pre-tool hookflows
RBAC / least privilege	Tool registry access control
Observability	Agent event streams and metrics

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                     HARNESS AS CODE                               │
├─────────────────────────────────────────────────────────────────┤
│                                                                   │
│  ┌──────────┐    ┌──────────┐    ┌──────────────┐               │
│  │  Config  │───▶│  Agent   │───▶│  Completion  │               │
│  │ (MD+Dir) │    │  Loop    │    │  Client      │               │
│  └──────────┘    └────┬─────┘    └──────────────┘               │
│                       │                                           │
│       ┌───────────────┼───────────────┐                          │
│       ▼               ▼               ▼                          │
│  ┌──────────┐   ┌──────────┐   ┌──────────────┐                │
│  │  Hook    │   │  Tool    │   │   Context    │                │
│  │  System  │   │ Registry │   │   Manager    │                │
│  └──────────┘   └────┬─────┘   └──────────────┘                │
│                       │                                           │
│       ┌───────────────┼───────────────┐                          │
│       ▼               ▼               ▼                          │
│  ┌──────────┐   ┌──────────┐   ┌──────────────┐                │
│  │ Starlark │   │ Delegate │   │  fs/edit     │                │
│  │ Engine   │   │  System  │   │  Built-ins   │                │
│  └──────────┘   └──────────┘   └──────────────┘                │
│                       │                                           │
│       ┌───────────────┼───────────────┐                          │
│       ▼               ▼               ▼                          │
│  ┌──────────┐   ┌──────────┐   ┌──────────────┐                │
│  │  Model   │   │  Agent   │   │   Task       │                │
│  │ Registry │   │ Registry │   │   Store      │                │
│  └──────────┘   └──────────┘   └──────────────┘                │
│                                                                   │
└─────────────────────────────────────────────────────────────────┘

Installation

CLI (recommended)

go install github.com/htekdev/ai-harness/cmd/harness@latest

Pre-built binaries

Download from Releases — available for Linux, macOS, and Windows (amd64 + arm64).

# Example: Linux amd64
curl -Lo harness.tar.gz https://github.com/htekdev/ai-harness/releases/latest/download/harness_linux_amd64.tar.gz
tar xzf harness.tar.gz
sudo mv harness /usr/local/bin/

As a library

go get github.com/htekdev/ai-harness

Requirements: Go 1.25+, an API key for any OpenAI-compatible endpoint (GitHub Copilot, OpenAI, etc.)

Quick Start

1. Scaffold a new project

harness init my-agent
cd my-agent

This creates:

harness.md — main configuration (YAML frontmatter + system prompt)
.harness/tools/ — tool definitions
.harness/hooks/ — hook definitions
.harness/agents/ — sub-agent definitions

2. Validate your harness

harness validate
# ✅ harness.md — valid (6 tools, 2 hooks, 0 agents) [3ms]

3. Run interactively

harness run

4. Inspect your configuration

harness tools          # List all registered tools
harness hooks -v       # List hooks with details
harness agents         # List configured sub-agents
harness artifacts      # List typed artifacts in the registry
harness artifacts -v   # Detailed view with tools, hooks, conditions
harness artifacts --type plugin  # Filter by artifact type
harness context        # Show context window composition (what the agent sees)
harness context -v     # Detailed provenance (which file, which artifact, why)
harness context --json # Machine-readable context snapshot
harness context --budget 64000  # Show token budget utilization

5. Define your harness in Markdown

<!-- harness.md -->
---
model:
  provider: copilot
  name: gpt-4o
  max_tokens: 4096
  temperature: 0.7
  api_key_env: GH_TOKEN

models:
  - name: gpt-4o
    provider: copilot
    api_key_env: GH_TOKEN
  - name: gpt-4o-mini
    provider: copilot
    api_key_env: GH_TOKEN

delegation:
  max_depth: 3
  max_concurrent: 5
  iterations_per_depth: [20, 10, 5, 3]

context:
  system_prompt: ""  # body below is the system prompt

tools:
  - name: greet
    description: Greet someone by name
    parameters:
      name:
        type: string
        description: Name to greet
        required: true
    script: |
      def run(args):
          return "Hello, " + args["name"] + "!"

hooks:
  - event: tool.pre
    handler: secret_guard
    priority: 10
    script: |
      def handle(event, payload):
          encoded = json.encode(payload)
          if "password" in encoded or "secret" in encoded:
              return block("potential secret detected")
          return allow()
---

# AI Assistant

You are a helpful AI assistant powered by the AI Harness framework.

## Rules

- Use the delegate tool to spawn sub-agents when you need specialized capabilities
- Never say "I can't do that" — delegate to a specialist agent
- Be concise and helpful

2. Add file-based tools (optional)

<!-- .harness/tools/read_file.md -->
---
parameters:
  path: { type: string, required: true }
script: |
  def run(args):
      return fs.read(args["path"])
---

# read_file

Read a file from the workspace and return its contents.

3. Add custom agents (optional)

<!-- .harness/agents/code-writer.md -->
---
model: gpt-4o
description: Writes and tests Go code
tools:
  - read_file
  - write_file
  - name: run_tests
    parameters: {}
    script: |
      def run(args):
          return exec.run("go", ["test", "./..."])
hooks:
  - path_guard
---

# Code Writer

You are a senior Go developer. Write clean, idiomatic, well-tested code.
Always run tests after writing code.

4. Run it

package main

import "github.com/htekdev/ai-harness/harness"

func main() {
    h, err := harness.New("harness.md")
    if err != nil {
        panic(err)
    }
    h.Interactive()
}

export GH_TOKEN=$(gh auth token)
go run ./cmd/example/

That's it. The harness auto-discovers .harness/tools/, .harness/hooks/, and .harness/agents/ directories. Inline definitions and file-based definitions are additive — mix freely.

Directory convention

project/
  harness.md                     # root harness (frontmatter + system prompt)
  .harness/
    agents/
      code-writer.md             # custom agent: "code-writer"
      researcher.md              # custom agent: "researcher"
    tools/
      read_file.md               # tool: "read_file"
      write_file.md              # tool: "write_file"
      edit_file.md               # tool: "edit_file"
    hooks/
      path_guard.md              # hook: "path_guard"
      command_guard.md           # hook: "command_guard"

Loading rules:

harness.md frontmatter is loaded first (inline tools/hooks registered)
.harness/tools/*.md are scanned and ADDED to the tool registry
.harness/hooks/*.md are scanned and ADDED to the hook system
.harness/agents/*.md are scanned and registered in the agent registry
On name collision, file wins (allows overriding inline defaults)
.harness/ is optional — inline-only works perfectly

Delegation system

Recursive delegation (agent trees)

Delegates can spawn their own delegates, creating trees of specialized workers:

{
  "task": "Build and test a REST API",
  "agent": "code-writer",
  "model": "gpt-4o-mini"
}

Guardrails (harness-level, not prompting):

Depth-limited: configurable max (hard cap at 5 regardless)
Iterations decrease per depth: [20, 10, 5, 3] by default
Retry guard blocks tools after 2 consecutive errors
delegate.pre / delegate.post hooks can block or rewrite at any level

Async delegation

{
  "task": "Research the latest Go release notes",
  "agent": "researcher",
  "async": true
}

Returns a task handle immediately. Query status with delegate_status, get results with delegate_result, or block with delegate_await.

Custom agents

Named agents in .harness/agents/ bundle:

Model — which model to use
System prompt — the markdown body
Tools — references to .harness/tools/ or inline definitions
Hooks — references to .harness/hooks/ or inline definitions

Agent tools can be string references (loaded from .harness/tools/) or inline objects. Hooks work the same way. This makes tools and hooks composable — define once, reuse across agents.

Parallel tool execution

All tool calls within a single model turn execute concurrently (goroutines + WaitGroup). Results are collected in order and added to context sequentially.

Starlark scripting engine

All tools and hooks are implemented in Starlark (a Python-like language) embedded in the Markdown frontmatter. No Go code needed for tool logic.

Available built-ins:

Category	Functions
Time	`time.now()`
JSON	`json.encode(val)`, `json.decode(s)`
Math	`math.abs`, `math.min`, `math.max`, `math.floor`, `math.ceil`
Runtime	`os.cwd()`, `os.hostname()`, `os.platform()`, `os.args()`
URL / IDs	`url.parse(s)`, `url.encode(params)`, `uuid.v4()`
Flow control	`random(min, max)`, `sleep(ms)`, `assert(condition, msg?)`
Network	`http.get(url, headers?, timeout_seconds?)`, `http.post(url, body?, headers?, timeout_seconds?)`
Regex	`re.match(pattern, text)`, `re.find_all(pattern, text)`, `re.replace(pattern, repl, text)`
Hashing	`hash.sha256(text)`, `hash.md5(text)`
Encoding / crypto	`base64.encode(s)`, `base64.decode(s)`, `crypto.hmac_sha256(key, msg)`
Strings / templating	`string.upper/lower/trim/split/join/truncate/pad_left/pad_right`, `template.render(tmpl, vars)`
Validation / sets	`validate.email/url/json`, `set.new/contains/union/intersect/diff/values/size`
State	`cache.set/get/has/delete/clear`, `metrics.incr/get/reset/snapshot`, `ctx.set/get/has/delete/clear/snapshot`
I/O	`env(key)`, `log(msg)`, `emit("custom.event", payload)`, `exec.run(cmd, args?, timeout_ms?, dir?)`
File read	`fs.read(path)`, `fs.exists(path)`, `fs.list(path)`, `fs.stat(path)`, `fs.glob(pattern)`, `fs.line_count(path)`, `fs.find(path, pattern)`, `fs.read_lines(path, start, end)`
File write	`fs.write(path, content)`, `fs.append(path, content)`, `fs.mkdir(path)`, `fs.remove(path)`, `fs.copy(src, dst)`, `fs.move(src, dst)`
File edit / preview	`fs.replace(path, old, new)`, `fs.replace_all(path, old, new)`, `fs.insert_at(path, line, content)`, `fs.replace_lines(path, start, end, content)`, `fs.delete_lines(path, start, end)`, `fs.diff(old_content, new_content, old_name?, new_name?)`
Hooks	`allow()`, `block(reason)`, `modify(payload)`

Lower-level API

client := completion.NewClient(completion.ClientConfig{
  BaseURL:    "https://api.githubcopilot.com",
  APIKey:     os.Getenv("GITHUB_TOKEN"),
  Model:      "gpt-4o",
  MaxRetries: 3,
})

registry := tools.NewRegistry()
system := hooks.NewSystem()
ctxMgr := contextpkg.NewManager(contextpkg.Config{
  SystemPrompt: "You are a helpful assistant.",
})

a := agent.New(agent.Options{
  Client:  client,
  Tools:   registry,
  Hooks:   system,
  Context: ctxMgr,
})

CLI example

# Linux/macOS
export GH_TOKEN=$(gh auth token)

# Windows PowerShell
$env:GH_TOKEN = $(gh auth token)

go run ./cmd/example/

API reference

`harness`

The high-level entry point. It loads config, creates the completion client, context manager, hook system, tool registry, model registry, agent registry, and agent.

h, err := harness.New("harness.md")
result, err := h.Run(ctx, "Summarize this file")

Key methods:

New(configPath string) (*Harness, error) — auto-detects .md vs .yaml
NewFromConfig(cfg *config.Config, agents map[string]*config.AgentConfig) (*Harness, error)
Run(ctx, input)
RunSession(ctx) / EndSession(ctx)
RegisterTool(def, handler)
RegisterHook(reg)
Agent()

Notes:

Tools with a script field are fully functional immediately — no Go handler needed.
Tools without a script are registered with a placeholder handler until you supply one via RegisterTool.
Hooks with a script field are fully functional immediately.
The delegate meta-tool is auto-registered when using the harness package.

`agent`

The core turn loop. It sends messages to the model, executes requested tools, appends tool results, and continues until the model returns a final response.

result, err := a.Run(ctx, "What time is it?")
fmt.Println(result.Response)
fmt.Println(result.ToolCalls)
fmt.Println(result.ToolResults)

Behavior highlights:

supports tool-call loops
aggregates token usage across completion calls
enforces MaxToolIterations
emits lifecycle hooks around sessions, turns, tools, and completions

`completion`

OpenAI-compatible chat client with retry support and both non-streaming and streaming modes.

resp, err := client.Complete(ctx, completion.Request{
  Messages: []completion.Message{{Role: completion.RoleUser, Content: "Hello"}},
})

Streaming example

stream, err := client.CompleteStream(ctx, completion.Request{
  Messages: []completion.Message{{Role: completion.RoleUser, Content: "Stream the answer"}},
})
if err != nil {
  panic(err)
}

for chunk := range stream {
  if chunk.Err != nil {
    panic(chunk.Err)
  }
  if chunk.Done {
    break
  }

  if chunk.Delta != "" {
    fmt.Print(chunk.Delta)
  }
  for _, tc := range chunk.ToolCallDeltas {
    fmt.Printf("\npartial tool call: %s %s", tc.Function.Name, tc.Function.Arguments)
  }
}

Streaming details:

parses Server-Sent Events in OpenAI chat format (data: {...}\n\n)
handles the [DONE] sentinel
returns StreamChunk values with text deltas, tool call deltas, finish reason, done state, and stream errors

`tools`

Runtime tool registry used by the agent.

registry := tools.NewRegistry()
err := registry.Register(definition, handler)
result := registry.Execute(ctx, tools.Call{
  ID:        "call_1",
  Name:      "echo",
  Arguments: json.RawMessage(`{"message":"hello"}`),
})

Capabilities:

register/unregister tools
inspect definitions with Get and List
execute handlers with JSON arguments
convert registered tools to OpenAI tool schema with ToOpenAIFormat()

`hooks`

Lifecycle hook system for governance and cross-cutting behavior.

sys := hooks.NewSystem()
sys.Register(hooks.Registration{
  Name:     "block-dangerous-tool",
  Event:    hooks.EventToolPre,
  Priority: 10,
  Handler: func(ctx context.Context, event hooks.Event, payload any) hooks.Result {
    return hooks.Result{Action: hooks.ActionContinue}
  },
})

Capabilities:

register ordered handlers by event
inspect handlers by event
block, modify, or continue execution

`context`

Conversation history manager with a system prompt and basic token-aware truncation.

ctxMgr := contextpkg.NewManager(contextpkg.Config{
  SystemPrompt: "You are concise.",
  MaxMessages:  50,
  MaxTokens:    128000,
})
ctxMgr.AddMessage(completion.Message{Role: completion.RoleUser, Content: "Hi"})
messages := ctxMgr.Messages()

Capabilities:

maintain conversation history
preserve system prompt outside history
estimate tokens approximately
truncate oldest messages when limits are exceeded
fork contexts for branching workflows

`config`

Markdown/YAML config loader, directory scanner, and validator.

cfg, agents, err := config.LoadFull("harness.md")
if err != nil {
  panic(err)
}

apiKey := cfg.ResolveAPIKey()
baseURL := cfg.BaseURL()

Capabilities:

parse Markdown (frontmatter + body) and legacy YAML/JSON
auto-detect format by extension
scan .harness/tools/, .harness/hooks/, .harness/agents/ directories
merge file-based definitions with inline (additive, files win on collision)
apply defaults and validate

Configuration reference

Example harness.md frontmatter:

model:
  provider: copilot         # openai | copilot | custom string
  name: gpt-4o              # required, non-empty
  max_tokens: 4096          # must be > 0
  temperature: 0.7          # must be between 0 and 2
  base_url: ""             # optional override; provider default used when empty
  api_key_env: GH_TOKEN     # env var to read API key from

models:                      # named model registry for delegation
  - name: gpt-4o
    provider: copilot
    api_key_env: GH_TOKEN
  - name: gpt-4o-mini
    provider: copilot
    api_key_env: GH_TOKEN

delegation:
  max_depth: 3              # max recursive depth (hard cap: 5)
  max_concurrent: 5         # max async tasks running
  iterations_per_depth:     # tool iterations allowed per depth level
    - 20
    - 10
    - 5
    - 3

context:
  max_history: 50           # max non-system messages retained
  max_tokens: 128000        # approximate context budget
  system_prompt: ""         # overridden by markdown body if empty

tools:
  - name: echo
    description: Echo back a message
    parameters:
      message:
        type: string        # string | number | boolean | object | array
        description: Message to echo back
        required: true

hooks:
  - event: tool.pre         # see valid hook events below
    handler: audit_log      # symbolic hook name

Full schema

`model`

Field	Type	Required	Notes
`provider`	string	no	Defaults to `openai`
`name`	string	yes	Model name; validated as non-empty
`max_tokens`	int	yes	Must be greater than 0
`temperature`	float	yes	Must be between 0 and 2
`base_url`	string	no	Overrides provider-based default
`api_key_env`	string	no	Defaults to `GITHUB_TOKEN`

`context`

Field	Type	Required	Notes
`max_history`	int	no	Defaults to `50`
`max_tokens`	int	no	Defaults to `128000`
`system_prompt`	string	no	Prepended as a system message

`tools[]`

Field	Type	Required	Notes
`name`	string	yes	Must be unique and non-empty
`description`	string	no	Sent to the model
`parameters`	map	no	Parameter definitions keyed by name
`script`	string	no	Starlark script implementing `def run(args): ...`

`tools[].parameters.*`

Field	Type	Required	Notes
`type`	string	yes	JSON-schema-like primitive type
`description`	string	no	Parameter description
`required`	bool	no	Marks the field required

`hooks[]`

Field	Type	Required	Notes
`event`	string	yes	Must match a defined lifecycle event
`handler`	string	yes	Symbolic handler name
`when`	string	no	Optional Starlark expression; hook runs only when it evaluates truthy
`priority`	int	no	Lower numbers execute first (default: 100)
`script`	string	no	Starlark script implementing `def handle(event, payload): ...`

Hook system

Events

Valid events:

Hooks may also include a when: expression that can inspect event, payload, and the standard Starlark built-ins before the main handle(event, payload) function runs.

session.start
session.end
turn.start
turn.end
tool.pre
tool.post
completion.pre
completion.post

Actions

A hook handler returns hooks.Result with one of these actions:

ActionContinue: continue normally
ActionBlock: stop execution and return an error
ActionModify: replace the payload passed to subsequent handlers

Priority

Lower priority numbers run first.

sys.Register(hooks.Registration{
  Name:     "normalize-input",
  Event:    hooks.EventTurnStart,
  Priority: 1,
  Handler: func(ctx context.Context, event hooks.Event, payload any) hooks.Result {
    input := payload.(string)
    return hooks.Result{Action: hooks.ActionModify, Payload: strings.TrimSpace(input)}
  },
})

Common hook examples

Block a tool call

sys.Register(hooks.Registration{
  Name:  "block-delete",
  Event: hooks.EventToolPre,
  Handler: func(ctx context.Context, event hooks.Event, payload any) hooks.Result {
    call := payload.(*tools.Call)
    if call.Name == "delete_file" {
      return hooks.Result{Action: hooks.ActionBlock, Reason: "delete_file is disabled"}
    }
    return hooks.Result{Action: hooks.ActionContinue}
  },
})

Modify user input

sys.Register(hooks.Registration{
  Name:  "rewrite-input",
  Event: hooks.EventTurnStart,
  Handler: func(ctx context.Context, event hooks.Event, payload any) hooks.Result {
    return hooks.Result{Action: hooks.ActionModify, Payload: "Answer in one sentence: " + payload.(string)}
  },
})

Tool registration

A tool is made of two parts:

a definition the model sees
a handler that executes the tool

def := tools.Definition{
  Name:        "calculate",
  Description: "Add two numbers",
  Parameters: []tools.Parameter{
    {Name: "a", Type: tools.TypeNumber, Description: "First number", Required: true},
    {Name: "b", Type: tools.TypeNumber, Description: "Second number", Required: true},
  },
}

handler := func(ctx context.Context, args json.RawMessage) (string, error) {
  var input struct {
    A float64 `json:"a"`
    B float64 `json:"b"`
  }
  if err := json.Unmarshal(args, &input); err != nil {
    return "", err
  }
  return fmt.Sprintf("%g", input.A+input.B), nil
}

registry.Register(def, handler)

Execution flow:

model requests a tool call
agent converts the request into tools.Call
registry invokes the handler
result is added back to conversation context as a tool message
model continues with the new information

Testing

go test ./... -cover

Current package coverage is designed to stay high across the core libraries, including streaming, config validation, agent loop behavior, hooks, tools, context management, and the top-level harness package.

Project structure

ai-harness/
├── .harness/           # File-based tools, hooks, agents (auto-discovered)
│   ├── agents/         # Custom named agents (.md files)
│   ├── hooks/          # Hook definitions (.md files)
│   └── tools/          # Tool definitions (.md files)
├── agent/              # Agent loop orchestration (parallel tool execution)
├── cmd/example/        # Example CLI (auto-detects harness.md vs harness.yaml)
├── completion/         # OpenAI-compatible client, including streaming
├── config/             # Markdown/YAML config, directory loader, validation
├── context/            # Conversation history manager
├── delegation/         # Recursive delegation, depth tracking, async task store
├── harness/            # High-level API (model registry, agent resolver, wiring)
├── hooks/              # Lifecycle hook system
├── scripting/          # Starlark engine + fs/edit built-ins
├── tools/              # Tool registry and execution
├── harness.md          # Root harness configuration
└── go.mod

Typed Artifact System

Artifacts are the fundamental building blocks of a harness. Each artifact is a single Markdown file that bundles identity, tools, hooks, and models into one composable unit.

Artifact Types (priority order)

Type	Priority	Purpose
`override`	100	Project-local overrides that supersede anything
`harness`	80	Root identity and policy (exactly one per project)
`builtin`	60	Core capabilities shipped with the runtime
`plugin`	40	User-authored or third-party capability bundles
`model`	20	Provider/model onboarding configurations

One file = one capability

---
name: git-safety
type: plugin
version: 1.0.0
description: Prevent force-pushes and history rewrites
tags: [governance, git]
condition: '"*.git*" in ctx.get("active_files", [])'
tools:
  - name: git-status
    description: Show git status safely
    timeout_ms: 5000
hooks:
  - event: onPreToolUse
    handler: block_force_push
    script: |
      def handle(event, payload):
          if "force" in payload.get("args", ""):
              return deny("Force push blocked by governance")
          return allow()
---

Git safety context: this plugin ensures all git operations
go through the governed workflow. Force pushes are blocked
at the architectural level.

Per-turn evaluation

Artifacts with condition expressions are evaluated every turn using Starlark:

condition: 'ctx.get("turn", 0) > 3'          # Activate after turn 3
condition: 'ctx.get("mode") == "review"'      # Activate in review mode
condition: 'len(time.now()) > 0'              # Always active (time-based)

After EvaluateConditions() runs, each artifact's Active field reflects whether it should participate in composition. This is the key differentiator: governance adapts per-turn, not just at startup.

Composition & Options Pattern

The Composer merges all active artifacts into a unified view using priority-based resolution.

Basic usage

import "github.com/htekdev/ai-harness/artifact"

reg := artifact.NewRegistry()
// ... register artifacts ...

composer := artifact.NewComposer(reg)

// Default: compose only Active artifacts (respects EvaluateConditions)
result, err := composer.Compose(nil)

// With dynamic condition evaluation at compose time
result, err = composer.Compose(func(cond string) (bool, error) {
    return evaluateStarlark(cond)
})

Functional options (ComposeWith)

For fine-grained control over composition:

// Only active artifacts (default)
result, _ := composer.ComposeWith()

// Include inactive artifacts (debugging/observability)
result, _ := composer.ComposeWith(artifact.WithIncludeInactive())

// Filter by type
result, _ := composer.ComposeWith(artifact.WithTypeFilter(artifact.TypePlugin))

// Filter by tag
result, _ := composer.ComposeWith(artifact.WithTagFilter("governance"))

// Dynamic evaluation (overrides cached Active state)
result, _ := composer.ComposeWith(artifact.WithEvalFn(myEvalFn))

// Combine options
result, _ := composer.ComposeWith(
    artifact.WithTypeFilter(artifact.TypePlugin, artifact.TypeBuiltin),
    artifact.WithTagFilter("security"),
)

Per-turn lifecycle

The full per-turn workflow:

// 1. Set turn state
ctx := scripting.WithTurnState(context.Background())
scripting.SetTurnState(ctx, "turn", turnNumber)
scripting.SetTurnState(ctx, "mode", "coding")

// 2. Evaluate all artifact conditions against current state
composer.EvaluateConditions(ctx)

// 3. Compose only the artifacts that passed evaluation
result, err := composer.Compose(nil)
// result.Tools       — deduplicated, priority-ordered tools
// result.Hooks       — all hooks from active artifacts
// result.Identity    — merged system prompt from harness artifact
// result.ContextBlocks — context from all active non-harness artifacts

Status

v0.3.0 — Typed artifact system, context observability, per-turn evaluation engine.

Component	Status
Config (Markdown + YAML)	✅ Stable
Agent loop (parallel tools)	✅ Stable
Hook system (8 lifecycle events)	✅ Stable
Tool registry + Starlark engine	✅ Stable
Delegation (sync + async)	✅ Stable
Completion client (streaming)	✅ Stable
Eval framework	✅ Stable
`.harness/` directory convention	✅ Stable
Typed artifact registry	✅ Stable
Context observability	✅ Stable
Per-turn evaluation engine	✅ Stable
Composition options pattern	✅ Stable

Contributing

Contributions are welcome. Keep changes small, add tests with code changes, and run:

go build ./...
go test ./... -cover
go vet ./...

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
.github		.github
.harness		.harness
agent		agent
artifact		artifact
cmd		cmd
completion		completion
compose		compose
config		config
context		context
delegation		delegation
evals		evals
harness		harness
hooks		hooks
observe		observe
scripting		scripting
tools		tools
.gitignore		.gitignore
.goreleaser.yaml		.goreleaser.yaml
Makefile		Makefile
README.md		README.md
evals.yaml		evals.yaml
go.mod		go.mod
go.sum		go.sum
harness.md		harness.md

Folders and files

Latest commit

History

Repository files navigation

Harness as Code

The Problem

What Makes It Different

Core Philosophy

The DevOps Parallel

Architecture

Installation

CLI (recommended)

Pre-built binaries

As a library

Quick Start

1. Scaffold a new project

2. Validate your harness

3. Run interactively

4. Inspect your configuration

5. Define your harness in Markdown

2. Add file-based tools (optional)

3. Add custom agents (optional)

4. Run it

Directory convention

Delegation system

Recursive delegation (agent trees)

Async delegation

Custom agents

Parallel tool execution

Starlark scripting engine

Lower-level API

CLI example

API reference

harness

agent

completion

Streaming example

tools

hooks

context

config

Configuration reference

Full schema

model

context

tools[]

tools[].parameters.*

hooks[]

Hook system

Events

Actions

Priority

Common hook examples

Block a tool call

Modify user input

Tool registration

Testing

Project structure

Typed Artifact System

Artifact Types (priority order)

One file = one capability

Per-turn evaluation

Composition & Options Pattern

Basic usage

Functional options (ComposeWith)

Per-turn lifecycle

Status

Contributing

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors

Uh oh!

`harness`

`agent`

`completion`

`tools`

`hooks`

`context`

`config`

`model`

`context`

`tools[]`

`tools[].parameters.*`

`hooks[]`

Packages