Tracebound

Tracebound closes the loop between your LLM agent in production and the next fix you ship.

The problem

Teams ship LLM agents and then operate them blind. Telemetry is collected but rarely closed back into the development loop. Failure modes get discovered ad-hoc, usually by one engineer scrolling Langfuse on a Friday afternoon. Fixes are one-off, and rarely regression-tested against the trace that motivated them. Expensive SME review effort gets thrown away after a single Slack comment.

What Tracebound does

Tracebound ingests production traces from any telemetry source, harvests feedback from users and subject-matter experts, and clusters recurring failures into a persistent, versioned catalogue that lives as a diffable file in your repo. For every failure mode, a coding agent drafts a fix spec you review before code is touched, implements the change in your working tree, and replays the original failing cohort against the patched agent to confirm the fix actually worked. Every state transition is human-reviewed; nothing gets committed, pushed, or merged without you.

ingest traces → extract feedback → cluster failure modes → propose fix → verify → (optionally) harden

Tracebound is opinionated about the workflow and agnostic about the telemetry source, feedback signal, agent framework, and test infrastructure.

How it works

Tracebound has two parts:

A CLI — deterministic primitives: scaffold, validate, query. Stateless, fast, used as a subroutine by the skills and by you.
Agent skills — LLM-driven judgment work. Each skill is a .md file that an AI coding agent (e.g. GitHub Copilot, Cursor, Claude) reads and follows. Skills call the CLI as a subroutine.

Everything lives as diffable files in your repo under a tracebound/ folder. No server, no database.

Installation

npm install -g @tracebound/cli

Or use without installing:

npx @tracebound/cli <command>

Quick start

1. Scaffold your project

Run this inside the repo that contains your agent. Pick a name for the agent — Tracebound scopes everything by it, so a single repo can track multiple agents side by side:

tracebound init support-bot

This creates:

tracebound/
└── support-bot/
    ├── tracebound.config.md      # edit this first
    ├── failure_modes.json       # starts empty
    ├── traces/                  # drop JSONL files here
    ├── failure_modes/           # one folder per failure mode
    └── adapters/                # converter scripts go here

Run tracebound init <other-agent> again to track a second agent. List the configured agents at any time with tracebound agents.

2. Configure the project

Edit tracebound/support-bot/tracebound.config.md. At minimum fill in:

Agent under test — repo root, entry point, framework.
Sanity checks — npm run typecheck, npm run lint, npm test, or whatever your project uses.
Model test command — a CLI that accepts --input "<message>" and invokes your live agent. Used by implement-failure-mode to verify fixes.
Hard rules — constraints the coding agent must never violate (e.g. "never edit src/payments/** without human review").

3. Import traces

Write or generate an adapter script under tracebound/support-bot/adapters/ that reads your telemetry provider's export and writes Tracebound-format JSONL to tracebound/support-bot/traces/.

The create-adapter skill can generate this script from a sample of your data:

"Create a Tracebound adapter for this Langfuse JSON export: <paste sample>"

Each line in the output JSONL is a Trace:

{
  "id": "trc_abc123",
  "input": "Can you cancel my order?",
  "output": "I've cancelled order #5551. You'll get a confirmation email shortly.",
  "feedback": [{ "sentiment": "negative", "source": "sme", "comment": "No cancel tool exists — hallucinated side-effect." }],
  "originalTraceFile": "original/trc_abc123.json",
  "failureModeIds": [],
  "analysis": { "status": "pending" }
}

4. Discover failure modes

Point the analyze-traces skill at a trace file (always under one agent):

"Run analyze-traces for support-bot on tracebound/support-bot/traces/langfuse-2026-04-26.jsonl"

The skill processes negatively-signalled traces in configurable batches, clusters them into failure modes, writes failure_modes.json, and validates after every batch. It self-corrects on validation errors.

// failure_modes.json
{
  "failureModes": [
    {
      "id": "fm_2026_04_hallucinated_action",
      "title": "Hallucinated side-effect confirmations",
      "description": "Agent confirms destructive actions (cancellations, refunds) it has no tool to perform.",
      "status": "discovered",
      "severity": "high",
      "tags": ["hallucination", "tool-use"],
      "discoveredAt": "2026-04-26T14:30:00Z",
      "lastUpdated": "2026-04-26T14:30:00Z",
      "affectedTraces": [{ "filename": "langfuse-2026-04-26.jsonl", "traceId": "trc_abc123" }]
    }
  ]
}

5. Research and spec a fix

Hand an agent and a failure mode id to the research-failure-mode skill:

"Research fm_2026_04_hallucinated_action for support-bot"

The skill reads the cohort, reads the agent source, forms hypotheses, then writes tracebound/support-bot/failure_modes/fm_2026_04_hallucinated_action/SPEC.md — a structured fix spec with root cause, proposed changes, acceptance criteria, and a test plan.

You review the spec before any code is touched. When you're happy, tell the skill to mark it approved (or edit status in failure_modes.json yourself to fix_approved).

6. Implement and verify

"Implement fm_2026_04_hallucinated_action for support-bot"

The implement-failure-mode skill reads the approved spec, writes a PLAN.md, makes the code changes, runs your sanity checks, then invokes the live agent with inputs derived from the failure mode's cohort to confirm the failure is resolved. Status moves to verified.

CLI reference

tracebound <command> [options]

Commands:
  init <agent-name>    Scaffold tracebound/<agent-name>/ in the current repo.
  agents               List the agents configured under tracebound/.
  validate             Validate one agent's tracebound/<agent>/ tree.
  status               Print catalogue health for one agent.
  trace get <id>       Find a trace by id within one agent.
  fm get <id>          Print a failure mode by id within one agent.

Global options:
  -h, --help           Show this help.
  -v, --version        Print the CLI version.

Every command except init and agents requires --agent <name> (alias -a). Without it, the command exits 2 and prints the list of agents currently configured under tracebound/.

`tracebound init <agent-name>`

Scaffolds tracebound/<agent-name>/. Pre-existing files are left untouched. Agent names must match ^[a-z0-9][a-z0-9_-]*$ (lowercase letters, digits, underscores, hyphens; must start with a letter or digit).

Positionals:
  <agent-name>       Required.

Options:
  -C, --cwd <path>   Directory to initialise inside (default: cwd)

`tracebound agents`

Lists every subdirectory of tracebound/ that contains a tracebound.config.md file (sorted alphabetically). Exits 0 even when no agents are configured.

Options:
  -C, --cwd <path>   Directory to inspect (default: cwd)
  --json             Emit { "agents": [{ "name", "path" }] }

Exit codes:
  0   listing printed (may be empty)
  2   could not run

`tracebound validate`

Checks structure, schemas, and invariants for one agent:

Required files and folders exist (tracebound.config.md, failure_modes.json, traces/, failure_modes/, adapters/).
failure_modes.json parses against the FailureModesFile schema.
Every traces/*.jsonl line parses against the Trace schema.
Failure mode ids are unique.
affectedTraces[] entries point to files and trace ids that exist.
Bidirectional links: every affectedTraces[n].traceId has a backlink in failureModeIds[], and vice versa.
No duplicate (filename, traceId) entries within a failure mode.

Options:
  -a, --agent <name>   Required.
  -C, --cwd <path>     Directory to validate (default: cwd)
  --json               Emit a structured JSON report

Exit codes:
  0   passed
  1   validation issues found
  2   could not run (IO error, missing or unknown --agent)

`tracebound status`

Prints catalogue health for one agent: failure-mode counts by lifecycle status, recently updated failure modes, specs awaiting approval, and per-file trace counts.

Options:
  -a, --agent <name>   Required.
  -C, --cwd <path>     Directory to inspect (default: cwd)
  --json               Emit structured JSON

Exit codes:
  0   report printed
  2   could not run

`tracebound trace get <id>`

Scans all traces/*.jsonl files for one agent and prints the first trace whose id matches.

Options:
  -a, --agent <name>   Required.
  -C, --cwd <path>     Directory to inspect (default: cwd)
  --json               Emit the raw JSON object

Exit codes:
  0   found
  1   not found
  2   could not run

`tracebound fm get <id>`

Looks up a failure mode by id in one agent's failure_modes.json and prints it.

Options:
  -a, --agent <name>   Required.
  -C, --cwd <path>     Directory to inspect (default: cwd)
  --json               Emit the raw JSON object

Exit codes:
  0   found
  1   not found
  2   could not run

Skills reference

Skills are instruction files for your AI coding agent. Drop the skills/ folder into your agent's context or reference individual files.

Every skill takes the agent name as a required input — pass it explicitly in the trigger phrase (e.g. "for support-bot") or the skill will run tracebound agents and ask which one.

Skill	Trigger phrase	What it does
`analyze-traces`	"Analyze `traces/foo.jsonl` for support-bot"	Clusters negatively-signalled traces into failure modes; writes `failure_modes.json`.
`research-failure-mode`	"Research `fm_…` for support-bot"	Investigates root cause, reads source, drafts `SPEC.md`. Read-only against agent code.
`implement-failure-mode`	"Implement `fm_…` for support-bot"	Applies an approved spec, runs sanity checks, verifies fix against the live agent.
`create-adapter`	"Create an adapter for support-bot from this Langfuse export"	Generates a converter script (per-agent) from a sample of your telemetry data.

All skills:

Run tracebound validate --agent <name> as a preflight check and refuse to proceed on a broken catalogue.
Quote the Hard rules from tracebound/<agent>/tracebound.config.md before doing any work.
Operate on one agent at a time — they never read or write under another agent's directory.
Never commit, push, or open PRs — they leave the working tree ready and stop.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.agents/skills		.agents/skills
.claude/skills		.claude/skills
examples		examples
public		public
scripts		scripts
skills		skills
specs		specs
src		src
templates		templates
.gitignore		.gitignore
LICENSE		LICENSE
PRD.md		PRD.md
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
skills-lock.json		skills-lock.json
tsconfig.build.json		tsconfig.build.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tracebound

The problem

What Tracebound does

How it works

Installation

Quick start

1. Scaffold your project

2. Configure the project

3. Import traces

4. Discover failure modes

5. Research and spec a fix

6. Implement and verify

CLI reference

`tracebound init <agent-name>`

`tracebound agents`

`tracebound validate`

`tracebound status`

`tracebound trace get <id>`

`tracebound fm get <id>`

Skills reference

Sponsors

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Tracebound

The problem

What Tracebound does

How it works

Installation

Quick start

1. Scaffold your project

2. Configure the project

3. Import traces

4. Discover failure modes

5. Research and spec a fix

6. Implement and verify

CLI reference

tracebound init <agent-name>

tracebound agents

tracebound validate

tracebound status

tracebound trace get <id>

tracebound fm get <id>

Skills reference

Sponsors

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

`tracebound init <agent-name>`

`tracebound agents`

`tracebound validate`

`tracebound status`

`tracebound trace get <id>`

`tracebound fm get <id>`

Packages