GitHub - aiming-lab/AutoHarness: AutoHarness: Automated Harness Engineering for AI Agents

「Aha」— AutoHarness: Automated Harness Engineering for AI Agents

Every agent deserves an aha moment — the model reasons, we harness the rest.

🇨🇳 简体中文 · 🇯🇵 日本語 · 🇰🇷 한국어 · 🇪🇸 Español · 🇫🇷 Français · 🇩🇪 Deutsch · 🇵🇹 Português · 🇷🇺 Русский

🤔 Why Aha (AutoHarness)?

In LLM training, the aha moment is when a model suddenly learns to reason.

For agents, the aha moment is when they go from "demo-ready" to truly reliable.

The gap is enormous: context management, tool governance, cost control, observability, session persistence... These are the patterns that separate a toy from a system. We call this harness engineering.

AutoHarness is a lightweight governance framework so every agent can have its aha moment.

Agent = Model + Harness. The model reasons. The harness does everything else.

⚡ Quick Install

git clone https://github.com/aiming-lab/AutoHarness.git
cd AutoHarness && pip install -e .

from openai import OpenAI
from autoharness import AutoHarness

client = AutoHarness.wrap(OpenAI())
# That's it. Your agent just had its aha moment.

🔥 News

[04/01/2026] v0.1.0 Released: Three-tier pipeline modes (Core / Standard / Enhanced ⚠️), 6-step governance pipeline, risk pattern matching, YAML constitution, trace-based diagnostics, multi-agent profiles, session persistence with cost tracking. 958 tests passing.

🚀 Quickstart

# Wrap any LLM client (2 lines, instant governance)
from openai import OpenAI
from autoharness import AutoHarness

client = AutoHarness.wrap(OpenAI())
response = client.chat.completions.create(
    model="gpt-5.4",
    messages=[{"role": "user", "content": "Refactor auth.py"}],
    tools=[{"type": "function", "function": {"name": "Bash", "description": "Run shell commands",
            "parameters": {"type": "object", "properties": {"command": {"type": "string"}}}}}],
)

# Or use the full agent loop
from autoharness import AgentLoop

loop = AgentLoop(model="gpt-5.4", constitution="constitution.yaml")
result = loop.run("Fix the failing tests in auth.py")

More examples →

🔧 Pipeline Modes

AutoHarness supports three pipeline modes. Choose the level of governance that fits your needs:

Mode	Pipeline	Hooks	Multi-Agent	Use Case
Core	6-step	Secret scanner + path guard + output sanitizer	Single agent	Lightweight governance
Standard	8-step	+ Risk classifier + pre-hooks	Basic profiles	Production agents
Enhanced ⚠️	14-step	+ Turn governor + alias resolution + failure hooks	Fork / Swarm / Background	Maximum governance

# Switch modes via constitution
# constitution.yaml
mode: core      # or "standard" or "enhanced"

# Or via CLI
autoharness mode enhanced

Enhanced ⚠️ is the default mode. Users get the strongest governance out of the box. Switch to Core for minimal overhead.

Full mode comparison →

✨ What You Get

Without Harness	With AutoHarness
Agent runs `rm -rf /`, nothing stops it	6-step pipeline blocks it, logs it, explains why
Context explodes past token limit	Token budget + truncation keeps context under control
No idea which tool call cost how much	Per-call cost attribution with model-aware pricing
Prompt injection sneaks through	Layered validation: input rails, execution, output rails
No audit trail for compliance	JSONL audit logs every decision with full provenance
Agents share one permission set	Multi-agent profiles with role-based governance

Core Architecture: 6-Step Governance Pipeline

Every tool call flows through a structured pipeline:

1. Parse & Validate  →  2. Risk Classify  →  3. Permission Check
4. Execute           →  5. Output Sanitize →  6. Audit Log

Built-in risk patterns detect dangerous operations, secret exposure, path traversal, and more.

By the Numbers

6-step governance pipeline   ·  Risk pattern matching      ·  YAML constitution
Token budget management      ·  Multi-agent profiles       ·  JSONL audit trail
2 lines to integrate         ·  0 vendor lock-in           ·  MIT licensed

🖥️ CLI

autoharness init                          # Generate constitution (default/strict/soc2/hipaa/financial)
autoharness init --mode core              # Generate with specific pipeline mode
autoharness mode                          # Show current pipeline mode
autoharness mode enhanced                 # Switch pipeline mode
autoharness validate constitution.yaml    # Validate a constitution file
autoharness check --stdin --format json   # Check a tool call against your rules
autoharness audit summary                 # View audit summary
autoharness install --target claude-code  # Install as a Claude Code hook (one command)
autoharness export --format cursor        # Export cross-harness constitution

📊 How We Compare

Capability	AutoHarness	LangGraph	Guardrails AI	OpenAI SDK
Tool governance pipeline	✅ 6-step (up to 14)	❌	⚠️ Output-only	❌
Context management	✅ Multi-layer	❌	❌	⚠️ Trimming
Multi-agent profiles	✅	✅ Graph	❌	⚠️ Handoff
Validation (input+output)	✅	❌	✅ Rails	❌
Trace-based diagnostics	✅	❌	❌	❌
Cost attribution	✅ Per-call	❌	❌	❌
Vendor lock-in	None	LangChain	None	OpenAI
Setup	2 lines	Graph DSL	RAIL XML	SDK

🙏 Acknowledgments

Claude Code by Anthropic: engineering patterns that inspired some of our features in the Enhanced ⚠️ mode
Codex by OpenAI: context engineering practices that informed our context management design

📌 Citation

If you use AutoHarness in your research, please cite:

@software{autoharness2026,
  title   = {AutoHarness: The Harness Engineering Framework for AI Agents},
  author  = {{AutoHarness Team}},
  year    = {2026},
  url     = {https://github.com/aiming-lab/AutoHarness},
  license = {MIT}
}

⚠️ Disclaimer

Some architectural decisions in the Enhanced mode were informed by publicly available analysis and community discussion of Claude Code's design following its inadvertent publication via Anthropic's npm registry on 2026-03-31. We acknowledge that Claude Code's original source code is the intellectual property of Anthropic. AutoHarness does not contain, redistribute, or directly translate any of Anthropic's proprietary code. We respect Anthropic's IP rights and will promptly address any concerns — please contact us via issue or autoharness.aha@gmail.com.

📄 License

MIT. See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
action		action
autoharness		autoharness
docs		docs
examples		examples
images		images
packages		packages
tests		tests
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
action.yml		action.yml
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

「Aha」— AutoHarness: Automated Harness Engineering for AI Agents

Every agent deserves an aha moment — the model reasons, we harness the rest.

🤔 Why Aha (AutoHarness)?

⚡ Quick Install

🔥 News

🚀 Quickstart

🔧 Pipeline Modes

✨ What You Get

Core Architecture: 6-Step Governance Pipeline

By the Numbers

🖥️ CLI

📊 How We Compare

🙏 Acknowledgments

📌 Citation

⚠️ Disclaimer

📄 License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

「Aha」— AutoHarness: Automated Harness Engineering for AI Agents

Every agent deserves an aha moment — the model reasons, we harness the rest.

🤔 Why Aha (AutoHarness)?

⚡ Quick Install

🔥 News

🚀 Quickstart

🔧 Pipeline Modes

✨ What You Get

Core Architecture: 6-Step Governance Pipeline

By the Numbers

🖥️ CLI

📊 How We Compare

🙏 Acknowledgments

📌 Citation

⚠️ Disclaimer

📄 License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages