⚠️ Work in progress. The API is under active development and may change between versions without notice.
Running a single LLM call is easy. Running a reliable, observable, cost-controlled AI system is not.
ActiveHarness is a Ruby framework for building production-grade LLM pipelines — with deep observability, consensus-based decisions, automatic fallbacks, and real-time cost and timing control. Made for Rails, works in plain Ruby too.
ActiveHarness gem gives you the scaffolding to build multi-step pipelines where every agent is under full control: its inputs are directed, its outputs are observed, its errors are retried, and its cost is tracked. You define the logic; ActiveHarness handles the infrastructure.
A harness in software is scaffolding that keeps a component under control — directing its inputs, observing its outputs, and enforcing rules around it. ActiveHarness does exactly that for LLM agents.
Build multi-step, trackable, cost-effective, and reliable AI flows with a clean, Rails-native DSL.
Group related steps into reusable sub-pipelines, and compose complex workflows from smaller ones. Each pipeline is just another step, with its own stop conditions, context forwarding, and execution time tracking.
Orchestrate deterministic and AI steps together.
With ActiveHarness you can track time, tokens, and dollars for every agent call, pipeline step, and tribunal.
| Cost in Application | Provider's Cost |
|---|---|
![]() |
![]() |
Use Tribunals to run multiple agents in parallel and make Verdicts based on their agreement — improving reliability and reducing biases and hallucinations.
Use power of event hooks to log and trace every step of your AI flows, from individual agent calls to multi-step pipelines and parallel tribunals.
| Event Tracing Architecture | Grafana Dashboard |
|---|---|
![]() |
![]() |
Backend Agnostic — Built on OpenTelemetry, ready for any collector (Jaeger, Datadog, Honeycomb, or custom).
Store conversation history in JSON, SQLite and PostgreSQL. Inject memory into prompts to make agents that remember past interactions.
| Rails App | Console |
|---|---|
![]() |
![]() |
| Capability | What it means |
|---|---|
| Multi-step Pipelines | Chain agents sequentially, with per-step stop conditions and context forwarding |
| Tribunal Consensus | Run multiple agents in parallel and accept the result only if they agree (unanimous, majority, or custom) |
| Automatic Fallbacks | If a model fails, the next one in the chain takes over — zero extra code |
| Retry Policy | Exponential backoff per model, globally configurable or per-agent |
| Full Observability | Lifecycle hooks on every agent event: before_call, after_call, retry, failure — log, stream, or act |
| Real-time Streaming | SSE-ready token streaming from any agent into your Rails response |
| Execution Time Tracking | Per-agent and per-pipeline timing built in |
| Token Cost Tracking | Know exactly what each call cost in tokens and dollars |
| Rails-native DSL | Clean file structure, Railtie integration, generator support |
| Event Tracing | OpenTelemetry integration for distributed tracing of agents, tribunals, and pipelines |
File structure for Ruby and Ruby on Rails applications:
Place all of your AI-related code in app/ai to keep it organized and separate from your core application logic. You can further organize it into subdirectories for prompts, agents, tribunals, pipelines, and memory.
app/
├── models/
├── controllers/
├── views/
└── ai/
├── prompts/ # system prompt classes
├── agents/ # agent classes
├── tribunals/ # parallel verdict panels
├── pipelines/ # multi-step pipelines
└── memory/ # custom memory classes
- How Prompts Work
- Minimal Prompt
- Multi-line Prompt
- Dynamic Content with @context
- Tuning with @params
- Using @input in the Prompt
- JSON Output Prompt
- Using @memory for Conversation History
- Respecting @context_window
- Generator
- How to Create Your First Agent in 5 Minutes
- How to Provide Fallbacks
- Model Options
- Modifying the Model Chain at Runtime
- How to Track Retries and Failures
- How to Use with RubyLLM
- JSON Output and Parsing
- Lifecycle Events
- Custom Providers
- Streaming in the Console
- Streaming in a Rails App
- How to Create Your First Pipeline in 5 Minutes
- Step Types
- Stop Conditions
- Tribunal Step
- Context: Accessing Previous Step Results
- Lifecycle Events
- Event Streams
- Full Example
- How It Works
- Minimal Example
- Stopping the Outer Pipeline from the Inside
- transform — Why It Is Required
- Accessing Inner Results from the Outer Context
- Event Streams
- Multiple Levels of Nesting
- Reference
- How to Create Your First Tribunal in 5 Minutes
- Tribunal from Different Agents
- Verdict Strategies
- Custom Verdict Logic
- Tolerating Partial Failures
- Same Agent, Different Models
- Runtime Model Prepend per Agent
- Direct Usage
- Lifecycle Events
- How Memory Works
- JsonFile Memory
- Custom Memory Class
- Managing Memory via Agent Callbacks
- Injection Patterns
- Filtering History with to_messages
- Memory API Reference
- Memory with namespace:
- Sharing Memory Across Pipeline Steps
- PostgreSQL Backend
- SQLite Backend
- Installation
- Rails Setup
- Plain Ruby Setup
- Configuration
- Providers Reference
- Global Settings
- Generators
- How It Works
- Simple Logging with Hooks
- OpenTelemetry Setup
- AgentTracing Concern
- TribunalTracing Concern
- PipelineTracing Concern
- Span Hierarchy
- Connecting to a Backend
MIT © the-teacher







