AI Harness

A plug-and-play LLM connectivity layer for TypeScript applications. Clone, configure one API key, and you have a working streaming chat connected to any major model provider — with tool calling, observability, MCP integration, and local inference support included from the start.

The harness solves the startup tax that every LLM-powered project pays: provider wiring, streaming transport, tool schemas, system prompt coupling, observability, and web capabilities. Pay it once here; every future app starts at application logic, not plumbing.

What's included

Provider abstraction (`src/ai/provider.ts`)

Single getModel(key) call returns a configured AI SDK v6 model. Supports:

Provider	Key	Notes
Anthropic	`anthropic`	Claude models
OpenAI	`openai`	GPT models
Google	`google`	Gemini models
Ollama	`ollama`	Local inference via `localhost:11434/v1`
LM Studio	`lmstudio`	Local inference via `localhost:1234/v1`

Local providers use @ai-sdk/openai-compatible pointed at the standard OpenAI-compatible endpoint. Swapping between local and cloud requires only a different key — no code changes.

Streaming chat API (`src/app/api/chat/route.ts`)

Next.js App Router route using AI SDK v6 streamText. Handles:

Provider and model selection per request
System prompt construction with live data injection
Role-based tool selection (dev vs app)
Multi-step agent loops (up to 50 steps) with automatic continuation support
Structured telemetry via experimental_telemetry

Step budget and continuation: The agent has a 50-step tool-call budget per turn. The system prompt instructs the model to plan first, work incrementally, and checkpoint progress. When a turn ends with tool calls still in flight, the frontend shows a "Continue" button so the user can seamlessly resume multi-step tasks across turns.

Gemini thought signature handling: The route uses convertToModelMessages() (AI SDK's own conversion function) rather than manually constructing history. This preserves provider-specific metadata — including Gemini 3's required thought signatures — through the UIMessage round-trip. Do not add custom history transformation between convertToModelMessages() and streamText without verifying signatures survive.

Tool layer (`src/ai/tools/`)

All tools use tool() from AI SDK v6 with Zod schemas. Registered in index.ts by role:

Dev tools (shell, file-read, file-write) — privileged tools for development-mode chat. These execute shell commands and read/write the filesystem. Currently run in the Next.js API route process; in the Tauri phase they move to Rust Tauri commands.

Sandbox enforcement: All file writes are automatically routed to the development/ directory. The model cannot modify harness source code, configuration, or anything outside the sandbox — paths are rewritten via sandboxPath(), not just blocked. File reads can access the full project for context, but .env / .env.local are blocked to protect secrets. See protected-paths.ts for the implementation.

Web tools (web-search, web-ingest) — conditionally registered based on available API keys:

Web search activates with TAVILY_API_KEY or BRAVE_SEARCH_API_KEY
Web ingestion activates when Crawl4AI is installed locally (pip install crawl4ai)

App tools — empty by default; add project-specific tools here.

MCP integration (`src/ai/mcp.ts`)

MCPHost class manages connections to one or more MCP servers and exposes their combined tool surface to the AI SDK orchestration layer:

const host = new MCPHost();
await host.connect('http://localhost:3001/sse', 'my-server');
await host.connect('http://localhost:3002/sse', 'another-server');

const tools = host.getTools(); // merged tool surface from both servers
// pass to streamText({ tools: { ...appTools, ...tools } })

await host.close();

MCP handles the tool surface. It does not replace provider SDKs or the orchestration loop.

Chat UI (`src/components/chat/`)

React component using useChat from @ai-sdk/react. Features:

Persistent sidebar layout — chat lives in a fixed sidebar on the left; agent-built content renders in a preview pane on the right. The chat is never displaced by model output.
Live preview pane — when the agent writes an HTML file to development/, the preview pane auto-opens with the content in a sandboxed iframe. Includes Reload and Close controls.
Continuation support — when a response ends with tool calls still in progress (step limit hit), an amber "Continue" banner appears with one-click resume.
Stop button — cancel streaming mid-response.
Streaming message display with markdown rendering (remark-gfm)
Provider and model selector
Tool call visibility: each tool invocation appears inline with tool name, state indicator, and collapsible args/result view
Per-message part rendering — text and tool calls appear in order as the model produces them

Observability (`src/ai/telemetry.ts`)

Langfuse integration via experimental_telemetry. Traces every LLM call with token counts, latency, and tool call chains. Falls back to structured console logging when LANGFUSE_SECRET_KEY is absent — the same interface, no conditional code in callers.

System prompt builder (`src/ai/system-prompt.ts`)

buildSystemPrompt() constructs the system prompt with optional live data injection. Pass application state (database records, user context, current page data) via the data field to keep the model grounded in real app state rather than working from stale context.

Getting started

# 1. Install dependencies
npm install

# 2. Configure providers
cp .env.example .env.local
# Edit .env.local — add at least one provider API key

# 3. Run
npm run dev

Open http://localhost:3000. The dev chat connects to whichever provider is configured.

Minimum configuration

One provider API key is all that's required:

ANTHROPIC_API_KEY=sk-ant-...

Progressive activation

Every optional capability activates by adding its env key — no code changes required:

Capability	Env key(s) required
Anthropic	`ANTHROPIC_API_KEY`
OpenAI	`OPENAI_API_KEY`
Google Gemini	`GOOGLE_API_KEY`
Ollama (local)	Running at `localhost:11434` — no key needed
LM Studio (local)	Running at `localhost:1234` — no key needed
Observability	`LANGFUSE_SECRET_KEY` + `LANGFUSE_PUBLIC_KEY`
Web search (Tavily)	`TAVILY_API_KEY`
Web search (Brave)	`BRAVE_SEARCH_API_KEY`
Web ingestion	`pip install crawl4ai` locally

Architecture

src/
├── ai/                        # THE HARNESS — extend, don't rewrite
│   ├── provider.ts            # getModel(key) → AI SDK model
│   ├── types.ts               # ChatConfig, ProviderKey, SystemPromptContext
│   ├── system-prompt.ts       # Prompt builder with sandbox + step budget instructions
│   ├── telemetry.ts           # Langfuse / console fallback
│   ├── mcp.ts                 # MCPHost — multi-server MCP client/host
│   └── tools/
│       ├── index.ts           # Tool registry by role
│       ├── protected-paths.ts # Sandbox enforcement (all writes → development/)
│       ├── shell.ts           # [PRIVILEGED] Shell execution
│       ├── file-read.ts       # [PRIVILEGED] Filesystem read (secrets blocked)
│       ├── file-write.ts      # [PRIVILEGED] Filesystem write (sandboxed)
│       ├── web-search.ts      # Tavily / Brave search
│       └── web-ingest.ts      # Crawl4AI URL ingestion
├── app/
│   ├── api/chat/route.ts      # Streaming chat endpoint (50-step budget)
│   ├── api/preview/route.ts   # Serves sandboxed files for iframe preview
│   ├── api/providers/route.ts # Available providers endpoint
│   └── page.tsx               # Sidebar chat + preview pane layout
├── components/
│   ├── chat/
│   │   ├── chat.tsx           # Chat component with auto-preview + continuation
│   │   ├── message.tsx        # Text message rendering (markdown)
│   │   ├── tool-call.tsx      # Tool invocation rendering (args + result)
│   │   ├── input.tsx          # Input with provider selector + stop button
│   │   └── error-message.tsx  # Error display
│   └── copilot/
│       └── layer.tsx          # CopilotKit layer (opt-in, see file for instructions)
└── development/               # SANDBOX — all agent file output goes here

Trust boundary: Tools marked [PRIVILEGED] perform filesystem and shell operations. In the current Next.js architecture these run in the API route process. In the Tauri phase (see roadmap) they move to Rust Tauri commands, enforcing a hard trust boundary between the webview and the privileged backend.

Stack

Component	Package	Version
Native app shell	`tauri`	v2
Rust secrets store	`tauri-plugin-store`	v2
Rust SQLite	`tauri-plugin-sql`	v2
Tauri TypeScript API	`@tauri-apps/api`	v2
AI orchestration	`ai` (Vercel AI SDK)	v6
React streaming	`@ai-sdk/react`	v3
Schema validation	`zod`	v4
Web framework	`next`	16
MCP	`@modelcontextprotocol/sdk`	v1
Observability	`langfuse`	v3
Styling	`tailwindcss`	v4
Testing	`vitest` + `@testing-library/react`	—

Development

npm run dev        # Start Next.js dev server (localhost:3000)
npm test           # Run test suite (vitest)
npm run test:watch # Watch mode
npm run lint       # ESLint

# Tauri (requires Rust toolchain — install via rustup.rs)
npm run tauri:dev   # Tauri + Next.js dev server in a native window
npm run tauri:build # Production .app + .dmg bundle

Tests cover provider factory behavior, system prompt construction, tool execution, and message rendering.

Production builds

Production Tauri builds (npm run tauri:build) require a static Next.js export. The next.config.ts enables output: 'export' automatically when TAURI_ENV=production is set — this is done by tauri.conf.json's beforeBuildCommand. API routes do not exist in static exports; all server-side logic moves to Rust commands in Phase 3.

Secrets in Tauri vs development

Context	How secrets are set
Development (`npm run dev`)	`.env.local` file — standard Next.js env
Tauri app (production)	Settings UI → `setSecret()` → Rust store at app data dir

The Rust store is OS-protected (macOS: ~/Library/Application Support/com.asimpleharness.app/). No API keys ship in the binary or sit in plain text alongside the bundle.

Roadmap

This harness is designed with a layered migration path toward a Mac-native Tauri application. The Next.js layer is a thin adapter; the src/ai/ core is framework-portable TypeScript.

Phase 1 — Correctness (complete)

Tool call rendering: tool invocations visible in chat with args and result
Gemini thought signature safety: convertToModelMessages() used end-to-end; risk documented at call site
MCP host pattern: MCPHost class replaces one-shot connector; manages multiple server connections
Trust boundary documentation: privileged tools annotated for Tauri migration

Phase 2 — Tauri scaffold (complete)

src-tauri/ Rust workspace: tauri 2.10, tauri-plugin-store, tauri-plugin-sql (SQLite)
tauri.conf.json: dev points at localhost:3000; prod builds from static export
Capabilities enforce the trust boundary: sql:* allowed for TypeScript, store:* excluded (secrets only via Rust commands)
get_secret / set_secret / list_configured_secrets / delete_secret Rust commands
SQLite migrations: conversations + messages schema applied on startup
src/lib/tauri.ts: isTauri() detection and secret IPC wrappers
src/lib/db.ts: conversation and message CRUD via @tauri-apps/plugin-sql
next.config.ts: static export mode gated on TAURI_ENV=production

Phase 2.5 — Agent sandbox and resilience (complete)

Sandbox enforcement: all agent file writes routed to development/ via sandboxPath() — the model cannot modify harness source, config, or any file outside the sandbox
Sidebar + preview layout: persistent chat sidebar on the left; sandboxed iframe preview pane on the right with auto-open on HTML writes, Reload/Close controls
Step budget and continuation: 50-step tool-call limit with system prompt instructions for planning, incremental work, and checkpointing. Frontend "Continue" button for seamless multi-turn task completion.
Stop button: cancel streaming mid-response
Shell injection fix: web-ingest.ts uses execFile() with args array instead of exec() with template strings
MCP error isolation: per-server and per-tool try/catch prevents cascading failures
Secret protection: .env / .env.local blocked from file-read tool
Dead code cleanup: removed unused durable.ts, trigger.config.ts, @trigger.dev/sdk

Phase 3 — Privilege migration

Move shell, file-read, file-write to Rust Tauri commands; TypeScript execute() functions become IPC callers
Add rig or genai crate for Rust-side provider abstraction; local model calls (Ollama/LM Studio) move to Rust
Move MCPHost to Rust backend; tool discovery and execution happen in the privileged layer

Phase 4 — Hardening

Local vector store (sqlite-vss or qdrant local) for RAG
Explicit Gemini thought signature pass-through for custom agent loops
Production monitoring with canary checks and performance baselines

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.notes		.notes
development		development
docs		docs
public		public
src-tauri		src-tauri
src		src
.gitignore		.gitignore
README.md		README.md
ai-harness-v1-spec.md		ai-harness-v1-spec.md
eslint.config.mjs		eslint.config.mjs
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Harness

What's included

Provider abstraction (`src/ai/provider.ts`)

Streaming chat API (`src/app/api/chat/route.ts`)

Tool layer (`src/ai/tools/`)

MCP integration (`src/ai/mcp.ts`)

Chat UI (`src/components/chat/`)

Observability (`src/ai/telemetry.ts`)

System prompt builder (`src/ai/system-prompt.ts`)

Getting started

Minimum configuration

Progressive activation

Architecture

Stack

Development

Production builds

Secrets in Tauri vs development

Roadmap

Phase 1 — Correctness (complete)

Phase 2 — Tauri scaffold (complete)

Phase 2.5 — Agent sandbox and resilience (complete)

Phase 3 — Privilege migration

Phase 4 — Hardening

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AI Harness

What's included

Provider abstraction (src/ai/provider.ts)

Streaming chat API (src/app/api/chat/route.ts)

Tool layer (src/ai/tools/)

MCP integration (src/ai/mcp.ts)

Chat UI (src/components/chat/)

Observability (src/ai/telemetry.ts)

System prompt builder (src/ai/system-prompt.ts)

Getting started

Minimum configuration

Progressive activation

Architecture

Stack

Development

Production builds

Secrets in Tauri vs development

Roadmap

Phase 1 — Correctness (complete)

Phase 2 — Tauri scaffold (complete)

Phase 2.5 — Agent sandbox and resilience (complete)

Phase 3 — Privilege migration

Phase 4 — Hardening

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Provider abstraction (`src/ai/provider.ts`)

Streaming chat API (`src/app/api/chat/route.ts`)

Tool layer (`src/ai/tools/`)

MCP integration (`src/ai/mcp.ts`)

Chat UI (`src/components/chat/`)

Observability (`src/ai/telemetry.ts`)

System prompt builder (`src/ai/system-prompt.ts`)

Packages