RFC: Establish pkg/ai as the foundational AI kernel

## Overview

This RFC proposes standardizing the Docker Agent to go beyond a CLI tool for running agentic AI. There are many open-source AI frameworks out there, each with its own limitations and tradeoffs. The Docker Agent already has strong core capabilities — model fallback, streaming, tool execution, structured output, multi-provider support — but they're buried inside runtime internals, tightly coupled and not reusable.

The focus of this proposal is modularization and establishing a strong core. By redesigning around a central `pkg/ai` package with clean, idiomatic Go APIs, we create clear boundaries between layers. With that modularity in place, targeting different use cases becomes straightforward, adding new features becomes incremental instead of invasive, and we avoid the tradeoffs that come from having everything tangled together.

## Pain

The Docker Agent was built around `pkg/runtime` as the primary entry point for model interaction. This made sense early on — the runtime powers the TUI agent loop and it does that job well. But as the project grew, more features needed to talk to models (session titles, compaction, background agents), and each one had to work around the runtime rather than with a shared foundation.

Today, the core LLM interaction logic — streaming, fallback, retry, tool execution — lives inside runtime alongside orchestration concerns like sessions, events, permissions, and hooks. These are tightly coupled, which makes it difficult to use one without pulling in the other.

Specific challenges:

- **Model interaction requires runtime.** There is no lightweight way to call a model. Session compaction creates an entire nested runtime instance for a single summarization call. Session title generation reimplements its own stream drain and fallback loop to avoid depending on runtime.

- **Core types lack a shared root.** `Message` and `Usage` live in `pkg/chat`, `Tool` and `ToolCall` in `pkg/tools`, `Provider` in `pkg/model/provider`, error handling in `pkg/modelerrors`. These all describe the same domain — LLM interaction — but there is no common package they stem from, which leads to cross-cutting imports and circular dependency pressure as new features are added.

- **Extending means working around coupling.** Adding a new capability that needs model access (evaluation, RAG rewriting, structured extraction) requires either importing runtime with all its dependencies or duplicating the call-and-stream pattern. Neither scales well.

- **One shape for all use cases.** The runtime is designed for long-running interactive agents. But not every model interaction is agentic — some are one-shot completions, some are structured data extraction, some are background tasks. These different shapes are forced through the same path today.

## Goal

Establish `pkg/ai` as the core package — a dependency-free foundation that owns all LLM interaction primitives. Everything needed to communicate with a model lives here: types, interfaces, streaming, fallback, retry, tool execution, and structured output.

The package should:

- **Be the single import for model interaction.** One package gives you messages, tools, providers, streaming, and completion — no need to assemble five imports to make a call.

- **Be generic and use-case agnostic.** It should serve a one-shot text generation, a structured data extraction, and a multi-turn tool loop equally well — without assuming agents, sessions, or UI.

- **Sit at the bottom of the dependency graph.** `pkg/ai` depends on nothing internal. Everything else — runtime, agents, sessions, providers — depends on it. This breaks circular dependency pressure and gives the codebase a clear direction.

- **Make the runtime thinner.** Runtime becomes focused on what it's good at: orchestrating long-running interactive agents. It calls into `pkg/ai` for model interaction instead of owning that logic. Session compaction and title generation become simple callers.

- **Lower the cost of new features.** Any new capability that needs to talk to a model — evaluation, background summarization, RAG pipelines — imports `pkg/ai` and calls `Generate`. No runtime, no duplication.

## Proposed Design

### Package Structure

```
pkg/ai/
├── ai.go           // Generate, GenerateText, GenerateData[T]
├── message.go      // Message, Role, Usage, FinishReason (from pkg/chat)
├── tool.go         // Tool, ToolCall, ToolResult (from pkg/tools)
├── provider.go     // Provider interface (from pkg/model/provider)
├── stream.go       // StreamDelta, stream draining logic (from runtime/streaming.go)
├── fallback.go     // Retry, backoff, model chain (from runtime/fallback.go)
├── option.go       // Functional options: WithModel, WithTools, WithStream, etc.
├── errors.go       // Error classification, context overflow (from pkg/modelerrors)
└── result.go       // Result type returned by Generate
```

### Core API

```go
package ai

// Generate runs the full completion loop: call model → stream response →
// execute tools → repeat, until the model stops or max turns is reached.
// Fallback, retry, and streaming are handled internally.
func Generate(ctx context.Context, opts ...Option) (*Result, error)

// GenerateText is a convenience wrapper that returns the text content.
func GenerateText(ctx context.Context, opts ...Option) (string, error)

// GenerateData calls the model with structured output and unmarshals into T.
func GenerateData[T any](ctx context.Context, opts ...Option) (T, error)
```

### Result

```go
type Result struct {
    Text             string
    ReasoningContent string
    ToolCalls        []ToolCall
    FinishReason     FinishReason
    Usage            *Usage
    Model            string       // which model actually responded
}
```

### Options (functional)

```go
ai.WithModel(model)              // primary provider
ai.WithFallbacks(models...)      // fallback chain
ai.WithMessages(msgs...)         // conversation messages
ai.WithTools(tools...)           // available tools
ai.WithMaxTurns(n)               // max tool-call round trips
ai.WithMaxTokens(n)              // model max output tokens
ai.WithStream(callback)          // streaming delta callback
ai.WithRetries(n)                // per-model retry count
// ... more options as needed
```

### Provider Interface

```go
type Provider interface {
    ID() string
    CreateChatCompletionStream(ctx context.Context, messages []Message, tools []Tool) (MessageStream, error)
}
```

Existing provider implementations (OpenAI, Anthropic, etc.) implement this interface. They live outside `pkg/ai` and are passed in via `WithModel()`.

### What Generate Does Internally

```
1. Build model chain: primary + fallbacks
2. Call model (with retry/backoff on failure, fallback on exhaustion)
3. Drain stream → aggregate into Result
4. If model returned tool calls and turns remain:
   a. Execute tools (through event/hook system — design TBD)
   b. Append tool results to messages
   c. Go to step 2
5. Return final Result
```

### Event / Hook System

The `ai` package will expose an event and hook system that allows callers to observe, intercept, and control the completion loop from outside. This includes:

- Observing streaming deltas, tool calls, model fallbacks
- Allowing or rejecting tool calls before execution
- Modifying tool inputs/outputs
- Injecting side effects (session recording, telemetry, UI events)

The specific design of this system is out of scope for this RFC and will be addressed in a follow-up.

### Migration

Existing packages (`pkg/chat`, `pkg/tools`, `pkg/model/provider`, `pkg/modelerrors`) will alias their types to `pkg/ai` to avoid breaking existing code. Over time, consumers migrate to importing `pkg/ai` directly.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Establish pkg/ai as the foundational AI kernel #2409

Overview

Pain

Goal

Proposed Design

Package Structure

Core API

Result

Options (functional)

Provider Interface

What Generate Does Internally

Event / Hook System

Migration

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

RFC: Establish pkg/ai as the foundational AI kernel #2409

Description

Overview

Pain

Goal

Proposed Design

Package Structure

Core API

Result

Options (functional)

Provider Interface

What Generate Does Internally

Event / Hook System

Migration

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions