Provider System

M31 Autonomous (M31A) supports multiple LLM providers with automatic fallback, model caching, capability detection, and SSE streaming.

Provider Interface

Source: internal/provider/interface.go

type LLMProvider interface {
    Name() string
    APIKey() string
    FetchModels(ctx context.Context) ([]types.ModelInfo, error)
    CachedModels() []types.ModelInfo
    ChatCompletionStream(ctx context.Context, req ChatRequest) (*types.StreamIterator, error)
    EstimateCost(modelID string, usage types.Usage) float64
    HealthCheck(ctx context.Context) types.HealthStatus
    GetModel(id string) (*types.ModelInfo, error)
}

ChatRequest

type ChatRequest struct {
    Model            string           `json:"model"`
    Messages         []types.Message  `json:"messages"`
    MaxTokens        int              `json:"max_tokens,omitempty"`
    Tools            []ToolDefinition `json:"tools,omitempty"`
    ReasoningEnabled bool             `json:"reasoning_enabled,omitempty"`
}

Registry

Source: internal/provider/registry.go

The Registry manages multiple providers with thread-safe operations:

Method	Description
`Register(name, provider)`	Add a provider
`SetActive(name)`	Switch active provider
`TrySetActive(name)`	Atomic set (prevents TOCTOU races)
`RollbackActive(from, to)`	Revert on health check failure
`Active()`	Current active provider name
`ActiveProvider()`	Current active provider instance
`Get(name)`	Look up provider by name
`List()`	All registered provider names (sorted)

Implementations

OpenRouter

Source: internal/provider/openrouter/client.go

Aggregates 300+ models from multiple providers
Default base URL: https://openrouter.ai/api/v1
Sends HTTP-Referer and X-Title headers
Custom base URL supported for proxies

Zen

Source: internal/provider/zen/client.go

OpenCode Zen API gateway
Default base URL: https://opencode.ai/zen/v1
Default context length support

Base Client

Source: internal/provider/base_client.go

Shared HTTP transport used by all provider implementations:

Setting	Value
Max idle connections	100
Max idle per host	10
Idle timeout	90 seconds
Dial timeout	30 seconds (`HTTPDialTimeout`)

Provides common operations: APIKey() (masked), EstimateCost(), GetModel(), CachedModels(), MakeIterator().

Model Cache

Source: internal/provider/cache.go

TTL-based in-memory cache with stale-while-revalidate semantics:

Setting	Default	Description
TTL	5 minutes (`ModelCacheTTL`)	Fresh cache lifetime
Stale TTL	24 hours (`StaleCacheTTL`)	Fallback cache lifetime

When fresh data expires, stale entries are served while a background refresh runs. Uses golang.org/x/sync/singleflight to prevent thundering herd on cache miss.

Model Metadata

Source: internal/provider/model_metadata.go

Enriches raw model data with additional metadata:

Tokenizer family detection
Variant classification ("thinking", "fast", "extended", "vision")

Capability Detection

Source: internal/provider/capabilities.go

Heuristic inference of model capabilities from model ID patterns:

Capability	Detection Heuristic
Tool Use	claude, gpt, gemini, deepseek, qwen, llama, mistral, command-r, command-a
Reasoning	/o1, /o3, /o4 patterns + "reason" or "thinking" in ID
Vision	"vision" or "multimodal" in ID

Automatic Fallback

Source: internal/provider/fallback.go

When the active provider degrades, M31A automatically switches to a healthy alternative:

Health Check Flow

Active provider fails
    │
    ▼
FindFallbackProvider()
    ├── Collect candidate providers (exclude current)
    ├── Parallel health checks (10s timeout)
    ├── Pick first "live" or "slow" provider
    └── Commit switch via TrySetActive()

Retry-After Awareness

FindFallbackWithRetryAfter() handles HTTP 429 responses:

Extracts Retry-After header value
Caps wait at 120 seconds (MaxRetryAfterWait)
Returns FallbackAfterWait struct for async scheduling (avoids blocking the Bubble Tea event loop)

Fallback Event

type FallbackEvent struct {
    From   string `json:"from"`
    To     string `json:"to"`
    Reason string `json:"reason"` // "fallback_live", "fallback_slow", "rate_limited"
}

SSE Streaming

Source: internal/provider/sse.go

Server-Sent Events parser for streaming LLM responses:

Parses data: [DONE] sentinel for stream completion
Handles delta chunks for text content
Handles tool_call chunks for native tool invocation
Detects truncated streams (ErrStreamTruncated)
Enforces MaxLLMResponseBytes (1 MB) to prevent OOM

Stream Iterator

type StreamIterator struct {
    Next  func() (*StreamChunk, error)
    Close func() error
}

The workflow engine's consumeStreamWithTools() reads chunks and:

Accumulates text deltas into a strings.Builder
Collects tool_call chunks by index into toolCallBuilder structs
Finalizes into ToolCall objects with parsed JSON arguments

Provider Registration

Source: internal/tui/provider_registration.go

Provider registration happens at startup in main.go:

registry := provider.NewRegistry()
if cfg.Provider.OpenRouter.APIKey != "" {
    RegisterProvider(registry, cfg, "openrouter", apiKey, version)
}
if cfg.Provider.Zen.APIKey != "" {
    RegisterProvider(registry, cfg, "zen", apiKey, version)
}
if cfg.Provider.Default != "" {
    registry.SetActive(cfg.Provider.Default)
}

Reasoning Support

Source: internal/provider/reasoning.go

Extended thinking / reasoning mode support:

Detects models that support reasoning via capability flags
ReasoningEnabled flag in ChatRequest
Thinking duration tracked in StreamChunk.ThinkingDuration
TUI renders thinking blocks with configurable opacity

Common Utilities

Source: internal/provider/common.go

Shared helpers:

HTTP response body reading with size limits
Error sanitization (caps at MaxProviderErrorChars = 200)
Retry-After header parsing
Rate limit detection (HTTP 429)

Uh oh!

Provider System

Provider System

Provider Interface

ChatRequest

Registry

Implementations

OpenRouter

Zen

Base Client

Model Cache

Model Metadata

Capability Detection

Automatic Fallback

Health Check Flow

Retry-After Awareness

Fallback Event

SSE Streaming

Stream Iterator

Provider Registration

Reasoning Support

Common Utilities

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally