LLM Providers

CortexPrism supports 24 LLM providers out of the box. Each provider is implemented in src/llm/ and exposes a common LLMProvider interface.

Provider List

Provider	File	Type	Notes
Anthropic Claude	`anthropic.ts`	Native SDK	SSE streaming, extended thinking
OpenAI	`openai.ts`	Native SDK	GPT-4o, o-series models, reasoning effort
Google Gemini	`google.ts`	Native SDK	Native vision, thinking budget
Mistral AI	`mistral.ts`	OpenAI-compatible	Mistral API
Groq	`groq.ts`	OpenAI-compatible	Ultra-fast inference
DeepSeek	`deepseek.ts`	OpenAI-compatible	Chat + Reasoner models
OpenRouter	`openrouter.ts`	OpenAI-compatible	Routes to 200+ models
xAI Grok	`xai.ts`	OpenAI-compatible	Grok-2 and newer
Together AI	`together.ts`	OpenAI-compatible	100+ open models
AWS Bedrock	`bedrock.ts`	AWS SDK	Converse API (Claude, Llama, Titan)
Cohere	`cohere.ts`	Native v2 API	Command R+ models
Ollama	`ollama.ts`	Native	Local models, NDJSON streaming
Cerebras	`cerebras.ts`	OpenAI-compatible	Cerebras inference
Fireworks	`fireworks.ts`	OpenAI-compatible	Fireworks AI
Perplexity	`perplexity.ts`	OpenAI-compatible	Online LLM with citations
NVIDIA NIM	`nvidia.ts`	OpenAI-compatible	NVIDIA inference
Moonshot (Kimi)	`moonshot.ts`	OpenAI-compatible	Moonshot AI
Novita AI	`novita.ts`	OpenAI-compatible	Novita inference
LM Studio	`lmstudio.ts`	OpenAI-compatible	Local LLM server
LiteLLM	`litellm.ts`	OpenAI-compatible	Multi-provider proxy
Hugging Face	`huggingface.ts`	OpenAI-compatible	HF Inference Router
Alibaba (Qwen)	`alibaba.ts`	OpenAI-compatible	Alibaba Cloud
Venice AI	`venice.ts`	OpenAI-compatible	Privacy-focused LLMs
Kilo AI	`kilo.ts`	OpenAI-compatible	Kilo-hosted models

Provider Interface

All providers implement:

interface LLMProvider {
  readonly name: string;
  readonly defaultModel: string;
  complete(options: CompletionOptions): Promise<CompletionResult>;
  stream(options: CompletionOptions): AsyncIterable<CompletionChunk>;
}

Provider-Specific Settings

Each provider exposes unique parameters in config:

Provider	Field	Purpose
Anthropic, OpenAI, Google	`reasoningEffort`	Extended thinking budget (low/medium/high)
OpenRouter	`httpReferer`, `xTitle`	HTTP headers for ranking
Perplexity	`searchRecencyFilter`	Recency filter (month/week/day/hour)
Together, Fireworks, Novita	`repetitionPenalty`	Token repetition penalty
Ollama, LM Studio	`numCtx`, `keepAlive`	Context window size, keep-alive duration
LiteLLM	`dropParams`	Drop unsupported parameters silently
Venice AI	`includeVeniceSystemPrompt`	Venice system prompt toggle

Multimodal Support

Providers with native vision/image support:

Provider	Image Support	Document Support
Anthropic	Native image blocks	Native document blocks
Google Gemini	`inlineData` parts	—
OpenAI	`image_url` parts	—
Ollama	`images` array	—
Bedrock	Text extraction	Text extraction

For text-only models, uploaded files are saved to disk and a descriptive message is appended to the prompt.

Model Router

The model router wraps providers with intelligent routing strategies:

Cascade: Tries cheapest provider first, escalates on low confidence
Threshold (RouteLLM-style): Scores prompt complexity, routes to strong or weak model

Both routers implement LLMProvider, making them drop-in replacements.

Model Quartermaster (MQM)

MQM is a learning-based model selection engine that dynamically routes requests based on:

Historical performance by task category
Episodic memory hits and quality signals
Cost optimization
Trajectory patterns (recent model usage)
Reflection feedback

Three arbiter strategies: conservative, balanced, aggressive.

Configuration

Configure in ~/.cortex/config.json:

{
  "defaultProvider": "anthropic",
  "providers": {
    "anthropic": {
      "kind": "anthropic",
      "model": "claude-sonnet-4-5",
      "apiKey": "sk-ant-..."
    }
  },
  "modelSelection": {
    "enabled": true,
    "mode": "balanced",
    "observeThreshold": 50
  }
}

Uh oh!

LLM Providers

LLM Providers

Provider List

Provider Interface

Provider-Specific Settings

Multimodal Support

Model Router

Model Quartermaster (MQM)

Configuration

See Also

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

CortexPrism Wiki

Getting Started

Core Concepts

AI & Models

Features

Extending

API Reference

Operations

Development

Reference

Clone this wiki locally