Skip to content

LLM Providers

CortexPrism edited this page Jun 17, 2026 · 1 revision

LLM Providers

CortexPrism supports 24 LLM providers out of the box. Each provider is implemented in src/llm/ and exposes a common LLMProvider interface.

Provider List

Provider File Type Notes
Anthropic Claude anthropic.ts Native SDK SSE streaming, extended thinking
OpenAI openai.ts Native SDK GPT-4o, o-series models, reasoning effort
Google Gemini google.ts Native SDK Native vision, thinking budget
Mistral AI mistral.ts OpenAI-compatible Mistral API
Groq groq.ts OpenAI-compatible Ultra-fast inference
DeepSeek deepseek.ts OpenAI-compatible Chat + Reasoner models
OpenRouter openrouter.ts OpenAI-compatible Routes to 200+ models
xAI Grok xai.ts OpenAI-compatible Grok-2 and newer
Together AI together.ts OpenAI-compatible 100+ open models
AWS Bedrock bedrock.ts AWS SDK Converse API (Claude, Llama, Titan)
Cohere cohere.ts Native v2 API Command R+ models
Ollama ollama.ts Native Local models, NDJSON streaming
Cerebras cerebras.ts OpenAI-compatible Cerebras inference
Fireworks fireworks.ts OpenAI-compatible Fireworks AI
Perplexity perplexity.ts OpenAI-compatible Online LLM with citations
NVIDIA NIM nvidia.ts OpenAI-compatible NVIDIA inference
Moonshot (Kimi) moonshot.ts OpenAI-compatible Moonshot AI
Novita AI novita.ts OpenAI-compatible Novita inference
LM Studio lmstudio.ts OpenAI-compatible Local LLM server
LiteLLM litellm.ts OpenAI-compatible Multi-provider proxy
Hugging Face huggingface.ts OpenAI-compatible HF Inference Router
Alibaba (Qwen) alibaba.ts OpenAI-compatible Alibaba Cloud
Venice AI venice.ts OpenAI-compatible Privacy-focused LLMs
Kilo AI kilo.ts OpenAI-compatible Kilo-hosted models

Provider Interface

All providers implement:

interface LLMProvider {
  readonly name: string;
  readonly defaultModel: string;
  complete(options: CompletionOptions): Promise<CompletionResult>;
  stream(options: CompletionOptions): AsyncIterable<CompletionChunk>;
}

Provider-Specific Settings

Each provider exposes unique parameters in config:

Provider Field Purpose
Anthropic, OpenAI, Google reasoningEffort Extended thinking budget (low/medium/high)
OpenRouter httpReferer, xTitle HTTP headers for ranking
Perplexity searchRecencyFilter Recency filter (month/week/day/hour)
Together, Fireworks, Novita repetitionPenalty Token repetition penalty
Ollama, LM Studio numCtx, keepAlive Context window size, keep-alive duration
LiteLLM dropParams Drop unsupported parameters silently
Venice AI includeVeniceSystemPrompt Venice system prompt toggle

Multimodal Support

Providers with native vision/image support:

Provider Image Support Document Support
Anthropic Native image blocks Native document blocks
Google Gemini inlineData parts
OpenAI image_url parts
Ollama images array
Bedrock Text extraction Text extraction

For text-only models, uploaded files are saved to disk and a descriptive message is appended to the prompt.

Model Router

The model router wraps providers with intelligent routing strategies:

  • Cascade: Tries cheapest provider first, escalates on low confidence
  • Threshold (RouteLLM-style): Scores prompt complexity, routes to strong or weak model

Both routers implement LLMProvider, making them drop-in replacements.

Model Quartermaster (MQM)

MQM is a learning-based model selection engine that dynamically routes requests based on:

  • Historical performance by task category
  • Episodic memory hits and quality signals
  • Cost optimization
  • Trajectory patterns (recent model usage)
  • Reflection feedback

Three arbiter strategies: conservative, balanced, aggressive.

Configuration

Configure in ~/.cortex/config.json:

{
  "defaultProvider": "anthropic",
  "providers": {
    "anthropic": {
      "kind": "anthropic",
      "model": "claude-sonnet-4-5",
      "apiKey": "sk-ant-..."
    }
  },
  "modelSelection": {
    "enabled": true,
    "mode": "balanced",
    "observeThreshold": 50
  }
}

See Also

Clone this wiki locally